Mapping the Landscape of Critical Thinking Assessment in STEM Education: A Systematic Review of Psychometric Properties, Contextual Implementation, and Future Directions

Keywords: Assessment, Critical Thinking, Psychometrics, STEM Education, Systematic Review

Abstract

Purpose of the study: This study aims to systematically map the landscape of critical thinking assessment in STEM education, with a particular focus on psychometric characteristics, contextual implementation, and emerging research trends.

Methodology: A Systematic Literature Review (SLR) was conducted following the PRISMA protocol using the Scopus database as the primary source. A total of 58 studies published between 2018 and 2025 were analyzed through bibliometric mapping using VOSviewer and thematic synthesis.

Main Findings: The findings indicate a substantial increase in research on critical thinking assessment in STEM education since 2020, aligning with growing global attention to 21st-century competencies. However, most studies continue to position assessment primarily as a tool for evaluating learning outcomes or the effectiveness of pedagogical interventions, such as project-based, problem-based, and inquiry-based learning. Only a limited number of studies systematically examine the psychometric quality of assessment instruments, including evidence of construct validity, reliability, and multidimensional structure. This pattern reveals a clear gap between assessment practices in STEM education and established standards for educational measurement, which may lead to weak or potentially misleading conclusions about students’ critical thinking abilities.

Novelty/Originality of this study: This review integrates bibliometric and thematic analyses to identify conceptual and methodological gaps in the existing literature and proposes a coherent direction for the development of critical thinking assessments that are both psychometrically robust and contextually relevant within STEM education.

 

Author Biographies

Yuleks Juru Mudi, Universitas Negeri Yogyakarta

Department of Educational Research and Evaluation, Universitas Negeri Yogyakarta, Yogyakarta, Indonesia

Kana Hidayati, Universitas Negeri Yogyakarta

Department of Educational Research and Evaluation, Universitas Negeri Yogyakarta, Yogyakarta, Indonesia

Muhammad Nursa'ban, Universitas Negeri Yogyakarta

Department of Educational Research and Evaluation, Universitas Negeri Yogyakarta, Yogyakarta, Indonesia

Widowati Pusporini, Universitas Negeri Yogyakarta

Department of Educational Research and Evaluation, Universitas Negeri Yogyakarta, Yogyakarta, Indonesia

References

I. Azmi, M. F. N. L. Abdullah, Z. Alwadood, and F. Calaminos, “Assessing critical thinking in mathematics education: A systematic review and analysis using the prisma framework,” Int. J. Essent. Competencies Educ., vol. 4, no. 1, pp. 54–69, 2025, doi: 10.36312/ijece.v4i1.1858.

C. H. Ng and M. Adnan, “The needs of competency assessment in STEM education: A systematic literature review,” Int. J. Mod. Educ., vol. 6, no. 23, pp. 455–469, 2024, doi: 10.35631/ijmoe.623031.

A. Alias, L. E. Mohtar, S. K. Ayop, and F. R. Rahim, “A systematic review on instruments to assess critical thinking & problem-solving skills,” EDUCATUM, vol. 9, no. Sp, pp. 38–47, 2022, doi: 10.37134/ejsmt.vol9.sp.5.2022.

N. W. A. Hakim and C. A. Talib, “Measuring critical thinking in science: Systematic review,” Asian Soc. Sci., vol. 14, no. 11, 2018, doi: 10.5539/ass.v14n11p9.

O. L. Liu, L. Frankel, and K. C. Roohr, “Assessing critical thinking in higher education: Current state and directions for next-generation assessment,” ETS Res. Rep. Ser., vol. 2014, no. 1, pp. 1–23, 2014, doi: 10.1002/ets2.12009.

S. Ramadani, P. Sinaga, and W. Liliawati, “Critical thinking in science learning: Systematic literature review,” J. Pemberdaya. Masy., vol. 4, no. 1, pp. 183–196, 2025, doi: 10.46843/jpm.v4i1.378.

S. M. Brookhart, How to assess higher-order thinking skills in your classroom. Alexandria, Virginia: ASCD, 2010.

B. I. Sappaile, A. Rahman, I. Ilwandri, T. A. Santosa, I. Ichsan, and J. N. T. Papia, “The effect of the STEM learning model on student’s critical thinking in Indonesia: Meta-analysis,” Edumaspul J. Pendidik., vol. 7, no. 1, pp. 1425–1436, 2023, doi: 10.33487/edumaspul.v7i1.6129.

T. Tanti, A. Astalini, D. A. Kurniawan, D. Darmaji, T. O. Puspitasari, and I. Wardhana, “Attitude for physics: The condition of high school students,” Jurnal Pendidikan Fisika Indonesia, vol. 17, no. 2, pp. 126-132, 2021, doi: 10.15294/jpfi.v17i2.18919.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, Standards for educational and psychological testing. Washington, DC: American Educational Research Association, 2014.

M. T. Kane, “Validating the Interpretations and Uses of Test Scores,” J. Educ. Meas., vol. 50, no. 1, pp. 1–73, Mar. 2013, doi: 10.1111/jedm.12000.

S. Mollah, “The factors, forms, causes, positive and negative impacts of the digital divide on educational practices from both educators’ and learners’ perspectives: A systematic review,” Integrated Science Education Journal, vol. 7, no. 1, pp. 86-103, 2026, doi: 10.37251/isej.v7i1.2314.

Y. B. Bhakti, R. Arthur, and Y. Supriyati, “Development of an assessment instrument for critical thinking skills in physics: A systematic literature review,” in Journal of Physics: Conference Series, 2023. doi: 10.1088/1742-6596/2596/1/012067.

D. T. Tiruneh, M. De Cock, A. G. Weldeslassie, J. Elen, and R. Janssen, “Measuring critical thinking in physics: Development and validation of a critical thinking test in electricity and magnetism,” Int. J. Sci. Math. Educ., vol. 15, no. 4, pp. 663–682, 2017, doi: 10.1007/s10763-016-9723-0.

R. H. Ennis, “The nature of critical thinking: An outline of critical thinking dispositions and abilities,” University of Illinois, Urbana, IL, 2011.

J. W. Pellegrino, “A new era for STEM assessment: Considerations of assessment, technology, and artificial intelligence BT - Uses of Artificial Intelligence in STEM Education,” X. Zhai and J. Krajcik, Eds., Oxford University Press, 2024, pp. 17–37. doi: 10.1093/oso/9780198882077.003.0002.

G. Reynders, J. M. Lantz, S. M. Ruder, C. L. Stanford, and R. Cole, “Rubrics to assess critical thinking and information processing in undergraduate STEM courses,” Int. J. STEM Educ., vol. 7, no. 1, pp. 1–15, 2020, doi: 10.1186/s40594-020-00208-5.

T. Tanti, A. Astalini, D. Darmaji, D. A. Kurniawan, and R. Fitriani, “Student perception review from gender: Electronic moduls of mathematical physics,” JPI (Jurnal Pendidikan Indonesia), vol. 11, no. 1, pp. 125-132, 2022, doi: 10.23887/jpiundiksha.v11i1.35107.

S. Deo and K. Hölttä-Otto, “Critical thinking assessment in engineering education: A Scopus-based literature review,” 2023, doi: 10.1115/1.4064275.

Q. Wang and A. H. Abdullah, “Enhancing students’ critical thinking through mathematics in higher education: A systemic review,” SAGE Open, vol. 14, no. 3, 2024, doi: 10.1177/21582440241275651.

T. Tanti, W. Utami, D. Deliza, and M. Jahanifar, “Investigation in vocation high school for attitude and motivation students in learning physics subject”, Journal Evaluation in Education (JEE), vol. 6, no. 2, pp. 479-490, 2025, doi: 10.37251/jee.v6i2.1452.

R. L. Nurjanah, “The Presentation of Students’ Critical Thinking Skill in Writing Essays with Microlearning Strategy and E-portfolios Integration,” J. English Lang. Teach. Linguist., vol. 10, no. 3, p. 459, 2025, doi: 10.21462/jeltl.v10i3.1766.

P. Caratozzolo, V. Lara-Prieto, S. Hosseini, and J. Membrillo-Hernández, “The use of video essays and podcasts to enhance creativity and critical thinking in engineering,” Int. J. Interact. Des. Manuf., vol. 16, no. 3, pp. 1231–1251, 2022, doi: 10.1007/s12008-022-00952-8.

P. Hull and T. Vígh, “Teachers’ Assessment Literacy: A descriptive literature review,” Hungarian Educ. Res. J., pp. 1–16, 2024, doi: 10.1556/063.2024.00317.

R. A. Shafii and J.-L. Berger, “Teacher Assessment Literacy, Formative Assessment Practices, and Their Perceived Efficacy in Tanzania: A Scoping Review,” Stud. Educ. Eval., vol. 86, pp. 101496, 2025, doi: 10.1016/j.stueduc.2025.101496.

M. Sabri and A. Wais, “Artificial Intelligence dalam Pengukuran dan Penilaian Pendidikan,” J. Eval. Pendidik., vol. 16, no. 2, pp. 95–107, 2025, doi: 10.21009/jep.v16i2.60738.

F. A. Yanti, R. W. Wardana, B. Buyung, E. Heryensi, and N. Khamis, “Automatic Assessment- Based Artificial Intellegent to Measure Students Environmental Literacy,” Indones. J. Learn. Adv. Educ., vol. 7, no. 3, pp. 461–480, 2025, doi: 10.23917/ijolae.v7i3.11240.

C. Zhao, “AI-assisted Assessment in Higher Education: A Systematic Review,” J. Educ. Technol. Innov., vol. 6, no. 4, pp. 39–58, 2024, doi: 10.61414/jeti.v6i4.209.

S. Ntumi and K. Twum Antwi-Agyakwa, “A Systematic Review of Reporting of Psychometric Properties in Educational Research,” Mediterr. J. Soc. Behav. Res., vol. 6, no. 2, pp. 53–59, 2022, doi: 10.30935/mjosbr/11912.

T. G. Bond and C. M. Fox, Applying the Rasch model: Fundamental measurement in the human sciences, 3rd ed. New York, NY: Routledge, 2015.

R. J. De Ayala, The theory and practice of item response theory. New York, NY: Guilford Press, 2009.

Q. Pan, F. Reichert, Q. Liang, J. de la Torre, and N. Law, “Measuring Digital Literacy Across Ages and Over Time: Development and Validation of A Performance-Based Assessment,” Educ. Inf. Technol., vol. 30, no. 15, pp. 22065–22100, 2025, doi: 10.1007/s10639-025-13592-8.

M. Alfaleh, “Sustainable AI-Driven Assessment in Higher Education: A Systematic Review of Fairness, Transparency, Pedagogical Innovation, and Governance,” Sustainability, vol. 18, no. 2, pp. 785, 2026, doi: 10.3390/su18020785.

M. Liu, “Ensuring Fairness in AI-Assisted EFL Writing Assessment,” J. Educ. Educ. Res., vol. 16, no. 3, pp. 97–102, 2025, doi: 10.54097/p137vr68.

C. M. Evans, “Applying a culturally responsive pedagogical framework to design and evaluate classroom performance-based assessments in Hawai’i,” Applied Measurement in Education, vol. 36, no. 3, pp. 269-285, 2023, doi: 10.1080/08957347.2023.2214655.

N. Hartini, E. Prihatin, Y. Rahyasih, E. Herawan, D. Nurbani, S. Dzakirah, and S. Jiayin, “Can Artificial Intelligence Automate the Microteaching Evaluation?,” Educational Process: International Journal, vol. 19, pp. 2025607, 2025, doi: 10.22521/edupij.2025.19.607.

H. R. Romandoni, F. Nurhasanah, and S. Maharani, “Integration and evaluation of computational thinking in mathematics education: A systematic review of research 2016-2025,” Mosharafa: Jurnal Pendidikan Matematika, vol. 14, no. 4, pp. 903-918, 2025, doi: 10.31980/mosharafa.v14i4.3548.

S. K. Mahmud, and M. Kurt, “Enhancing inclusive sustainability-oriented learning in higher education using adaptive learning platforms and performance-based assessment,” Sustainability, vol. 18, no. 3, pp. 1489, 2026, doi: 10.3390/su18031489.

N. Q. Nguyen, and L. P. Nguyen, “Formative assessment for knowledge cultivation in university EFL classrooms: a systematic review from higher education,” Language Testing in Asia, vol. 16, no. 1, pp. 22, 2026, doi: 10.1186/s40468-026-00430-y.

Y. Liu, “Paradigmatic compatibility matters: A critical review of qualitative-quantitative debate in mixed methods research,” Sage Open, vol. 12, no. 1, 2022, doi: 10.1177/21582440221079922.

D. L. Dinsmore, and L. K. Fryer, “Critical thinking and its relation to strategic processing,” Educational Psychology Review, vol. 35, no. 1, pp. 36, 2023, doi: 10.1007/s10648-023-09755-z.

J. A. Paul, M. Sinha, and J. D. Cochran, “Instruments to assess students’ critical thinking—A qualitative approach,” Decision Sciences Journal of Innovative Education, vol. 21, mo. 3, pp. 123-143, 2023, doi: 10.1111/dsji.12295.

R. Cerchione, M. Morelli, R. Passaro, and I. Quinto, “A critical analysis of the integration of life cycle methods and quantitative methods for sustainability assessment,” Corporate Social Responsibility and Environmental Management, vol. 32, no. 2, pp. 1508-1544, 2025, doi: 10.1002/csr.3010.

W. M. Lim, “What is qualitative research? An overview and guidelines,” Australasian marketing journal, vol. 33, no. 2, pp. 199-229, 2025, doi: 10.1177/14413582241264619.

C. Aloisi, “The future of standardised assessment: Validity and trust in algorithms for assessment and scoring,” European Journal of Education, vol. 58, no. 1, pp. 98-110, 2023, doi: 10.1111/ejed.12542.

M. Thelwall, and K. Kousha, “Technology assisted research assessment: algorithmic bias and transparency issues,” Aslib Journal of Information Management, vol. 77, no. 1, pp. 175-190, 2025, doi: 10.1108/AJIM-04-2023-0119.

S. C. Nouis, V. Uren, and S. Jariwala, “Evaluating accountability, transparency, and bias in AI-assisted healthcare decision-making: A qualitative study of healthcare professionals’ perspectives in the UK,” BMC Medical Ethics, vol. 26, no. 1, pp. 89, 2025, doi: 10.1186/s12910-025-01243-z.

S. A. Birahim, “Contesting the algorithm: advancing a right to challenge AI decisions under the GDPR for algorithmic fairness,” Transforming Government: People, Process and Policy, vol. 19, no. 4, pp. 895-913, 2025, doi: 10.1108/TG-05-2025-0148.

M. Alfaleh, “Sustainable AI-Driven assessment in higher education: A systematic review of fairness, transparency, pedagogical innovation, and governance,” Sustainability, vol. 18, no. 2, pp. 785, 2026, doi: 10.3390/su18020785.

J. G. Trigueiro, M. V. da Costa, M. A. F. Barreto, M. Zwarenstein, and R. E. Fontenele Lima de Carvalho, “Cross-cultural adaptation and evidence of validity of the interprofessional collaboration scale (IPC-BR) for Brazil,” Journal of Interprofessional Care, vol. 39, no. 2, pp. 257-266, 2025, doi: 10.1080/13561820.2025.2451957.

M. Alavi, D. Le Lagadec, and M. Cleary, “Challenges of cross‐cultural validation of clinical assessment measures: A practical introduction,” Journal of Advanced Nursing, vol. 82, no. 1, pp. 941-949, 2026, doi: 10.1111/jan.16906.

Published
2026-03-25
How to Cite
[1]
Y. J. Mudi, K. Hidayati, M. Nursa’ban, and W. Pusporini, “Mapping the Landscape of Critical Thinking Assessment in STEM Education: A Systematic Review of Psychometric Properties, Contextual Implementation, and Future Directions”, In. Sci. Ed. J, vol. 7, no. 2, pp. 265-274, Mar. 2026.
Section
Articles