Negative-Worded Items Functioning as Method Artifacts in the Chemistry Identity Scale: Evidence from Exploratory, Confirmatory, and Bifactor Analyses

Yuleks Juru Mudi; Aeda Kasrianti; Sefthy P B Syahailatua; Nurul Isnaini; Balthasar Eba

doi:10.37251/isej.v7i3.2960

Negative-Worded Items Functioning as Method Artifacts in the Chemistry Identity Scale: Evidence from Exploratory, Confirmatory, and Bifactor Analyses

10.37251/isej.v7i3.2960

Yuleks Juru Mudi: Yogyakarta State University

Indonesia; yuleksmudi2000@gmail.com
Aeda Kasrianti: Yogyakarta State University

Indonesia
Sefthy P B Syahailatua: Yogyakarta State University

Indonesia
Nurul Isnaini: Yogyakarta State University

Indonesia
Balthasar Eba: Yogyakarta State University

Indonesia

Purpose of the study: Chemistry identity is an important affective construct in science education because it is associated with learning engagement, academic persistence, and STEM career aspirations. This study aims to evaluate whether negatively worded items represent substantive dimensions of the construct or merely methodological artifacts.

Methodology: This study involved 300 senior high school students in Indonesia who completed the Chemistry Identity Scale, consisting of 27 items, including five negatively worded items. Data were analyzed using a comprehensive psychometric approach that incorporated exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and bifactor modeling to distinguish substantive construct variance from method variance attributable to item wording.

Main Findings: The findings showed that negatively worded items tended to form a distinct cluster during the exploratory stage, indicating shared method variance. The best-fitting CFA model was the four-factor model with an additional negative wording method factor. Bifactor analysis revealed the dominance of a general chemistry identity factor; however, negatively worded items contributed minimally to the general construct, suggesting that these items function more as sources of method variance than as substantive indicators.

Novelty/Originality of this study: The novelty of this study lies in its comprehensive evaluation of wording effects in chemistry identity measurement through the integration of EFA, competitive CFA, and bifactor modeling. These findings have practical implications for educational instrument developers, highlighting the need for greater caution when using negatively worded items, as they may affect score interpretation and lead to less accurate evaluative decisions.
How to cite

[1]
“Negative-Worded Items Functioning as Method Artifacts in the Chemistry Identity Scale: Evidence from Exploratory, Confirmatory, and Bifactor Analyses”, In. Sci. Ed. J, vol. 7, no. 3, pp. 459–469, May 2026, doi: 10.37251/isej.v7i3.2960.
More Citation Formats
- ACM
- ACS
- APA
- ABNT
- Chicago
- Harvard
- IEEE
- MLA
- Turabian
- Vancouver
121

Abstract views

83

Downloads

Metrics — Badges
1. [1] X. Guo, W. Deng, K. Hu, W. Lei, S. Xiang, and W. Hu, “The effect of metacognition on students’ chemistry identity: the chain mediating role of chemistry learning burnout and chemistry learning flow,” Chem. Educ. Res. Pract., vol. 23, no. 2, pp. 408–421, 2022, doi: 10.1039/D1RP00342A. DOI: https://doi.org/10.1039/D1RP00342A
2. [2] K. N. Hosbein and J. Barbera, “Development and evaluation of novel science and chemistry identity measures,” Chem. Educ. Res. Pract., vol. 21, no. 3, pp. 852–877, 2020, doi: 10.1039/C9RP00223E. DOI: https://doi.org/10.1039/C9RP00223E
3. [3] Z. Jiang, B. Wei, S. Chen, and L. Tan, “Examining the formation of high school students’ science identity,” Sci. Educ., vol. 33, no. 1, pp. 135–157, Feb. 2024, doi: 10.1007/s11191-022-00388-2. DOI: https://doi.org/10.1007/s11191-022-00388-2
4. [4] Z. Hazari, G. Sonnert, P. M. Sadler, and M.-C. Shanahan, “Connecting high school physics experiences, outcome expectations, physics identity, and physics career choice: A gender study,” J. Res. Sci. Teach., vol. 47, no. 8, pp. 978–1003, 2010, doi: 10.1002/tea.20363. DOI: https://doi.org/10.1002/tea.20363
5. [5] V. B. Arias and B. Arias, “The negative wording factor of Core Self-Evaluations Scale (CSES): Methodological artifact, or substantive speci fi c variance ?,” Pers. Individ. Dif., vol. 109, pp. 28–34, 2017, doi: 10.1016/j.paid.2016.12.038. DOI: https://doi.org/10.1016/j.paid.2016.12.038
6. [6] H. C. Bulut and O. Bulut, “Item wording effects in self-report measures and reading achievement: Does removing careless respondents help?,” Stud. Educ. Eval., vol. 72, pp. 101126, 2022, doi: 10.1016/j.stueduc.2022.101126. DOI: https://doi.org/10.1016/j.stueduc.2022.101126
7. [7] M. İlhan, N. Güler, G. T. Teker, and Ö. Ergenekon, “The effects of reverse items on psychometric properties and respondents’ scale scores according to different item reversal strategies,” Int. J. Assess. Tools Educ., vol. 11, no. 1, pp. 20–38, 2024, doi: 10.21449/ijate.1345549. DOI: https://doi.org/10.21449/ijate.1345549
8. [8] C. Tang, B. Yang, and H. Tian, “Examination of the wording effect in the new ecological paradigm scale in China: a bi-factor modeling approach,” Curr. Psychol., vol. 43, no. 7, pp. 5887–5900, 2024, doi: 10.1007/s12144-023-04801-z. DOI: https://doi.org/10.1007/s12144-023-04801-z
9. [9] J. García-Fernández, Á. Postigo, M. Cuesta, C. González-Nuevo, Á. Menéndez-Aller, and E. García-Cueto, “To be Direct or not: Reversing likert response format items,” Span. J. Psychol., vol. 25, p. e24, Oct. 2022, doi: 10.1017/SJP.2022.20. DOI: https://doi.org/10.1017/SJP.2022.20
10. [10] F. A. Setiawati, S. R. Nurhayati, R. N. Amelia, and A. A. Darojat, “Study on the threats of reverse-worded items to the psychometric properties of the marital quality scale, The Open Psychology Journal, vol. 15, no. 1, pp. 1–8, 2022, doi: 10.2174/18743501-v15-e2208150. DOI: https://doi.org/10.2174/18743501-v15-e2208150
11. [11] C. C. Koutsogiorgi and M. P. Michaelides, “Response tendencies due to item wording using eye-tracking methodology accounting for individual differences and item characteristics,” Behav. Res. Methods, vol. 54, no. 5, pp. 2252–2270, 2022, doi: 10.3758/s13428-021-01719-x. DOI: https://doi.org/10.3758/s13428-021-01719-x
12. [12] D. Elek, H. Cígler, D. J. Grüning, and S. Ježek, “Advancing the psychometrics of reverse-keyed items: enriching cognitive theory by a logical and linguistic perspective,” Front. Psychol., vol. 16, 2025, doi: 10.3389/fpsyg.2025.1684612. DOI: https://doi.org/10.3389/fpsyg.2025.1684612
13. [13] F. Antoniou and M. H. Alghamdi, “Confidence in mathematics is confounded by responses to reverse-coded items,” Front. Psychol., vol. 15, 2024, doi: 10.3389/fpsyg.2024.1489054. DOI: https://doi.org/10.3389/fpsyg.2024.1489054
14. [14] S. Chen and B. Wei, “Development and validation of an instrument to measure high school students’ science identity in science learning,” Research in Science Education, vol. 52, no. 11, pp. 111-126, 2020, doi: 10.1007/s11165-020-09932-y. DOI: https://doi.org/10.1007/s11165-020-09932-y
15. [15] L. Avraamidou, “Science identity as a landscape of becoming: rethinking recognition and emotions through an intersectionality lens,” Cult. Stud. Sci. Educ., vol. 15, no. 2, pp. 323–345, 2020, doi: 10.1007/s11422-019-09954-7. DOI: https://doi.org/10.1007/s11422-019-09954-7
16. [16] A. Venta et al., “Reverse-Coded items do not work in Spanish: Data from four samples using established measures,” Front. Psychol., vol. 13, 2022, doi: 10.3389/fpsyg.2022.828037. DOI: https://doi.org/10.3389/fpsyg.2022.828037
17. [17] B. Zeng, M. Jeon, and H. Wen, “How does item wording affect participants’ responses in Likert scale? Evidence from IRT analysis,” Front. Psychol., vol. 15, 2024, doi: 10.3389/fpsyg.2024.1304870. DOI: https://doi.org/10.3389/fpsyg.2024.1304870
18. [18] R. Komperda, K. N. Hosbein, and J. Barbera, “Evaluation of the influence of wording changes and course type on motivation instrument functioning in chemistry,” Chem. Educ. Res. Pract., vol. 19, no. 1, pp. 184–198, 2017, doi: 10.1039/C7RP00181A. DOI: https://doi.org/10.1039/C7RP00181A
19. [19] A. Rodriguez, S. P. Reise, and M. G. Haviland, “Evaluating bifactor models: Calculating and interpreting statistical indices.,” Psychol. Methods, vol. 21, no. 2, pp. 137–150, 2016, doi: 10.1037/met0000045. DOI: https://doi.org/10.1037/met0000045
20. [20] M. Prokofieva, D. Zarate, A. Parker, O. Palikara, and V. Stavropoulos, “Exploratory structural equation modeling: a streamlined step by step approach using the R Project software,” BMC Psychiatry, vol. 23, no. 1, p. 546, 2023, doi: 10.1186/s12888-023-05028-9. DOI: https://doi.org/10.1186/s12888-023-05028-9
21. [21] V. Swami, C. Maïano, and A. J. S. Morin, “A guide to exploratory structural equation modeling (ESEM) and bifactor-ESEM in body image research,” Body Image, vol. 47, pp. 101641, 2023, doi: 10.1016/j.bodyim.2023.101641. DOI: https://doi.org/10.1016/j.bodyim.2023.101641
22. [22] J. Koran, “Indicators per factor in confirmatory factor analysis: more is not always better,” Struct. Equ. Model. A Multidiscip. J., vol. 27, no. 5, pp. 765–772, 2020, doi: 10.1080/10705511.2019.1706527. DOI: https://doi.org/10.1080/10705511.2019.1706527
23. [23] T. A. Kyriazos, “Applied psychometrics: Sample size and sample power considerations in factor analysis (EFA, CFA) and SEM in general,” Psychology, vol. 09, no. 08, pp. 2207–2230, 2018, doi: 10.4236/psych.2018.98126. DOI: https://doi.org/10.4236/psych.2018.98126
24. [24] S. Liu, S. Xu, Q. Li, H. Xiao, and S. Zhou, “Development and validation of an instrument to assess students ’ science , technology , engineering , and mathematics identity,” Phys. Rev. Phys. Educ. Res., vol. 19, no. 1, p. 10138, 2023, doi: 10.1103/PhysRevPhysEducRes.19.010138. DOI: https://doi.org/10.1103/PhysRevPhysEducRes.19.010138
25. [25] J. Suárez-Álvarez, I. Pedrosa, L. Lozano, E. García-Cueto, M. Cuesta, and J. Muñiz, “Using reversed items in Likert scales: A questionable practice,” Psicothema, vol. 2, no. 30, pp. 149–158, May 2018, doi: 10.7334/psicothema2018.33. DOI: https://doi.org/10.7334/psicothema2018.33
26. [26] N. Menold, “How Do Reverse-keyed Items in Inventories Affect Measurement Quality and Information Processing?,” Field methods, vol. 32, no. 2, pp. 140–158, May 2020, doi: 10.1177/1525822X19890827. DOI: https://doi.org/10.1177/1525822X19890827
27. [27] F. Kiwanuka, J. Kopra, N. Sak-Dankosky, R. C. Nanyonga, and T. Kvist, “Polychoric Correlation With Ordinal Data in Nursing Research,” Nurs. Res., vol. 71, no. 6, pp. 469–476, Nov. 2022, doi: 10.1097/NNR.0000000000000614. DOI: https://doi.org/10.1097/NNR.0000000000000614
28. [28] S. Lim and S. Jahng, “Determining the number of factors using parallel analysis and its recent variants.,” Psychol. Methods, vol. 24, no. 4, pp. 452–467, 2019, doi: 10.1037/met0000230. DOI: https://doi.org/10.1037/met0000230
29. [29] C. J. Gaskin and B. Happell, “On exploratory factor analysis: A review of recent evidence, an assessment of current practice, and recommendations for future use,” Int. J. Nurs. Stud., vol. 51, no. 3, pp. 511–521, 2014, doi: 10.1016/j.ijnurstu.2013.10.005. DOI: https://doi.org/10.1016/j.ijnurstu.2013.10.005
30. [30] J. W. Osborne, “What is rotating in exploratory factor analysis?,” Pract. Assessment, Res. Eval., vol. 20, no. 2, pp. 1–7, 2015, doi: 10.7275/hb2g-m060.
31. [31] P. Rogers, “Best practices for your confirmatory factor analysis: A JASP and lavaan tutorial,” Behav. Res. Methods, vol. 56, no. 7, pp. 6634–6654, 2024, doi: 10.3758/s13428-024-02375-7. DOI: https://doi.org/10.3758/s13428-024-02375-7
32. [32] D. Shi, C. DiStefano, A. Maydeu-Olivares, and T. Lee, “Evaluating SEM model fit with small degrees of freedom,” Multivariate Behav. Res., vol. 57, no. 2–3, pp. 179–207, 2022, doi: 10.1080/00273171.2020.1868965. DOI: https://doi.org/10.1080/00273171.2020.1868965
33. [33] D. Shi and A. Maydeu-Olivares, “The effect of estimation methods on SEM fit indices,” Educ. Psychol. Meas., vol. 80, no. 3, pp. 421–445, 2020, doi: 10.1177/0013164419885164. DOI: https://doi.org/10.1177/0013164419885164
34. [34] S. P. Reise, W. Bonifay, and M. G. Haviland, “Bifactor modelling and the evaluation of scale scores,” The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development, pp. 675–707, 2018, doi: 10.1002/9781118489772.ch22. DOI: https://doi.org/10.1002/9781118489772.ch22
35. [35] K. S. Taber, “The use of cronbach’s alpha when developing and reporting research instruments in science education,” Res. Sci. Educ., vol. 48, no. 6, pp. 1273–1296, 2018, doi: 10.1007/s11165-016-9602-2. DOI: https://doi.org/10.1007/s11165-016-9602-2
36. [36] J. Wang, X. Xin, Y. Huo, Y. Li, Y. Han, and F. Kong, “Bifactor modelling, reliability, and validity of the material values scale in Chinese youth,” Psychol. Rep., vol. 127, no. 1, pp. 465–484, 2024, doi: 10.1177/00332941221114407. DOI: https://doi.org/10.1177/00332941221114407
37. [37] M. S. Bartlett, “A note on the multiplying factors for various χ2 approximations,” J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 16, no. 2, pp. 296–298, 1954, doi: 10.1111/j.2517-6161.1954.tb00174.x. DOI: https://doi.org/10.1111/j.2517-6161.1954.tb00174.x
38. [38] H. F. Kaiser, “An index of factorial simplicity,” Psychometrika, vol. 39, no. 1. 1974. doi: 10.1007/BF02291575. DOI: https://doi.org/10.1007/BF02291575
39. [39] M. W. Watkins, “Exploratory factor analysis: A guide to best practice,” J. Black Psychol., vol. 44, no. 3, pp. 219–246, 2018, doi: 10.1177/0095798418771807. DOI: https://doi.org/10.1177/0095798418771807
40. [40] T. Zhang, C. Yin, Y. Geng, Y. Zhou, S. Sun, and F. Tang, “Development and validation of psychological contract scale for hospital pharmacists,” J. Multidiscip. Healthc., vol. 13, pp. 1433–1442, 2020, doi: 10.2147/JMDH.S270030. DOI: https://doi.org/10.2147/JMDH.S270030
41. [41] C.-H. Li, “Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares,” Behav. Res. Methods, vol. 48, no. 3, pp. 936–949, 2016, doi: 10.3758/s13428-015-0619-7. DOI: https://doi.org/10.3758/s13428-015-0619-7
42. [42] J. Revuelta, C. Ximénez, and N. Minaya, “Overfactoring in rating scale data: A comparison between factor analysis and item response theory,” Front. Psychol., vol. 13, 2022, doi: 10.3389/fpsyg.2022.982137. DOI: https://doi.org/10.3389/fpsyg.2022.982137
43. [43] P. J. Ferrando and U. Lorenzo-Seva, “Assessing the quality and appropriateness of factor solutions and factor score estimates in exploratory item factor analysis,” Educ. Psychol. Meas., vol. 78, no. 5, pp. 762–780, 2018, doi: 10.1177/0013164417719308. DOI: https://doi.org/10.1177/0013164417719308
44. [44] W. R. da Silva, G. S. Donofre, A. N. Neves, J. Marôco, P. A. Teixeira, and J. A. D. B. Campos, “Investigating method effects associated with the wording direction of items of the social physique anxiety scale,” Eat. Weight Disord. - Stud. Anorexia, Bulim. Obes., vol. 27, no. 7, pp. 2857–2867, 2022, doi: 10.1007/s40519-022-01439-x. DOI: https://doi.org/10.1007/s40519-022-01439-x
45. [45] S. Savahl, F. Casas, and S. Adams, “Considering a bifactor model of children’s subjective well-being using a multinational sample,” Child Indic. Res., vol. 16, no. 6, pp. 2253–2278, 2023, doi: 10.1007/s12187-023-10058-6. DOI: https://doi.org/10.1007/s12187-023-10058-6
46. [46] C. C. S. Kam, “Why do regular and reversed items load on separate factors? response difficulty vs. item extremity,” Educ. Psychol. Meas., vol. 83, no. 6, pp. 1085–1112, 2023, doi: 10.1177/00131644221143972. DOI: https://doi.org/10.1177/00131644221143972
47. [47] M. Fokkema and S. Greiff, “How performing PCA and CFA on the same data equals trouble,” Eur. J. Psychol. Assess., vol. 33, no. 6, pp. 399–402, Nov. 2017, doi: 10.1027/1015-5759/a000460. DOI: https://doi.org/10.1027/1015-5759/a000460
48. [48] I. Etikan, “Comparison of convenience sampling and purposive sampling,” Am. J. Theor. Appl. Stat., vol. 5, no. 1, p. 1, 2016, doi: 10.11648/j.ajtas.20160501.11. DOI: https://doi.org/10.11648/j.ajtas.20160501.11
49. [49] G. D. Valenti, R. Bottaro, and P. Faraci, “Assessing the two sources of construct-relevant psychometric multidimensionality of the nomophobia questionnaire: The integrated framework of bifactor exploratory structural equation modeling,” Eval. Health Prof., vol. 47, no. 1, pp. 52–65, 2024, doi: 10.1177/01632787231203380. DOI: https://doi.org/10.1177/01632787231203380
50. [50] R. E. Davis, S. Lee, T. P. Johnson, W. Yu, L. I. Reyes, and J. F. Thrasher, “Individual-level cultural factors and use of survey response styles among latino survey respondents,” Hisp. J. Behav. Sci., vol. 44, no. 3, pp. 216–242, 2023, doi: 10.1177/07399863231183023. DOI: https://doi.org/10.1177/07399863231183023
51. [51] A. Alamer, “Exploratory structural equation modeling (ESEM) and bifactor ESEM for construct validation purposes: Guidelines and applied example,” Res. Methods Appl. Linguist., vol. 1, no. 1, pp. 100005, 2022, doi: 10.1016/j.rmal.2022.100005. DOI: https://doi.org/10.1016/j.rmal.2022.100005
52. [52] D. Bolt, Y. C. Wang, R. H. Meyer, and L. Pier, “An IRT mixture model for rating scale confusion associated with negatively worded items in measures of social-emotional learning,” Appl. Meas. Educ., vol. 33, no. 4, pp. 331–348, 2020, doi: 10.1080/08957347.2020.1789140. DOI: https://doi.org/10.1080/08957347.2020.1789140

SINTA

GScholar

Search

Integrated Science Education Journal

Negative-Worded Items Functioning as Method Artifacts in the Chemistry Identity Scale: Evidence from Exploratory, Confirmatory, and Bifactor Analyses

How to cite

More Citation Formats

Metrics — Badges

Categories