DETEKSI BIAS GENDER PADA INSTRUMEN EVALUASI BELAJAR KIMIA DENGAN METODE MANTEL-HAENZEL

Rizki Nor Amelia, Sri Rejeki Dwi Astuti, Anggi Ristiyana Puspita Sari

Abstract


This study aims to explore psychometric characteristics and detect whether the teacher-made chemistry learning evaluation instrument contains item bias that can benefit students of a certain gender. The research sample was 358 students of class XII SMA in Yogyakarta who were taken by cluster random sampling then the answer were analyzed by Winsteps Rasch Software 3.73. In general, it can be concluded that the teacher-made chemistry learning achievement instrument is good in terms of the reliability coefficient, the distribution of the difficulty level of the items, the majority of which are in the medium category, and only one that detected as misfit item. The results of the analysis using the Mantel-Haenszel method showed that three item was biased and two of them favored male students. With the detection of bias item, it indicates that follow-up from the teacher is needed to improve these items so that the principle of test fairness can be enforced.

Full Text:

PDF

References


Amunga. J. K.. Amadalo. M. M.. & Musera. G. (2011). Disparities in chemistry and biology achievement in secondary schools: Implications for vision 2030. International Journal of Humanities and Social Science. 1(18). 226-236.

Aune, S. E., Abal, F. J. P., & Attorresi, H. F. (2020). A psychometric analysis from the Item Response Theory: step-by-step modelling of a loneliness scale. Ciencias Psicologicas, 14(1), 1-15. https://dx.doi.org/10.22235/cp.v14i1.2179

Bassey, B. A, Ovat, S. V., & Ofem, U. J. (2019). Systematic error in measurement: Ethical implication in decision makin in learners’ assessment in the Nigerian educational system. Prestige Journal of Education, 2(1), 137-146.

Bordbar, S. (2020). Gender differential item functioning (GDIF) analysis in Iran’s University Entrance Exam. English Language in Focus, 3(1), 49-68.

Budiono. (2009). The accuracy of mantel-haenszel. sibtest. and regression methods in differential item function detection. Jurnal Penelitian dan Evaluasi Pendidikan. 12(1). 1-20.

Cuevas, M., & Cervantes, V. H. (2012). Differential item functioning detection with logistic regression. Mathematics and Social Sciences. 3. 45-59.

D’Andola, C. (2016). Women in chemistry-where we are today. Chemistry European Journal, 22, 3523-3528. https://dx.doi.org/10.1002/chem.201600474

Emaikwu. S. O. (2012). Issues in test item bias in public examinations in Nigeria and implications for testing. International Journal of Academic Research in Progressive Education and Development. 1(1). 175-187.

Ezeudu. F. O.. & Obi-Theresa. N. (2013). Effect of gender and location on students’ achievement in chemistry in secondary schools in Nsukka local government area of Enugu state Nigeria. Research on Humanities and Social Sciences. 3(15). 50-55.

Fidelis. I. (2018). Use of Differential Item Functioning (DIF) analysis for bias analysis in test construction. International Journal of Education. Learning and Development. 6(3). 80-91.

Gardner. J. (2013). The public understanding of error in educational assessment. Oxford Review of Education. 39. 72–92. https://dx.doi.org/10.1080/03054985.2012.760290

Hambleton. R.K.. & Swaminathan. H. (1985). Items response theory: Principles and application. Boston: Kluwer-Nijjhoff Publish.

Hayat, B., Putra, M. D. K., & Suryadi, B. (2020). Comparing item parameter estimates and fit statistics of the Rasch Model from three different tradition. Jurnal Penelitian dan Evaluasi Pendidikan, 24(1), 39-50. https://dx.doi.org/10.21831/pep.v24i1.29871

Heidari, S., Babor, T. F., Castro P. D.,Tort S., & Curno, M. (2016). Sex and gender equity in research: rationale for the SAGER guidelines and recommended use. Research Integrity and Peer Review, 1(2), 1-9. https://dx.doi.org/10.1186/s41073-016-0007-6

Ibrahim. A. (2018). Differential item functioning: The state of the art. Jigawa Journal of Multidisciplinary Studies. 1(1). 37-50.

Karami, H. (2012). An introduction to differential item functioning. The International Journal of Educational and Psychological Assessment, 11(2), 59-75.

Karami. H.. Nodoushan. M. A. S.. & Ali. M. (2011). Differential item functioning (DIF): Current problems and future directions. International Journal of Language Studies. 5(3). 133-142

Kendhammer, L., Holme, T., & Murphy, K. (2013). Identifying differential performance in general chemistry: Differential item functioning Analysis of ACS General Chemistry Trial Tests. Journal of Chemical Education, 90, 846-853. https://dx.doi.org/10.1021/ed4000298

Li. Z. (2015). A power formula for the Mantel–Haenszel test for differential item functioning. applied psychological measurement. Applied Psychological Measurement. 39(5). 373–388. https:/dx.doi.org/10.1177/0146621614568805

Linacre. J.M. (2009). A user’s guide to WINSTEPS. Chicago. IL: Winsteps

Magno. C. (2009). Demonstrating the difference between classical test theory and item response theory using derived test data. The International Journal of Educational and Psychological Assessment. 1(1). 1-11

Mellenberg. G.J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics. 7. 105-108.

Moghadam, M., & Nasirzadeh, F. (2020). The application of Kunnan’s test fairness framework (TFF) on a reading comprehension test. Language Testing in Asia, 10(7), 1-21. https://doi.org/10.1186/s40468-020-00105-2

Moss-Racusin, C.A., Dovidio, J.F., Brescoll, V.L., Graham, M.J., & Handelsman, J. (2012). Science faculty´s subtle gender biases favor male students. Proceedings of National Academy of Sciences of the United States of America, 109(41), 16474-16479. https://dx.doi.org/10.1073/pnas.121128610

Osadebe, P.U., & Agbure, B. (2019). Assessment of differential item functioning in social studies multiple choice questions in basic education certificate examination. European Journal of Education Studies, 6(8), 312-344. https://dx.doi.org/10.5281/zenodo.3674732

Paek, I., & Cole, K. (2019). Using R for Item Response Theory Model Applications. New York, NY: Routledge.

Queensoap, M., & Orluwene, G.W. (2019). Use of mantel haenszel differential item functioning in detecting item bias in a chemistry achievement test in four ethnic groups in Nigeria. International Journal of Current Research, 11(3), 2665-2670. https://dx.doi.org/10.24941/ijcr.34709.03.2019

Rasch. G. (1980). Probabilistic models for some intelligence and attainment test. Chicago. IL: University of Chicago Press.

Razak. N. bin Abd.. Khairani. A.Z. bin. & Thien. L.M. (2012). Examining quality of mathemtics test items using rasch model: Preminarily analysis. Procedia - Social and Behavioral Sciences. 69. 2205-2214. https://dx.doi.org/10.1016/j.sbspro.2012.12.187

Salehi, M., & Tayebi, A. (2012). Differential item functioning: Implications for test validation. Journal of language teaching and research, 3(1), 84-95. https://dx.doi.org/10.4304/jltr.3.1.84-92

Shanmugam. K.S. (2018). Determining gender differential item functioning for mathematic in coeducational school culture. Malaysian Journal of Learning and Instruction. 15(2). 83-109.

Sirhan, G. (2007). Learning difficulties in chemistry: An overview. Journal of Turkish Science Education, 4(2), 1-20.

Tenaw. Y. A. (2013). Relationship between self-efficacy. academic achievement and gender in analytical chemistry at Debre Markos College of teacher education. African Journal of Chemical Education. 3(1). 3-28.

Teresi. J. A.. Ramirez. M.. Lai. J.. & Silver. S. (2008). Occurrences and sources of Differential Item Functioning (DIF) in patient-reported outcome measures: Description of DIF methods. and review of measures of depression. quality of life and general health. Psychology Science Quarterly. 50(4). 538-612

Toland, M.D. (2014). Practical guide to conducting an item response theory analysis. Journal of Early Adolescence. 34(1), 120-151. https://dx.doi.org/10.1177/0272431613511332

Uyar. S.. Kelecioglu. H.. & Dogan. N. (2017). Comparing differential item functioning based on manifest groups and latent classes. Educational science: Theory and practice. 17(6). 1977-2000. https:/dx.doi.org/10.12738/estp.2017.6.0526

Veloo. A.. Hong. L.H.. & Lee. S.C. (2015). Gender and ethnicity differences manifested in chemistry achievement and self-regulated learning. International Education Studies. 8(8). http://dx.doi.org/10.5539/ies.v8n8p1Hayat, Putra, & Suryadi, 2020

Wetzel. E.. Hell.. B.. & Passler. K. (2012). Comparison of different test construction strategies in the Wright. B.. & Stone. M. (1999). Measurement essentials (2nd ed.). Wilmington: Wide Range. Inc.

Yashim, A.U., Mhab., L.C., & Waziri, J.A. (2021). Measurement errors in educational assessment. Journal of Educational Theory and Practice, 1(1), 1-9.

Zubairi. A.M.. & Kassim. N.L.A. (2006). Classical and rasch analyses of dichotomously scored reading comprehension test items. Malaysian Journal of ELT Research. 2(1). 1-20.




DOI: http://dx.doi.org/10.30829/tar.v29i2.1781

Refbacks

  • There are currently no refbacks.


CURRENT INDEXING
 

 

Creative Commons License

Jurnal Tarbiyah by UIN Sumatera Utara Medan is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at http://jurnaltarbiyah.uinsu.ac.id/index.php/tarbiyah.
Permissions beyond the scope of this license may be available at http://jurnaltarbiyah.uinsu.ac.id/index.php/tarbiyah/about/submissions#copyrightNotice.