Analysis of item difficulty, discrimination and reliability of the KNP freshmen admission test: Basis for test revision and enhancement

Mharfe M. Micaroz & Jemuelle P. Frando

Volume 7 Issue 2, June 2026

Abstract

Admission examinations support selection, profiling, and academic–support decisions in higher education; therefore, their score use must be supported by empirical evidence. This study evaluated the psychometric quality of the KNP Freshmen Admission Test by analyzing item difficulty, item discrimination, corrected item–total correlations, and internal consistency reliability. Using a descriptive–evaluative design, the study examined the dichotomously scored responses of 573 examinees to a 100–item admission test within the framework of Classical Test Theory. The test showed good internal consistency reliability (Cronbach's alpha = .824), but item–level results indicated a difficult test form and uneven item functioning: 65% of the items were difficult or very difficult, 58% showed poor or negative discrimination, and the integrated item decision framework retained 28 items, revised 31 items, and rejected 41 items. These findings indicate that the instrument has a reliable score structure but requires systematic item redevelopment, review of content alignment, and continuing validation before broader score–based admission decisions are made. The study contributes a practical institutional model for evidence–based admission test revision, item banking, and continuous assessment quality assurance.

Keywords

Author information & Contribution

Mharfe M. Micaroz. Corresponding author. Doctor of Philosophy in Education major in Educational Leadership. Vice President for Academic Affairs, PAFTE/MTAP. Email: mharfemicaroz@gmail.com

Jemuelle P. Frando. Master of Education major in Guidance and Counseling, Registered Guidance Counselor, Guidance Counselor

"All authors equally contributed to the conception, design, preparation, data gathering and analysis, and writing of the manuscript. All authors read and approved of the final manuscript."

Disclosure statement

Funding

Institutional Review Board Statement

Data and Materials Availability

AI Declaration

Notes

Acknowledgement

References

Almarabheh, A., Ismaeel, A., Al–Qahtani, A., & Al–Mutairi, A. (2022). Predictive validity of admission criteria in predicting academic performance of medical students: A retrospective cohort study. Advances in Medical Education and Practice, 13, 1009–1019. https://doi.org/10.2147/AMEP.S376792

American Educational Research Association (2014). Standards for educational and psychological testing. https://www.aera.net/Portals/38/1999%20Standards_revised.pdf

Asirit, L. B. L. (2024). From insight to measurement: A self–assessment tool development for entry–level teachers’ instructional competence. International Journal of Educational Management and Development Studies, 5(1), 27–53. https://doi.org/10.53378/353043

Ayanwale, M. A., Chere–Masopha, J., & Morena, M. C. (2022). The classical test or item response measurement theory: The status of the framework at the Examination Council of Lesotho. International Journal of Learning, Teaching and Educational Research, 21(8), 384–402. https://doi.org/10.26803/ijlter.21.8.22

Bhattacherjee, S., Mukherjee, A., Bhandari, K., & Rout, A. J. (2022). Evaluation of multiple–choice questions by item analysis, from an online internal assessment of 6th semester medical students in a rural medical college, West Bengal. Indian Journal of Community Medicine, 47(1), 92–95. https://doi.org/10.4103/ijcm.ijcm_1156_21

Capan Melser, M., Steiner–Hofbauer, V., Lilaj, B., Agis, H., Knaus, A., & Holzinger, A. (2020). Knowledge, application and how about competence? Qualitative assessment of multiple–choice questions for dental students. Medical Education Online, 25(1), Article 1714199. https://doi.org/10.1080/10872981.2020.1714199

Cheung, B. H. H., Lau, G. K. K., Wong, G. T. C., Lee, E. Y. P., Kulkarni, D., Seow, C. S., Wong, R., & Co, M. T.–H. (2023). ChatGPT versus human in generating medical graduate exam multiple choice questions–A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom). PLOS ONE, 18(8), Article e0290691. https://doi.org/10.1371/journal.pone.0290691

Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Cengage Learning.

Dorsah, P. (2026). The use of Cronbach’s alpha reliability in educational research: A systematic review. European Journal of Contemporary Education and E–Learning, 4(2), 39–50. https://doi.org/10.59324/ejceel.2026.4(2).04

Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement (5th ed.). Prentice Hall.

Edelsbrunner, P. A., Simonsmeier, B. A., & Schneider, M. (2025). The Cronbach’s alpha of domain–specific knowledge tests before and after learning: A meta–analysis of published studies. Educational Psychology Review, 37, Article 4. https://doi.org/10.1007/s10648–024–09982–y

Eleragi, A. M. S., Miskeen, E., Hussein, K., Rezigalla, A. A., Adam, M. I. E., Al–Faifi, J. A., Alhalafi, A., Al Ameer, A. Y., & Mohammed, O. A. (2025). Evaluating the multiple–choice questions quality at the College of Medicine, University of Bisha, Saudi Arabia: A three–year experience. BMC Medical Education, 25, Article 233. https://doi.org/10.1186/s12909–025–06700–2

Escher, M., Weppert, D., Amelung, D., Huelmann, T., Stegt, S., & Hissbach, J. (2023). Paper–based and computer–based admission tests for medicine–Are they equivalent? Frontiers in Education, 8, Article 1209212. https://doi.org/10.3389/feduc.2023.1209212

Ganji, K. K., Ananthakrishnan, N., Manivasakan, S., Alruwaili, M. K., Alonazi, M. A., & Algarni, H. A. (2025). Analyzing the relationship between psychometric indices of item analysis with attainment of course learning outcomes: Cross–sectional study in integrated outcome–based dental curriculum courses. BMC Medical Education, 25, Article 1366. https://doi.org/10.1186/s12909–025–07871–8

Gebremichael, M. W., Baraki, B., Mehari, M.–A., & Assalfew, B. (2025). Item analysis of multiple choice questions from assessment of health sciences students, Tigray, Ethiopia. BMC Medical Education, 25, Article 441. https://doi.org/10.1186/s12909–025–06904–6

Gottlieb, M., Bailitz, J., Fix, M., Shappell, E., & Wagner, M. J. (2023). Educator’s blueprint: A how–to guide for developing high–quality multiple–choice questions. AEM Education and Training, 7(1), Article e10836. https://doi.org/10.1002/aet2.10836

Iñarrairaegui, M., Fernández–Ros, N., Lucena, F., Landecho, M. F., García, N., Quiroga, J., & Herrero, J. I. (2022). Evaluation of the quality of multiple–choice questions according to the students’ academic level. BMC Medical Education, 22, Article 779. https://doi.org/10.1186/s12909–022–03844–3

Jaehn, M., Hissbach, J., Frickhoeffer, M., Weppert, D., Zimmerhofer, A., Wolf, K., & Hampe, W. (2025). Predictive validity of admission tests and educational attainment on preclinical academic performance: A multisite study. BMC Medical Education, 25, Article 1255. https://doi.org/10.1186/s12909–025–07974–2

Kumar, D., Jaipurkar, R., Shekhar, A., Sikri, G., & Srinivas, V. (2021). Item analysis of multiple choice questions: A quality assurance test for an assessment tool. Medical Journal Armed Forces India, 77(Suppl. 1), S85–S89. https://doi.org/10.1016/j.mjafi.2020.11.007

Levacher, J., Koch, M., Stegt, S. J., Hissbach, J., Spinath, F. M., Escher, M., & Becker, N. (2023). The construct validity of the main student selection tests for medical studies in Germany. Frontiers in Education, 8, Article 1120129. https://doi.org/10.3389/feduc.2023.1120129

Loder, A. K. F. (2024). Student performance correlates of psychology admission exam scores and the number of places for students. Acta Psychologica, 250, Article 104523. https://doi.org/10.1016/j.actpsy.2024.104523

Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing.

Marsevani, M. (2022). Item analysis of multiple–choice questions. English Review: Journal of English Education, 10(3), 759–766. https://doi.org/10.25134/erjee.v10i3.6241

Mustafa, S., & Hamid, O. E. (2026). Psychometric item/question analysis of multiple–choice questions in fixed prosthodontics exam. BMC Medical Education, 26, Article 86. https://doi.org/10.1186/s12909–025–08429–4

Nitko, A. J., & Brookhart, S. M. (2014). Educational assessment of students (7th ed.). Pearson.

O’Neill, L. D., & Nielsen, T. (2024). Admission testing, pre–academic exam self–efficacy, and retention: A prospective cohort study. Studies in Educational Evaluation, 83, Article 101383. https://doi.org/10.1016/j.stueduc.2024.101383

Polat, M. (2022). Comparison of performance measures obtained from foreign language tests according to item response theory vs. classical test theory. International Online Journal of Education and Teaching, 9(1), 471–485.

Rezigalla, A. A. (2024). AI in medical education: Uses of AI in construction type A MCQs. BMC Medical Education, 24, Article 247. https://doi.org/10.1186/s12909–024–05250–3

Robb, C., Banks, P. W., Copeland, H. L., MacIntosh, A., Ivan, R., Moskowitz, J. B., Reiter, H., & Sitarenios, G. (2025). Examining the predictive validity of an open–response situational judgment test with typed–response and video–response items. Educational Assessment, 1–11. https://doi.org/10.1080/10627197.2025.2576228

Salih, K. E. M. A., Jibo, A. M., Ishaq, M., Khan, S., Mohammed, O. A., & Al–Shahrani, A. M. (2020). Psychometric analysis of multiple–choice questions in an innovative curriculum in Kingdom of Saudi Arabia. Journal of Family Medicine and Primary Care, 9(7), 3663–3668. https://doi.org/10.4103/jfmpc.jfmpc_1034_19

Schutte, F. (2024). A model for assessments in higher education institutions. International Journal of Educational Management and Development Studies, 5(3), 92–117. https://doi.org/10.53378/ijemds.353088

Shahat, K. A. (2024). Item analysis of multiple–choice question (MCQ)–based exam efficiency among postgraduate pediatric medical students: An observational, cross–sectional study from Saudi Arabia. Cureus, 16(9), Article e69151. https://doi.org/10.7759/cureus.69151

Srisomsak, V., Sitticharoon, C., Keadkraichaiwat, I., Meethes, S., & Inpaen, I. (2026). Detection of flawed multiple–choice questions in preclinical medical education using item difficulty and discrimination indices: A six–year analysis. BMC Medical Education, 26, Article 92. https://doi.org/10.1186/s12909–025–08204–5

Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach’s alpha. International Journal of Medical Education, 2, 53–55. https://doi.org/10.5116/ijme.4dfb.8dfd

Watrin, L., Geiger, M., Levacher, J., Spinath, B., & Wilhelm, O. (2022). Development and initial validation of an admission test for bachelor psychology studies. Frontiers in Education, 7, Article 909818. https://doi.org/10.3389/feduc.2022.909818

Woo, S. E., LeBreton, J. M., Keith, M. G., & Tay, L. (2023). Bias, fairness, and validity in graduate–school admissions: A psychometric perspective. Perspectives on Psychological Science, 18(1), 3–31. https://doi.org/10.1177/17456916211055374

Yüksel, K. B., & Doğan, N. (2022). Investigation of psychometric properties of multiple–choice items developed by Turkish teachers. Sakarya University Journal of Education, 12(1), 130–149. https://doi.org/10.19126/suje.1007897

Cite this article:

Micaroz, M.M. & Frando, J.P. (2026). Analysis of item difficulty, discrimination and reliability of the KNP freshmen admission test: Basis for test revision and enhancement. International Journal of Educational Management and Development Studies, 7(2), 25-47. https://doi.org/10.53378/ijemds.353353