Development and calibration of an instrument measuring attitudes toward statistics using classical and modern test theory
Ezi Apino 1 * , Edi Istiyono 2, Heri Retnawati 3, Widihastuti Widihastuti 4, Kana Hidayati 3
More Detail
1 Educational Research and Evaluation, Graduate School, Universitas Negeri Yogyakarta, Indonesia
2 Department of Physics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia
3 Department of Mathematics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia
4 Department of Fashion and Food Technology Education, Faculty of Engineering, Universitas Negeri Yogyakarta, Indonesia
* Corresponding Author


Assessment of attitudes towards statistics [ATS] is needed to support the success of statistics education in tertiary institutions, so measuring instruments with high accuracy is required. However, existing instruments to measure ATS have not considered the use of technology as an essential variable affecting success in statistics education. The current study sought to fill this gap by developing a standardized instrument to measure ATS and considering aspects of technology use as a necessity for statistics education in the modern era. The study involved 367 students from various study programs spread across several universities in Indonesia as participants. To examine the quality of the instrument, we performed factor analysis, reliability estimation, and item calibration. We calibrated items based on classical test theory [CTT] and item response theory [IRT] using the graded response model [GRM]. Exploratory factor analysis [EFA] indicated three main factors (i.e., interest, difficulty, and value) for measuring attitudes toward statistics. Factor loading of each factor component > 0.45, indicating that all items contributed to the main factor. Cronbach’s alpha coefficient of the three factors ranged from 0.784 to 0.929, indicating that the instrument was reliable. Item calibration based on CTT and IRT-GRM indicated that item performance was satisfactory regarding item endorsement and discrimination. In addition, the information function indicated that the instrument accurately measures attitudes from very low to very high levels. Overall, the psychometric properties of the instrument indicated that the instrument was valid, reliable, and feasible for use in practice and research in the field of education.



  • Aiken, L. R. (1980). Content validity and reliability of single items or questionnaires. Educational and Psychological Measurement, 40(4), 955–959.
  • Akour, M. M. (2022). Rasch rating scale analysis of the survey of attitudes toward statistics. Eurasia Journal of Mathematics, Science and Technology Education, 18(12), em2190.
  • Auzmendi, E. (1991). Factors related to statistics: A study with a Spanish sample (ED333049). ERIC.
  • Aybek, E. C., & Gulleroglu, H. D. (2021). Attitudes toward pirated content: A scale development study based on graded response model. Eurasian Journal of Educational Research, 2021(91), 127–144.
  • Aybek, E. C., & Toraman, C. (2022). How many response categories are sufficient for Likert type scales? An empirical study based on the item response theory. International Journal of Assessment Tools in Education, 9(2), 534–547.
  • Baker, F. B. (2001). The basics of item response theory (2nd ed). ERIC Clearinghouse on Assessment and Evaluation.
  • Benková, M., Bednárová, D., Bogdanovská, G., & Pavlíčková, M. (2022). Redesign of the statistics course to improve graduates’ skills. Mathematics, 10(15), 1–26.
  • Brezavšček, A., Šparl, P., & Žnidaršič, A. (2016). Factors influencing the behavioural intention to use statistical software: The perspective of the Slovenian students of social sciences. EURASIA Journal of Mathematics, Science and Technology Education, 13(3), 953–986.
  • Carver, R., Everson, M., Gabrosek, J., Horton, N. J., Lock, R., Mocko, M., Rossman, A., Rowell, G. H., Velleman, P., Witmer, J., & Wood, B. (2016). Guidelines for assessment and instruction in statistics education (GAISE) college report 2016. American Statistical Association.
  • Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.
  • Christmann, E. P. (2017). A comparison of the achievement of statistics students enrolled in online and face-to-face settings. E-Learning and Digital Media, 14(6), 323–330.
  • Cladera, M., Rejón-Guardia, F., Vich-i-Martorell, G. À., & Juaneda, C. (2019). Tourism students’ attitudes toward statistics. Journal of Hospitality, Leisure, Sport & Tourism Education, 24, 202-210.
  • Comrey, A. L., & Montag, I. (1982). Comparison of factor analytic results with two-choice and seven-choice personality item formats. Applied Psychological Measurement, 6(3), 285–289.
  • Counsell, A., & Cribbie, R. A. (2020). Students’ attitudes toward learning statistics with R. Psychology Teaching Review, 26(2), 36–56.
  • Counsell, A., Rovetti, J., & Buchanan, E. (2022). Psychometric evaluation of the students’ attitudes toward statistics and technology scale (SASTSc). Statistics Education Research Journal, 21(3), 1–18.
  • Cruise, R. J., Cash, R. W., & Bolton, D. L. (1985). Development and validation of an instrument to measure statistical anxiety. In E. Team (Eds.), American Statistical Association 1985 proceedings of the section on statistical education (pp. 92–97). American Statistical Association.
  • DeVellis, R. F. (2017). Scale development: Theory and applications (4th ed.). Sage Publication.
  • DeVellis, R. F., & Thorpe, C. T. (2022). Scale development: Theory and applications (5th ed.). Sage Publication.
  • Escalera-Chávez, M. E., García-Santillán, A., & Venegas-Martínez, F. (2014). Modeling attitude toward statistics by a structural equation. Eurasia Journal of Mathematics, Science and Technology Education, 10(1), 23–31.
  • Fayomi, A., Mahmud, Z., Algarni, A., & Almarashi, A. M. (2022). Arab and Malay students’ attitudes toward statistics and their learning styles: A Rasch measurement approach. Mathematical Problems in Engineering, 2022, Article 4144254.
  • Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Springer.
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage.
  • Hommik, C., & Luik, P. (2017). Adapting the survey of attitudes towards statistics (SATS-36) for Estonian secondary. Statistics Education Research Journal, 16(1), 228–239.
  • Immekus, J. C., Snyder, K. E., & Ralston, P. A. (2019). Multidimensional item response theory for factor structure assessment in educational psychology research. Frontiers in Education, 4(45), 1–15.
  • Irwing, P., & Hughes, D. J. (2018). Test development. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley handbook of psychometric testing (pp. 1–47). Wiley.
  • Jatnika, R. (2015). The effect of SPSS course to students’ attitudes toward statistics and achievement in statistics. International Journal of Information and Education Technology, 5(11), 818–821.
  • Joshi, A., Kale, S., Chandel, S., & Pal, D. (2015). Likert scale: Explored and explained. British Journal of Applied Science & Technology, 7(4), 396–403.
  • Judi, H. M., Mohamed, H., Ashaari, N. S., & Wook, T. S. M. T. (2011). A structural validity study of students’ attitudes towards statistics: An initial empirical investigation. Journal of Quality Measurement and Analysis, 7(2), 163–171.
  • Koparan, T. & Rodriguez-Alveal, F. (2022). Probabilistic thinking in prospective teachers from the use of TinkerPlots for simulation: Hat problem. Journal of Pedagogical Research, 6(5), 1-16.
  • Koparan, T. (2015). Development of an attitude scale towards statistics: A study on reliability and validity. Karaelmas Journal of Educational Sciences, 3(1), 76–86.
  • Koparan, T. (2018). Examination of the dynamic software-supported learning environment in data analysis. International Journal of Mathematical Education in Science and Technology, 50(2), 277–291.
  • Kyriazos, T. A., & Stalikas, A. (2018). Applied psychometrics: The steps of scale development and standardization process. Psychology, 9(11), 2531–2560.
  • Larwin, K. H., & Larwin, D. (2011). A meta-analysis examining the impact of computer-assisted instruction on postsecondary statistics education: 40 years of research. Journal of Research on Technology in Education, 43(3), 253–278.
  • Maskey, R., Fei, J., & Nguyen, H. O. (2018). Use of exploratory factor analysis in maritime research. Asian Journal of Shipping and Logistics, 34(2), 91–111.
  • Pardede, T., Santoso, A., Diki, D., Retnawati, H., Rafi, I., Apino, E., & Rosyada, M. N. (2023). Gaining a deeper understanding of the meaning of the carelessness parameter in the 4PL IRT model and strategies for estimating it. Research and Evaluation in Education, 9(1), 86–117.
  • Peiró-Signes, Á., Trull, Ó., Segarra-Oña, M., & García-Díaz, J. C. (2020). Attitudes towards statistics in secondary education: Findings from fsQCA. Mathematics, 8(5), 1–17.
  • Posit Team. (2023). RStudio: Integrated development environment for R (2023.3.1.446). Posit Software.
  • Revelle, W. (2023). Psych: Procedures for psychological, psychometric, and personality research (R package version 2.3.3). Cran-r Project.
  • Roberts, D. M., & Bilderback, E. W. (1980). Reliability and validity of a statistics attitude survey. Educational and Psychological Measurement, 40(1), 235–238.
  • Saidi, S. S., & Siew, N. M. (2019). Investigating the validity and reliability of survey attitude towards statistics instrument among rural secondary school students. International Journal of Educational Methodology, 5(4), 651–661.
  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores (psychometric monograph no. 17). Psychometrika, 34(1), 1–100.
  • Schau, C. (2003). Students attitudes: The “other” important outcome in statistics education. Joint Statistical Meetings-Section on Statistical Education, 2003, 3673–3683.
  • Schau, C., Stevens, J., Dauphinee, T. L., & Vecchio, A. Del. (1995). The development and validation of the survey of attitudes toward statistics. Educational and Psychological Measurement, 55(5), 868–875.
  • Sharma, A. M., & Srivastav, A. (2021). Study to assess attitudes towards statistics of business school students: An application of the SATS-36 in India. International Journal of Instruction, 14(3), 207–222.
  • Soe, H. H. K., Khobragade, S., Lwin, H., Htay, M. N. N., Than, N. N., Phyu, K. L., & Abas, A. L. (2021). Learning statistics: Interprofessional survey of attitudes toward statistics using SATS-36. Dentistry and Medical Research, 9(2), 121–125.
  • Sosa, G. W., Berger, D. E., Saw, A. T., & Mary, J. C. (2011). Effectiveness of computer-assisted instruction in statistics: A meta-analysis. Review of Educational Research, 81(1), 97–127.
  • Tishkovskaya, S., & Lancaster, G. A. (2012). Statistical education in the 21st century: A review of challenges, teaching innovations and strategies for reform. Journal of Statistics Education, 20(2), 1–55.
  • Vanhoof, S., Kuppens, S., Sotos, A. E. C., Verschaffel, L., & Onghena, P. (2011). Measuring statistics attitudes: Structure of the survey of attitudes toward statistics (SATS-36). Statistics Education Research Journal, 10(1), 35–51.
  • Willse, J. T. (2018). CTT: Classical test theory functions (R package version 2.3.3). Cran-r Project.
  • Wise, S. L. (1985). The development and validation of a scale measuring attitudes toward statistics. Educational and Psychological Measurement, 45(2), 401–405.
  • Zanon, C., Hutz, C. S., Yoo, H., & Hambleton, R. K. (2016). An application of item response theory to psychological test development. Psicologia: Reflexão e Crítica, 29(1), 1–10.
  • Zeidner, M. (1991). Statistics and mathematics anxiety in social science students: Some interesting parallels. British Journal of Educational Psychology, 61(3), 319–328.


This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.