Skip to main content
Log in

Generating items during testing: Psychometric issues and models

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

On-line item generation is becoming increasingly feasible for many cognitive tests. Item generation seemingly conflicts with the well established principle of measuring persons from items with known psychometric properties. This paper examines psychometric principles and models required for measurement from on-line item generation. Three psychometric issues are elaborated for item generation. First, design principles to generate items are considered. A cognitive design system approach is elaborated and then illustrated with an application to a test of abstract reasoning. Second, psychometric models for calibrating generating principles, rather than specific items, are required. Existing item response theory (IRT) models are reviewed and a new IRT model that includes the impact on item discrimination, as well as difficulty, is developed. Third, the impact of item parameter uncertainty on person estimates is considered. Results from both fixed content and adaptive testing are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adams, R. A., Wilson, M., & Wang, W-C. (1997). The multidimensional random coefficients multinomial logit model.Applied Psychological Measurement, 21, 1–24.

    Google Scholar 

  • Bejar, I.I. (1990). A generative analysis of a three-dimensional spatial task.Applied Psychological Measurement, 14, 237–246.

    Google Scholar 

  • Bejar, I. I. (1996).Generative response modeling: Leveraging the computer as a test delivery medium (Educational Testing Service Research Report RR-96-13). Princeton, NJ: ETS.

    Google Scholar 

  • Bejar, I. I., & Yocom, P. (1991). A generative approach to the modeling of isomorphic hidden-figure items.Applied Psychological Measurement, 15, 129–137.

    Google Scholar 

  • Binet, A., & Simon, T. (1905). New methods for the diagnosis of the intellectual level of subnormals.L'Annee psychologique, 11, 245–336.

    Google Scholar 

  • Binet, A., & Simon, T. (1915).A method of measuring the development of the intelligence of young children (3rd ed.). Chicago: Chicago Medical Book.

    Google Scholar 

  • Carpenter, P.A., Just, M.A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of processing in the Raven's Progressive Matrices Test.Psychological Review, 97, 404–431.

    Google Scholar 

  • DiBello, L. V., Stout, W. F., & Roussos, L. (1995). Unified cognitive psychometric assessment likelihood-based classification techniques. In P. D. Nichols, S. F. Chipman, & R. L. Brennan (Eds.),Cognitively diagnostic assessment. Hillsdale, NJ: Erlbaum Publishers.

    Google Scholar 

  • Diehl, K. A. (1998).Using cognitive theory and item response theory to extract information from wrong responses. Unpublished Master's thesis, University of Kansas, Lawrence, Kansas.

    Google Scholar 

  • Embretson, S.E. (1983). Construct validity: Construct representation versus nomothetic span.Psychological Bulletin, 93, 179–197.

    Google Scholar 

  • Embretson, S. E. (1984). A general multicomponent latent trait model for response processes.Psychometrika, 49, 175–186.

    Google Scholar 

  • Embretson, S.E. (1985).Test design: Developments in psychology and psychometrics. New York: Academic Press.

    Google Scholar 

  • Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change.Psychometrika, 56, 495–516.

    Google Scholar 

  • Embretson, S.E. (1994). Application of cognitive design systems to test development. In C.R. Reynolds (Ed.),Cognitive assessment: A multidisciplinary perspective (pp. 107–135). New York: Plenum Press.

    Google Scholar 

  • Embretson, S. E. (1995a). A measurement model for linking individual change to processes and knowledge: Application to mathematical learning.Journal of Educational Measurement, 32, 277–294.

    Google Scholar 

  • Embretson, S.E. (1995b). Developments toward a cognitive design system for psychological tests. In D. Lupinsky & R. Dawis (Eds.),Assessing individual differences in human behavior (pp. 17–48). Palo Alto, CA: Davies-Black Publishing.

    Google Scholar 

  • Embretson, S.E. (1995c). The role of working memory capacity and general control processes in intelligence.Intelligence, 20, 169–190.

    Google Scholar 

  • Embretson, S. E. (1997). Multicomponent latent trait models. In W. van der Linden & R. Hambleton,Handbook of modern item response theory (pp. 305–322). New York: Springer-Verlag.

    Google Scholar 

  • Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning.Psychological Methods, 3, 380–396.

    Google Scholar 

  • Embretson, S. E. (in press). Generating abstract reasoning items with cognitive theory. In S. Irvine & P. Kyllonen (Eds.),Item generation for test development. Mahwah, NJ: Erlbaum Publishers.

  • Embretson, S. E., & Schneider, L. M. (1989). Cognitive models of analogical reasoning for psychometric tasks.Learning and Individual Differences, 1, 155–178.

    Google Scholar 

  • Embretson, S. E., Schneider, L. M., & Roth, D. L. (1985). Multiple processing strategies and the construct validity of verbal reasoning tests.Journal of Educational Measurement, 23, 13–32.

    Google Scholar 

  • Embretson, S. E., & Wetzel, D. (1987). Component latent trait models for paragraph comprehension tests.Applied Psychological Measurement, 11, 175–193.

    Google Scholar 

  • Fischer, G. H. (1973). Linear logistic test model as an instrument in educational research.Acta Psychologica, 37, 359–374.

    Google Scholar 

  • Goeters, K. M. & Lorenz, B. (in press). On the implementation of item generation principles in the design of aptitude testing in aviation. In S. Irvine & P. Kyllonen (Eds.),Item generation for test development. Mahwah, NJ: Erlbaum Publishers.

  • Gulliksen, H. (1950).Theory of mental tests. New York: Wiley.

    Google Scholar 

  • Guttman, L. (1969). Integration of test design and analysis.Proceedings of the 1969 invitational conference on testing problems. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Hively, W., Patterson, H. L., & Page, S. (1968). A “universe-defined” system of arithmetic achievement tests.Journal of Educational Measurement, 5, 275–290.

    Google Scholar 

  • Hornke, L.F., & Habon, M.W. (1986). Rule-based item bank construction and evaluation within the linear logistic framework.Applied Psychological Measurement, 10, 369–380.

    Google Scholar 

  • Irvine, S. H., Dunn, P. L., & Anderson, J. D. (1989).Towards a theory of algorithm-determined cognitive test construction. Devon, UK: Polytechnic South West.

    Google Scholar 

  • Irvine, S., & Kyllonen, P. (Eds.). (in press).Item generation for test development. Mahwah, NJ: Erlbaum Publishers.

  • Kyllonen, P. (1993). Aptitude testing inspired by information processing: A test of the four-sources model.Journal of General Psychology, 120, 375–405.

    Google Scholar 

  • Little, J. R. A., & Rubin, D. B. (1987).Statistical analysis with missing data. New York: Wiley.

    Google Scholar 

  • Mislevy, R. J. (1988). Exploiting auxiliary information about items in the estimation of Rasch item difficulty parameters.Applied Psychological Measurement, 12, 281–296.

    Google Scholar 

  • Mislevy, R. J., Sheehan, K. M., & Wingersky, M. (1993). How to equate tests with little or no data.Journal of Educational Measurement, 30, 55–76.

    Google Scholar 

  • Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Chicago, IL: University of Chicago Press.

    Google Scholar 

  • Raven, J. C. (1938).Progressive matrices: A perceptual test of intelligence. 1938 individual form. London: Lewis.

    Google Scholar 

  • Raven, J.C., Court, J.H., & Raven, J. (1992).Manual for Raven's Progressive Matrices and Vocabulary Scale. San Antonio, TX: The Psychological Corporation.

    Google Scholar 

  • Rubin, D. B. (1987).Multiple imputation for nonresponse in surveys. New York: Wiley.

    Google Scholar 

  • Shye, S., Elizur, D., & Hoffman, M. (1994).Introduction to facet theory. Thousand Oaks, CA: Sage Publishers.

    Google Scholar 

  • Spearman, C. (1913). Correlations of sums and differences.British Journal of Psychology, 5, 417–426.

    Google Scholar 

  • Spearman, C. (1927).The abilities of man: Their nature and measurement. London: MacMillan.

    Google Scholar 

  • Sternberg, R. J. (1985).Beyond IQ: A triarchic theory of human intelligence. New York: Cambridge University Press.

    Google Scholar 

  • Tsutakawa, R. K., & Johnson, J. (1990). The effect of uncertainty on item parameter estimation on ability estimates.Psychometrika, 55, 371–390.

    Google Scholar 

  • Tsutakawa, R. K., & Soltys, M. J. (1988). Approximation for Bayesian ability estimation.Journal of Educational Statistics, 13, 117–130.

    Google Scholar 

  • Whitely, S. E. (1980). Multicomponent latent trait models for ability tests.Psychometrika, 45, 479–494.

    Google Scholar 

  • Whitely, S. E., & Schneider, L. M. (1981). Information structure on geometric analogies: A test theory approach.Applied Psychological Measurement, 5, 383–397.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This article is based on the Presidential Address Susan E. Embretson gave on June 26, 1999 at the 1999 Annual Meeting of the Psychometric Society held at the University of Kansas in Lawrence, Kansas. —Editor

Rights and permissions

Reprints and permissions

About this article

Cite this article

Embretson, S.E. Generating items during testing: Psychometric issues and models. Psychometrika 64, 407–433 (1999). https://doi.org/10.1007/BF02294564

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02294564

Key words

Navigation