DOI: 10.3390/e28070730 ISSN: 1099-4300

Knowing What We Don’t Know: Model-Based Uncertainty Decomposition for Categorical Sequences

Marc A. Scott, Fulvia Pennoni, Ignacio Bórquez

State sequence analysis of longitudinal categorical data seeks to synthesize pathways through different dimensions of the life course for descriptive, associative and predictive purposes. Given the number and variety of patterns in such data, measures of the dynamic features of sequences are used to characterize them. One, based on the information-theoretic notion of entropy, measures the uncertainty in the state that will be active at a given time. We customize its use to establish the extent to which we are ignorant, or unsure, of what happens next in a dynamic process, conditional on its past. Relying on different Markov chain models for nominal state sequences, we establish multiple measures of uncertainty that allow us to adjust expectations to reflect individual-specific differences and historical information. We establish complementary measures to assess the predictive power of the models in the context of this uncertainty. In so doing, we can summarize and contrast the change in uncertainty associated with different models. As is common in this field, we consider ways in which data can be stratified through demographics and clustering, and how this additional level of partitioning builds a more complete narrative of the social process.

More from our Archive