DOI: 10.1515/cllt-2026-0011 ISSN: 1613-7027

Irregularity correlates with frequency, concreteness, and low contextual uncertainty: a corpus-based study of English verb paradigms

Hewei Liu

Abstract

Irregularity correlates with frequency, but frequency is not its only lexical correlate. Investigating such correlates requires quantifying irregularity, increasingly viewed as gradient and multidimensional. This paper operationalises English verb irregularity as gradient structural indices derived from paradigmatic wordform relations and then tests whether these indices covary with usage and semantic predictors beyond token frequency. The study analyses 2,000 most frequent English verbs in the British National Corpus. From phonemic wordform lists, paradigm cells are aligned and six indices are derived: microclass size, stem uniformity, two surprisal-based measures of system-relative deviation and a lexeme-level Paradigm Cell Filling Problem inferability score. These indices are modelled using usage and semantic predictors. A follow-up study compares recurrent predictors across broad verb groups defined by diachronic trajectory. Results show that most irregularity indices are associated with higher token frequency, greater concreteness and lower contextual uncertainty, with concreteness associated with the historical source of English strong verbs.

More from our Archive