DOI: 10.1142/s0219649226500449 ISSN: 0219-6492

Chatbot Simulator and Adaptive Recommendation Systems: Enhancing Learner Profiling in AI-assisted English Learning Using Weighted Cosine Similarity and Structure-aware Clustering

Shannaq Boumedyen

Unsupervised clustering has become an attractive solution to personalise the learning with AI-assistance in English without the necessity of labelled results. Standard similarity metrics tend to assess all features of learners in the same way, restricting their capacity to reflect pedagogically significant variations. This work is a continuation that builds on a baseline cosine-similarity clustering model, adding a weighted cosine similarity structure-based clustering framework that enhances the discovery of the persona of learners in learning the English language with the help of artificial intelligence. With the relevant large-scale dataset of 15,000 learner interactions with tools like ChatGPT, Grammarly, Duolingo and ELSA Speak, we experimentally consider various schemes of weighting, where the primary concern is that of learning gain, task engagement and error reduction. The feature weighting that is proposed combines with the common standardised preprocessing and dimensionality reduction feature and allows more expressive similarity modelling with no complexity increase to the model. Numerous clustering methods are compared with the use of weighted and unweighted cosine similarity on the Silhouette score, Davies-Bouldin index and Calinski–Harabasz index, finding significant and consistent improvements on both Euclidean and unweighted cosine baselines, with the ability to produce Silhouette scores up to 0.77 and well-separate and pedagogically interpretable clusters of learners. The refined clusters have a more distinct differentiation in the learning efficiency and task-engagement profiles, which makes them even more appropriate to be used in downstream personalisation and recommendation tasks. Placed in a non-native learning of English in the GCC context, the work contributes to unsupervised learner modelling and gives a solid methodological basis for future adaptive tutoring and chatbot-assisted learning systems.

More from our Archive