Predicting Future E-Commerce Purchases from Multi-Visit Sequences and Behavioural Micro-Interactions
Marcin Gabryel, Milan Kocić, Aleksandra Gabryel, Marek Kisiel-Dorohonicki, Zofia Patora-WysockaAbstract
This paper investigates future purchase prediction in e-commerce using sequences of user visits and behavioural micro-interactions. The study is based on anonymised production data from three e-commerce domains, comprising approximately 690,000 users and 2.61 million visits. User histories are modelled as temporally ordered sequences of visits, with each visit represented by 21 features. Six models are evaluated: three tabular approaches (SVM, XGBoost, CatBoost) and three sequential architectures (RNN, LSTM, GRU), each tested in two variants addressing class imbalance. The experimental setup employs a temporal train–test split, five random seeds, and bootstrap-based significance testing. The top-performing sequential models achieve an AUC-PR of approximately 0.379, significantly outperforming the strongest tabular baseline, XGBoost (AUC-PR ≈ 0.364). This advantage remains consistent across domain-specific evaluations as well as in a leave-one-shop-out setting. Further analysis shows that mouse movement and scrolling-related features provide the strongest predictive signal, whereas extending user histories beyond 40 visits yields only marginal performance gains.