DOI: 10.35377/saucis...1881274 ISSN: 2636-8129

Synthetic Data Augmentation for Fine-Grained Retail Product Classification under Low-Data Regimes

Alperen Ünal, İpek Baz
Fine-grained retail product classification is a challenging computer vision task due to high inter-class similarity, subtle intra-class variations, and limited annotated training data. These challenges are particularly pronounced in real-world retail environments, where collecting and labeling large-scale datasets is costly and time-consuming. This study investigates the effectiveness of synthetic data augmentation for improving fine-grained retail product classification under low-data regimes. To address data scarcity and class imbalance, high-resolution class-conditional synthetic images are generated using StyleGAN-XL. Two retail product categories, snacks and beverages, each consisting of 30 visually similar classes, are considered to evaluate the generalizability of the proposed approach. Synthetic images are generated at a resolution of 4096 × 4096 and subsequently divided into 128 × 128 patches to match the input requirements of the classification model. A YOLOv11-based classification architecture is employed, and four experimental training scenarios are designed to analyze the impact of synthetic data: mixed real and synthetic data, fully synthetic data, limited real data only, and fully real data. Experimental results demonstrate that synthetic data significantly improves classification performance when real training data is limited. In particular, scenarios combining synthetic data with a small portion of real samples consistently outperform real-only training under low-data conditions. The beverage category, characterized by visually distinctive packaging, shows stronger gains than the snack category, highlighting the relationship between synthetic data quality and class discriminability. While fully real data achieves the highest overall performance, the findings indicate that GAN-generated synthetic images provide a practical and effective alternative when collecting sufficient real data is infeasible. Additional comparative experiments with ConvNeXt-Tiny and ResNet50 further confirm that YOLOv11-CLS achieves comparable and competitive performance across different training scenarios. These results confirm that high-quality synthetic data generated with StyleGAN-XL can play a critical role in mitigating data scarcity and improving fine-grained retail product classification performance in low-resource settings.

More from our Archive