DOI: 10.1177/30504554261450839 ISSN: 3050-4554

Decoupling Hierarchical Errors: A Dual-Pronged Approach for In-Group and Out-of-Group Challenges in FGVC

Yongsan Lin, Chenyu Lin

Fine-grained visual classification (FGVC) is challenged by subtle interclass differences, which lead to both out-of-group and in-group errors, particularly when label hierarchies are only partially available. To leverage hierarchical labels for enhanced feature discriminability, this paper introduces two novel loss functions. First, a hierarchically discriminative loss ( L hd ) linearly combines fine-grained subcategory features with their ancestral superclass features. This fusion, enhanced by cross-channel max pooling and channel-wise dropout, strengthens feature discriminability to suppress out-of-group errors. Second, an in-group regularization loss ( L ig ) addresses visually similar categories within the same superclass. It introduces controlled interference by mixing target class features with randomly selected features from other in-group classes. This process increases learning difficulty without altering labels, thereby mitigating sample-specific overfitting and reducing in-group errors. Our proposed loss functions are easy to integrate into existing frameworks, require no extra annotations or complex network modifications, and support end-to-end training. Extensive experiments on five benchmark datasets, including BreakHis, CUB-200-2011, Butterfly-200, FGVC-Aircraft, and Stanford Cars, demonstrate that our method consistently improves top-1 accuracy, as well as weighted average precision and mean average precision. Furthermore, we also observe consistent improvements in F 1 -score. The performance gains are particularly significant in scenarios with low fine-grained annotation ratios.

More from our Archive