DOI: 10.1177/30504554261450839 ISSN: 3050-4554
Decoupling Hierarchical Errors: A Dual-Pronged Approach for In-Group and Out-of-Group Challenges in FGVC
Yongsan Lin, Chenyu Lin
Fine-grained visual classification (FGVC) is challenged by subtle interclass differences, which lead to both out-of-group and in-group errors, particularly when label hierarchies are only partially available. To leverage hierarchical labels for enhanced feature discriminability, this paper introduces two novel loss functions. First, a hierarchically discriminative loss (
L
hd
) linearly combines fine-grained subcategory features with their ancestral superclass features. This fusion, enhanced by cross-channel max pooling and channel-wise dropout, strengthens feature discriminability to suppress out-of-group errors. Second, an in-group regularization loss (
L
ig
) addresses visually similar categories within the same superclass. It introduces controlled interference by mixing target class features with randomly selected features from other in-group classes. This process increases learning difficulty without altering labels, thereby mitigating sample-specific overfitting and reducing in-group errors. Our proposed loss functions are easy to integrate into existing frameworks, require no extra annotations or complex network modifications, and support end-to-end training. Extensive experiments on five benchmark datasets, including BreakHis, CUB-200-2011, Butterfly-200, FGVC-Aircraft, and Stanford Cars, demonstrate that our method consistently improves top-1 accuracy, as well as weighted average precision and mean average precision. Furthermore, we also observe consistent improvements in
F
1
-score. The performance gains are particularly significant in scenarios with low fine-grained annotation ratios.