DOI: 10.3390/fishes11070373 ISSN: 2410-3888

PaliGemma2-FishGrounding: Generative Vision-Language Grounding for Few-Shot Fish Disease Lesion Localization

Peng Peng, Meijing Zhang, Tianqi Lv, Guangmao Ding

Rapid identification of fish disease lesions is essential for disease monitoring and early intervention in aquaculture, yet existing detection methods rely heavily on extensive lesion annotations and fixed category definitions. To address the challenges of limited annotated data and heterogeneous supervision, this study proposes PaliGemma2-FishGrounding, a generative vision-language grounding framework for few-shot fish disease lesion localization. The framework reformulates lesion detection as an open-vocabulary grounding task and unifies lesion localization, disease recognition, health-status classification, and symptom understanding within a single instruction-learning paradigm. By integrating heterogeneous supervision from multiple fish disease datasets, the proposed method enables lesion localization through structured vision-language generation rather than conventional closed-set detection. Experimental results on an Epizootic Ulcerative Syndrome (EUS) lesion dataset demonstrate that the proposed framework outperforms YOLOv8-based baselines, achieving an AP50 of 0.3659 and an AR@10 of 0.5378 while maintaining a zero invalid-box rate. Ablation studies further confirm the effectiveness of the instruction-learning strategy and grounding-based design. The results indicate that generative vision-language models can effectively leverage limited lesion annotations and auxiliary disease knowledge for fish disease analysis. This framework provides a practical solution for low-annotation disease monitoring and offers a promising direction for intelligent aquaculture applications under data-scarce conditions.

More from our Archive