DOI: 10.3390/ani15142051 ISSN: 2076-2615

Prediction of Mature Body Weight of Indigenous Camel (Camelus dromedarius) Breeds of Pakistan Using Data Mining Methods

Daniel Zaborski, Wilhelm Grzesiak, Abdul Fatih, Asim Faraz, Mohammad Masood Tariq, Irfan Shahzad Sheikh, Abdul Waheed, Asad Ullah, Illahi Bakhsh Marghazani, Muhammad Zahid Mustafa, Cem Tırınk, Senol Celik, Olha Stadnytska, Oleh Klym

The determination of the live body weight of camels (required for their successful breeding) is a rather difficult task due to the problems with handling and restraining these animals. Therefore, the main aim of this study was to predict the ABW of eight indigenous camel (Camelus dromedarius) breeds of Pakistan (Bravhi, Kachi, Kharani, Kohi, Lassi, Makrani, Pishin, and Rodbari). Selected productive (hair production, milk yield per lactation, and lactation length) and reproductive (age of puberty, age at first breeding, gestation period, dry period, and calving interval) traits served as the predictors. Six data mining methods [classification and regression trees (CARTs), chi-square automatic interaction detector (CHAID), exhaustive CHAID (EXCHAID), multivariate adaptive regression splines (MARSs), MLP, and RBF] were applied for ABW prediction. Additionally, hierarchical cluster analysis with Euclidean distance was performed for the phenotypic characterization of the camel breeds. The highest Pearson correlation coefficient between the observed and predicted values (0.84, p < 0.05) was obtained for MLP, which was also characterized by the lowest root-mean-square error (RMSE) (20.86 kg), standard deviation ratio (SDratio) (0.54), mean absolute percentage error (MAPE) (2.44%), and mean absolute deviation (MAD) (16.45 kg). The most influential predictor for all the models was the camel breed. The applied methods allowed for the moderately accurate prediction of ABW (average R2 equal to 65.0%) and the identification of the most important productive and reproductive traits affecting its value. However, one important limitation of the present study is its relatively small dataset, especially for training the ANN (MLP and RBF). Hence, the obtained preliminary results should be validated on larger datasets in the future.