Data challenges and practical aspects of machine learning-based statistical methods for the analyses of poultry data to improve food safety and production efficiency



Leveraging data collected by commercial poultry requires a deep understanding of the data that are
collected. Machine learning (ML)-based techniques are capable of “learning by finding” nonobvious associations and patterns in the data in order to create more reliable, accurate, explanatory, and predictive statistical models. This article provides practical definitions and examples of ML-based statistical approaches for the analysis of poultry production and poultry food safety-based data. In addition to summarizing the literature, two real examples of the supervised machine learning ensemble technique, random forest (RF), are provided with respect to predicting egg weights from a commercial layer farm and identifying the potential causes of a Salmonella outbreak from a commercial broiler facility. Specifically, as an example, for the prediction of egg weights, a training model and a test model were created, and a modification of RF was used to explore the ability to
predict egg weights. Results identified multiple variables including Age, Farm Location, Body Weight,
Total Eggs, Hens Housed, and House Style which were predictive of the continuous variable Egg Weight. With respect to the accuracy of the variable Egg Weight, the average error between the predicted and actual egg weight was determined to be less than 3%. With respect to broiler food safety, a relational database was constructed and a supervised RF model was developed to identify the predictors of Salmonella in a grow-out farm and associated broiler processing plant. Predictors of Salmonella that included livability, density of birds in the grow-out farm, and breeder age were identified. The task of choosing the most appropriate ML-based model(s) that accounts for the large number of variables common to the poultry industry and addresses the intricate interdependence between several production parameters and inputs while predicting multiple sequential outputs is complex. The use of ML techniques in combination with new data streams
including sensors (e.g., visual and audio), IoT, and Web-scraping could offer a more comprehensive,
efficient, and timely approach toward evaluating productivity, food safety, and profitability in commercial poultry.