DOI: 10.3390/app16126196 ISSN: 2076-3417

A Modular AutoML Framework for End-to-End Machine Learning Automation

Du’a Al-zaleq, Suboh Alkhushayni

As an automated alternative to the complexity and resource intensive task of building a machine learning (ML) pipeline, AutoML offers substantial value. Moreover, given the growing number of application areas requiring ML solutions, but having limited technical expertise, the need for AutoML is increasing. The authors describe a novel and modular AutoML pipeline built using object-oriented Python, designed for both regression and classification problems. Unlike other established libraries (TPOT, Auto-sklearn, Hyperopt-sklearn) this new framework provides structure to the output formats (JSON, YAML etc.) used for backend and API integration purposes. Rather than a GUI-based platform (low-code or otherwise) the authors propose a developer oriented/Code Driven AutoML pipeline. Additionally, it includes interpretability through VIF based Feature Engineering and increased extensibility. The experimental results provided by the authors are based upon public insurance data sets and demonstrate that their system performs at least on par with, and in several cases surpasses, the baseline systems tested, and does so in a manner that provides greater modularity and easier deployment. Therefore, this study demonstrates a lightweight, real world ready solution to provide an effective AutoML solution for use in a variety of application areas including NLP, computer vision, and Web-Based Machine Learning Applications.

More from our Archive