By Alexander Scriven, Complex Adaptive Systems Lab, Data Science Institute, University of Technology Sydney, Australia, alexander.scriven@uts.edu.au | David Jacob Kedziora, Complex Adaptive Systems Lab, Data Science Institute, University of Technology Sydney, Australia, david.kedziora@uts.edu.au | Katarzyna Musial, Complex Adaptive Systems Lab, Data Science Institute, University of Technology Sydney, Australia, katarzyna.musial-gabrys@uts.edu.au | Bogdan Gabrys, Complex Adaptive Systems Lab, Data Science Institute, University of Technology Sydney, Australia, bogdan.gabrys@uts.edu.au
With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and industry. For the still relatively young field of Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period is under way, spurred on by a burgeoning interest from broader society. Yet, to date, little research has been undertaken to assess the current state of this dissemination and its uptake. Thus, this review makes two primary contributions to knowledge around this topic. Firstly, it provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial. Secondly, it motivates and outlines a framework for assessing whether an AutoML solution designed for real-world application is ’performant’; this framework extends beyond the limitations of typical academic criteria, considering a variety of stakeholder needs and the human-computer interactions required to service them. Thus, additionally supported by an extensive assessment and comparison of academic and commercial case-studies, this review evaluates mainstream engagement with AutoML in the early 2020s, identifying obstacles and opportunities for accelerating future uptake.
The Technological Emergence of AutoML presents a comprehensive snapshot of how AutoML has permeated into mainstream use within the early 2020s. This work surveys both their implementation and application in the context of industry. It also defines what a ‘performant’ AutoML system is – HCI support is valued highly here – and assesses how the current crop of available packages and services lives up to expectations. To do so in a systematic manner, this survey is structured as follows.
Section 2 begins by elaborating on the notion of an ML workflow, conceptually framing AutoML in terms of the high-level operations required to develop, deploy and maintain an ML model. Section 3 uses this workflow to support the introduction of industry-related stakeholders and their interests/obligations. These requirements are unified into a comprehensive set of criteria, supported by methods of assessment, that determine whether an AutoML system can be considered performant. Section 4 launches the survey in earnest, assessing the nature and capabilities of existing AutoML technology beginning with an examination of open-source AutoML packages. The section additionally investigates AutoML systems that are designed for specific domains, as well as commercial products. Subsequently, Section 5 assesses where AutoML technology has been used and how it has fared. Academic work focusing on real-world applications is surveyed, as are vendor-based case studies. All key findings and assessments are then synthesized in Section 6, with commentary around how mature AutoML technology is, as well as whether there are obstacles and opportunities for future uptake. Finally, Section 7 provides a concluding overview on the technological emergence of AutoML.