Machine learning operations (MLOps), the standardisation and streamlining of machine learning lifecycle management, has become one of the most important development areas within business.
By Fred Senekal, head of R&D at Learning Machines
Machine learning (ML) and data science are being widely adopted within organisations and can deliver practical solutions that provide demonstratable business returns. But to do so requires an effective team structure consisting of several specialists working together.
It all starts with the data engineer who is responsible for preparing and building the data infrastructure to store and manage the data for the data scientist to analyse. Moreover, the engineer is responsible to ingest both batch and streaming data to fulfil ETL (extract, transform, load) functions. It is also their responsibility to maintain data pipelines to provide the data that powers ML models. Think of the data engineer as the gatekeeper and facilitator for the movement and storage of data.
Data scientists are analytical experts who find trends in data and build predictive models. They serve as the link between technology and business and perform exploratory data analysis, feature selection, and dataset preparation. They are also responsible for building ML models and assessing their quality. Because of this, most of them are familiar with several programming languages as well as statistical analysis, ML techniques, and data visualisation.
The MLOps or DevOps engineers work with the data scientists, data engineers, developers, and technical teams to package machine learning models, manage code releases, manage versioning of machine learning models and software deployments, and monitor the models during production. They build operational environments to comply with MLOps and DevOps strategies and best practices.
An integral part of this team is the ML or software engineer. These are the individuals tasked with integrating ML models into applications and systems. They must also ensure that the ML models work seamlessly with non-ML applications. Fundamentally, they apply the principles of computer science and computer engineering to design and develop relevant solutions that address business requirements.
Moving on from the development side, the data and business analysts provide analysis and interpretation of data sets, trends, and patterns that are valuable for decision-making within organisations. They collaborate with data scientists and engineers to identify opportunities for process improvements, recommend system modifications, and develop policies for data governance. They incorporate critical thinking and good communication skills to effectively convey to business and technical teams how solutions must be optimised.
The final link in the MLOps chain is that of report writers. They prepare reports based on the analysis from the data scientists and provide effective communication on trends, patterns, and predictions. These individuals support research teams and management by collecting and analysing data and reporting results based on the needs of end users.
All told, effective ML development inside the organisation requires all these individuals to work together with an understanding of the business challenges that need to be addressed. While this is an example of an ‘ideal’ team, the reality is that many companies will likely distribute the roles and responsibilities between several specialists due to availability of skills and affordability concerns.