Machine Learning is much more than training models

Over the past few years, I've met a lot of data scientists and machine learning engineers. I've also had the opportunity to learn a lot about what a machine learning job means, so I decided to share some thoughts.

Here is the biggest misconception that I hear over and over again: "Machine Learning Engineers spend their time designing and training machine learning models." This idea is misleading. On the one hand, yes, they spend some of their time designing and training new models, but realistically, this is just a small fraction of the job.

Usually, Machine Learning Engineers spend most of their time dealing with data and putting their models into production. In a perfect world, there would be a Data Engineer that would help set up the data pipeline that will feed the machine learning model. It would also be DevOps specialists, and Software Engineers, and other roles helping with everything that needs to happen to take a model out of a laptop and into the real world. The reality is somewhat different.

Here is a high-level summary of everything you should expect in a regular Machine Learning Engineer position:

  • Gather the list of requirements and understand the problem that you need to solve
  • Communicate with stakeholders throughout the entire lifecycle of the project
  • Do a lot of research to find the best way to solve the problem
  • Design a solution that provides value and stays within the timeline and budget constraints
  • Determine what's the data that you will use, based on its availability and usefulness
  • Design and implement the solution to gather the data and make it available to your implementation
  • Do feature engineering on the original data to transform it to your specific needs
  • Create the appropriate machine learning models to solve the problem
  • Design and implement the connections of your solution with existing systems and processes
  • Glue together all components into a comprehensive solution that addresses the original problem
  • Take your solution into production
  • Design a process to keep your product up to date (further model training, updates, etc.)

Your mileage may vary depending on the your company and its characteristics, but in general, your job will be much more than just training models.

(In my opinion, Software Engineers that get into Machine Learning have an excellent advantage to succeed in the field.)