Scoping a Machine Learning Project
Let us paint a scenario that you might be familiar with or will come across during your professional career.
The client has just tasked you with a new feature or project that requires you to do an analysis and further build a machine learning model. For any data science professional, it might be natural to want to skip to the exciting component of planning the potential models to try out.
Nevertheless, during the project scoping stage, it is of utmost importance to deliberately pause and seize the opportunity to ask critical questions. Taking the time to reflect and inquire allows for a deeper understanding of the project’s intricacies and ensures that no vital aspects are overlooked.
For any analytics, data science or machine learning project, it is important to consider the following steps:
- Project Brief – What are the important questions you need to ask to understand the business use-case?
- Data – How do you anticipate the potential challenges with the data?
- Methodology – Design a comprehensive approach to achieve the client’s requirements.
- Plan – Points to consider when setting a timeline and building a plan.
Given the numerous factors at play, such as the intricacies of the desired outcome and the importance of key metrics, it can feel overwhelming for new practitioners to juggle all of these simultaneously. However, it is crucial to acknowledge the significance of approaching these factors systematically. Breaking them down into manageable components and addressing them one by one will help alleviate the feeling of overwhelm and empower practitioners to navigate the project with greater confidence.
In this series of articles, we will discuss a checklist of the key considerations while approaching a new data science project. While we will delve into the specifics of each step in subsequent articles, it is crucial to begin by establishing a clear project brief for your machine learning endeavour. By starting with a solid foundation, you will set yourself up for success as you navigate the intricacies of the project ahead.
This component is the starting point for the project with the primary focus on understanding the business objectives and specifications. Typically positioned at the beginning of the project’s life cycle, it involves engaging in one or multiple discussions with the pertinent stakeholders to gain clarity and alignment. The key questions include :
- What exactly is required in the solution in terms of the functionality?
- What is the intended use of the respective analysis/feature or model?
- What impact does it have on business decision-making?
- What criteria will be used to test and ensure that the model aligns with the client’s requirements?
Let us take an example:
The client has asked you to setup a system that transcribes sales calls and provide a sentiment monitor to better guide the sales agent.
The important questions that we need clarity on include:
- What is the existing setup of their current system?
- How many sales agents are there or plan to use this system?
- In what ways will the sentiment monitor impact and influence the sales agents?
Further, you also want to get clarity on the business specifications such as:
- How fast the client wants the system to work?
- What level of model accuracy is necessary, and what are the consequences of false positives?
- Are there any specific scenarios or special cases that the model must be capable of handling?
Additionally, it is crucial to consider domain-specific details such as the nature of the products involved. The specific features discussed during sales calls can significantly impact the model’s performance. Furthermore, it is essential to account for the different stages of sales calls and variations in the sales scripts, as they can introduce unique challenges and nuances that need to be addressed for optimal results.
Properly addressing these critical questions plays a pivotal role in setting the approach and comprehending the nuances of the underlying domain and use case. At this stage, it is highly effective to paraphrase or reiterate the discussed requirements, affirming to the client that their needs have been actively listened to and thoroughly understood. This fosters a strong client-provider relationship and lays a solid foundation for project success.
While the project brief provides an overall understanding of the requirements, it is essential to recognise the critical role of data in driving the success of your machine learning project. In the next article of this series, we will delve into the intricacies of data requirements and explore effective strategies for assessing data quality. Stay tuned for more actionable insights and guidance in our next instalment.
- Take the time to properly define and understand the project before diving into modelling.
- Understand the business objectives, specifications, and requirements through important questions and clear communication.
- Paraphrase or reiterate requirements to ensure a clear understanding and avoid misunderstandings.
Data science discovery is a step on the path of your data science journey. Please follow us on LinkedIn to stay updated.
About the writers:
- Ujjayant Sinha: Data scientist with professional experience in market research and machine learning in the pharma domain across computer vision and natural language processing.
- Ankit Gadi: Driven by a knack and passion for data science coupled with a strong foundation in Operations Research and Statistics has helped me embark on my data science journey.