This is the last article of the series Decoding Machine Learning.

Managing machine learning projects requires careful planning, well thought strategy, and effective use of resources to ensure that they deliver desired results within stipulated timelines. At first it looks like a difficult cookie to crack, understandably so. The stakes are high and as data science often consists of conducting several experiments and tests to get to the right answer, it almost seems unpredictable.

This article covers ways that will assist you in navigating this hurdle.

Elements

Let’s consider a brief overview of the major elements that are the prerequisites of building a plan. These elements have also been covered in more detail in the previous articles.

Understanding the scope: Before beginning any machine learning project, it’s important to have a clear understanding of the scope and goals of the project. This involves defining the business problem the project aims to solve and identifying the specific objectives (business and technical specifications) that must be achieved to solve it successfully.
Data Complexity: The assessment of data requirements is a very important step. It will help define the constraints of your approach. Another key part is understanding the data quality which involves evaluating the completeness, accuracy, and consistency of available data to determine if they are suitable for the project. Further, identifying the potential limitations of data, such as missing values, non-uniformity, or possible biases that may also impact the project’s performance.
Approach: The strategy or general approach to the problem statement is critical. We have covered this in much more detail in the previous article.
Plan: This portion will be discussed in detail in this article covering resourcing to building the final plan for executing the project.

Plan

How do we plan? At this stage you should have an understanding of the client requirements, a proposed approach and a comprehensive list of the tasks. Let us focus our attention on two major concepts and practices.

Using Work Breakdown Structure (WBS)

Understand with an example

It is important to simply breakdown a task into subtasks. Let’s take an example:

Client requirement is to classify red / green light at traffic stops. While the client has a model in place, they realised that there were several points of failure. Particularly, certain types of stop lights were barely seen in the training dataset causing failure in production.
The client wants you to evaluate their training data to identify more such issues and also provide more data to help cover the missed out cases.

Now, let’s consider the overall tasks:

Client data assessment: Evaluating the images by clustering them basis similarity, metadata and other characteristics.
Research & defining the universe: Understanding the different ways stop lights are shown in different countries.
Sourcing: Implement a mechanism to source more data
Quality evaluation: Calculating KPI’s for evaluating the sourced data
Delivery: Delivering the data along with the presentations and analysis.

If we were to break down the tasks into smaller components:

Task	Sub-Tasks
Client data assessment	Ingesting the data into local / cloud storage
Client data assessment	Building code to generate meta data features such as height, width of images
Client data assessment	Building code to cluster similar images using image embeddings
Client data assessment	Manually review clusters and summarise meta data statistics and clusters
Client data assessment	Identify and document gaps in the data
Research & defining the universe	Research for the type of stop lights by country
Research & defining the universe	Identify sources to scrape / fetch the data
Sourcing	Write code to crawl web pages containing animal pictures or gather URLs provided manually
Sourcing	Evaluate the coverage / sufficiency and quality of the extract
Sourcing	Downloading additional data points manually where scraping is not possible
Quality Evaluation	Identifying KPI’s to assess the coverage, completeness and data quality
Quality Evaluation	Build analysis report
Delivery	Finalise the data for delivery
Delivery	Finalise the reports / presentation of the analysis covering major statistics and insights

Definition

Officially, a work breakdown structure (WBS) is a hierarchical decomposition of tasks required to complete a project. But that is not the only requirement of WBS. While developing a comprehensive WBS, we also need to categorise responsibilities based on scope, complexity, and resource requirements. Further, we assign subtasks to individuals, establish clear dependencies among tasks, and outline milestones that signify completion of important phases. It does feel like an extra effort but having a detailed WBS ensures nothing falls through the cracks and keeps team members focused on their respective roles and accountabilities.

Work estimation

Let’s consider that during that the client has asked you to complete this activity within 7 days. Also, in a few sources identified during your preliminary research, it will be easier to download the data than building scrapers.

Thus, in order to estimate the effort involved for each component task, you need to understand the complexity. This is usually determined based on the past experience, skill set and research.

For example, if there is a relatively less complicated source that needs to be scraped and the intended resource to be allocated has prior experience then within a day, the script can be made and tested. It is also imported to build some buffer in your effort estimations. Generally, if there is a high complexity task then you can add 30% – 50% time to your buffer.

This means we are classifying each task into high, medium or low based on the complexity.

Complexity	Description
High	Tasks with a lot of unknowns or known difficult challenges. These challenges poses the potentiality of altering your approach to a problem.
Medium	Tasks where you have a general idea or approach in mind or you have had past experience implementing something similar.
Low	Generally, an easy task where the requirements and approach is well known.

Similarly, the potential impact / value of the outcome can also be classified into categories (High, Medium, Low).

Impact	Description
High	Highly critical task delivering high value to the end user. If this task is not carried out then it will not meet the project requirements
Medium	Tasks where you have a general idea or approach in mind or you have had past experience implementing something similar.
Low	Generally, an easy task where the requirements and approach is well known.

Value vs Complexity Matrix

At its core, the complexity vs. value matrix assigns relative values to tasks or features within a larger project context, allowing teams to make informed decisions on how to divide efforts and resources. Consider the example covered in this article:

Task	Sub-Tasks	Complexity	Value
Client data assessment	Ingesting the data into local / cloud storage	Low	Low
Client data assessment	Building code to generate meta data features such as height, width of images	Medium	Medium
Client data assessment	Building code to cluster similar images using image embeddings	High	High
Client data assessment	Manually review clusters and summarise meta data statistics and clusters	Medium	High
Client data assessment	Identify and document gaps in the data	Low	High
Research & defining the universe	Research for the type of stop lights by country	Medium	High
Research & defining the universe	Identify sources to scrape / fetch the data	Low	Medium
Sourcing	Write code to crawl web pages containing animal pictures or gather URLs provided manually	Medium	Medium
Sourcing	Evaluate the coverage / sufficiency and quality of the extract	Low	Medium
Sourcing	Downloading additional data points manually where scraping is not possible	Medium	Medium
Quality Evaluation	Identifying KPI’s to assess the coverage, completeness and data quality	Medium	High
Quality Evaluation	Build analysis report	Low	High
Delivery	Finalise the data for delivery	Low	High
Delivery	Finalise the reports / presentation of the analysis covering major statistics and insights	Low	High

Let’s decode the matrix.

High value, Low complexity: Tasks falling into this category may not require a lot of effort, but they hold significant weight in achieving project goals. Addressing these tasks early on sets a foundation for continued progress while yielding substantial benefits.

Low value, Low complexity: Routine jobs without much significance. While essential for meeting project objectives, they often consume unnecessary effort unless managed carefully. Avoid investing too heavily in these trivial matters that will not significantly contribute to the project’s outcome.

High value, High complexity: With elevated stakes come complicated processes demanding intense focus and specialised skills. These projects present both opportunities and challenges, and managing them well requires experienced leadership, teamwork, and clear communication channels.

Similarly, you can consider the rest of the combinations as well.

Finally, we come close to putting our plan in shape. The only component left is deciding the tooling (which can be decided based on the requirements) and allocating the tasks to the respective resources.

About Us

Data Science Discovery is a step on the path of your data science journey. Please follow us on LinkedIn to stay updated.

About the writers:

Ujjayant Sinha: Data scientist with professional experience in market research and machine learning in the pharma domain across computer vision and natural language processing.
Ankit Gadi: Driven by a knack and passion for data science coupled with a strong foundation in Operations Research and Statistics has helped me embark on my data science journey.

Elements

Plan

Using Work Breakdown Structure (WBS)

Understand with an example

Definition

Work estimation

Value vs Complexity Matrix

About Us

Leave a Reply Cancel reply

Join the DSD community

A community to catch up, stay updated and connect.

Elements

Plan

Using Work Breakdown Structure (WBS)

Understand with an example

Definition

Work estimation

Value vs Complexity Matrix

About Us

You Might Also Like

Decoding Machine Learning: 3. How to Demystify Data Types

Decoding Machine Learning: 2. How to Master the Data Puzzle

Decoding Machine Learning: 4 How to Create a Solution Approach

Leave a Reply Cancel reply

Join the DSD community

A community to catch up, stay updated and connect.