Efficient supply chains, production planning, and resource allocation management are more important than ever. Python is a fantastic platform to model our needs and constraints, and can even visualize our schedule for cascading communications.

It’s no secret that supply chain management is one of the biggest areas of focus for improvement in today’s global economy. Getting goods from destination A to destination B is challenging enough, but Manufacturing enough materials is equivocally arduous, with shortages in silicon chips and pharmaceutical assemblies catching headlines as well.
Modeling these manufacturing processes requires a well-developed understanding of the constraints and dependencies inherent to the production line. Some mixed-integer programming models such as CPLEX or Google’s OR framework derive solutions optimizing an objective function, such as minimizing costs or assigning resources to fixed shifts with delineated constraints and priorities. These models struggle however to model continuous systems and require scaling to make sure the inputs are mathematically even possible to implement.
We can develop a rudimentary production plan with resource balancing in a forward-facing heuristic with relative ease within Python using well-known packages such as Pandas, and can graphically illustrate an interactive Gantt chart using Plotly to show our plan for cascading vertically to management and the shop floor.
First, let’s flush out the case study we will be using for this exercise. In this application, we will be discussing biologics manufacturing within pharmaceuticals. Many vaccines such as those for COVID manufactured today are derived from microorganisms modified to produce the bulk material eliciting an immune response necessary to prevent disease transmission and illness. The details of these processes are often proprietary and very complicated, but for our exercise today let’s assume we are manufacturing a product grown from a microorganism in a fermentation process, purified thoroughly using filters and centrifuges, and then sterilized before adding adjuvants and filling in vials. This process illustrated below is generic and can be found at many different organizations at lab-scale and industrial-scale.

True manufacturing processes would have hundreds of tasks with interlinked constraints and dependencies. For our model, we will have progressive dependencies of one or two preceding tasks, and constraints of technician operator labor and equipment available. We will model these tasks and constraints within a JSON file in the following formats.
Tasks:
"Items": [
{
"ID":1,
"Name":"Seed",
"Description":"Start fermentation from vial of seed material.",
"Process":"Fermentation",
"ResourcesRequired":
{
"Operator":2,
"SafetyCabinet":1
},
"Dependencies":[0],
"Duration":2
},
{
"ID":2,
"Name":"Flasks",
"Description":"Ferment seed material in flasks.",
"Process":"Fermentation",
"ResourcesRequired":
{
"Operator":2,
"SafetyCabinet":1
},
"Dependencies":[1],
"Duration":3
},
{
"ID":3,
"Name":"Small Fermentor",
"Description":"Ferment in small fermentation vessel.",
"Process":"Fermentation",
"ResourcesRequired":
{
"Operator":2,
"SmallFermentor":1
},
"Dependencies":[2],
"Duration":3
},
{
"ID":4,
"Name":"Large Fermentor",
"Description":"Ferment and inactivare in large fermentation vessel.",
"Process":"Fermentation",
"ResourcesRequired":
{
"Operator":2,
"LargeFermentor":1
},
"Dependencies":[3],
"Duration":4
},
{
"ID":5,
"Name":"Prepare Filters",
"Description":"Prep purification filters for next step.",
"Process":"Purification",
"ResourcesRequired":
{
"Operator":1,
"PurificationTank":1
},
"Dependencies":[3],
"Duration":1
},
{
"ID":6,
"Name":"Purification Filters",
"Description":"Start purification in first purification assembly.",
"Process":"Purification",
"ResourcesRequired":
{
"Operator":3,
"PurificationTank":1
},
"Dependencies":[4,5],
"Duration":4
},
{
"ID":7,
"Name":"Centrifuge",
"Description":"Separate material in centrifuges.",
"Process":"Purification",
"ResourcesRequired":
{
"Operator":2,
"Centrifuge":2
},
"Dependencies":[6],
"Duration":4
},
{
"ID":8,
"Name":"Sterile Filter",
"Description":"Start sterilization of material.",
"Process":"Sterile Boundary",
"ResourcesRequired":
{
"Operator":3,
"SterileAssembly":1
},
"Dependencies":[7],
"Duration":2
},
{
"ID":9,
"Name":"Adjuvants",
"Description":"Add adjuvants to bulk material.",
"Process":"Sterile Boundary",
"ResourcesRequired":
{
"Operator":2,
"SterileAssembly":1
},
"Dependencies":[8],
"Duration":2
},
{
"ID":10,
"Name":"Prepare Vials",
"Description":"Sterilize bulk vials.",
"Process":"Sterile Boundary",
"ResourcesRequired":
{
"Operator":2,
"VialFiller":1
},
"Dependencies":[8],
"Duration":1
},
{
"ID":11,
"Name":"Fill",
"Description":"Fill vials with bulk material.",
"Process":"Sterile Boundary",
"ResourcesRequired":
{
"Operator":2,
"VialFiller":1
},
"Dependencies":[9,10],
"Duration":3
}
],
Notice every step of our batch includes:
- ID – integer value
- Name – string short description
- Description – string long description
- Process – string (one of three categories of step)
- Resources Required – dictionary of resources required and integer count
- Dependencies – list of ID integers reliant upon
- Duration – integer of hours required (does not have to be an integer)
Constraints:
"ResourceCapacities":{
"Operator":3,
"SafetyCabinet":1,
"SmallFermentor":1,
"LargeFermentor":1,
"PurificationTank":1,
"Centrifuge":3,
"SterileAssembly":1,
"VialFiller":1
}
Notice our constraints are a dictionary of the resources from tasks and what our case study model team has available. Our manufacturing suite has 3 operators, 1 safety cabinet, 1 small fermentor, etc, available for use at any given time.
Each one of our batches we think will take approximately two days if we add up all of the durations of our tasks, however, we notice that some tasks are reliant on two predecessors. These can be done in tandem if we have enough resources available! The goal of our program will be to schedule the batch with the shortest runtime. In theory, this production plan could be used to schedule a series of runs and give management a stronger idea of what is achievable given limited resources within a period.
To start developing our model within Python, we first have to import the libraries we will be using.
import pandas as pd
import datetime
import numpy as np
import plotly.figure_factory as ff
import json
import random
import plotly.express as px
Pandas will be the main library being used for multiple data frames and manipulations. Plotly will be our main library for visualization later on. Importing our data JSON file is straightforward.
with open('tasks.json') as json_file:
data = json.load(json_file)
We can then initialize our tasks data frame with this loaded data.
TasksDF = pd.DataFrame.from_dict(data['Items'])
CapacitiesDict = data['ResourceCapacities']

Our data frame looks very similar to that you would see in excel or another table-based tool, however, our ResourcesRequired column contains dictionaries and our Dependencies column contains lists of either one or two elements. The capacities dictionary we want to initialize directly as that object.
Next, we want to initialize a list of time intervals spanning the entire range we can expect our process to reach. The total of our duration columns summed together is 29 hours, so let’s round up to two days.
start_date = datetime.datetime(2021, 1, 1,0,0,0)
number_of_days = 2
intervals = []
for minute in range(number_of_days*96):
interval = (start_date + datetime.timedelta(minutes = 15*minute)).isoformat()
intervals.append(interval)
Our granularity in our intervals is fifteen-minute chunks, and there are 96 fifteen-minute chunks in one day. Therefore our entire list will be 192 of these fifteen-minute intervals. We could also either scale up to thirty-minute or one-hour intervals to save processing time or could get more precise timelines at lower scales of five or ten-minute intervals; a deciding factor in determining granularity is the specificity of the duration column.
Next, we need to make a matrix of all of our times and the resources we will need to load. Note below that our columns are all of the resource names and the keys from the capacities dictionary.
headings = list(CapacitiesDict.keys())
loading = pd.DataFrame(columns = headings)
loading['index']=intervals
loading = loading.fillna(0)

We will then do the same for a maximum loading matrix, but this time, our contents will be the maximum number of resources we can load during that interval. If any task is attempted to be scheduled and resources are not available during that interval, it should be postponed to the next interval with available resources. For the sake of simplicity within this exercise we will keep our values constant (three operators available 24/7), but in reality, these values are expected to change with shifts and equipment availability. Our later algorithms will function the same regardless of the scenario.
loadingMax = pd.DataFrame(columns = headings)
loadingMax['index']=intervals
for item in headings:
loadingMax[item] = CapacitiesDict[item]

Our tasks now need to have some additional data appended. We need to know how many minutes still need to be scheduled and have their resources accounted for, and then initialize some start and end times within our interval indices as well.
jobsToAdd = TasksDF.copy()
jobsToAdd['TimeAddedToSchedule']=jobsToAdd['Duration']*60
jobsToAdd['Start'] = start_date
jobsToAdd['End'] = start_date

We have now prepared all of our data to be processed. Now comes the challenging part: processing our various constraints in resources and dependencies and scheduling our tasks.
Our algorithm will operate in this order:
- Load row by row of loading matrix
- Load task by task of jobs to add tasks
- Check if the current job needs to be added still; if it does, proceed
- Check if resources are available in the current time interval of the loading matrix; if they are, proceed
- Check if the dependencies have been scheduled yet and do not end in the next interval; if they are complete, proceed
- If this is the first interval being scheduled, take the timestamp of the start time
- Deduct time remaining to be scheduled and allocate resources
- If there is no more time remaining to be scheduled, take the timestamp of the end time
That’s essentially it. To dive into the code, please copy the block and paste it within a development environment such as Atom or Visual Studio to see all of the semantics highlighted.
for i in range(len(loading.index)): # Go through loading schedule time by time
print(str(round(i/len(loading.index)*100,2))+'%')
for j in range(len(jobsToAdd.index)): # Go through list of jobs, job by job
if jobsToAdd['TimeAddedToSchedule'][j]>0: # Continue if job needs to be scheduled still
available = True
for resource in list(jobsToAdd['ResourcesRequired'][j].keys()): # Continue if all required resources are available
if loading[resource][i] + jobsToAdd['ResourcesRequired'][j][resource] > loadingMax[resource][i]:
available=False
if available:
dependenciesSatisfied = True
if jobsToAdd['Dependencies'][j][0] == 0: #Skip checking dependencies if there are none
pass
else:
for dependency in jobsToAdd['Dependencies'][j]: # Check if a task's dependencies have been fully scheduled
if jobsToAdd.loc[jobsToAdd['ID'] == dependency]['TimeAddedToSchedule'].item() > 0:
dependenciesSatisfied = False # Check if fully scheduled
if jobsToAdd.loc[jobsToAdd['ID'] == dependency]['End'].item() == datetime.datetime.strptime(loading['index'][i],'%Y-%m-%dT%H:%M:%S')+ datetime.timedelta(minutes = 15):
dependenciesSatisfied = False # Check that dependent end time isnt after the start of this time
if dependenciesSatisfied:
if jobsToAdd['TimeAddedToSchedule'][j]==jobsToAdd['Duration'][j]*60: # Set the start time
jobsToAdd['Start'][j]=datetime.datetime.strptime(loading['index'][i],'%Y-%m-%dT%H:%M:%S')
for resource in list(jobsToAdd['ResourcesRequired'][j].keys()): # Allocate Resources
loading[resource][i] = loading[resource][i] + jobsToAdd['ResourcesRequired'][j][resource]
jobsToAdd['TimeAddedToSchedule'][j] = jobsToAdd['TimeAddedToSchedule'][j]-15 # Reduce needed time
if jobsToAdd['TimeAddedToSchedule'][j] == 0: # Set the end time
jobsToAdd['End'][j]=datetime.datetime.strptime(loading['index'][i],'%Y-%m-%dT%H:%M:%S')+ datetime.timedelta(minutes = 15)
Running our algorithm should only take a second or two with our limited number of tasks to schedule. As we can imagine, it could take significantly longer with more tasks to schedule, or over a larger period of intervals.
Following completion of our algorithm run, we can see that our jobs to add data frame is now completed and the placeholder timestamps are replaced. No time is remaining in need of scheduling.

Upon examination of our loading data frame, we can also see that our resources have been allocated at or below the capacities of the maximum loading data frame.

While our schedule is now complete, the data is hardly in a form easily presentable to audiences whether on the shop floor or to upper management. To better illustrate our timelines, let’s build a Gantt chart with Plotly.
To start, we need to configure our data into a form readable by plotly figure factory.
x = jobsToAdd[['Name','Start','End','Process']].copy()
x = x.rename(columns={'Name':'Task','Process':'Resource','End':'Finish'})

# Configure data for data frame formatting
df = []
for r in range(len(x.index)):
df.append(dict(Task=x['Task'][r],Start=x['Start'][r],Finish=x['Finish'][r],Resource=x['Resource'][r]))
# Assign color pallete randomly for dynamic number of resource categories
r = lambda: random.randint(0,255)
colors = ['#%02X%02X%02X' % (r(),r(),r())]
for i in range(len(x.Resource.unique().tolist())):
colors.append('#%02X%02X%02X' % (r(),r(),r()))
fig = ff.create_gantt(df, colors=colors, index_col='Resource', show_colorbar=True, group_tasks=True)
fig.show()
We can see our interactive Gantt chart rendered below.

This production schedule can be exported wherever we need it to be, and we can select on-screen areas we want to dive deeper into. We can also note now visually that we can batch ferment in our large fermentor and prepare filters for purification at the same time because we have enough operators available and purification requires both of those tasks to be complete to proceed, however preparing vials and adding adjuvants cannot proceed at the same time due to a shortage of operator resources available.
What if we want to gather some statistics for our data, such as resource utilization? All of this data is captured in our loading data frame and is available for reporting.
loading.describe()

Using the Pandas describe() function, we can see the maximum resource utilization available. This is extremely useful for determining the answer the right amount of resources needed to get the job done. We see we have maxed out at least for a short period our number of operators available. If we had four available, we’d be able to complete our adding adjuvants and prepare vials at the same time and complete our process earlier.
fig = px.line(loading, x="index", y="Operator")
fig.show()

The case study we worked together on here is relatively small in scope compared to the real-manufacturing-floor batches which can contain hundreds of tasks and dozens of resources to load. This data can add up quickly and can vary from industry to industry. However, our code is scalable and can handle these dependencies albeit at the cost of more extensive processing time. Some areas for improvement to our algorithm also include capabilities seen in other project management suites such as delayed starts, inventory tracking and batching, and adding/tracking delays. We can also supplement our data at a later point and add further requirements or even bills of materials or costs to our work, to develop secondary objective functions and for further visualization. Because we developed our visuals within Plotly, we can also use enterprise data hosting or make our own Dash dashboards.
Please let me know what you think, and feel free to reach out to me on LinkedIn with feedback, to ask questions, or to see how we can bring these tools and mindsets to your organization! Check out some of my other articles here on data analysis and visualization! Additionally, if you’re curious about the hardware and tools I use to crunch numbers check out my website at www.willkeefe.com