
Why measuring time is important
We can go on and provide hundreds of arguments and quotes showing why the act of measurement is important in science and technology. Here is a powerful one,
One accurate measurement is worth a thousand expert opinions. – Grace Hopper
And according to Lord Kelvin,
If you can not measure it, you can not improve it.
We want to do productive data science. That means we want to improve the efficiency of our Machine Learning (ML) training and inference. Naturally, we need to measure the time it takes to fit a particular ML model to the data.
Perhaps, we want to experiment with various models and measure how they perform on the same test data. Or, we may want to vary the hyperparameters of a model and see how the training time and performance vary.
In any case, we have to measure the execution time of the training function for multiple runs. That is because we may want to perform a systematic study of how the ML model tuning improved the execution time and performance.
Or, we may be interested in jointly optimizing the metrics of time and performance (e.g. too complex model may give only slightly improved performance but runs much slower).
Let me show you a simple approach to do this logging using simple Python code.
Timing decorator and functools
The majority of data scientists use the Jupyter notebook for their exploratory work and it provides magic functions like %timeit and %%timeit for measuring the execution time of a block of code. However, that’s good for a single run in a Notebook cell but not very useful for a flexible recording of timing over multiple runs.
A timing decorator
Enter the decorators and built-in Python module functools
.
For brevity, we won’t discuss the nuts and bolts of these utilities. However, it is sufficient to say that they take advantage of the fact that functions are first-class objects in Python and one can wrap around a function to extract useful information about its execution along with running it normally.
We can write a timing decorator to measure the function execution time and put that to use. Here is the code to construct such a decorator,
def timing(func):
@wraps(func)
def wrap(*args, **kw):
ts = time()
result = func(*args, **kw)
te = time()
tdelta = round(1000*(te-ts),3)
print (f"Function '{func.__name__}' took {tdelta} milliseconds to run")
return result
return wrap
Note the use of *args
and **kw
to allow any general set of arguments and keywords to be passed along to the underlying function func
. The main purpose of this code is to measure the time-delta (tdelta = round(1000*(te-ts),3)
) i.e. time difference before and after the function finishes execution to extract the execution time. Because the execution can be pretty fast, we have normalized it to milliseconds by multiplying it by 1000.
Now comes the part where we use it on a simple demo function.
@timing
def list_length(a):
if isinstance(a,list):
time.sleep(0.1)
s = len(a)
return s
else:
print("Argument is not a list")
It’s a function that simply measures the length of a given list. It also has a time.sleep(0.1)
code just to force a delay for demo purposes only. Without it, the execution would be so fast that I cannot show you any meaningful result.
We test it,
list_length([1,2,3])
>> Function 'list_length' took 111.291 milliseconds to run
>> 3
Note, how cool this output is! The first printed statement is coming from the wrap
function and the second result of 3 is coming from list_length
function. They are interlinked by the wrapper/decorator mechanism. The timing of 111.291 milliseconds is, of course, due to the delay we have put in and will vary slightly from system to system or run to run.
If we test it using a single number instead of passing a list,
list_length(5)
>> Argument is not a list
>> Function 'list_length' took 0.0 milliseconds to run
In this case, the conditional block in the list_length
function skips the time.sleep(0.1)
part and therefore returns almost instantaneously recording a near-zero execution time.
Returning only the execution time
Now, the code above was for demo only and should not be used in practice because it does not, in fact, return the execution time, but only prints it. Here is a slightly modified code,
def time_return(func):
@wraps(func)
def wrap(*args, **kw):
ts = time()
result = func(*args, **kw)
te = time()
tdelta = round(1000*(te-ts),3)
return tdelta
return wrap
This code has no print statement and returns the tdelta
measurement explicitly.
We can wrap another demo function with it,
@time_return
def numpy_matmul(a,b):
return (np.matmul(a,b))
So, it will measure the time it takes to compute matrix multiplication from two given Numpy arrays.
Let’s test it.
SIZE = 1000
a = np.random.beta(1.0,2.0,size=(SIZE,SIZE))
b = np.random.beta(1.0,2.0,size=(SIZE,SIZE))
numpy_matmul(a,b)
>> 16.48
So, it takes 16.48 milliseconds to compute the matrix multiplication of two Numpy matrices of random numbers (Beta distribution) of size (1000,1000).
Run it again with a different SIZE
parameter.
SIZE = 2000
a = np.random.beta(1.0,2.0,size=(SIZE,SIZE))
b = np.random.beta(1.0,2.0,size=(SIZE,SIZE))
numpy_matmul(a,b)
>> 111.301
So, now it takes 111.301 milliseconds for matrices of double size. We learn that matrix multiplication time does not scale linearly 🙂
But the main advantage of doing all of this is that now we can write a compact loop with this function and observe how the execution time varies with input size,
SIZE = [500,1000,2000,3000,4000,5000]
for s in SIZE:
a = np.random.beta(1.0,2.0,size=(s,s))
b = np.random.beta(1.0,2.0,size=(s,s))
t = numpy_matmul(a,b)
print(f"Matrix multiplication of size ({s}x{s}) took {t} milliseconds")
On my computer, I get the following result,
Matrix multiplication of size (500x500) took 3.0 milliseconds
Matrix multiplication of size (1000x1000) took 17.031 milliseconds
Matrix multiplication of size (2000x2000) took 111.501 milliseconds
Matrix multiplication of size (3000x3000) took 359.307 milliseconds
Matrix multiplication of size (4000x4000) took 835.614 milliseconds
Matrix multiplication of size (5000x5000) took 1611.042 milliseconds
Throwing an ML estimator into the mix
Without wasting any more time, we throw an ML estimator (from Scikit-learn) into the mix. Now, because we are interested in both the execution speed and the ML performance, we return both the time-delta and the ML metric in this case. Here is the code which is very similar to the previous ones we wrote except that we return a tuple: return (tdelta, result)
def time_estimator(func):
@wraps(func)
def wrap(*args, **kw):
ts = time()
result = func(*args, **kw)
te = time()
tdelta = round(1000*(te-ts),3)
return (tdelta, result)
return wrap
And this decorates a scoring function that actually fits the given data and returns the accuracy score (on a test/validation set)
@time_estimator
def classifier_accuracy(estimator,x,y):
X_train, X_test, y_train, y_test = train_test_split(x, y,
test_size=0.33,
random_state=42)
estimator.fit(X_train,y_train)
score = estimator.score(X_test,y_test)
return round(score,3)
We make some synthetic data and instantiate a Logistic regression estimator.
data = make_classification(n_samples=1000, n_features=20, n_informative=20, n_redundant=0, flip_y=0.05, class_sep=1.5)
x,y = data[0],data[1]
log_model = LogisticRegressionCV()
When we run the classifier_accuracy
function, we get both the execution time and the accuracy score,
classifier_accuracy(log_model,x,y)
>> (312.083, 0.836)
The first number is the execution time i.e. 312.083 milliseconds and the second one is the accuracy score of 0.836.
Now, we have the means to compute both the timing and performance metric of a general ML estimator with arbitrary input.
Let’s put this to use.
Change the data and record execution time
We can run the estimator for various data sizes and record the performance and execution time with a simple loop.
SIZE = [1000+500*i for i in range(21)]
log_model = LogisticRegressionCV()
model_time, model_acc = [],[]
for s in SIZE:
data = make_classification(n_samples=s, n_features=20, n_informative=20,n_redundant=0, flip_y=0.05, class_sep=1.5)
x,y = data[0],data[1]
m_time, m_acc = classifier_accuracy(log_model,x,y)
model_time.append(m_time)
model_acc.append(m_acc)
The key line of code, as highlighted, is the decorated function classifier_accuracy
that returns a tuple of execution time and accuracy score.
A simple plotting code gives us the following,

So, getting more data makes the ML training slower but really does not improve the execution time. That is because of the presence of the argument flip_y = 0.05
in the synthetic data generation that literally flips the labels of 5% of the data and thus makes the base noise floor fixed at 0.05. This is why the majority of the accuracy scores hover around 0.95 or below and cannot exceed that.
Therefore, the insight from this study is that we can do away with a small dataset and thereby reduce the execution time to save computational cost. This is how we optimize the cost-performance of an ML training exercise.
Play with the estimator to optimize the cost-performance
Let’s suppose we are using a Random Forest estimator for this problem and we want to optimize the accuracy and execution time together by varying the number of trees in the model.
Similar code can be written as before.
num_trees = [5*x for x in range(1,21)]
model_time, model_acc = [],[]
data = make_classification(n_samples=1000, n_features=20,
n_informative=20, n_redundant=0,
flip_y=0.05,class_sep=1.0)
x,y = data[0],data[1]
for n in num_trees:
rf_model = RandomForestClassifier(n_estimators=n)
m_time, m_acc = classifier_accuracy(rf_model,x,y)
model_time.append(m_time)
model_acc.append(m_acc)
Now, when we plot the accuracy and execution time, we get,

Clearly, we can determine an optimum number of trees. For example, we can create a cost function as follows,
model_opt = model_acc + 1/model_time
This captures the intent that we want to maximize accuracy while minimizing the execution time.
If we plot this, we get,

Clearly, we can choose ~45 trees for this model.
The boilerplate code is here in my Github repo.
Summary
We described how to use Python decorator and the functools
module to create a utility decorator for general Python functions and an ML estimator, in particular. We also demonstrated the use of such a wrapper/decorator for experiments with data or models and how to utilize the idea for ML model optimization.
This kind of timing measurement and ideas are commonly found in the internals of Auto-ML packages where they run many experiments with ML models and choose the best one. For optimizing the computational resource and performance together, data scientists should find this kind of measurement tool useful in their daily tasks and deploy it regularly.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.