The world’s leading publication for data science, AI, and ML professionals.

Neural Networks for Time-Series Imputation: Tackling Missing Data

Part 3: Discover how a simple Keras sequential model can be effective

Source: DALL-E.
Source: DALL-E.

One of the common problems in time-series analysis is missing data.

As we have seen in Part 1, simple imputation techniques or regression-based models like linear regression and decision trees can get us a long way.

But what if we need to handle more subtle patterns and capture fine-grained fluctuations in complex time-series data?

In this article, we will explore how a Neural Network (NN) can be used to impute missing values.

The strengths of NNs are their capability to capture nonlinear patterns and interactions in data. Although NNs are usually computationally expensive, they can offer a very effective way to impute missing time-series data in cases where simpler models fail.

We will work with the same dataset as in Part 1 and Part 2, with 10% values missing, introduced randomly for the mock energy production dataset.

Don’t miss out Part 1 of this series:

Missing Data in Time-Series: Machine Learning Techniques

And Part 2, where we employ KNN:

Missing Data in Time-Series? Machine Learning Techniques (Part 2)


Hello there!

My name is Sara Nóbrega, and I am a Data Scientist specializing in AI Engineering. I hold a Master’s degree in Physics and I later transitioned into the exciting world of Data Science.

I write about data science, artificial intelligence, and data science career advice. Make sure to follow me and subscribe to receive updates when the next article is published!

Sara’s Data Science Free Resources


Contents

  1. Data Note: Mock Energy Production Dataset
  2. Why and When to Use Non-Linear Machine Learning for Imputation?
  3. Part 2: NN for Time-Series Imputation 3.1 Statistical Comparison 3.1.1 Autocorrelation, STL Decomposition and Residual Analysis

  4. Potential Limitations of NN
  5. When NOT to use NN for Time-Series Imputation

Data Note: Mock Energy Production Dataset

In Part 1, we used a mock dataset of energy production at 10-minute intervals between January 1, 2023, and March 1, 2023. To make the case more realistic, we took roughly 10% of the data points as missing for imputation.

import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt

# Generate the mock energy production data
start_date = datetime(2023, 1, 1)
end_date = datetime(2023, 3, 1)
datetime_index = pd.date_range(start=start_date, end=end_date, freq='10T')

# Create energy production values with day-night cycles
np.random.seed(42)
base_energy = []
for dt in datetime_index:
    hour = dt.hour
    if 6 <= hour <= 18:
        energy = np.random.normal(loc=300, scale=30)
    else:
        energy = np.random.normal(loc=50, scale=15)
    base_energy.append(energy)

energy_production = pd.Series(base_energy)

# Introduce missing values
num_missing = int(0.1 * len(energy_production))
missing_indices = np.random.choice(len(energy_production), num_missing, replace=False)
energy_production.iloc[missing_indices] = np.nan

mock_energy_data_with_missing = pd.DataFrame({
    'Datetime': datetime_index,
    'Energy_Production': energy_production
})

# Reset index for easier handling
data_with_index = mock_energy_data_with_missing.reset_index()
data_with_index['Time_Index'] = np.arange(len(data_with_index))  # Add time-based index

plt.figure(figsize=(14, 7))
plt.plot(mock_energy_data_with_missing['Datetime'], mock_energy_data_with_missing['Energy_Production'], 
         label='Energy Production (With Missing)', color='blue', alpha=0.7)
plt.scatter(mock_energy_data_with_missing['Datetime'], mock_energy_data_with_missing['Energy_Production'], 
            c=mock_energy_data_with_missing['Energy_Production'].isna(), cmap='coolwarm', 
            label='Missing Values', s=10)  # Reduced size of the markers
plt.title('Mock Energy Production Data with Missing Values (10-Minute Intervals)')
plt.xlabel('Datetime')
plt.ylabel('Energy Production')
plt.legend()
plt.grid(True)
plt.show()
Figure 1: Mock Energy Production Data with Missing values. | Image by author.
Figure 1: Mock Energy Production Data with Missing values. | Image by author.

Working with time-series data? Then you need key techniques to master its analysis:

5 Must-Know Techniques for Mastering Time-Series Analysis


Neural Networks for Time-Series Imputation

A simple feedforward neural network will be implemented to predict and impute the missing values. The network uses input cyclical features such as hour and day-of-week for learning temporal patterns in time-dependent trends.

Steps:

  1. Feature Engineering: Forming relevant features from which the Neural Network can learn the temporal pattern.
  2. Data Splitting: Prepare training data, excluding the rows with missing values.
  3. Model Training: Train a feedforward NN to predict energy production.
  4. Imputation: The model trained in the previous step is used to impute missing values.

Let’s begin:

import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.callbacks import EarlyStopping
import random

# Set seed for reproducibility
SEED = 42
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)

# Step 1: Feature Engineering
data_nn = data_with_index.copy()

# Extract hour, day of week, and encode cyclical features
data_nn['Hour'] = data_nn['Datetime'].dt.hour
data_nn['DayOfWeek'] = data_nn['Datetime'].dt.dayofweek

# Cyclical encoding for 'Hour'
data_nn['Hour_Sin'] = np.sin(2 * np.pi * data_nn['Hour'] / 24)
data_nn['Hour_Cos'] = np.cos(2 * np.pi * data_nn['Hour'] / 24)

# Cyclical encoding for 'DayOfWeek'
data_nn['Day_Sin'] = np.sin(2 * np.pi * data_nn['DayOfWeek'] / 7)
data_nn['Day_Cos'] = np.cos(2 * np.pi * data_nn['DayOfWeek'] / 7)

# Step 2: Prepare Features and Target
feature_columns = ['Hour_Sin', 'Hour_Cos', 'Day_Sin', 'Day_Cos']
data_nn = data_nn.dropna()  # Remove rows with missing target values

features = data_nn[feature_columns]
target = data_nn['Energy_Production']

# Standardize the features
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

# Step 3: Split Data
X_train, X_test, y_train, y_test = train_test_split(features_scaled, target, test_size=0.2, random_state=42)

# Step 4: Define Neural Network Model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1)  # Single output
])

model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Step 5: Train the Model
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
history = model.fit(X_train, y_train, epochs=30, batch_size=32, validation_data=(X_test, y_test), callbacks=[early_stopping])

# Step 6: Imputation
# Predict missing values using the trained model
data_with_missing = data_with_index.copy()

# Extract hour and day of week (redo feature engineering)
data_with_missing['Hour'] = data_with_missing['Datetime'].dt.hour
data_with_missing['DayOfWeek'] = data_with_missing['Datetime'].dt.dayofweek

# Cyclical encoding for 'Hour'
data_with_missing['Hour_Sin'] = np.sin(2 * np.pi * data_with_missing['Hour'] / 24)
data_with_missing['Hour_Cos'] = np.cos(2 * np.pi * data_with_missing['Hour'] / 24)

# Cyclical encoding for 'DayOfWeek'
data_with_missing['Day_Sin'] = np.sin(2 * np.pi * data_with_missing['DayOfWeek'] / 7)
data_with_missing['Day_Cos'] = np.cos(2 * np.pi * data_with_missing['DayOfWeek'] / 7)

# Select missing features for prediction
missing_features = data_with_missing.loc[data_with_missing['Energy_Production'].isna(), feature_columns]
missing_features_scaled = scaler.transform(missing_features)

# Predict missing values
predicted_values = model.predict(missing_features_scaled)
data_with_missing.loc[data_with_missing['Energy_Production'].isna(), 'Energy_Production'] = predicted_values
Epoch 1/30

192/192 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - loss: 47289.5938 - mae: 177.2104 - val_loss: 31623.6465 - val_mae: 141.5553
Epoch 2/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 17872.7188 - mae: 100.2324 - val_loss: 4078.6731 - val_mae: 50.8379
Epoch 3/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 3836.3140 - mae: 49.4058 - val_loss: 3911.0200 - val_mae: 49.2663
Epoch 4/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 3654.9775 - mae: 47.6395 - val_loss: 3804.6765 - val_mae: 48.3873
Epoch 5/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 3534.4617 - mae: 46.6192 - val_loss: 3656.6111 - val_mae: 47.2733
Epoch 6/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 3378.8108 - mae: 45.3782 - val_loss: 3461.5427 - val_mae: 45.8035
Epoch 7/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 3182.7766 - mae: 43.7996 - val_loss: 3222.8591 - val_mae: 43.9219
Epoch 8/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 2943.5549 - mae: 41.8146 - val_loss: 2940.5742 - val_mae: 41.6142
Epoch 9/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 2667.7310 - mae: 39.4235 - val_loss: 2621.4626 - val_mae: 38.9475
Epoch 10/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 2357.8921 - mae: 36.7026 - val_loss: 2270.7549 - val_mae: 35.9953
Epoch 11/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 2022.3892 - mae: 33.7252 - val_loss: 1904.5137 - val_mae: 32.9176
Epoch 12/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 1672.1134 - mae: 30.6480 - val_loss: 1536.0582 - val_mae: 29.7249
Epoch 13/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 1329.0890 - mae: 27.5651 - val_loss: 1227.2889 - val_mae: 26.9015
Epoch 14/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 1055.4308 - mae: 24.8727 - val_loss: 1003.9343 - val_mae: 24.5620
Epoch 15/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 865.8824 - mae: 22.7478 - val_loss: 867.3121 - val_mae: 22.8818
Epoch 16/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 756.9211 - mae: 21.3564 - val_loss: 802.0678 - val_mae: 21.9702
Epoch 17/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 706.9869 - mae: 20.6465 - val_loss: 770.4452 - val_mae: 21.4972
Epoch 18/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 680.2199 - mae: 20.2274 - val_loss: 750.3719 - val_mae: 21.1738
Epoch 19/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 662.4779 - mae: 19.9081 - val_loss: 736.6984 - val_mae: 20.9330
Epoch 20/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 648.1115 - mae: 19.6183 - val_loss: 725.9727 - val_mae: 20.7376
Epoch 21/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 635.9745 - mae: 19.4054 - val_loss: 715.5459 - val_mae: 20.5475
Epoch 22/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 627.1302 - mae: 19.2404 - val_loss: 708.2336 - val_mae: 20.4228
Epoch 23/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 620.2661 - mae: 19.1281 - val_loss: 701.4677 - val_mae: 20.3177
Epoch 24/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 614.5076 - mae: 19.0311 - val_loss: 695.7640 - val_mae: 20.2231
Epoch 25/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 609.0852 - mae: 18.9352 - val_loss: 689.9358 - val_mae: 20.1252
Epoch 26/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 603.5502 - mae: 18.8279 - val_loss: 684.2870 - val_mae: 20.0308
Epoch 27/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 598.4345 - mae: 18.7273 - val_loss: 679.2006 - val_mae: 19.9456
Epoch 28/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 591.7822 - mae: 18.6190 - val_loss: 675.0480 - val_mae: 19.8816
Epoch 29/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 590.4954 - mae: 18.5747 - val_loss: 671.3280 - val_mae: 19.8238
Epoch 30/30
192/192 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 587.6835 - mae: 18.5242 - val_loss: 668.3542 - val_mae: 19.7750
27/27 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
# Plot the first 100 data points for comparison
plt.figure(figsize=(14, 7))

# Original data with missing values
plt.plot(
    data_with_missing['Datetime'][:200],
    data_with_missing['Energy_Production'][:200],
    label='Imputed Data (NN)',
    linestyle='-',
    color='blue',
    alpha=0.8,
)

# Original data with missing values (plotted last to appear in the foreground)
plt.plot(
    mock_energy_data_with_missing['Datetime'][:200],
    mock_energy_data_with_missing['Energy_Production'][:200],
    label='Original Data (With Missing)',
    linestyle='--',
    color='red',
    alpha=0.7,
)

plt.title('Comparison of Original and NN-Imputed Data (First 100 Data Points)')
plt.xlabel('Datetime')
plt.ylabel('Energy Production')
plt.legend()
plt.grid(True)
plt.show()
A comparative line plot showing original energy production data (with missing values) in red and NN-imputed data in blue | Image by Author.
A comparative line plot showing original energy production data (with missing values) in red and NN-imputed data in blue | Image by Author.

Here, I have chosen to present the first 100 points of imputed data.

Notice how the continuity of the gaps in the blue line of the imputed data is smoothly filled in compared with the red-dotted line presenting the original data: missing and existing points pass nicely into one another; imputation fitted well together with the observed values.

This is a good sign: this detailed comparison validates the quality of this imputation, especially in smaller segments of the data.

Statistical Comparison

Let’s compare their statistics:

original_stats = mock_energy_data_with_missing['Energy_Production'].describe()
nn_stats = data_with_missing['Energy_Production'].describe()

stats_comparison_nn = pd.DataFrame({
    'Metric': original_stats.index,
    'Original Data': original_stats.values,
    'Imputed Data (NN)': nn_stats.values
})
stats_comparison_nn
Statistical Comparison | Image by Author
Statistical Comparison | Image by Author

The mean and standard deviation of the imputed data are very close to the original; this means that there is consistency in the central tendency and variability.

The range within the values is maintained, insinuating that no unrealistic extremes were introduced.

The resemblance of these statistics means the NN-imputed data is highly similar in distribution compared to the original one and thus does not break the integrity of the dataset.

Autocorrelation, STL Decomposition and Residual Analysis

We will apply ACF, STL Decomposition to evaluate the imputed data on the underlying seasonal patterns, trends, and autocorrelations.

ACF Comparison Function

import statsmodels.api as sm
from statsmodels.tsa.seasonal import seasonal_decompose

def plot_acf_comparison(original_series, imputed_series, lags=50):
    plt.figure(figsize=(14, 5))

    # Original Data ACF (using linear interpolation to handle missing values)
    original_interpolated = original_series.interpolate(method='linear')
    plt.subplot(1, 2, 1)
    sm.graphics.tsa.plot_acf(original_interpolated, lags=lags, ax=plt.gca(), 
                             title="ACF of Original Data (Interpolated)")
    plt.grid(True)

    # Imputed Data ACF
    plt.subplot(1, 2, 2)
    sm.graphics.tsa.plot_acf(imputed_series, lags=lags, ax=plt.gca(), 
                             title="ACF of NN-Imputed Data")
    plt.grid(True)

    plt.tight_layout()
    plt.show()

# Perform ACF Comparison
plot_acf_comparison(
    mock_energy_data_with_missing['Energy_Production'], 
    data_with_missing['Energy_Production']
)
Side-by-side ACF plots comparing temporal dependencies in the original (interpolated) and NN-imputed datasets. Both charts show similar periodic patterns, validating the preservation of temporal dependencies. | Image by author
Side-by-side ACF plots comparing temporal dependencies in the original (interpolated) and NN-imputed datasets. Both charts show similar periodic patterns, validating the preservation of temporal dependencies. | Image by author

We can see that both plots show very similar patterns, confirming that the imputation indeed preserved temporal dependencies!

We can conclude from these plots that the imputed NN dataset captured well the periodic nature of the original data.

# Define the seasonal period
# Since data is at 10-minute intervals, and there are 144 intervals in a day (144 * 10 minutes = 1440 minutes = 24 hours)
seasonal_period = 144

# Handle missing values in original data by linear interpolation for decomposition
original_interpolated = mock_energy_data_with_missing['Energy_Production'].interpolate(method='linear')

# Perform STL Decomposition
original_decompose = seasonal_decompose(original_interpolated, model='additive', period=seasonal_period)
nn_decompose = seasonal_decompose(data_with_missing['Energy_Production'], model='additive', period=seasonal_period)

# Visualization of STL Decomposition Components
# Plot Trend Comparison
plt.figure(figsize=(14, 5))
plt.plot(nn_decompose.trend, label='NN-Imputed Trend', color='green')
plt.plot(original_decompose.trend, label='Original Interpolated Trend', color='orange', alpha=0.7)
plt.title('Trend Comparison: Original vs. NN-Imputed Data')
plt.xlabel('Datetime')
plt.ylabel('Energy Production Trend')
plt.legend()
plt.grid(True)
plt.show()

# Plot Seasonal Comparison
plt.figure(figsize=(14, 5))
plt.plot(original_decompose.seasonal, label='Original Interpolated Seasonality', color='orange', alpha=0.7)
plt.plot(nn_decompose.seasonal, label='NN-Imputed Seasonality', color='green')
plt.title('Seasonality Comparison: Original vs. NN-Imputed Data')
plt.xlabel('Datetime')
plt.xlim(0, 1000)
plt.ylabel('Energy Production Seasonality')
plt.legend()
plt.grid(True)
plt.show()

# Optional: Plot Residuals Comparison
plt.figure(figsize=(14, 5))
plt.plot(original_decompose.resid, label='Original Interpolated Residuals', color='orange', alpha=0.7)
plt.plot(nn_decompose.resid, label='NN-Imputed Residuals', color='green')

plt.title('Residuals Comparison: Original vs. NN-Imputed Data')
plt.xlabel('Datetime')
plt.ylabel('Residuals')
plt.legend()
plt.grid(True)
plt.show()
A trend analysis plot comparing the STL decomposition trends of the original (interpolated) and NN-imputed datasets. The chart shows minimal differences, indicating the imputed data maintains long-term trends. | Image by author
A trend analysis plot comparing the STL decomposition trends of the original (interpolated) and NN-imputed datasets. The chart shows minimal differences, indicating the imputed data maintains long-term trends. | Image by author

The trends of the original interpolated data and the NN-imputed data are very similar. Both follow a smooth pattern, which indicates that the imputation process preserved the long-term trends in the data quite well!

This confirms that Neural Network has learned the underlying dynamics in the time series without introducing significant distortion to the trend component.

Seasonal Component plot comparing the original and imputed data. | Image by author
Seasonal Component plot comparing the original and imputed data. | Image by author

Finally, seasonal components of the original interpolated and NN-imputed data are intercepting very well. The periodical nature of this energy production dataset (day night cycles) was preserved.

This suggests that the model captured the inherent cyclicality of the data quite well, which is very important for accurate preservation of seasonality in downstream Forecasting tasks.

Residual Analysis

Figure 7: Dual visualizations showcasing seasonal components and residuals for original (interpolated) and NN-imputed datasets. Seasonal patterns are nearly identical, while residuals demonstrate consistent noise levels across both datasets. | Image by author
Figure 7: Dual visualizations showcasing seasonal components and residuals for original (interpolated) and NN-imputed datasets. Seasonal patterns are nearly identical, while residuals demonstrate consistent noise levels across both datasets. | Image by author

Similar white noise patterns are visible in the residuals for both datasets. The imputation by NN does not introduce many anomalies. One can notice some small deviations in the original due to the presence of missing values.

Potential Limitations of Neural Networks

Let’s start with the obvious: training Neural Networks can be very computationally intensive and may ask for too many computational resources (CPU/GPU), especially in cases of larger datasets or deeper architectures.

Neural Networks cannot handle non-stationary time-series data, meaning the trends or seasonality can change with time. Models normally assume stationarity unless explicitly handled.

NNs need adequate and good-quality data to train on. If the dataset contains too many missing values, it can be challenging to learn meaningful patterns, leading to poor imputations.

NNs contain many hyperparameters that require a lot of tuning, including the number of layers/neurons, learning rate, and activation functions.

NNs are usually considered "black-box" models, as it is hardly possible to understand what logical procedure the network uses for imputing the missing values.

NNs are prone to overfitting in cases of small datasets or datasets with strong noise. The model may memorize the pattern rather than generalize it.

When to Avoid NN for Time-Series Imputation

  • High proportion of missing values, such as >50%, with insufficient training data.
  • Either time or resource limitations prohibit model training or tuning of hyperparameters.
  • The dataset is either small or lacking complex patterns-simpler methods will be sufficient.
  • Applications require high interpretability of the imputation process.

Conclusion

In this article, we applied a simple NN model to impute time-series Missing Data on mock energy data.

We saw that the imputation method of Neural Network preserved the important statistical and temporal properties from the original data, including time trends, seasonal patterns, and autocorrelation.

We deduce from here that the NN model provides a good approach toward the missing-value problem in a time-series data set.

Obviously, on real datasets, we face some difficulties; most of those problems would come about related to data quality, its black-box nature and computational resources required.

Thank you for reading 😉 . What other imputations methods have worked for you? Let me know in the comments!


If you found value in this post, I’d appreciate your support with a clap. You’re also welcome to follow me on Medium for similar articles!

Book a call with me, ask me a question or send me your resume here:


Related Articles