The world’s leading publication for data science, AI, and ML professionals.

Medical Image Pre-Processing with Python

An overview of preprocessing a dicom image for the training model.

CT Images -Image by author
CT Images –Image by author

How is The Data

In this post, I will explain how beautifully medical images can be preprocessed with simple examples to train any artificial intelligence model and how data is prepared for model to give the highest result by going through the all preprocessing stages.

The data I am going to use is bunch of 2D Brain CT images. They are in DICOM format. You can simply apply these operations to your own data to get more efficient results from your model.

Before we start coding, let’s talk about the medical data. First of all, I will explain what is CT.

Computer Tomography is a scanning that takes images of X-rays which are sent to the body from different angles and combined using a computer processor to access cross-sectional images (slices) of bones, blood vessels, and soft tissues in various parts of the body. These images provide more detailed information than regular x-ray images. In this way, anomalies in the bones, veins or tissues of the patient are detected. That’s why, a more precise diagnosis can be maden for patient and the treatment would continue accordingly.

Now let’s talk about, what the Dicom format is.

DICOM is an acronym for Digital Imaging and Communication in Medicine. Files in this format are most likely saved with a ‘dcm’ file extension. DICOM is both a communication protocol and a file format; This means that a patient can store medical information such as ultrasound and MRI images along with their information in a single file. While png or jpg files contain only the name, date, and number of pixels of the picture; dicom format includes the patient’s information, windowing intervals of the picture, which we call meta data. Briefly it includes more detailed information of patients. This format not only keeps all the data together, but also ensures that the information is transferred between devices that support the DICOM format. I took the few dcm images from Kaggle.

import pydicom
file_path = "/Users/esmasert/Desktop/basicsDcm/10095.dcm"
medical_image = pydicom.read_file(file_path)
print(medical_image)
Reading Dicom File - Image by author
Reading Dicom File – Image by author
Image by author
Image by author

After the basic summary of CT and dicom, let’s move on with the preprocessing. I used 5 steps during the preprocessing stages of images.

These steps are: Transforming to HU, Removing Noises, Tilt Correction, Crop Images and Padding.

After applying these preprocessing steps to data, we see that model accuracy got increased significantly. I am about the explain the preprocessing methods. To see the code in a clearer format, you can visit this link.

After loading our image data in DICOM format, we will transform it to Hounsfield Unit form.

Transforming to HU

The Hounsfield Unit (HU) is a relative quantitative measurement of the intensity of radio waves used by radiologists for better explanation and understanding of computed tomography (CT) images. The absorption/attenuation coefficient of radiation within a tissue is used during CT reconstruction to produce a grayscale image. The linear transformation produces a Hounsfield scale that displays as gray tones. More dense tissue, with greater X-ray beam absorption, has positive values and appears bright; less dense tissue, with less X-ray beam absorption, has negative values and appears dark.[1] The Hounsfield unit is named after the famous Sir Godfrey Hounsfield, who has part of the invention of Computer Tomography and was awarded the Nobel Prize for it.

We can obtain the HU by using Rescale Intercept and Rescale Slope headers:

def transform_to_hu(medical_image, image):
    intercept = medical_image.RescaleIntercept
    slope = medical_image.RescaleSlope
    hu_image = image * slope + intercept

    return hu_image
def window_image(image, window_center, window_width):
    img_min = window_center - window_width // 2
    img_max = window_center + window_width // 2
    window_image = image.copy()
    window_image[window_image < img_min] = img_min
    window_image[window_image > img_max] = img_max

    return window_image

If you want a specific zone of the image you can adjust the windowing of image.

Removing Noises

During preprocess, removing noises is a very important stage since, the data is improved after the implementation we can see it more clearly. So, model can be trained better. [2]

def remove_noise(file_path, display=False):
    medical_image = pydicom.read_file(file_path)
    image = medical_image.pixel_array

    hu_image = transform_to_hu(medical_image, image)
    brain_image = window_image(hu_image, 40, 80) #bone windowing

    segmentation = morphology.dilation(brain_image, np.ones((1, 1)))
    labels, label_nb = ndimage.label(segmentation)

    label_count = np.bincount(labels.ravel().astype(np.int))
    label_count[0] = 0

    mask = labels == label_count.argmax()

    mask = morphology.dilation(mask, np.ones((1, 1)))
    mask = ndimage.morphology.binary_fill_holes(mask)
    mask = morphology.dilation(mask, np.ones((3, 3)))
    masked_image = mask * brain_image
    return masked_image
Image by author
Image by author
Image by author
Image by author

Tilt Correction:

Tilt correction is the alignment of brain image in a proposed way. When tilt experienced by brain CT images, it may result in misalignment for medical applications. It is important because when we train the model, it can see the whole data through the same alignment. Manually correcting the tilt on a large scale data is time-consuming and expensive. Thus, there is a need for an automatic way of performing tilt correction in preprocessing before the training.

img=numpy.uint8 (iskemiMaskedImg)
contours, hier =cv2.findContours (img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
mask=numpy.zeros (img.shape, numpy.uint8)

# find the biggest contour (c) by the area
c = max(contours, key = cv2.contourArea)

(x,y),(MA,ma),angle = cv2.fitEllipse(c)

cv2.ellipse(img, ((x,y), (MA,ma), angle), color=(0, 255, 0), thickness=2)

rmajor = max(MA,ma)/2
if angle > 90:
    angle -= 90
else:
    angle += 96
xtop = x + math.cos(math.radians(angle))*rmajor
ytop = y + math.sin(math.radians(angle))*rmajor
xbot = x + math.cos(math.radians(angle+180))*rmajor
ybot = y + math.sin(math.radians(angle+180))*rmajor
cv2.line(img, (int(xtop),int(ytop)), (int(xbot),int(ybot)), (0, 255, 0), 3)

pylab.imshow(img)
pylab.show()

M = cv2.getRotationMatrix2D((x, y), angle-90, 1)  #transformation matrix

img = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]), cv2.INTER_CUBIC)

pylab.imshow(img)
pylab.show()
Image by author
Image by author

Crop Image and Add Pad:

Cropping image is needed to place the brain image at the center and get rid of unnecessary parts of image. Also, some brain images might be placed in different location within general image. By cropping image and adding pads, we will make sure almost all the images are in same location within general image itself.

Image by author
Image by author
def add_pad(image, new_height=512, new_width=512):
    height, width = image.shape

    final_image = np.zeros((new_height, new_width))

    pad_left = int((new_width - width) // 2)
    pad_top = int((new_height - height) // 2)

    # Replace the pixels with the image's pixels
    final_image[pad_top:pad_top + height, pad_left:pad_left + width] = image

    return final_image
Image by Author
Image by Author

Here is the result! A clean, corrected and centered brain image. Ready to go inside training.

I included the references below. Many thanks to https://vincentblog.xyz/ ! It really helped me to understand the Image Processing deeper.

If you have any suggestion or question please comment below. Thank you very much!

Also, if you would like to support me, you can buy me a coffee, so my brain would keep working.. https://www.buymeacoffee.com/esmasert


References

  1. https://www.ncbi.nlm.nih.gov/books/NBK547721/
  2. https://vincentblog.xyz/posts/medical-images-in-python-computed-tomography
  3. https://link.springer.com/article/10.1007/s10278-020-00400-7
  4. https://www.eyewated.com

Related Articles