Classification on Hyperspectral Data

A step-by-step tutorial about learning how to perform feature reduction and then the classification of hyperspectral data using Support Vector Machines

Richa Dutt
Towards Data Science

--

Introduction

The goal of this tutorial is to apply PCA to hyperspectral data. (To learn about PCA, read the article “PCA on Hyperspectral Data”.). After reducing the dimensionality of the data using PCA, classify the data by applying the Support Vector Machine(SVM) to classify the different materials in the image.

Steps

We are using the Hyperspectral Gulfport Dataset in this tutorial. You can download the data from the following link.

The MUUFL Gulfport data contains the pixel-based ground truth map which was provided by manually labeling the pixels in the scene. The following classes were labeled in the scene trees, mostly grass, ground surface, mixed ground surface, dirt and sand, road, water, buildings, the shadow of buildings, sidewalk, yellow curb, cloth panels (targets), and unlabeled points.

Step 1: Importing the libraries

Step 2: Loading the data

There are 65 bands in the original data.

RGB Image

Step 3: Get rid of unlabeled and some classes data

Remove the unlabeled data points and also some similar classes are merged into one. For example, water and building shadows are merged into one class as they have similar spectra. Also, cloth panel and yellow curb classes are ignored since they are very few in numbers thus, not enough available for training.

Since the ground truth labels start from one. so the in the last line I subtract one from all labels to make sure they start from zero.

Step 4: Split the data into training and testing

Step 5: Applying SVM classifier on the Original Data

I have applied K-fold cross-validation on the training data.

Step 6: Apply SVM classifier on the PCA data

I have applied an ensemble of 3 models and used K-fold cross-validation on the training data.

Final prediction from all three models after computing the majority vote.

Step 7: Plot the final image after PCA

Image after applying PCA

Conclusion

The dimensionality of data before PCA is 65 and after PCA it is 3. PCA reduced the dimensionality of data by a factor of almost 21.

We can conclude from the above results that when we apply SVM to the original data, the accuracy is around 88.7% and after applying SVM to the PCA is 88.9%. So we are getting the nearly same accuracy in both cases.

That's why we apply classifiers on the reduced data. It reduces the time and space complexity. Depending on the problem, the accuracy with PCA might be even higher compared to using the original data.

Thanks for reading! I hope you found this article useful. Feel free to ask, if you have any questions.

References

--

--

Machine Learning Phd student at University of Florida | Working on Hyperspectral Imaging