If you have arrived here and do not have a good understanding of SVM, then check this article first.
Let’s start with a good dataset.
The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson’s Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. You can download from Kaggle.

Load the dataset
Our first step is to load the dataset and lets normalize our features for better performance. While creating y lets use grp2idx to convert species into a number.
%% Loading our dataset
clear;
tbl = readtable('IRIS.csv');
[m,n] = size(tbl);
X = tbl{:,1:n-1};
[y,labels] = grp2idx(tbl{:,n});
nl = length(labels);
[X_norm, mu, sigma] = featureNormalize(X);
Split training and test datasets
Lets split our dataset into a training and test set, being sure to randomise as we go. Split will be a 80/20 split.
%% split up train, cross validation and test set
rand_num = randperm(size(X,1));
X_train = X(rand_num(1:round(0.8*length(rand_num))),:);
y_train = y(rand_num(1:round(0.8*length(rand_num))),:);
X_test = X(rand_num(round(0.8*length(rand_num))+1:end),:);
y_test = y(rand_num(round(0.8*length(rand_num))+1:end),:);
cv = cvpartition(y_train,'k',5);
Feature selection
We need to decide on which are our best features to use, so lets let sequentialfs do the job for us. However, sequentialfs needs a cost function, which have defined as an inline functionaled called ‘costfun‘. This function uses the matlab ‘loss‘ function to evaluate the Classification error on each model. Finally, remove the unwanted columns
%% feature selection
opts = statset('display','iter');
costfun = @(XT,yT,Xt,yt)loss(fitcecoc(XT,yT),Xt,yt);
[fs, history] = sequentialfs(costfun, X_train, y_train, 'cv', cv, 'options', opts);
% Remove unwanted columns
X_train = X_train(:,fs);
X_test = X_test(:,fs);
After running feature selection, you should see its selected columns [1 3 4]. This is a small dataset, you may get different results, but I’m sure you get the idea ;-). If you run a corrplot([X y]), look at the bottom row and it will make sense why 1, 3 & 4 are have the best outcome on Y.

Train SVM
Matlab has a great function called fitcecoc which fits multi class models for SVM on our behalf.
GREAT…. we don’t need to do the maths….
We choosing to use a Gaussian kernel to evaluate our model. I do explain gaussian here if you need an intro. Then lets use the loss function to calculate our accuracy.
%% check loss against test dataset
t = templateSVM('KernelFunction','gaussian');
mdl = fitcecoc(X_train, y_train,'Learners',t,'Coding','onevsone');
L = loss(mdl,X_test,y_test) * 100
If columns [1 3 4] were selected, then you should see a loss of 0%.
Prediction’s
Ok, so now we know that we have 100% accuracy, since the loss is 0%. Matlab also have a handy predict function to help us make preditions. In below, we need to use our mu and sigma from featureNormalize to make sure our scale is correct. We are predicting the result when features are [5.2 3.5 1.2 0.3].
%% Predict a result
px = bsxfun(@minus, [5.2 3.5 1.2 0.3], mu);
px = bsxfun(@rdivide, px, sigma);
predict(mdl,px([1 3 4]))
All going well, your result was "1" which is "Iris-setosa".
Conclusion
Although we have gone into a little more detail here, multinomial SVM is a not a complicated equation to use in Matlab. As always, its all about the data, so having a good data set is key.