SVM Classification

Just as with the Random Forest, there are quite a few R packages that provide SVM, e.g., caret, e1071, or kernLab. We will use the e1071 package, as it offers an interface to the well-known libsvm implementation.

Below you can see a complete code implementation. While this is already executable with your input data, you should read the comprehensive in-depth guide in the following to understand the code in detail.

# import packages
library(raster)
library(e1071)
 
# import image (img) and shapefile (shp)
setwd("/media/sf_exchange/landsatdata/")
img <- brick("LC081930232017060201T1-SC20180613160412_subset.tif")
shp <- shapefile("training_data.shp")
 
# extract samples with class labels and put them all together in a dataframe
names(img) <- c("b1", "b2", "b3", "b4", "b5", "b6", "b7")
smp <- extract(img, shp, df = TRUE)
smp$cl <- as.factor( shp$classes[ match(smp$ID, seq(nrow(shp)) ) ] )
smp <- smp[-1]
 
# optional: shuffle samples and undersample if necessary
smp <- smp[sample(nrow(smp)),]
smp <- smp[ave(1:(nrow(smp)), smp$cl, FUN = seq) <= min(summary(smp$cl)), ]
 
# tune and train model via gridsearch (gamma & cost)
gammas = 2^(-8:5)
costs = 2^(-5:8)
svmgs <- tune(svm,
              train.x = smp[-ncol(smp)],
              train.y = smp$cl,
              scale = TRUE,
              kernel = "radial", 
              type = "C-classification",
              ranges = list(gamma = gammas,
                            cost = costs),
              tunecontrol = tune.control(cross = 5)
              )

# save svmmodel
svmmodel <- svmgs$best.model
save(svmmodel, file = "svmmodel.RData")
 
# predict image data with svm model
result <- predict(img,
                  svmmodel,
                  filename = "classification_svm.tif",
                  overwrite = TRUE
                  )


In-depth Guide

In order to be able to use the functions of the e1071 package, we must additionally load the library into the current session via library(). If you do not use our VM, you must first download and install the packages with install.packages():

#install.packages("raster")
#install.packages("e1071")
library(raster)
library(e1071)

First, it is necessary to process the training samples in the form of a data frame. The necessary steps are described in detail in the previous section.

names(img) <- c("b1", "b2", "b3", "b4", "b5", "b6", "b7")
smp <- extract(img, shp, df = TRUE)
smp$cl <- as.factor( shp$classes[ match(smp$ID, seq(nrow(shp)) ) ] )
smp <- smp[-1]

After that, you can identify the number of available training samples per class with summary(). There is often an imbalance in the number of those training pixels, i.e., one class is represented by a large number of pixels, while another class has very few samples:

summary(smp$cl)
##  baresoil    forest grassland  urban_hd  urban_ld     water 
##       719      2074      1226      1284       969       763

This often leads to the problem that classifiers favor and overclass strongly-represented classes in the classification. Unfortunately, a SVM can not handle this problem as elegantly as Random Forest methods, why you might need to prepare your training data:
Only if your data is imbalanced, you should consider to do the following undersampling: We shuffle our entire training dataset and pick out the same number of samples for each class. The purpose of shuffle is to prevent sampling of all pixels that are next to each other (spatial autocorrelation). The training data is mixed using the sample() method. We sample all rows of our dataset nrow(smp) in a random manner in line 10:

head(smp)
##    b1  b2  b3  b4  b5  b6 b7    cl
## 1 192 229 321 204 161 100 72 water
## 2 179 203 233 130 156  93 68 water
## 3 189 221 272 164 173 109 81 water
## 4 159 179 188  97 125  63 40 water
## 5 171 194 203 116 146  82 56 water
## 6 164 188 196 108 138  71 51 water

smp <- smp[sample(nrow(smp)), ]

head(smp)
##       b1  b2  b3  b4   b5   b6   b7       cl
## 2024 176 195 324 234 2146 1022  479   forest
## 5545 409 517 844 988 3038 3005 1818 baresoil
## 2877 195 212 363 293 2500 1435  690   forest
## 6210 303 322 482 411 2465 1410  872 urban_hd
## 6613 367 418 672 600 2855 1876 1210 urban_ld
## 321  163 178 201 136  124   76   56    water

Compare the first six entries of the smp dataset before and after the command in line 10 using the head() function. You will notice that the order of the rows is shuffled!
With the min() function, we can select the minimum value from the class column’s summary and use the ave() function to make a selection of the same size for each class:

summary(smp$cl)
##  baresoil    forest grassland  urban_hd  urban_ld     water 
##       719      2074      1226      1284       969       763

smp.maxsamplesize <- min(summary(smp$cl))
smp.maxsamplesize
## [1] 719

smp <- smp[ave(1:(nrow(smp)), smp$cl, FUN = seq) <= smp.maxsamplesize, ]

summary(smp$cl)
##  baresoil    forest grassland  urban_hd  urban_ld     water 
##       719       719       719       719       719       719

Let the training begin!
Classifications using an SVM require two parameters: one gamma \(\gamma\) and one cost \(C\) value (for more details refer to the SVM section). These hyperparameters significantly determine the performance of the model. Finding the best hyparameters is not trivial and the best combination can not be determined in advance. Thus, we try to find the best combination by trial and error. Therefore, we create two vectors comprising all values that should be tried out:

gammas = 2^(-8:5)
gammas
##  [1]  0.00390625  0.00781250  0.01562500  0.03125000  0.06250000
##  [6]  0.12500000  0.25000000  0.50000000  1.00000000  2.00000000
## [11]  4.00000000  8.00000000 16.00000000 32.00000000

costs = 2^(-5:8)
costs
##  [1]   0.03125   0.06250   0.12500   0.25000   0.50000   1.00000   2.00000
##  [8]   4.00000   8.00000  16.00000  32.00000  64.00000 128.00000 256.00000

So we have 14 different values for \(\gamma\) and 14 different values for \(C\). Thus, the whole training process is done for 196 (14 * 14) models, often referred to as gridsearch. Conversely, this means that the more parameters we check, the longer the training process takes.
We start the training with the tune() function. We need to specify the training samples as train.x, i.e., all columns of our smp dataframe except the last one, and the corresponding class labels as train.y, i.e. the last column of our smp dataframe:

svmgs <- tune(svm,
              train.x = smp[-ncol(smp)],
              train.y = smp[ ncol(smp)],
              type = "C-classification",
              kernel = "radial", 
              scale = TRUE,
              ranges = list(gamma = gammas, cost = costs),
              tunecontrol = tune.control(cross = 5)
              )

We have to set the type of the SVM to “C-classification” in line 4 in order to perform a classification task. Furthermore, we set the kernel used in training and predicting to a RBF kernel via “radial”. We set the argument scale to TRUE in order to initiate the z-transformation of our data. The argument ranges in line 7 takes a named list of parameter vectors spanning the sampling range. We put our gammas and costs vectors in this list. By using the tunecontrol argument in line 8, you can set k for the k-fold cross validation on the training data, which is necessary to assess the model performance.

Depending on the complexity of the data, this step may take some time. Once completed, you can check the output by calling the resultant object name:

svmgs
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 5-fold cross validation 
## 
## - best parameters:
##  gamma cost
##      2    8
## 
## - best performance: 0.02132796

In the course of the cross-validation, the overall accuracies were compared and the best parameters were determined: In our example, those are 2 and 8 for \(\gamma\) and \(C\), respectively. Furthermore, the error of the best model is displayed: 2.1% error rate (which is quite good!).
We can plot the errors of all 196 different hyperparameter combinations in the so called gridsearch using the standard plot() function:

plot(svmgs)

The plot appears very homogeneous, as the performance varies only very slightly and is already at a very high level. It can happen that the highest accuracies are found on the edge of the grid search. Then the training with other ranges for \(\gamma\) and \(C\) should be done again.
We can extract the best model out of our svmgs to use for image prediction:

svmmodel <- svmgs$best.model
svmmodel
## 
## Call:
## best.tune(method = svm, train.x = smp[-ncol(smp)], train.y = smp$cl, 
##     ranges = list(gamma = gammas, cost = costs), tunecontrol = tune.control(cross = 5), 
##     scale = TRUE, kernel = "radial", type = "C-classification")
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  8 
##       gamma:  2 
## 
## Number of Support Vectors:  602

Save the best model by using the save() function. This function saves the model object svmmodel to your working directory, so that you have it permanently stored on your hard drive. If needed, you can load it any time with load().

save(svmmodel, file = "svmmodel.RData")
#load("svmmodel.RData")

Since your SVM model is now completely trained, you can use it to predict all the pixels in your image. The command method predict() takes a lot of work from you: It is recognized that there is an image which will be processed pixel by pixel. As with the training pixels, each image pixel is now individually classified and finally reassembled into your final classification image. Use the argument filename = to specify the name of your output map:

result <- predict(img,
                  svmmodel,
                  filename = "classification_svm.tif",
                  overwrite = TRUE
                  )

plot(result, 
     axes = FALSE, 
     box = FALSE,
     col = c("#fbf793", # baresoil
             "#006601", # forest
             "#bfe578", # grassland
             "#d00000", # urban_hd
             "#fa6700", # urban_ld
             "#6569ff"  # water
             )
     )

Support Vector Machine classification completed!
If you store the result of the predict() function in an object, e.g., result, you can plot the map using the standard plot command and passing this object: