The world’s leading publication for data science, AI, and ML professionals.

Regional and Online Learnable Fields

A new, not so new, clustering algorithm

(Image by author)
(Image by author)

This article will guide you through Regional and Online Learnable Fields. A seemingly unrecognized Clustering algorithm, that provides several advantages over k-Means and k-Nearest-Neighbours, from which it originates.

#

First proposed by Rolf Schatten, Nils Goerke, and Rolf Eckmiller in 2005, the Regional and Online Learnable Field is an online and unsupervised clustering technique [1]. The general idea is to approximate our datapoints with so-called ROLF-Neurons, which can be connected to form clusters in a later step. Originating from k-Means and k-Nearest-Neighbors, this algorithm does aim to improve on them. Let us first see, where the former can be enhanced.

Where can k-Means and KNN be improved?

  1. A Common Problem with the original k-Means Algorithm is that we have to choose the number of clusters k in advance. While sometimes we can make an educated guess, most of the time we have to resort to some form of trial and error. This takes time, and it would be better to incorporate the choosing of k into the clustering process itself.
  2. Another problem is, that k-Means is unable to cluster certain shapes, like two concentric circles or two interlocked half-moons. In general, k-Means fails when clusters are not circular or when their centroids are close to each other.
  3. k-Means is not online adaptable. Say you trained k-Means on two clusters and now are presenting it with datapoints of a third one. These points would be assigned to the two clusters found during training. This is not desirable, and it would be far better if the algorithm would create a new cluster for said points, or mark them as ‘unknown’.
  4. Additionally, kNN faces the problem of complexity. To assign a datapoint to a cluster, we have to calculate its distance to all training-points. You can imagine, that if we have a great number of training-points, this becomes computationally inefficient quite fast. So here, we have to find a way to reduce the amount of distance calculations made.

The ROLF-Neuron

The Heart of Regional and Online Learnable Fields are the so-called ROLF-Neurons. These act as prototypes to our datapoints, and have an area they are responsible for. This area is called the perceptive field. The ROLF-Neuron consists of 3 Parameters:

  1. Center c, that defines the position of the neuron in space.
  2. Width σ, that defines the width of each neuron.
  3. Perceptive Field r = pσ, which defines the area for which it is responsible. Set initially, the constant p is equal for all neurons. It defines the ratio between the perceptive field and the width of each neuron.
p = 2 (Image by author)
p = 2 (Image by author)

Learning the Network

This essentially works in two steps, first, we have to find the ROLF-Neurons or Prototypes in our training data. After which we build a Neighborhood graph between them, that finally defines the clusters.

How to find the Rolf-Neurons?

First, we iteratively go through all training patterns and look if they are inside the perceptive field of an already existing Neuron. If it is in the Perceptive field of multiple Neurons, we choose that Neuron, which is closest to the pattern. After we decided on this so-called ‘winning neuron’, we adapt its center c and width σ.

(Image by author)
(Image by author)

If the pattern does not lie in the perceptive field of any existing neuron, we create a new one according to the initialization strategy. This strategy has to be chosen initially. We will discuss the details of this later.

How to adapt the Winning Neuron?

Let x be a training pattern and c be the center of the winning Neuron. To adapt center c and width σ we first have to calculate the distance d between c and x. Afterward, we move c towards x and adapt σ towards d. Finally, we recalculate the perceptive field.

(Image by author)
(Image by author)

From the formulas, we can see the use of the constant parameter p. It allows the neuron to accept patterns outside σ. If the distance between x and c is greater than σ, the former gets increased. In the opposite case, it gets decreased. Without p, the perceptive field could only shrink, as exclusively points within σ would be accepted.

How to initialize a new Neuron?

When we initialize a new neuron, we always set c to be at pattern x. The value we change is the neuron’s width σ. This can be set in four different ways.

  1. Init-σ → Initialize σ to a constant value
  2. Minimum-σ → Initialize σ as the minimal σ of all already existing neurons
  3. Maximum-σ → Initialize σ as the maximal σ of all already existing neurons
  4. Mean-σ→ Initialize σ as the mean σ overall already existing neurons

Now, the observant reader might think: How do I initialize the first neuron if I choose mean-σ as my strategy. After all, there are no σ-values to calculate the mean over. Well, this is correct. The σ of the first neuron always has to be set by a predefined value. All following will then be set by your chosen strategy.

Building the Neighborhood

In the second step of the Learning algorithm, we want to group our previously found neurons into clusters. This is done by building a Neighborhood graph, where neurons with overlapping perceptive fields are connected with an edge. Mathematically speaking, two neurons are connected if their center distance is smaller than the sum of their perceptive fields (radii).

(Image by author)
(Image by author)

Now all neurons, that can be connected via a sequence of edges, are of the same cluster. The union of their perceptive fields is the total area, in which new datapoints are assigned to said cluster during prediction.

Advantages of ROLF

  1. The algorithm can choose the number of clusters k on its own.
  2. The algorithm is not restricted by the shapes of clusters. Especially the problem cases for k-Means, the interlocked half-moons and the concentric circles can be clustered with ROLF. Examples will be given in the following section.
  3. By representing the training datapoints with ROLF-neurons, only distance calculations for the former have to be performed. Additionally, exclusively neurons have to be stored, while patterns will only be presented once, and can then be discarded. This also reduces memory usage.
  4. The algorithm can detect outliers on its own, on which it then can act accordingly.

Performance

Now I want to show you several examples of how the Regional and Online Learnable Field does perform in practice.

20000 points, 219 neurons (Image by author)
20000 points, 219 neurons (Image by author)

In the given examples, you can see how ROLF can cluster the two failing cases of k-Means. For both examples, a learning rate of 0.05 for c and σ was used. New neurons were initialized with the Minimum-σ strategy. The first Neuron was initialized with σ = 0.4, while p was set to 2. The neighborhood graph is visualized by the black lines. Both datasets consist of 20000 points and were reduced to about 1% of neurons. The Green and Blue area visualize where points would be assigned to the respective cluster.

20000 points, 239 neurons (Image by author)
20000 points, 239 neurons (Image by author)

Remarks

  1. The dimension of center c for the ROLF neurons does not have to be 2. In the original paper, ROLF was used to cluster points from a 720-dimensional input space [1].
  2. The code used to generate the examples can be found here.

Sources:

[1] Schatten R., Goerke N., Eckmiller R. (2005) Regional and Online Learnable Fields. In: Singh S., Singh M., Apte C., Perner P. (eds) Pattern Recognition and Image Analysis. ICAPR 2005. Lecture Notes in Computer Science, vol 3687. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11552499_9


Related Articles