Increased pace of digitisation has also brought increased risk of online theft and fraud. All industries, be it banking, retail or even education, are impacted by these threats. Enterprises are always looking ways to protect their customers. Different ways of protection mechanism such as digital autogenerated passwords, fingerprint, or advanced techniques such as voice or facial recognition have been used. Unfortunately, they are, on the one hand, intrusive and on the other hand they do not provide continuous protection.
Recently behavioural methods such as analysing mouse movements to detect fraudulent or unwanted access have been used. This involves modelling the user’s mouse movement style, which is the focus of this story. This story is inspired from excellent white-paper IET Journal Intrusion Detection using Mouse Dynamics by Margit Antal and Elod Egyed-Zsigmond[1] and Balabit Mouse Challenge dataset[2].
Generally every one has his or her own style of mouse usage. Some people may be fast, some slow. Some like to navigate more than others. You can learn a lot about ones behaviour just by looking at mouse movement data
A sample dataset[2], in order to illustrate how to model mouse movement, is shown below
The mouse movement data has fields such as user, session, client_timestamp which give information about user and time of the activity. The fields x and y indicate the co-ordinates on the screen indicating the mouse location. The state of the mouse, such as Move, Drag, Pressed is also captured. The field fraud is a labelled field based on the fact if the access was unwanted or normal
Modelling mouse movement is using these fields creatively in order to identify which behaviour in normal and which is unwanted.
First, let us visualise how the mouse movement look like. Shown below is visualisation in form of animation of mouse movement for a normal usage

A similar visualisation for unwanted access is also shown here

Modelling the mouse movements requires the knowledge about elementary physics, such as distance, angular velocity, acceleration etc in order to extract features.
Here I explain what these features mean and how they can be useful in identifying unwanted access
Screen Distance Travelled
Screen distance travelled is distance between two screen positions where the mouse has moved. The two screen position correspond to RELEASE state of the mouse and PRESS state of the mouse

The screen distance travelled by mouse is euclidean distance between point P1 and point PN
We can analyse the screen distance travelled for normal and fraudulent access using a box plot

You will observe that the range of screen distance travelled by unwanted access is smaller than that of normal access. This means that an unwanted access tends to focus on a small part of the screen. A normal access tends to be exploring different parts of the screen
For the avid readers, you must have noticed this behaviour from the animations above. As you can see that the x and y axis range for normal access are much more than that of unwanted access.
Angle of Movement
Angle of movement indicates the direction of movement. It is ranging from 0 to 360° . Angle can be used to know about the nature of movement. For example, a movement of 0° or 180° indicates a straight movement, while a movement of 90° or 270° indicates a vertical movement.
We can also categorise the angle into 8 directions, as shown below

We can analyse the angle of movement with help of radar chart visualisation. The radar chart for angle of movement for normal and unwanted access is illustrated here. The length of line in each direction is average angle of movement that particular direction

We can observe that the unwanted access and normal access overlap almost completely in the radar chart. This means that there is not much difference in the angle of movement between the two kinds of accesses
Velocity
Generally when we think of events such as car robbery or bank robbery, we tend to think that it is fast and rapidly done. So are any online fraudulent or unwanted access also done at high speed ? Let us find out
Velocity is an indication of speed. It is measured as distance of mouse movement in pixel in one second. We can use this concept to analyse speed of mouse movement. The velocity of the two kind of access is illustrated using a box plot

As you will observe that the mouse speed for unwanted access is generally more than that of normal access. In an unwanted access, the mouse moves in range of 0 to 250 pixels/second. There are no outliers observed in unwanted access. In a normal access , the mouse moves in range of 0 to 100 pixels/seconds, however there are some outliers shown as out of range black dots. These outliers represent sudden high speed movements.
This analysis confirms our thinking of an online fraudster behaviour. It is very similar behaviour when compared to car robbery or online theft. The fraudster tends to finish his or her activity very fast, but without any sudden movements
Angular Velocity
We have seen that the velocity of unwanted access is generally higher than normal access, while the normal access does have some sudden speedy movements. It would be would be interesting to know about the direction of these movements in order to see if it reveals some additional insights
We can combine the above mentioned concepts of angular movement and velocity into concept of Angular Velocity. It is an indication of speed in a particular direction.
An important point to note while measuring angular velocity is that it can be positive or negative. All movements which go from left to right (in any angle) will have positive angular velocity. While those going from right to left will have negative angular velocity

Histogram visualisation of angular velocity for the two kind of access to shown here

You will observe that majority of sudden speedy movements in a normal access have negative angular velocity, meaning moving backwards. This can signify some correction to be made in some already filled fields or clicking on a left-hand side menu
In addition to above, some of the other interesting features which can be calculated using angle and velocity are Straightness and Curvatures. Straightness are movements very near to 0°. Curvatures are movements which represent a circle or curve
We can use above interesting features in order to build a model which can predict unwanted access. Some of the features which are generally found useful are
- Screen Distance
- Angle of movement
- Velocity
- Angular Velocity
- Straight Movements
- Curvature
- Movement beginning Time
- Minimum, Maximum, Standard deviation of above values
With this features, generally achieving a good accuracy. The model prediction and accuracy may vary from each situation, industry.
So in order to summarise, we saw how analysing mouse movement behaviour data can be useful to predict unwanted or fraudulent access.
The data such as distance, speed, angle combined with elementary physics knowledge can be used for creative feature engineering to develop predictive models to combat online fraud
References
[1] – IET Journal Intrusion Detection using Mouse Dynamics (https://arxiv.org/pdf/1810.04668v1.pdf)
[2] – Balabit Mouse Challenge dataset (https://github.com/balabit/Mouse-Dynamics-Challenge)
Website
You can visit my website to learn about Data Science. You can also make analytics with zero-coding. See link below
Please subscribe to stay informed whenever I release a new story.
You can also join Medium with my referral link.
Youtube channel Here is link to my YouTube channel https://www.youtube.com/c/DataScienceDemonstrated