Goal: This article gives an introduction to vectors, vector operations and their applications in the field of Data Science
Why you should learn it: It is the basis for almost all machine learning techniques to learn from data whether it is predicting, classification or clustering
Table of Contents:
- What is a vector?
- Vector addition
- Scalar-vector multiplication
- Dot product
- Linear Combinations
What is a vector?
A vector is an ordered finite list of numbers. They are most often written like this:

The elements of a vector are the values within that vector. The python equivalent are numpy arrays. The size (or length) of the vector is the number of it’s elements:

Examples:
- Feature vector: In many applications a vector collects different features of a single entity. These features can be measurements of an object, for example age, height, weight, blood pressure of a patient in a hospital.
- Time series: A vector can represent a time series or signal, that is, the value of some quantity at different times. For example, time series can represent the value of a share in stock market but also something like hourly rainfall in a certain region.
- Customer Purchases: A vector can also represent a record of a particular customer’s purchase from a business, with the entries of a vector representing the amount of dollars the customer has spent on a certain product.
Vector addition
Vector addition works by element-wise addition:


And similarly vector subtraction works by element-wise subtraction:


Examples:
- Word counts: If vectors a and b are word counts, denoting the frequency of a given word in two corresponding documents A and B, then the sum of a + b gives the combination of the two documents. Likewise, the difference a-b gives the number of times each word appeared more in document A then in B.
- Time series: If a and b represent time series of the same quantity, for example monthly profit of two stores, then the sum a+b represents a time series of the total monthly profit of the two stores.
- Portfolio trading: Suppose we have two vectors. First, the original portfolio vector s with the entries denoting the number of shares of a given asset in a portfolio. Second, the trade vector b with positive entries giving the number of assets bought and negative entries the number of assets sold. Then our final portfolio is given by s +b.
Scalar-vector multiplication
Another important vector operation is multiplying a vector with a scalar (which is just a fancy word for ‘number’), which is done by multiplying each element of the vector by the scalar:


Examples:
- Materials requirement: Suppose the vector q is the bill of materials for producing one unit of some product, for example a mobile phone. Then the entries of q are the amount of raw material required to produce one mobile phone. To produce 300 units of the mobile phone we require raw materials given by 300q.
- Audio scaling: If a vector v represents an audio signal (which is, as we learned, a time series), the signal can be increased in volume by a factor 3 if we take the scalar multiple 3v.
Dot product
Now this is one of the most important operations in linear algebra, managing to come up in all areas related to data science from [linear regression](http://Linear regression) to neural networks. The dot product of two vectors is calculated by multiplying each corresponding elements of the vectors and adding the resulting products. See for yourself:


Examples:
- Sum: If we take the dot product of vectors a and b, where a only consists of 1s and has the same length as b, we get the sum operation.
- Average: If we take dot product of vectors a and b, where a only consists of 1/n entries(n = shared length of the vectors), we get the average value of vector b.
- Co-occurence: Suppose vectors a and b are vectors of the same length, where the entries can only be either 0 or 1 then the dot product of a and b gives the total number of entries where both vectors are show 1. This could mean in some cases that we have the same prediction or feature.
- Sentiment analysis: A specific problem in text-analysis is the question, whether a sentiment (emotional polarity) of a given text is positive, negative or neutral. We can take an initial approach to this problem by creating two vectors. First, vector x of length n representing the frequencies of n words in the text. Second, vector w of the same length, representing the polarity of the given word, with entries -1(for negative words like ‘bad’ or ‘terrible’), 0 (neutral words like ‘and’) or 1 (positive words like ‘nice’ or ‘awsome’). Then the dot product of x and w gives us a first (crude) measure of sentiment in the text.
Linear combinations
A special interpretation of a dot product where we multiply a vector x by another vector β is called a linear combination of x:

Here the elements of β are called the coefficients. Linear combinations of x form the backbone of one of the most popular statistical tools to predict continuous quantities – linear regression. It is used for example in house price prediction, where we have a feature vector x and want to find the optimal weighting of these features by regression coefficients β to predict the house price given the features of a house (e.g., house area in square feet, number of bed rooms etc.)
This is it for today, thank you very much for reading! Follow me if you want to be in the loop for future articles and leave a clap if you enjoyed the article!