The Ultimate Ndarray Handbook: Mastering the Art of Scientific Computing with Rust

An overview of different Rust’s built-in data structures and a deep dive into the Ndarray library

Mahmoud Harmouch
Towards Data Science
31 min readMay 2, 2023

--

Photo by Crissy Jarvis on Unsplash

TL;DR

Rust has gained immense popularity as a programming language globally, and it’s not without reason. Additionally, when discussing data analysis specifically, Rust stands out from its peers with its exceptional capabilities in this field. The extensive library support coupled with robust tools makes Rust the preferred option for many professionals working on complex datasets today. Moreover, knowing how to store your data is essential if you are looking to use Rust for data analysis or other related tasks.

By the end of this article, you’ll have a rock-solid foundation that will enable you to kick off your Rust data analysis journey with confidence and ease.

Note: This article assumes you are a bit familiar with Rust references and its borrow checker.

The notebook named 2-ndarray-tutorial.ipynb was developed for this article which can be found in the following repo:

What Is This Article All About?

Photo by Ashni on Unsplash

The spotlight of this piece is on an essential Rust library for data analysis, namely ndarray. ndarray empowers users with the ability to handle large multi-dimensional arrays and matrices while also offering an extensive selection of mathematical operations that can be performed on them.

But before we dive into ndarray specifically, let’s take a step back and explore different Rust built-in data structures and why Rust is such a great language for data analysis in general.

Rust Built-In Data Structures

In this section, we’ll delve into the fundamental concepts and powerful tools that form the backbone of this fantastic Rust programming language. In particular, we will cover the basics of Rust data structures, including vectors, tuples, sets, and hash maps, gaining a solid understanding of how they work and how they can be used to solve real-world problems.

1. Vectors

Vectors memory layout (Image by author)

Vectors, known as “lists” in some programming languages like Python, are everywhere; From simple shopping lists to more complex recipe instructions, they can help us keep track of things and find them when needed. In programming, vectors are an essential data structure used in countless applications, taking many different shapes and forms.

Creating a Vector

In Rust, vectors are essential data structures, and you can create them using different approaches. To create an empty vector, you can call the Vec::new() function and add a type annotation since Rust doesn’t know what elements you intend to store in it:

let v: Vec<i32> = Vec::new();

Alternatively, you can use the vec! macro to create a new vector with initial values:

let v = vec![1, 2, 3];

The rust compiler has the ability to infer the type of vector through its initial values, thereby eliminating manual specification. After creating a vector, you have diverse options for modifying it based on your requirements.

Accessing Vectors Elements

In Rust, we can access values stored in a vector in two ways: either by indexing or using the get method. Let’s explore both methods, along with some code examples!

First, let’s consider the following vector v with some values:

let v = vec!["apple", "banana", "cherry", "date"];

The indexing operator [] can be utilized to retrieve a specific value from a vector. To access the second element, let’s consider the following example:

// Get the second element
let second = &v[1];
println!("The second element is {}", second);

// Output:
// The second element is banana

Here, we’re creating a reference & to the second element in the vector using indexing with []. When attempting to access a non-existent index, the Rust compiler will trigger termination/panic and cause program failure. To avoid this, we can utilize the get function that produces an Option<&T> instead of a reference. Here’s how it works:

let v = vec![
("apple", 3),
("banana", 2),
("cherry", 5),
("date", 1),
];

// Get the quantity of cherries
let quantity = v.get(2).map(|(_, q)| q);

match quantity {
Some(q) => println!("There are {} cherries", q),
None => println!("Cherries not found"),
}

// Output:
// There are 5 cherries

By invoking v.get(2), the method will return an Option<&T> type that yields a positive result in the form of Some if the element is present, or a negative outcome as None. We can utilize a robust approach by implementing a match expression to handle both scenarios effectively. By leveraging these techniques, you can easily access elements in vectors!

Iterating over Values

In Rust, iterating through a vector is a common task that can be executed in two ways: utilizing immutable and mutable references. This approach enables us to perform actions on each vector element individually. To gain further understanding, let’s explore both of these methods using some code examples!

let fruits = vec![("apple", 3), ("banana", 2), ("orange", 5), ("peach", 4)];
let mut sum = 0;
for (_, num) in &fruits {
sum += num;
}
let avg = sum as f32 / fruits.len() as f32;
println!("The average of the second elements is {}", avg);

// Output:
// The average of the second elements is 3.5

In the above code snippet, we are using the & operator to obtain an immutable reference for every item in the vector. Then, we display the value of each element by utilizing the println! macro.

In addition, the iter() function creates an iterator for vector values. Using this technique, we can obtain mutable references to each value in the vector, allowing us to add 10 seamlessly. The code below demonstrates how to use the iter() method to optimize your iteration over vectors efficiently.

let mut values = vec![10, 20, 30, 40, 50];
for value in values.iter_mut() {
*value += 10;
}
println!("The modified vector is {:?}", values);

// Output:
// The modified vector is [20, 30, 40, 50, 60]

We can effectively traverse a portion of the vector’s elements by utilizing a for loop and range. To illustrate this concept, consider the following code snippet showcasing how to employ a for loop to obtain immutable references to get only three elements from a given vector before outputting them to the terminal.

let values = vec![10, 20, 30, 40, 50];
for value in &values[0..3] {
println!("The value is {}", value);
}

// Output
// The value is 10
// The value is 20
// The value is 30

By utilizing Rust’s enumerate() function, we can effortlessly traverse a vector and obtain its values and corresponding indices. The code snippet below showcases how to use the enumerate() method to retrieve immutable references for each element within an i32 value-based vector while simultaneously printing their respective indices and values.

let values = vec![10, 20, 30, 40, 50];
for (index, value) in values.iter().enumerate() {
println!("The value at index {} is {}", index, value);
}

// Output:
// The value at index 0 is 10
// The value at index 1 is 20
// The value at index 2 is 30
// The value at index 3 is 40
// The value at index 4 is 50

Using these techniques, you can easily iterate and manipulate elements in vectors!

Modifying a Vector

The versatility of Rust’s vector lies in its ability to resize dynamically, allowing for the addition or removal of elements during runtime. This section will explore different approaches to modifying and updating vectors within Rust.

Adding elements

Add an element to a vector (Image by author)

We can add elements to a vector using the push method, which appends an element to the end of the vector:

let mut v = vec!["apple", "banana", "orange"];

v.push("mango");

println!("{:?}", v);

// Output:
// ["apple", "banana", "orange", "mango"]

The given example involves the creation of a three-element vector, followed by appending “mango” to its end with a push operation. Finally, we display the modified vector on the terminal via the println! macro. Alternatively, We can use the insert method to add an element at a specific index:

let mut v = vec!["apple", "mango", "banana", "orange"];

v.insert(v.len(), "mango");

println!("{:?}", v);

// Output:
// ["apple", "mango", "banana", "orange", "mango"]

The above example entails the creation of a four-element vector, followed by the insertion of “mango” at the end of the vector by utilization of the insert method. Finally, we display the modified vector on the terminal through the println! macro.

Modifying Elements

To alter the elements of a string vector, we can utilize the index operator [] to reach out for an element at a particular position and substitute it with a new value. This approach is highly effective in modifying values within a given vector.

let mut v = vec!["apple", "banana", "orange"];

v[1] = "pear";
v[2] = "grapefruit";

println!("{:?}", v);

// Output:
// ["apple", "pear", "grapefruit"]

The given example involves the creation of a vector v comprising three elements, followed by the alteration of its second element (located at index 1) to “pear” and assigning “grapefruit” as the value for the third one (at index 2). Finally, we display this updated version on the terminal through the println! macro.

Removing Elements

Removing an element from a vector (Image by author)

We can remove an element from a vector using the pop() method, which removes and returns the last element of the vector:

let mut v = vec!["apple", "banana", "orange", "mango"];

let removed_element = v.pop();

println!("Removed element: {:?}", removed_element.unwrap());
println!("{:?}", v);

// Output:
// Removed element: "mango"
// ["apple", "banana", "orange"]

In the example above, we created a four-element vector called v and then removed the last element using the pop method. This method also provides us with the removed component as output. Finally, we used the println! macro to display both our updated vector and extracted element on the terminal screen in an orderly manner.

We can also use the remove method to remove an element at a specific index:

let mut v = vec!["apple", "banana", "orange", "mango"];

let removed_element = v.remove(2);

println!("Removed element: {}", removed_element);
println!("{:?}", v);

// Output
// Removed element: orange
// ["apple", "banana", "mango"]

To remove all elements from a vector in Rust, use retain method to keep all elements that match a certain condition:

let mut v = vec!["A", "warm", "fall", "warm", "day"];
let elem = "warm"; // element to remove
v.retain(|x| *x != elem);
println!("{:?}", v);

// Output:
// ["A", "fall", "day"]

Concatenating Two Vectors

To concatenate two vectors of strings, we can use the extend method, which takes an iterator as an argument and appends all its elements to the vector:

let mut v1 = vec!["apple", "banana"];
let mut v2 = vec!["orange", "mango"];

v1.extend(v2);

println!("{:?}", v1);

// Output:
// ["apple", "banana", "orange", "mango"]

In the example above, we first create two vectors v1 and v2 , then we concatenate them by calling the extend method on v1 and passing v2 as a parameter.

Filter & Map Elements

We can filter and map elements of a vector in Rust using the iter, filter, and map methods.

Filter Elements

We can effectively filter out vector elements by combining the iter and filter methods. To illustrate this point, let’s consider how to filter out all even numbers from a set of integers using the following example:

let v = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
let odd_numbers: Vec<i32> = v.iter().filter(|x| *x % 2 != 0).map(|x| *x).collect();
println!("{:?}", odd_numbers);

// Output:
// [1, 3, 5, 7, 9]

In the example above, we first create a vector v with ten elements, then we use iter and filter methods to create a new vector odd_numbers that contains only the odd numbers from v. Finally, we print the new vector to the terminal using the println! macro.

Map Elements

To map elements of a vector, we can use the iter and map methods together. For example, to convert a vector of strings to uppercase:

let v = vec!["hello", "world", "rust"];
let uppercase_strings: Vec<String> = v.iter().map(|x| x.to_uppercase()).collect();
println!("{:?}", uppercase_strings);

// Output
// ["HELLO", "WORLD", "RUST"]

In the example above, we first create a vector v with three elements, then we use iter and map methods to create a new vector uppercase_strings that contains the uppercase versions of the elements in v. Finally, we print the new vector to the console using the println! macro.

Length

To compute the length, we can use the len method:

let v = vec!["hello", "world", "rust"];
println!("Size: {}", v.len());

// Output
// Size: 3

Check If Element Exists

We can use contains to check if a vector contains a specific element:

let v = vec!["hello", "world", "rust"];
println!("{}", v.contains(&"hello"));

// Output
// true

Note the method requires a borrowed copy, hence the & in the argument. The compiler will tell you to add this symbol if you forget.

Reversing Elements

We can reverse a vector in Rust using the reverse method. This method modifies the vector in place, so it doesn’t return anything.

let mut v = vec![1, 2, 3, 4, 5];
v.reverse();
println!("{:?}", v);

// Output:
// [5, 4, 3, 2, 1]

In the example above, a vector v consisting of five elements is created, and then the reverse method is employed to alter the sequence of these components in place. Finally, we display the reversed vector on the terminal for observation.

Maximum & Minimum Elements

By utilizing Rust’s iter function alongside the max and min methods, one can effortlessly locate both the highest and lowest values within a vector. This approach is highly effective in simplifying such operations with ease.

let v = vec![1, 2, 3, 4, 5];
let max_element = *v.iter().max().unwrap();
let min_element = *v.iter().min().unwrap();
println!("Max element: {}", max_element);
println!("Min element: {}", min_element);

// Output
// Max element: 5
// Min element: 1

In the example above, we initialized a vector v of five elements. Subsequently, the iter method is employed to create an iterator which helps us determine the maximum and minimum values by utilizing max and min methods. Finally, using println!, we display both these results on the console screen.

Now that you have a solid foundation for using and manipulating vectors, let’s look at another built-in collection: arrays.

2. Arrays

Rust array memory layout (Image by author)

Using an array is a viable option for storing different values of the same data type. Unlike vectors, each element in the array must have consistent data types. Compared to arrays in other programming languages, they are fixed-size collections with identical data type elements. These collections come with benefits when you need to allocate memory on the stack or know that their sizes will remain constant throughout the runtime.

Creating an array

To create an array, you can use square brackets [] with comma-separated values:

let days = ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"];

You can also explicitly specify the number of elements in the array and their types, like so:

let a: [i32; 5] = [1, 2, 3, 4, 5];

Using this syntax, an array consisting of i32 values with a length equivalent to 5 can be formed. In order to set all elements within this array to one typical value, you may employ the following method:

let zeros = [0; 5];

This creates an array of length 5, with all the elements initialized to 0.

Accessing Elements

You can access individual elements of an array using square brackets with the index of the element:

let numbers = [1, 2, 3, 4, 5];
println!("{}", numbers[2]);

// Output:
// 3

Modifying Elements

Since arrays have a fixed size, you cannot push or remove elements like vectors. However, you can modify individual elements by making the array mutable using the mut keyword so that you can change its elements:

let mut numbers = [1, 2, 3, 4, 5];
numbers[1] = 10;
println!("{:?}", numbers);

// Output:
// [1, 10, 3, 4, 5]

Iterating

To retrieve every individual element from an array, we must traverse through all of them instead of relying on indices to access one at a time. Demonstrated below is the implementation of a for loop that effectively retrieves and prints out each value within an i32 type array.

let seasons = ["Winter", "Spring", "Summer", "Fall"];
for season in seasons {
println!("{season}");
}
// or
for index in 0..seasons.len() {
println!("{}", seasons[index]);
}
// or
for season in seasons.iter() {
println!("{}", season);
}

Slicing Arrays

You can also create a new array that contains a subset of the original array using slicing:

let numbers = [1, 2, 3, 4, 5];
let slice = &numbers[1..4];
println!("{:?}", slice);

// Output:
// [2, 3, 4]

This creates a new array containing the original array’s elements at indices 1, 2, and 3.

To sum up, Rust arrays are versatile data structures that serve numerous purposes. Their fixed-size nature renders them more effective than vectors in specific scenarios. When you have the array size predetermined and no need for runtime modifications, employing arrays is an ideal choice for storing data.

3. Tuples

A tuple is a compound type in Rust that allows you to group several values with varying types into one object. Regarding size, tuples are fixed and cannot be resized once declared, much like arrays.

Creating a Tuple

In Rust, creating a tuple is an effortless task. Just enclose your values in parentheses and separate them with commas. Each position within the tuple has its type, which may differ from one another without any constraints on the uniformity of the types among all elements present in it.

let person = ("Mahmoud", 22, true, 6.6);

When creating a tuple, it is possible to incorporate optional type annotations. This can be observed in the example below:

let person: (&str, i32, bool, f64) = ("Mahmoud", 22, false, 6.6);

Updating a Tuple

Utilizing the mut keyword, you can transform a tuple into a mutable form and modify its contents. This grants access to alter specific elements within the tuple by referencing them through dot notation followed by their respective index values:

let mut person = ("Mahmoud", 22, true);

You can modify its elements by utilizing the dot notation followed by the corresponding element index.

person.1 = 21;

Destructuring a Tuple

The process of extracting different components from a tuple and assigning them to separate variables is known as destructuring which is demonstrated in the following example.

let (name, age, is_male) = ("Mahmoud", 22, true);
println!("Name: {}, Age: {}, Gender: {}", name, age, if is_male { "Male" } else { "Female" });

// Output
// Name: Mahmoud, Age: 22, Gender: Male

We can also ignore some of the elements of the tuple while destructuring:

let (_, _, _, height) = ("Mahmoud", 22, false, 6.6);
println!("Height: {}", height);

// Output
// Height: 6.6

In addition, we can access a specific element in a tuple using indexing:

let person = ("Mahmoud", 3, true, 6.0);
println!("Experience: {}", person.1);

// Output
// Experience: 3

In summary, tuples are a powerful way to group together values with different types into one object in Rust. They are immutable and fixed in size but can be made mutable to modify their contents. You can also destructure tuples to access their elements. With these features, tuples are a versatile tool for working with data in Rust!

4. Hash Sets

If you are familiar with Python, sets may already be a known concept. These collections consist of distinct elements and do not prioritize orders. In Rust programming language, hash sets and B-tree sets represent these unique groups; however, the former is more frequently employed in practice.

Creating a Set

Creating a hash set in Rust is as simple as importing it from the standard library and calling the new method or associated function:

use std::collections::HashSet;
let mut my_set: HashSet<i32> = HashSet::new();

You can also create a set from a vector of elements:

let my_vector = vec![1, 2, 3, 4];
let my_set: HashSet<i32> = my_vector.into_iter().collect();

You can even initialize it from an array:

let a = HashSet::from([1, 2, 3]);

Updating a Set

Adding elements

Adding elements to a hash set is easy with the insert method:

let mut my_set: HashSet<i32> = HashSet::new();
my_set.insert(1);
my_set.insert(2);
my_set.insert(3);

Removing elements

Removing elements from a hash set is done using the remove method:

let mut my_set = HashSet::from([1, 2, 3, 4]);
my_set.remove(&2); // removes 2 from the set

Iterate over Sets

You can easily iterate over a hash set using a for loop:

let my_set = HashSet::from([1, 2, 3]);

for element in &my_set {
println!("{}", element);
}

// Output(not ordered):
// 1
// 3
// 2

Sets Operations

Different set operations (Image by author)

Rust’s hash sets offer an array of set operations, including difference, intersection, and union functions. These functionalities enable us to execute set arithmetic on hash sets which makes them a valuable resource for storing unique data. To illustrate this point, let’s consider the following example:

use std::collections::HashSet;

let set_a = HashSet::from([1, 2, 3]);
let set_b = HashSet::from([4, 2, 3, 4]);

// elements in set_a that are not in set_b
let difference_set = set_a.difference(&set_b);

// elements common to both set_a and set_b
let intersection = set_a.intersection(&set_b);

// elements in either set_a or set_b
let union_set = set_a.union(&set_b);

for element in difference_set {
println!("{}", element);
}

// Output:
// 1

for element in intersection {
println!("{}", element);
}

// Output:
// 3
// 2

for element in union_set {
println!("{}", element);
}

// Output:
// 3
// 2
// 1
// 4

In essence, hash sets are an indispensable asset that every Rust developer has to familiarize themselves with. They possess remarkable efficiency and offer plenty of operations for set arithmetic. Having been equipped with the illustrations provided, you should now be able to incorporate hash sets into your personal Rust projects.

For more info, you can refer to the official doc.

5. Hash Maps

Hash Map (image by author)

Hash Maps are a type of collection that consists of key-value pairs and offer quick and effective access to data by utilizing keys instead of indexing. Rust declares Hash Maps through the std::collections::HashMap module, an unordered structure with remarkable speed. Let’s look at how to create, update, access, and iterate over Hash Maps in Rust.

Creating a Hash Map

You can initialize a Hash Map in Rust in a number of ways, one of which is by using the new method of the Hash Map struct.

use std::collections::HashMap;

let mut employees_map = HashMap::new();

// Insert elements to the HashMap
employees_map.insert("Mahmoud", 1);
employees_map.insert("Ferris", 2);

// Print the HashMap
println!("{:?}", employees_map);

// Output:
// {"Mahmoud": 1, "Ferris": 2}

In the given instance, we introduce a new Hash Map named employees_map. Subsequently, utilizing the insert function, we add elements to this Hash Map. Lastly, by applying the println! macro and formatting it with {:?}, we exhibit debug mode representation of our created HashMap. Another way to initialize a HashMap is by using the HashMap::from method.

use std::collections::HashMap;

let employees_map: HashMap<i32, &str> = HashMap::from([
(1, "Mahmoud"),
(2, "Ferris"),
]);

Updating a Hash Map

Adding Elements

As we have seen in the previous example, we can use the insert method to add elements (key-value pairs) to a Hash Map. For example:

use std::collections::HashMap;

let mut employees_map = HashMap::new();

// Insert elements to the HashMap
employees_map.insert("Mahmoud", 1);
employees_map.insert("Ferris", 2);

// Print the HashMap
println!("{:?}", employees_map);

// Output:
// {"Mahmoud": 1, "Ferris": 2}

Removing Elements

We can use the remove method to remove an element (key-value pair) from a Hash Map. For example:

use std::collections::HashMap;

let mut employees_map: HashMap<i32, String> = HashMap::new();

// insert elements to hashmap
employees_map.insert(1, String::from("Mahmoud"));

// remove elements from hashmap
employees_map.remove(&1);

Updating an Element

We can update elements of a Hash Map by using the insert method. For example:

let mut employees_map: HashMap<i32, String> = HashMap::new();

// insert elements to hashmap
employees_map.insert(1, String::from("Mahmoud"));

// update the value of the element with key 1
employees_map.insert(1, String::from("Ferris"));
println!("{:?}", employees_map);

// Output:
// {1: "Ferris"}

Access Values

Like Python, we can use the get to access a value from a given Hash Map in Rust. For example:

use std::collections::HashMap;

let employees_map: HashMap<i32, &str> = HashMap::from([
(1, "Mahmoud"),
(2, "Ferris"),
]);

let first_employee = employees_map.get(&1);

Iterate over Hash Maps

use std::collections::HashMap;

fn main() {
let mut employees_map: HashMap<i32, String> = HashMap::new();

employees_map.insert(1, String::from("Mahmoud"));
employees_map.insert(2, String::from("Ferris"));

// loop and print values of hashmap using values() method
for employee in employees_map.values() {
println!("{}", employee)
}

// print the length of hashmap using len() method
println!("Length of employees_map = {}", employees_map.len());
}

// Output:
// Ferris
// Mahmoud
// Length of employees_map = 2

In essence, Rust’s Hash Map is a robust data structure that facilitates the effective management and arrangement of data through key-value pairs. They offer fast access to data and are frequently used for tasks like counting occurrences, memoization, and caching. Thanks to Rust’s integrated Hash Map implementation coupled with its extensive array of techniques, utilizing Hash Maps is an effortless process devoid of complications.

For more info, you can refer to this page of the official docs.

As we come to the end of this first segment, let us reflect on our journey into the vast world of Rust’s built-in data structures. Our exploration has led us through some fundamental components such as vectors, arrays, tuples, and hash maps — all crucial elements for any proficient programmer in their quest towards building robust programs.

Through our mastery of creating and accessing data structures and manipulating them with ease, we have gained valuable insights into their defining characteristics and nuances. Armed with this knowledge, you will be empowered to craft Rust code that is efficient and highly effective in achieving your desired outcomes.

Having established a firm grasp on the fundamental concepts of Rust’s built-in data structures, we shall now integrate them with the latter half of this article that delves into Ndarray. This fantastic library is famous for its prowess in numerical computation within Rust. It features an array object similar to a vector but augmented with advanced capabilities to execute mathematical operations seamlessly.

Ndarray for Data Analysis

Different arrays Dimensions (image by author)

In the following sections, we will delve into the world of ndarray: a robust Rust library that easily enables numerical computations and data manipulation. With its diverse array of methods for working with arrays and matrices containing numeric data, it is an essential asset in any data analysis toolkit. In the following sections, we’ll cover all aspects of using ndarray from scratch, including how to work with array and matrix structures and perform mathematical operations on them effortlessly. We’ll also explore advanced concepts such as indexing and slicing, which flexibly facilitate the efficient handling of large datasets.

By following through examples and hands-on exercises throughout these sections, you can gain mastery over employing ndarrayarrays effectively towards your unique analytical tasks!

Ndarray Intro

The ArrayBase struct provides an essential data structure, aptly named the n-dimensional array, that effectively stores and manages vast arrays of data. This includes integers or floating point values. The benefits of using a ndarray arrays over Rust’s native arrays or tuple structures are various: it is more efficient and user-friendly.

Ndarray Use Cases

Here are some real-life use cases of ndarray in data analysis:

  • Data Cleaning and Preprocessing: Ndarray offers robust features for data cleaning and preprocessing, including the ability to filter out missing values, convert various data types, and scale your dataset. Suppose you have a set of records with gaps; ndarray’s nan (not a number) value can represent these absent entries effectively. Utilizing functions like fill, you can easily manage those incomplete pieces of information without any hassle.
  • Data Visualization: Ndarray arrays are a reliable option for data storage to facilitate visualization. The versatility of ndarray arrays allows them to be used with the Plotters library for visual representation purposes. For instance, by generating an array containing random numbers using Ndarrays, we could plot the distribution in the form of a histogram through Plotters’ plotting capabilities.
  • Descriptive Statistics: Ndarray offers an array of robust methods for doing descriptive statistics on arrays, including computing the mean, median, mode, variance, and standard deviation. These functions are invaluable in analyzing data as they provide a quick overview of key metrics. For instance, by utilizing ndarray’s mean function, we can easily calculate the average value within our dataset with ease.
  • Machine Learning: Ndarray is a crucial component in machine learning, offering speedy and effective manipulation of large datasets. Numerical data must often be expressed as arrays for use with these algorithms, making ndarray an ideal solution due to its ease of use and efficiency. With this tool, we can effortlessly generate feature and label arrays that are essential for the success of any given machine-learning algorithm.
  • Linear Algebra: Ndarray offers many robust methods for carrying out linear algebraic operations like matrix inversion, multiplication, and decomposition. These functions are convenient when analyzing data represented as matrices or vectors. For instance, the dot function in ndarray enables us to execute matrix multiplication on two arrays with ease.

Initial Placeholders

Ndarray offers a variety of functions for generating and initializing arrays, known as initial placeholders or array creation functions. These powerful tools enable us to create customized arrays with specific shapes and data types, complete with predetermined or randomized values. Here are some frequently utilized examples of these handy initial placeholder functions within the ndarray library:

  1. ndarray::Array::<type, _>::zeros(shape.f()): This function creates an array filled with zeros. The shape parameter specifies the array’s dimensions, and the type parameter specifies the data type of the array elements. The f function converts the array from row-major into column-major.
  2. ndarray::Array<::type, _>::ones(shape.f()): This function creates an array filled with ones. The type and the f have the same effect as for ndarray::Array::zeros.
  3. ndarray::Array::<type, _>::range(start, end, step): This function creates an array with values in a range. The start parameter specifies the range’s start, and the end parameter specifies the end of the range (exclusive). The step parameter specifies the step size between values. The type parameter specifies the data type of the array elements.
  4. ndarray::Array::<type, _>::linspace(start, end, n): This function creates an array with values evenly spaced between the start and end values. The n parameter specifies the number of values in the array, and the end parameter specifies whether the stop value is included. The type parameter specifies the data type of the array elements.
  5. ndarray::Array::<type, _>::fill(value): This function fills an array with a specified value. The value parameter specifies the value to fill the array with.
  6. ndarray::Array::<type, _>::eye(shape.f()): This function creates a squared identity matrix with ones on the diagonal and zeros elsewhere. The n parameter specifies the number of rows and columns. The type parameter and f function have the same meaning as for ndarray::Array::zeros.
  7. ndarray::Array<type, _>::random(shape.f(), distribution_function): This function creates an array with random values with a given distribution. The shape parameter specifies the dimensions of the array.

These initial placeholder functions are highly valuable for generating and initializing arrays in ndarray. They offer a hassle-free approach to creating collections of diverse shapes and data types, allowing the user to specify specific or random values. Here’s a simple Rust program example to showcase the various placeholders available within ndarray.

use ndarray::{Array, ShapeBuilder};
use ndarray_rand::RandomExt;
use ndarray_rand::rand_distr::Uniform;

// Zeros

let zeros = Array::<f64, _>::zeros((1, 4).f());
println!("{:?}", zeros);

// Output:
// [[0.0, 0.0, 0.0, 0.0]], shape=[1, 4], strides=[1, 1], layout=CFcf (0xf), const ndim=2

// Ones

let ones = Array::<f64, _>::ones((1, 4));
println!("{:?}", ones);

// Output:
// [[1.0, 1.0, 1.0, 1.0]], shape=[1, 4], strides=[4, 1], layout=CFcf (0xf), const ndim=2

// Range

let range = Array::<f64, _>::range(0., 5., 1.);
println!("{:?}", range);

// Output:
// [0.0, 1.0, 2.0, 3.0, 4.0], shape=[5], strides=[1], layout=CFcf (0xf), const ndim=1

// Linspace

let linspace = Array::<f64, _>::linspace(0., 5., 5);
println!("{:?}", linspace);

// Output:
// [0.0, 1.25, 2.5, 3.75, 5.0], shape=[5], strides=[1], layout=CFcf (0xf), const ndim=1

// Fill

let mut ones = Array::<f64, _>::ones((1, 4));
ones.fill(0.);
println!("{:?}", ones);

// Output:
// [[0.0, 0.0, 0.0, 0.0]], shape=[1, 4], strides=[4, 1], layout=CFcf (0xf), const ndim=2

// Eye

let eye = Array::<f64, _>::eye(4);
println!("{:?}", eye);

// Output:
// [[1.0, 0.0, 0.0, 0.0],
// [0.0, 1.0, 0.0, 0.0],
// [0.0, 0.0, 1.0, 0.0],
// [0.0, 0.0, 0.0, 1.0]], shape=[4, 4], strides=[4, 1], layout=Cc (0x5), const ndim=2

// Random

let random = Array::random((2, 5), Uniform::new(0., 10.));
println!("{:?}", random);

// Output:
// [[9.375493735188611, 4.088737328406999, 9.778579742815943, 0.5225866490310649, 1.518053969762827],
// [9.860829919571666, 2.9473768443117, 7.768332993584486, 7.163926861520167, 9.814750664983297]], shape=[2, 5], strides=[5, 1], layout=Cc (0x5), const ndim=2

Multidimensional Arrays

Ndarray can build arrays with multiple dimensions, such as 2D matrices and 3D matrices. We can effortlessly generate complex data structures using the from_vec function along with a vector of vectors, or using the array! macro. For instance, let’s take an example program that showcases how ndarray creates arrays across various dimensions.

use ndarray::{array, Array, Array2, Array3, ShapeBuilder};

// 1D array
let array_d1 = Array::from_vec(vec![1., 2., 3., 4.]);
println!("{:?}", array_d1);

// Output:
// [1.0, 2.0, 3.0, 4.0], shape=[4], strides=[1], layout=CFcf (0xf), const ndim=1

// or

let array_d11 = Array::from_shape_vec((1, 4), vec![1., 2., 3., 4.]);
println!("{:?}", array_d11.unwrap());

// Output:
// [[1.0, 2.0, 3.0, 4.0]], shape=[1, 4], strides=[4, 1], layout=CFcf (0xf), const ndim=2

// 2D array

let array_d2 = array![
[-1.01, 0.86, -4.60, 3.31, -4.81],
[ 3.98, 0.53, -7.04, 5.29, 3.55],
[ 3.30, 8.26, -3.89, 8.20, -1.51],
[ 4.43, 4.96, -7.66, -7.33, 6.18],
[ 7.31, -6.43, -6.16, 2.47, 5.58],
];

// or

let array_d2 = Array::from_shape_vec((2, 2), vec![1., 2., 3., 4.]);
println!("{:?}", array_d2.unwrap());

// Output:
// [[1.0, 2.0],
// [3.0, 4.0]], shape=[2, 2], strides=[2, 1], layout=Cc (0x5), const ndim=2

// or

let mut data = vec![1., 2., 3., 4.];
let array_d21 = Array2::from_shape_vec((2, 2), data);

// 3D array

let mut data = vec![1., 2., 3., 4.];
let array_d3 = Array3::from_shape_vec((2, 2, 1), data);
println!("{:?}", array_d3);

// Output:
// [[[1.0],
// [2.0]],
// [[3.0],
// [4.0]]], shape=[2, 2, 1], strides=[2, 1, 1], layout=Cc (0x5), const ndim=3

Ndarray Arrays Manipulation

In this section, we will delve into the diverse techniques of altering ndarray arrays, such as indexing, slicing, and reshaping.

Indexing & Slicing

Array Slicing (Image by author)

Ndarray offers impressive capabilities through indexing and slicing features, enabling us to access and modify individual elements or subarrays within an array. Like Python lists, indexing in an ndarray involves using index values to retrieve specific elements from the array. As a demonstration of this functionality, consider accessing the second element of an array with code like so:

let array_d1 = Array::from_vec(vec![1., 2., 3., 4.]);
array_d1[1]

Multidimensional arrays also support indexing and slicing, not just 1D arrays. To illustrate this point, consider the code below which retrieves an element from a 2D array by specifying its row and column coordinates:

let zeros = Array2::<f64>::zeros((2, 4).f());
array_d1[1, 1]

Slicing is a powerful technique that enables us to extract a subarray from an array. The syntax for slicing resembles indexing, but instead of square brackets, it uses periods .. to specify the start and end points of the slice. To illustrate this method in action, consider the following code, which generates a new array consisting only of its first three elements:

let array_d1 = Array::<i32, _>::from_vec(vec![1, 2, 3, 4]);
let slice = array_d1.slice(s![0..3]);

Reshaping

Reshaping is a technique of altering the configuration or arrangement of an array while retaining its data. The ndarray library offers a range of powerful functions to reshape arrays, such as flatten and, most notably, reshape.

Reshape

Array reshaping (image by author)

With the reshape function, which can only be applied on ArcArray, you can modify an array’s shape by defining the number of rows and columns for its new configuration. For example, the following code snippet transforms a 1D array with four elements into a 2D one consisting of two rows and two columns:

use ndarray::{rcarr1};
let array_d1 = rcarr1(&[1., 2., 3., 4.]); // another way to create a 1D array
let array_d2 = array_d1.reshape((2, 2));

Flatten

Array flattening (Image by author)

The ndarray_linalg::convert::flatten function produces a 1D array containing all the elements from the source array. However, it generates a new copy of data instead of mutating the original collection. This approach ensures distinctness between both arrays and avoids any potential confusion or errors arising from overlapping arrays.

use ndarray::{array, Array2};
use ndarray_linalg::convert::flatten;

let array_d2: Array2<f64> = array![[3., 2.], [2., -2.]];
let array_flatten = flatten(array_d2);
print!("{:?}", array_flatten);

// Output:
// [3.0, 2.0, 2.0, -2.0], shape=[4], strides=[1], layout=CFcf (0xf), const ndim=1

Not only does ndarray offer the ability to reshape arrays, but it also presents a range of other functions for array manipulation. These include transposing, and swapping axes, among many others.

Transposing

Array transposition (Image by author)

By using thet function, a new array is generated with its axes transposed. To illustrate this point, let’s consider the following code snippet which demonstrates how to transpose a 2D array:

let array_d2 = Array::from_shape_vec((2, 2), vec![1., 2., 3., 4.]);
println!("{:?}", array_d2.unwrap());

// Output
// [[1.0, 2.0],
// [3.0, 4.0]], shape=[2, 2], strides=[2, 1], layout=Cc (0x5), const ndim=2)

let binding = array_d2.expect("Expect 2d matrix");

let array_d2t = binding.t();
println!("{:?}", array_d2t);

// Output
// [[1.0, 3.0],
// [2.0, 4.0]], shape=[2, 2], strides=[1, 2], layout=Ff (0xa), const ndim=2

Swapping Axes

Swapping axes in ndarray involve exchanging the rows and columns within the array. This can be accomplished by utilizing either the t method, previously discussed, or through using ndarray’s swap_axes method. Swapping axes is a crucial aspect when conducting data analysis with multi-dimensional arrays.

It’s important to note that an axis refers to each dimension present within a multi-dimensional array; for instance, 1D arrays have only one axis, while 2D ones possess two — namely rows and columns. Similarly, 3D arrays feature three distinct axes: height, width, and depth — starting from zero until additional axes are added.

To perform such swaps using Rust’s ndarray library via its built-in methods like swap_axes, you need simply provide it with two arguments representing which specific pair should be swapped around accordingly based on their respective positions along these various dimensional planes!

let array_d2 = Array::from_shape_vec((2, 2), vec![1., 2., 3., 4.]);
println!("{:?}", array_d2.unwrap());

// Output:
// [[1.0, 2.0],
// [3.0, 4.0]], shape=[2, 2], strides=[2, 1], layout=Cc (0x5), const ndim=2

let mut binding = array_d2.expect("Expect 2d matrix");
binding.swap_axes(0, 1);
println!("{:?}", binding);

// Output:
// [[1.0, 3.0],
// [2.0, 4.0]], shape=[2, 2], strides=[1, 2], layout=Ff (0xa), const ndim=2

Linear Algebra

Ndarray, a feature-rich Rust library for numerical calculations and data handling, provides exceptional linear algebra support through a separate crate called ndarray-linalg. This section delves into the diverse array of functions that ndarray offers in terms of linear algebra and how they can be effectively utilized to facilitate data analysis tasks easily.

  • Matrix Multiplication: The process of matrix multiplication can be executed through the ArrayBase.dot function, which effectively calculates the dot product between two matrices. To illustrate this concept further, we will utilize it to determine the outcome when multiplying matrices a and b together and then storing that result in a new matrix called c.
extern crate blas_src;
use ndarray::{array, Array2};

let a: Array2<f64> = array![[3., 2.], [2., -2.]];
let b: Array2<f64> = array![[3., 2.], [2., -2.]];
let c = a.dot(&b);
print!("{:?}", c);

// Output
// [[13.0, 2.0],
// [2.0, 8.0]], shape=[2, 2], strides=[2, 1], layout=Cc (0x5), const ndim=2
  • Inversion: another essential operation when working with matrices that can be achieved using ndarray_linalg::solve::Inverse.inv function that computes the inverse for any given matrix inputted into it! For instance, suppose you want to invert Matrix array_d2, invoke the inv method on its values, and use a match statement to handle the result.
use ndarray::Array;
use ndarray_linalg::solve::Inverse;
use std::result::Result::{Err, Ok};


let array_d2 = Array::from_shape_vec((2, 2), vec![1., 2., 2., 1.]);

match array_d2.expect("Matrix must be square & symetric!").inv() {
Ok(inv) => {
println!("The inverse of m1 is: {}", inv);
}
Err(err) => {
println!("{err}");
}
}

// Output:
// The inverse of m1 is: [[-0.3333333333333333, 0.6666666666666666],
// [0.6666666666666666, -0.3333333333333333]]
  • Eigen Decomposition: The use ndarray_linalg::Eig function showcases this by calculating the eigenvalues and eigenvectors of a matrix. In our case, we determine these values for Matrix array_d2 and save them in matrices eigs and vecs correspondingly.
use ndarray::array;
use ndarray_linalg::Eig;
use std::result::Result::{Err, Ok};

let array_d2 = array![
[-1.01, 0.86, -4.60],
[ 3.98, 0.53, -7.04],
[ 3.98, 0.53, -7.04],
];
match array_d2.eig() {
Ok((eigs, vecs)) => {
println!("Eigen values: {}", eigs);
println!("Eigen vectors: {}", vecs);
}
Err(err) => {
println!("{err}");
}
}

// Output:
// Eigen values: [-3.759999999999999+2.706048780048134i, -3.759999999999999-2.706048780048134i, 0.00000000000000022759891370571733+0i]
// Eigen vectors: [[0.402993672209733+0.3965529218364603i, 0.402993672209733-0.3965529218364603i, 0.13921180485702092+0i],
// [0.5832417510526318+0.00000000000000006939572631647882i, 0.5832417510526318-0.00000000000000006939572631647882i, 0.9784706726517249+0i],
// [0.583241751052632+-0i, 0.583241751052632+0i, 0.15236540338584623+0i]]
  • Singular Value Decomposition (SVD): The power of ndarray_linalg::svd::SVD function is showcased as it calculates the left and right singular vectors along with the distinct values for a given matrix. To illustrate this, we perform SVD on matrix array_d2 resulting in u holding its left singular vectors, v storing its distinct values while containing the right ones.
use ndarray::array;
use ndarray_linalg::svd::SVD;
use std::result::Result::{Err, Ok};

let array_d2 = array![
[-1.01, 0.86, -4.60],
[ 3.98, 0.53, -7.04],
[ 3.98, 0.53, -7.04],
];
match array_d2.svd(true, true) {
Ok((u, sigma, v)) => {
println!("The left singular vectors are: {:?}", u.unwrap());
println!("The right singular vectors are: {:?}", v.unwrap());
println!("The sigma vector: {:?}", sigma);
}
Err(err) => {
println!("{err}");
}
}

// Output:
// The left singular vectors are: [[-0.3167331446091065, -0.948514688924756, 0.0],
// [-0.6707011685937435, 0.22396415437963857, -0.7071067811865476],
// [-0.6707011685937436, 0.2239641543796386, 0.7071067811865475]], shape=[3, 3], strides=[3, 1], layout=Cc (0x5), const ndim=2
// The right singular vectors are: [[-0.4168301381758514, -0.0816682352525302, 0.9053081990455173],
// [0.8982609360852509, -0.18954008048752713, 0.39648688325344433],
// [0.13921180485702067, 0.9784706726517249, 0.1523654033858462]], shape=[3, 3], strides=[3, 1], layout=Cc (0x5), const ndim=2
// The sigma vector: [12.040590078046721, 3.051178554664221, 9.490164740574465e-18], shape=[3], strides=[1], layout=CFcf (0xf), const ndim=1
  • Matrix Trace: The ndarray_linalg::trace::Trace function is a powerful function that calculates the sum of diagonal elements in any matrix. By applying this method to Matrix array_d2, we obtain its trace result and match its value for further analysis. This simple yet effective technique showcases how mathematical functions can enhance data processing capabilities with ease and precision.
use ndarray::array;
use ndarray_linalg::trace::Trace;
use std::result::Result::{Err, Ok};

let array_d2 = array![
[-1.01, 0.86, -4.60],
[ 3.98, 0.53, -7.04],
[ 3.98, 0.53, -7.04],
];
match array_d2.trace() {
Ok(value) => {
println!("The sum of diagonal elements is: {:?}", value);
}
Err(err) => {
println!("{err}");
}
}

// Output:
// The sum of diagonal elements is: -7.52
  • Matrix Determinant: The calculation of a matrix’s determinant is exemplified through the utilization of ndarray_linalg::solve::Determinant function. Our focus lies on computing the determinant value for Matrix array_d2.
use ndarray::array;
use ndarray_linalg::solve::Determinant;
use std::result::Result::{Err, Ok};

let array_d2 = array![
[-1.01, 0.86, -4.60],
[ 3.98, 0.53, -7.04],
[ 3.98, 0.53, -7.04],
];
match array_d2.det() {
Ok(value) => {
println!("The determinant of this matrix is: {:?}", value);
}
Err(err) => {
println!("{err}");
}
}

// Output:
// The determinant of this matrix is: 2.822009292913204e-15
  • Solving Linear Equations: The ndarray_linalg::solve function is utilized to showcase the solution of a set of linear equations in the format ax = b. In this example, we resolve the equation system ax=b by employing a as an array of constants and then store our results within the variable x.
use ndarray::{array, Array1, Array2};
use ndarray_linalg::Solve;

// a11x0 + a12x1 = b1 ---> 3 * x0 + 2 * x1 = 1
// a21x0 + a22x1 = b2 ---> 2 * x0 - 2 * x1 = -2:
let a: Array2<f64> = array![[3., 2.], [2., -2.]];
let b: Array1<f64> = array![1., -2.];
let x = a.solve_into(b).unwrap();
print!("{:?}", x);

// Output:
// [-0.2, 0.8], shape=[2], strides=[1], layout=CFcf (0xf), const ndim=1

In this segment of the article, we delved into working with Multidimensional Arrays in ndarray. These arrays are a crucial component utilized across various scientific computing fields. The array! macro function in ndarray enables effortless creation and manipulation of multidimensional arrays, making it an invaluable tool for data management.

In addition, we have gained knowledge on how to utilize Arithmetic operations with ndarray arrays. These types of arrays are capable of supporting fundamental arithmetic functions like adding, subtracting, multiplying, and dividing. It is possible to carry out these calculations either for individual elements or the entire array simultaneously.

Finally, we delved into the realm of ndarray and its application in Linear Algebra. This dynamic tool offers a vast array of functions that enable seamless matrix operations including dot product, transpose, inverse as well as determinant. These fundamental mathematical tools are essential for tackling complex problems encountered across diverse fields such as finance, engineering, and physics.

Conclusion

Throughout this article, we delved into the fundamental data structures in Rust and demonstrated how to execute various arithmetic operations using the ndarray library. Additionally, it highlights Rust’s potential for linear algebra: a critical component of data science.

This long-running series indicates that Rust is a language with remarkable strength and vast capabilities for seamlessly building data science projects. It provides exceptional performance while also being relatively simple to handle complex datasets. Those looking to pursue a promising career in data science should undoubtedly include Rust as one of their top choices.

Closing Note

Photo by Kelly Sikkema on Unsplash

As always, I want to take a moment and extend my heartfelt gratitude to everyone who has invested their efforts and time in reading this article and following along. Showcasing the capabilities of Rust and its ecosystem with you all was an absolute delight.

Being passionate about data science, I promise you that I will keep writing at least one comprehensive article every week or so on related topics. If staying updated with my work interests you, consider connecting with me on various social media platforms or reach out directly if anything else needs assistance.

Thank You!

--

--

Senior Blockchain Rust Enjoyer at GigaDAO - I occasionally write articles about data science, machine learning and Blockchain in Rust - Currently Writing Books