The world’s leading publication for data science, AI, and ML professionals.

How Do Computers Actually Remember?

A Budding Data Scientist's Introduction to Computer Hardware

Introduction

Have you ever wondered how a computer actually remembers things? Sure, you’ve heard that it uses RAM as "short-term" memory and your HDD or SSD is "long-term" memory. But, how do devices such as these actually hold onto information? That’s the question we are going to answer in this article.

Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash

As data scientists, we typically use high-level tools. Python is great, but it hides from you a lot of the nuts and bolts that make a computer work for the sake of simplicity. And yet, computers are the tools of our trade – doesn’t it make sense that we should understand how they work? This article is the third in my series "A Budding Data Scientist’s Introduction to Computer Hardware". Don’t worry if you missed the first two— you don’t really need them to follow along, but they will help you get to a deeper level of understanding (if you want to read the others, you can find links to the others at the bottom of this article).

The point of this series is to give new (or experienced) data scientists a glimpse at how the tools they use daily actually work, without assuming any knowledge about physics, electrical engineering, or low-level computer science concepts. If you are curious about how computers work, but don’t have much of a background in the area, then this series will be a good place for you to start.

Great, now that we have that out of the way, let’s get into the fun stuff!

Memory vs Storage: What’s the Difference?

Depending on your background, you may have heard that computers employ memory and storage and that these two elements are different. Roughly speaking, memory (or RAM) is the computer’s "short-term" memory, and that storage (usually your hard drive or your solid-state drive) is your "long-term" memory.

For starters, let’s clean up our vocabulary a bit. Instead of calling it "short-term" and "long-term" memory, we will distinguish them from here on out as "volatile" and "non-volatile" memory, respectively. Volatile memory refers to those components that cannot retain data after they lose electrical power, while non-volatile memory components retain their data, even in the absence of electrical power. In truth, there are many different types of volatile and non-volatile memory. This article will focus on volatile memory, which is the memory type used for RAM. In the next article, we will discuss non-volatile memory and compare and contrast the two types.

Volatile Memory: The Short-Term Memory of the Computer

To reiterate, volatile memory loses its data when there is no power. This means that it cannot (and should not) be used to store files long-term. You have likely heard of RAM (Random Access Memory). You can easily see how much of it is being used in your computer in Task Manager, and if you have ever tried to use some of the larger datasets we sometimes get access to as data scientists, you have probably maxed it out before in the middle of your task (always frustrating). While RAM is a type of volatile memory computers use, it is not the only type of volatile memory computers employ.

RAM can be broken down into two broad categories DRAM (Dynamic RAM) and SRAM (Static RAM). We will begin by discussing DRAM, which is the type of RAM you are probably familiar with.

DRAM: It’s all about Capacitors

If you have ever heard someone talking about a "stick" of RAM, or mention how much RAM their computer has, they are talking about DRAM. The core idea behind DRAM is actually pretty simple – it just charges a capacitor to represent 1 and discharges it to represent 0. Let’s take a look at the circuit diagram.

Image by Author
Image by Author

In this arrangement, the word line is a wire that acts as an enable/disable switch, while the bit line is a wire that carries the data to the capacitor. I realize that may not immediately have made sense – that’s ok. I’m going to elaborate more.

To understand how this circuit works, you need a basic understanding of the transistor element. There’s really only one thing you need to know about the transistor for the purposes of this article: it works as a switch. Electricity (Voltage) applied at the base (that’s the part the wordline is connected to) allows electricity to flow from the collector (the part the bitline is connected to) to the emitter (the part the capacitor is connected to).

Think of it like a faucet at your sink. You turn the handle (like the wordline "turns" the base by conducting electricity to it), and water (electricity) flows from the water line ( the collector) out through the spigot (the emitter). If you want to go deeper into how a transistor works, I’d recommend my previous article, How do Computers Actually Compute?

Let’s continue with our faucet analogy. We only really have to make one slight change. Instead of having only one handle at the faucet, you have two. One controls your faucet (the wordline) and the other controls water flow down the pipe (the bitline). The capacitor is an electrical element that stores electricity (like the sink stores water). When the capacitor is full of electricity (when the sink is full of water), we take this to mean there is a 1 bit stored there. Note that the sink can only fill up when both handles are turned on. If you just turn on the one at the faucet (the wordline), then there is no water (the bitline is set to 0) in the pipe to fill the sink (the capacitor).

Let’s summarize. The word line acts as an enable switch, telling our capacitor when it can and cannot read data coming in on the bit line. When the wordline is on and the bit line is off, that stores a "0" in the capacitor. When the wordline is on and the bitline is on, that stores a "1" in the capacitor.

The circuit I showed you is actually a bit simplified. I left out a few components needed to deal with the weirdness associated with the electrical properties of a capacitor. One such property is that capacitors leak over time. In order to keep our data stored properly, we have to refresh the electricity stored in them. Going back to our sink analogy, this would be like turning on the water briefly to refill the leaky sink. If you have ever heard of the refresh rate of RAM, this is what they are talking about. Refresh rate is how often you have to top off the leaky capacitors with electricity to maintain data integrity.

Tying it Together: How we go from single memory cell to DRAM

Let’s call each of these little circuits a memory cell. You can chain together memory cells into a 2D grid to write more than 1 bit at a time. Let’s do an example. Let’s look at how you could use DRAM memory cells to store a 4-bit number.

Side note: if you need a quick refresher on what I mean by a 4-bit number, here’s a quick summary. Computers store numbers in binary (using only 0’s and 1’s). 4 bits means that we can store any number that can be represented using only four 0s or 1s. That gives us a range between 0 and 2³-1 =7. So, translating from our typical number system to binary, 0 = 0b0000 and 7 = 0b1111. If you want to go deeper, I have an article that covers the basics of binary: Why do Computers even use Binary?

Image by Author
Image by Author

Each row of these memory cells can store a 4-bit number. When we turn on wordline 1, it switches on the transistors in the first row, allowing data from bitlines 1–4 to be read into and saved in the capacitors. If we put each bit from our 4-bit number on a different bitline, we can save the whole number at once by turning on the corresponding wordline.

This 2D architecture also lets us assign addresses to each memory element. Say we want to reference the data in the second column of the number in the first row. We know it is at Wordline 1, Bitline 2. This lets computers access memory at different locations with a great degree of precision. A lot of your high-level programming languages abstract this detailed type of memory management, but lower-level languages such as C let you see this a little more clearly.

I’d like to pause for a moment and address a question that may be nagging at you. Why is it called a wordline? That seems like a very bizarre word to use for a wire that is essentially acting like an on/off switch for a row of memory cells. The answer lies in how we talk about data size in Computer Science. All computer hardware has a maximum number of bits per number it can handle. This is usually decided by the processor. Most commonly, processors can handle either 32 or 64 bits per number. This is what we mean when we say a computer is 32 or 64-bit. A "word" in computer science is the size number that can be moved between the memory and the processor at once. Usually, the word size matches that 32 or 64 number, for simplicity. In the above example, we only had 4 memory cells per row, so our word size was 4. In a real system, there would typically be 32 or 64 elements per row. The wordline controls read/write to a whole word of data, hence the name. RAM is called RAM (Random Access Memory) because it randomly grabs an unused row and accesses it when it needs to store data.

Now you should have a pretty good understanding of how DRAM functions. However, DRAM isn’t the only type of RAM your computer uses. There is another branch of RAM – SRAM. SRAM stands for Static RAM; it’s what your CPU uses to cache data it is actively working on.

SRAM: Speedy Memory for a Speedy Computer

SRAM is used by your computer’s CPU to quickly access data it needs when working. To accomplish this, we employ SRAM in 2 ways: as a cache and as a register. First, we will delve into how SRAM works, then we will discuss the difference between these two use cases.

Flip Flops! (Not the Beach kind)

SRAM works fundamentally differently from DRAM. It does not use capacitors as the unit of data storage – instead, it uses clever arrangements of transistors called logic gates to store data.

If you aren’t familiar with logic gates, I’ll briefly summarize what you need to know for this article. Logic gates are circuit elements made of transistors that implement basic logic functions. For example, an AND logic gate outputs a 1 if both of its inputs are 1, else it outputs 0. Think of it like a component that says "If this input AND that input are on, I’ll be on. Otherwise, I am going to be off". AND is part of a family of logic gates such as OR, and NOT that can be implemented using transistors. For a more detailed discussion, refer to my previous article How do Computers Actually Compute?

Photo by Ridwan Muhamad Iqbal on Unsplash
Photo by Ridwan Muhamad Iqbal on Unsplash

There are two circuits made of logic gates that we use for memory: flip-flops and latches. We are going to focus on latches because they are a little simpler, but most SRAM circuits use flip-flops. However, they are very similar circuits, and the core idea is the same for both.

Let’s start by taking a look at the circuit.

Image by Author
Image by Author

This is the circuit for a D-latch (D stands for data). The idea behind this circuit is that whenever the Enable is on (set to 1), then data is read in from the data line and stored. If Enable is off, then Stored_Data keeps its previous value no matter what happens to Data.

Play with the circuit in your mind. As a reminder, here is what each gate does. The NOT gate inverts a signal (if input = 0, output = 1. If input = 1, output = 0). The AND gate is 1 if both inputs a 1, otherwise it is 0. The OR gate outputs 1 if either or both input is 1, but if both inputs are 0, it outputs 0.

If you play with it long enough, you should find the following table of outputs (called a truth table) from the above circuit.

Image by Author
Image by Author

Note that "X" means that we don’t care what the input is, the circuit has the same behavior either way.

The only real difference between the D-latch and the D flip-flop is that the enable wire is connected to a clock in the flip-flop. That means that data updates on the clock change, but not otherwise. Some circuit designers prefer to use these instead of latches, but the core idea remains the same.

Each of these circuits makes up one memory cell of SRAM. We can connect a different bitline to each data line, and the enable line to a wordline (just like we did for DRAM) to make larger and larger SRAM components.

Two Applications: Caches and Registers

SRAM is used mainly in your computer’s cache and its CPU registers. The reason we use it is simple: it is faster and physically smaller than DRAM. Capacitors are notoriously large and slow, which makes them a bad choice in applications where we need speed. However, they are considerably cheaper than their SRAM counterparts, so computer engineers try to strike a balance between speed, space, and cost-effectiveness by using SRAM where they need to and DRAM everywhere else. Registers and caches are two places engineers generally prioritize speed and size over cost. We will briefly cover the purpose of each of these here.

A cache is a bit of RAM used in a similar way that DRAM is used. Once data stored in DRAM is used by the CPU, it is copied over to the SRAM data cache as well. The reason for this is a computer science principle that you will see applied in many different branches of the field: if data is accessed once, it is likely to be accessed again soon. We copy data from DRAM to the SRAM cache because SRAM is much faster, and it allows us to speed up our computer’s access times any time our CPU has to access data it has used recently.

CPU registers are small memory elements stored physically inside the CPU unit that temporarily hold data on which it is operating. The CPU is one of the quickest operating components in your computer – oftentimes it is waiting for data to arrive from RAM or nonvolatile memory so it can perform its computations. When it is performing a computation (like addition), it loads the two numbers it is operating on into an SRAM register inside the CPU and then pipes that directly into the CPU. SRAM registers allow for the CPU to quickly carry out memory operations on data it is actively working with.

Concluding Remarks

In this article, we discussed briefly the difference between volatile and nonvolatile computer memory. We learned that volatile memory loses its data when it loses power, and nonvolatile memory does not. We then delved into the inner workings of RAM, the most common type of volatile memory used in computers. We distinguished between the slow, large, and cheap DRAM and the quick, small, and pricey SRAM. We showed the circuits for both and discussed how they can fit together to store large amounts of data.

Hopefully, this article gave you a deeper understanding of how a computer actually remembers data. As data scientists, computers are integral to our careers. Having a deeper understanding of them can help make you better at using them effectively. Memory is a core concept in computing, and knowing which types work best for which applications and why can be critical in applications where you need your application to run quickly.

I hope you found this article interesting and useful. Next time, we will discuss the other type of computer memory – non-volatile computer memory. We will discuss prominent non-volatile technologies like hard drives and solid-state drives.

After we finish discussing memory, we are going to bring together a lot of the concepts we have covered in these first three articles and discuss how it all fits together to make the CPU – the central processing unit. If you have never really understood how the brain of a computer works, keep following along, because that’s where we are headed with this series.

I hope to see you again next week! Have a great day friends.

Want to check out the previous articles in the series? See below.

Why do Computers even use Binary?

How do Computers Actually Compute?

References

[1] R. Palaniappan, Digital Systems Design (2011), https://dvikan.no/ntnu-studentserver/kompendier/digital-systems-design.pdf


Related Articles