Why do CPUs Need Caches? – Computerphile

Why do CPUs Need Caches? – Computerphile


So, I’ve come here today because you promised me “cache”, but I don’t see any money. I thought you were paying me for these things, but- No, we’re not talking about that sort of cash. We’re actually looking for the cache that is built into our CPUs and they’re used in computers to try and make things run faster. Now, we talked about how the CPU talks to memory, and we spent some time looking at how we can build memory chips out of discrete logic circuits. While you probably want to build all the memory in your computer system like that, there are other ways you can build them to create SIMs as you’re using in the late 80s, early 90s. And this one is about 256 kilobytes, but you can get DIMMs that are as big as 16 gig these days. Now, if you remember back to what Steve Furber was saying about when you build the BBC micro, he was talking about how when they built it, they used RAM chips that ran at twice the speed of the CPU. So, we got the 8 RAM chips here, and they’re connected directly, more or less, to the CPU here. The memory ran at 4 megahertz and the CPU ran at 2 megahertz. And so the CPU could make its requests and the RAM would return it very quickly. And while the CPU was still processing that, the video circuits could grab the data from memory to form the display. So, it was able to multiplex the two and not slow the CPU down, unlike some of the systems. Now, as time went on, the CPUs got much faster so by the end of 80s, you could get CPUs like this, which ran at 8 megahertz and then 16 megahertz, 32 megahertz and so on. And now a 3 gigahertz CPU is very easy to get a hold of. Unfortunately, the RAM didn’t increase speed at the same rate. So these days, the RAM runs several orders of magnitude slower than what the CPU runs at. So, this leaves us with a problem. Even if the clock speed of the CPU increased, it would still have to wait for the memory, so it wouldn’t actually appear to get any faster. So actually, it is possible to build memory that will work at the speed that the CPU executes at. But the problem is it takes more space on the silicon to store each bit of information, and so therefore, it costs a lot more to produce the memory compared to the DIMMs, the DRAM that we use today. So the way we get around this is we split the memory up into two types. We have our main memory, which we build out of dynamic RAM. But we also have a second type of memory which is actually often built into the CPU as well. Now this is much smaller, but it’s built out of much faster memory. And this is referred to as the cache. Now, the cache is perhaps an old-fashioned English word, but it basically just means a small place where we can store things. So you might use it to store your hidden treasure if you’re a pirate or to store your food for winter. Another example where you might come across a cache is with your web browser. So the cache on the web browser is used to get around because it takes a relatively long time to fetch a piece of information over the Internet compared to accessing something on your local machine. So what happens is: When you go and fetch a page from the Internet the browser will go and get the HTML page, it’ll get the CSS files, the images and so on. And it’s stores or caches a copy onto your local disk that it can then refer if it needs to get it again. And the idea is that we can get the data from the local copy a lot quicker than it could if it had to go fetch it from the web server somewhere else in the world. And it’s this same approach that is used by the CPUs. The CPU’s got the same problem. It can talk to its cache on the CPU very, very quickly, but talking to main memory, compared to talking to the cache is a relatively long time. So what happens is: Every time it requests a bit of data, it caches a copy locally in the cache built onto the CPU, so that when it needs to fetch is again in the near future, it can access it from its local copy a lot faster. There’s some other tricks that it can do as well, because the CPU can say, “Well, actually, if I fetch this instruction, there’s a very good chance that I’m going to execute the next instruction, and the one after that at the same time.” And so what it can do is, rather than just getting one word of memory at a time, it’ll say, “Well okay, get me the next 128 bytes of memory.” And it’ll read what we call a cache line, one single lot of 128 bytes from memory into the CPU in one go. The idea being that it takes less time to read 128 bytes in one go, than just to read each 128 bytes individually. But that’s down to the way memory actually store things. So, we talked about, in the previous video, how we would have an address, a binary number that represents each different bit in the RAM chips. But actually, rather than storing it as one big list of bits, it actually stores it as a grid. The address that you give it from the CPU gets split up to reference a particular row, and a column of that grid to get the particular bit it’s interested in. Now, the way the RAM chips work: One you’ve selected a specific row, you can then access each of the columns in that relatively quickly, compared to changing to a different row. So if we want to get 128 bytes, if they’re all in the same row of memory, then we can access them very quickly, and so copy them into the CPU’s cache much quicker than if we were having to select different rows at a time. So, how big does the cache on your CPU need to be? Well, actually it turns out, you only need a relatively small amount of cache to make a significant difference because our programs are often sitting in loops, executing the same set of instructions again and again and again. So if you’ve got enough to store that loop, then they can be cached and it’ll work fine. Or the program’s accessing the same block of data and manipulating that in different ways. And so if that will fit into the cache, things work relatively fine. So we don’t need that much. You often use separate caches for the instructions and for the data so you don’t remove the instruction that you’re interested in to put a copy of the data that you’re going to process in there. I mean, you could think about it, that you might have a field full of turnips and so you’d have to go and dig them up. But you might also have a cupboard with your turnips in the kitchen, so you can make your stew that night without having to go out into the field And if you’re out working somewhere else, you may even have a turnip in your backpack to eat on the way out. Second level turnip cache? Yeah! *laughter*