Saturday, 8 December 2018

Memory - And the Computer Remembers

Well, it sounds as if they had decided to use the term Bandwidth for memory. Well, that is confusing because bandwidth seems to have a different term elsewhere, namely the range of frequencies that something is broadcast upon. However, this isn't the case with memory because the frequency, or speed, of the memory has nothing to do with its bandwidth. Here, it is basically the width of the connector to the memory which tells us how much input and output can occur at the same time. Namely, it is how much data can be transferred at a single time.

There is also latency, which basically tells us the time it takes for something to be found within the memory, but it is more than that because we are also talking about something getting placed in the memory, so in a way latency is basically the speed of the transactions. However we will get onto how to work these things out a little later. First, let us look at the memory hierarchy.

The Pecking Order

Once again, a diagram is going to help here:

Okay, the pecking order really comes down to the speeds. Now, we won't be worrying about things at the bottom of the pyramid just yet, we will only be focusing on things at the top.

Now, note the pyramidal structure - the reason that it is like that is because we are not just talking about speed, but also size and cost. So, the registers at the top, which is where information is stored in the CPU while it is being processed, are the smallest, the fastest, and the most expensive. Then we have the caches. As explained previously they sit in the CPU (though the level three cache, which isn't listed here, sits just outside), and are constructed using flip flops. Once again, they are fast, small, and pretty expensive.

Then we have the main memory, which is basically going to be the focus of this post, though we will be touching on the others as well as they also play a role. Now, these memory modules are the ones that you purchase to place into your computer, and they use capacitors to store information (though if you were to look at one you will think that they are just more integrated circuits). Capacitors are actually cheaper than flip-flops, but they need to be constantly refreshed. They are also set out in a grid arrangement, and the various points are located based on rows and columns.

Memory Types

So, now, let's look at the various types of memory. There are going to be a few more than what I have mentioned here, but they do give you an idea of how memory developed.

PROM: This is programmable read-only memory, and really isn't in use today. Basically the program on the memory was written when the memory was made, though there were some later varieties where you could program it afterwards. However the results were the same: basically once the memory has been programmed it pretty much stays that way, well, forever. There used to be a time when ROM was a standard facet of pretty much every computer (I remember my old Commodore 64 had RAM - Random Access Memory, and ROM; the ROM basically contained all of the computer's system information). However, these days it pretty much isn't used.

EPROM: Erasable Programmable Read Only Memory was slightly different in that you could actually remove the contents of the memory, and then reprogram it. However, it was pretty basic. You would recognise one because it had a little window in it. That was because to erase the contents you had to shine an ultraviolet light into the window. There were two problems though - firstly you had to erase everything on the chip and start from scratch, there was no half measures here. The other problem was that you had to make sure that no UV light accidentally got into the chip.

Here is a picture of one:


EEPROM: This is the next evolution and stands for electronically erasable read only memory. This was far superior as you could erase parts of the memory and reprogram it, and you didn't have to keep it away from ultra-violet light. These chips are still around today.

Now, the difference between ROM and RAM is that RAM is known as volatile memory, while ROM isn't. Volatile memory means that if you turn the computer off basically everything is lost. This is why ROM was useful because it would retain the information even though your computer has shut down. The RAM is the memory that is made up of capacitors, and the reason that it is volatile is because unless there is a current flowing through the memory, the charges that are held in the capacitors will quickly drain away - think of it like a tank of water, where there is an inlet, and an outlet - as long as the water flows, there will be water in the tank, but as soon as the flow stops, the water will eventually all drain out.

More Tidbits

Let us jump back to the latency and bandwidth for a second. You see, in your normal laptop or desktop latency isn't really important because it is really only for time critical situations, particularly with server systems. However, if you were using your computer for a home video unit, then bandwidth would be important where you are moving large chunks of sequential data. The thing with computers is that they really only move one piece of data at a time - they can't mix and match. As such, when we are moving large amounts of non-sequential data, latency suddenly becomes more important than bandwidth,

Okay, let us move onto random access memory. There are two types of RAM - SRAM and DRAM. SRAM stands for static RAM and DRAM stands for Dynamic RAM. Let us look at each individually:

SRAM: This is generally what the cache is made of and is constructed using flip-flops. Basically it can hold information until power is supplied, though it is still volatile which means if you turn off the computer everything is lost. It has a low density, and tends to be a lot more expensive and a lot more power hungry.

DRAM: The dynamic RAM is what your memory modules are made up of and are usually comprised of capacitors. The problem with capacitors is that they can only hold a charge for a very short period of time, so they need to be constantly refreshed. However, the density tends to be much greater, and it also tends to be a lot cheaper to produce.

Cache

I've probably talked about cache quite a bit, but there are still things that I haven't touched. The thing is that cache is small, which means management of the data that it holds is paramount. Now, when a CPU is searching for something, it beings by searching the cache, from first to third. If it finds what it is looking for then it is a cache hit, otherwise it is a cache miss, and then goes and looks for the data in the main memory.

So, the trick comes down to knowing what to keep in the cache and what to discard. There are a few methods, including least recently used, namely the stuff that has been there the longest, and hasn't been used, is tossed. The most recently used, which is basically the opposite. The least frequently used, or simply just a random selection. Honestly, there isn't an optimal answer because, as Murphy's Law implies, as soon as you toss something out is basically when you actually need it.

Now another thing we should take into consideration is with the main memory and the CPU's front side bus (basically the front door). You see, the front door is only so wide, so if the road from the memory is wider than the CPU's front door we are suddenly going to get a bottleneck as all this stuff tries to squeeze through. On the flipside, if the front door is wider than the road, well, we are going to find the CPU sitting idle while we are waiting for the goodies to arrive.

Figuring Out the Numbers

Okay, have a look at this little piccy I got from Hardware Secrets.

Do you know what all those numbers mean? You do? Good, because I don't.

No, seriously, what we are looking at here is what is referred to as the RAM timings. They basically tell you how fast the RAM is, and also it's bandwidth. Now, note that it is DDR3 - that stands for Double Density RAM. Pretty much all RAM these days is double density, but it is important to remember this when working out the timings. As for the 3? Well, that basically means that it is third generation.

So, first comes the maximum theoretical transfer rate. Actually, it is already written on this label, where it says PC3-10666, however you can work it out from just the first bit, namely the 1333. Now, this is the clock speed, but not the real clock speed. The reason for is is because it is double density, so you much halve it, so it comes down to 666. Now, there is a formula for working out the transfer rate, namely:

clock speed x (number of bits/8)

Now, the number of bits is 64, so divided by 8 gives us 8, and the clock speed is 1333, so multiplying that by 8 gives us 10666 (we don't halve that number for this equation).

Lets now turn to the latency. See the numbers that say CL7-7-7-18 - well, that is the latency. The first number is 'Column Access Strobe' latency or CAS Latency. It is also referred to as 'Access time'. This tells us how many clock cycles we have to wait once the column address has been sent to the controller to receive what we have requested. The next number is the RAS (row access strobe) to CAS delay, and means that this is the number of clock cycles we have to wait once a row has been selected before we can send a column address to the RAM controller.

The next number is the Row Prechange Time, and is the number of clock cycles we have to wait if we already have a row selected but we want to change to another row. The final number is the Row Active Time, which is the number of clock cycles that the row needs to have been active before we can send a request to it. Now, the thing is these days only the first number will actually be listed on the memories specifications.

Okay, now, lets try some math. The time we have to wait to receive information from the RAM if a row has not been selected:

  • Activate the row and wait 18 clock cycles;
  • Activate the column 7 clock cycles;
  • wait for a response 7 clock cycles;
So the total is 32 clock cycles.

What if we have the wrong row selected?
  • Change the row and wait 7 clock cycles;
  • Activate the column 7 clock cycles;
  • wait for a response 7 clock cycles;
So the total is 21 clock cycles.

At the basic level latency is the the delay between when an instruction is entered and when it is executed, however it is measured in clock cycles. Now, to work out how much time it actually takes we need to go back to what we did for when we worked out the CPU clock speed. It goes by the same process.

So, let's work that out for the above memory. We know that the memory is DDR3-1333, so the clock speed is 666 Mhz (which is 666 cycles per second). Now, to get the actual time we need to invert it, and turn it into nano-seconds. Remember that this is in Mhz, so we should convert it to Hz, so the speed in hz is 666 000 000. Now invert it and you get:

0.000 000 001 5

This is in seconds, so we need to convert it into nanoseconds, which is 109


So, our answer, which is the speed of a clock cycle, becomes 1.5 ns. Now, that we have the speed of the clock cycle, we can now work out the true latency, which all we need to do is to multiple it by the number of clock cycles.

Our first answer was 32 clock cycles, multiplying that by 1.5ns, gives us 48 ns. The second answer was 21 clock cycles, so the answer is 31.5 ns.

Let us try one more before we move on. This time the memory module is a DDR4 2133. So, first we halve the frequency (due to it being double density), so we have 1066.5 Mhz. Then we convert it to Hz, which gives us 1066 500 000.

Then we invert it (1/1066 500 000), which gives us: 0.00000000093764. Then we convert it into nano seconds (since this is in seconds), which produces 0.94 ns. The CAS Latency is 14, so multiply that by 14 and we have 13.16 ns.

Isn't it odd that the manufacturers don't actually advertise the true latencies of their product, and we have to work it out ourselves. Well, look at this table:


Interesting, isn't it. The true latencies actually haven't changed all that much, despite the fact that frequencies have actually got higher. That's probably why they don't advertise it, because you are probably going to be drawn to higher numbers (even though higher numbers for the CL latency is actually worse as opposed to better). The thing is that while the speed hasn't increased, the performance has. Also, the other reason that the latencies haven't increased is because the memory is actually a lot larger, so the larger the memory, the more time it will take to search for what is needed.

Overclocking

Okay, this is the part that probably a lot of people are interested in. This is basically forcing the computer to run faster than what the manufacturer intended it to run. The big question is whether it can actually damage your computer. Well, unfortunately the answer is going to be yes, and no. Okay, that probably doesn't really help you all that much, but the thing is that computers generally have fail safe mechanisms to prevent damage. If the computer ends up being overclocked, it is likely to shut down before any major damage occurs.

Components that can be overclocked are memory, CPUs, video cards, and motherboards. Let us take a CPU for example. The CPU speed is a multiplier by the front side bus speed, so a processor with a multiplier of 16 with a FSB (front side bus) speed of 200 Mhz, would run a 3.2 Ghz. You can increase the speed by either increasing the FSB speed, or the multiplier. Increasing the FSB speed really only increases the speed between the memory and the CPU, while increasing the multiplier only increases the processor speed. However, we need to take voltage and heat into consideration.

If you only want to increase it by a little, then maybe increase the FSB speed, however if you want to go for broke, then increasing the multiplier is the trick. Remember though, even though it is unlikely you will damage the component, you can still damage it. Honestly, I have never had a need to overclock my computer. The other thing is that by gradually increasing the speeds is a way to reduce the chance of it burning out.

One of the reasons that people overclock their systems is for performance testing, and honestly, people have probably already done it, so you can check out the results of the various products here.


Creative Commons License

Memory - And the Computer Remembers by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

No comments:

Post a Comment