Sunday, 23 September 2018

The Intricacies of Data Transfer (and some Other Random Stuff)

Okay, CPUs, or the computer's central processing unit, is basically the heart of any computer and they are pretty complex these days. However, the key to understanding computers is to understand how these chips operate. So, let us first look at a diagram of the CPU and see the parts that make up the whole:

Well, that diagram may not say as much as one might expect, so let's unpack it and look at each of the components.

RAM: This is the random access memory. Technically this isn't a part of the CPU, however the modern processors do have memory built into the chips known as Cache (more on that later). This is pretty much where everything is stored, from the data to the instructions.

MAR: The memory address register. Now, whenever you see the word register you can basically assume that this is a short term storage area. In this particular instance, this is where the addresses are stored. Much like the physical address where you live, the address on a computer tells the computer where data, or instructions, are stored in the RAM.

MDR: This is the memory data register, which is where data, or information, is temporarily stored. Basically when data is fetched from the RAM (based on the address in the Memory Address Register), it is temporarily stored here before it is sent into the CPU to be processed. However, once it has been processed, it is sent back here to be stored while the computer searches for the address where it is to be placed in the RAM.

IR: The instruction register is the third register in the CPU, and this is where the instructions that have been pulled from the RAM are stored. Basically the computer checks the instructions, and once it has determined what needs to be done, it performs it by sending it to the Control Unit (if it can be performed, but we will assume that it can be performed).

PC: This is the program counter and is where the computer keeps track of where it is up to in the specific program. It normally holds the address of the next instruction in the program, so that once the current instruction has been completed, it can then use the information in the program counter to move on to the next step and fetch the next instruction.

ACC: The accumulator is basically where the results of any instructions, specifically mathematical functions, are stored before being returned to the MDR to then be placed back into the RAM.

Control Unit: Where the CPU is the heart of the computer, the control unit is the heart of the CPU. This is where instructions are fed to be interpreted and then actioned. The control unit does a lot more than just interpreting instructions, it also actions them.

ALU: The Arithmetic Logic Unit is basically where all of the mathematical calculations are performed. Those adders that we explored in a previous post will actually be found here, but there is actually much more to the ALU than simply adding two numbers together as it is also designed to perform complex calculations such as floating point processes. In older computers there used to be a chip called a Maths Co-processor that was designed to perform these functions, but these days this is incorporated into the CPU.

Clock: The clock rate of the cpu is basically how many instructions can be performed in a given period of time, and the clock basically keeps track of the instruction cycle. In a sense what it does is that it keeps everything moving along.

The Instruction Cycle

Okay, I was bandying around the term 'Instruction Cycle' so it might be an idea to actually explain what it is about. Basically the instruction cycle is known as the 'Fetch - Decode - Execute - Store' cycle and is the basic way the CPU operates. So, the computer checks the address in the Memory Address Register and then fetches whatever data happens to be there. It them places it into the Memory Data Register. The control unit then interprets the data to determine whether it was an instruction, or just plain data (which it would have known from the instruction that had already been execute). Once this is determined, the instruction is then executed before being returned to the RAM.

Let us consider this process - the first instruction is to fetch data from a specific address, which it does. The next instruction requests the computer fetch some more data, which it does. The third instruction tells the computer to add those two pieces of data together, so they are then sent to the ALU, the addition is performed, and then placed back into the accumulator. No further instructions are required, so it is then sent back to the RAM, and the program counter then checks the address of where the next instruction happens to be. That is then pulled into the Memory Data Register, and because it is an instruction it is sent to the instruction register. Unless the instruction happens to be an instruction requiring the program to jump to a different address, the Program counter will store the next address as the location of the next instruction.

And so it goes ...

Now, if you look at the CPU diagram above, you will notice that there is only one way traffic between the CPU and the RAM through the Address register. This is because addresses aren't pulled in from the RAM, but rather sent to the register from any instructions that have been pulled. The instructions will tell the CPU where any relevant data is stored, and as such those addresses will be sent through to the RAM via the Memory Address Register. However, the Memory Data Register has two way traffic, right down to the ALU. This is because data is flowing both ways, out of the RAM, and back into the RAM. Notice also the CPU bus, which connects the registers, the Program Counter, and the Instruction Register. This is because data pulled from the RAM could go to any one of those locations. Notice also that there is two way traffic into the Control unit, but also lines from the control unit to all parts of the CPU, and even the RAM. This is because while the Control unit interprets instructions fed into it by the instruction register, it also needs to interpret data brought in from other areas. Thus, when an instruction tells the CPU to fetch some data, it needs to notify that the next bit of information coming into the Memory Data Register is data, and not an instruction.

Confused yet? Well, so am I and I'm writing this. Maybe this video will help:

Swapping Devices

Okay, we will come back to the CPU a bit later, and now look at the concept of swapping devices. Basically there are three ways of doing so: Hot Swapping, Warm Swapping, and Cold Swapping. Hot swapping is where you can add or remove a device while the computer is actively in use, warm swapping is occurs when the computer, or device, is asleep, and cold swapping is when the computer has to be shut down to replace them.

Now, an example of hot swapping would be the mouse or the keyboard. Honestly, I generally don't like hot swapping things, but if you need to change a mouse, you can simply unplug one, and plug in another.

An example of warm swapping would be an external hard drive. Okay, I know that you can theoretically hot swap them, but really, that is a very, very bad idea. There is a reason why you should unmount a external harddrive before removing it, and that is because it parks the head. If you don't do this, you are actually in danger of destroying your data (I ought to know, it happened to me). However, you can plug them back in without any concern. Other examples would be removing storage devices from inside the computer, such as CD and DVD ROMs.

As for cold swapping, I would suggest a graphics card. In fact, to replace one of them you generally need to completely unplug your computer, open it up, and then put it in when it is powered down. Oh, and replacing the power supply is also another example (though I suspect if you did try hot swapping it, your computer wouldn't be powered on for much longer, and you also run the risk of getting a pretty nasty, death inducing, electric shock).

Serial and Parallel Connections

So, there are two types of cables (well, there are probably more than two 'types' but we are talking about what is termed as bus topography here), and that is serial and parallel. Basically a serial cable is where you have a single cable and the data is sent along the cable sequentially. The parallel cables have multiple cables running side by side, and the data is sent down each of the cables. Now, the USB cables are a classic example of a serial cable, but here is a picture of the old school ribbon cables (which you used to find inside your computers, and some of the older computers, such as mine, still use them):

Here is a diagram of how they are, theoretically, supposed to work:

real arts!!!: Data Transmission!!!
Now, the problem with serial cables is that they were slow, which is why parallel cables were in frequent usage in the past. However, speeds have increased significantly, so parallel cables are no longer as prevalent as they once were. The reason for this is that there tended to be a lot more problems with parallel than serial. For instance, parallel cables were only practical over short distances, the longer the distance the less effective the cable was. Secondly, there was an issue of cross talk, which basically meant that the cables would have the effect of interfering with the cables next to it. This was sorted out by having ground wires. The final problem is that if all of the data didn't arrive at the same time, then there would be delays as the components pretty much played catch up. So, now that serial cables are much faster, the parallel counterparts are pretty much a thing of the past.

It should be noted that even though the cables are a thing of the past, if you actually look at your motherboard, or even inside your CPU (which you pretty much can't because they are so small the buses are measured in nano-metres, which means that you will be able to see didly squat), you will still find parallel connections. This is because the distances involved are so small that syncronisation issues aren't that big of a problem.

Another thing about parallel cables is that they really can't handle higher speeds in the way that serial cables do. Now, this is a little year twelve physics - when you send an electric current down a wire. the current will produce a magnetic field around the wire - this is why you generally don't see houses built underneath those huge transmission towers, it has more to do with the electro-magnetic field that is produced than the towers accidentally falling down (because houses are close enough that if they do fall down, then there is a still a chance that they can fall on the houses).

Now, the faster you send signals along the cable, the greater the magnetic field becomes, and as mentioned above, these fields have a tendency to interfere with the parallel wires. So, when the transmission is sped up, the chance of interference will also increase substantially. 

More on Error Correcting

So, we have already discussed error correcting codes, so now we shall see how this works in reality. As we understand, there are some codes where errors are detected but cannot be corrected, so that is here something called an Automatic Repeat Request (or ARQ) comes in. Basically if the receiver receives a package that is flagged as being in error, this package will be discarded and a negative acknowledgement will be sent back to resend that particular package.

Now, we have something called Stop and Wait ARQ. The diagram below should explain things, but I'll talk you through it as well.

So, what we have here is that a single frame (or package), is sent, and once received, if it is received without error, then an acknowledgement is sent. Once the acknowledgement is received the second frame is sent. Now, if the acknowledgement is not received after a certain period of time the sender will send the frame again. Now, if the frame has already received, it will send an acknowledgement back corresponding to that frame, and discard that frame. Now, this acknowledgement is the key, because if the frame isn't received, then the acknowledgement won't be sent, so the sender does not know whether it has been received or not, so after a period of time it will resend the frame.

Now, notice how it works on 0s and 1s. Everything with computers works on 0s and 1s (well, not quite, but that is getting into the realm of quantum computing, and we won't be going down that road, at least yet). Now, when frame 0 is sent, acknowledgement 1 is sent, which tells the sender to send frame 1. When frame 1 is received, it responds with acknowledgement 0, telling it to send the next frame, which becomes frame 0. Now the acknowledgement tells the sender whether it needs to send the next frame, or the same frame. In the example above, the second acknowledgement was lost, so when the second frame 1 was sent, the sender sent a second acknowledgement 0 because it knew that the first acknowledgement wasn't received.

Finally, let us consider Forward Error Correcting (FEC). Now remember in the previous post we were talking about Hamming Codes and SECDED. Well, this is where forward error correcting comes into play. Basically if the receiver detects an error, it can then attempt to correct that error without having to make a repeat request. It is only when the errors can't be corrected (namely that there are two errors) that the repeat request needs to be made.

Now, there are a few others as well, such as Go-back-N. Here a series of frames are sent, and if one of the frames received has an error, then any subsequent frames that are received are discarded and an negative acknowledgement is sent back regarding the corrupted frame, and the sender repeats the process from that particular frame. Finally, there is the selective repeat, which is similar to Go-back-N, except that only the corrupted frames are resent. So, the receiver only discards corrupted frames, and sends a negative acknowledgement, and the sender then only sends frames where a corrupted acknowledgement was received.
Creative Commons License

The Intricacies of Data Transfer (and some Other Random Stuff) by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

No comments:

Post a Comment