Sunday, 28 April 2019

Intro to Data Comms

The proper title of this subject is Data Communications and Net-centric computing, and a lot of people shortened the title to DCNC. Honestly, I didn't particularly think all that much about that, so instead I simply referred to it as data comms. Anyway, you know how we seem to magically be able to connect to a computer on the other side of the world and be able to access the information on that computer almost instantaneously. Well, this subject is designed to actually demystify all of that technobabble and actually demonstrate how it is done. Mind you, one of the reasons that we are able to access Netflix has more to do with there being a server here in Australia as opposed to actually downloading the information directly from the United States. One of the reasons that this isn't all that feasible, despite this information traveling at, or at least pretty close to, the speed of light has something to do with there not actually being a direct cable between Australia and the US.

Here is a map of where all of the submarine cables are located across the world that enable us to be interconnected in a way that we haven't been before. Oh, and before you ask, satellite transmission is so painfully slow that we simply don't bother with it, despite the fact that once again the signals travel at, or at least pretty close to, the speed of light.


Bell's Invention

So, let's consider a little history here. Sure, we could say that Alexander Graham Bell 'invented' the telephone, but honestly, people were sending messages electronically long before he made that famous call to the guy in the next room. The thing is that before the telephone there was the telegraph, which was used to transmit messages across long distances. Before that, with the exception of the Greeks (or was it the Persians) using bonfires to transmit messages, the fastest way to send a message from one place to another was by horse. Actually, the United States had this method known as the Pony Express, where a rider would ride a certain distance, and when he reached a checkpoint, he would hand the parcel to a much more rested horse and rider. Still, that was a pretty slow way of transmitting messages.

Now we have the telephone. The way the telephone works (and after I discovered this I can never look at that device the same again) is that there is a diaphragm in the speaker that vibrates when you speak. The vibration then causes a circuit to connect, though the strength of the circuit will depend upon the strength of the diaphragm hitting the circuit. This is how our voice is modulated into an electronic signal. The signal then travels down a wire, through a system known as the PTSN, or public telephone switching network, to the destination. The electric pulses will then hit a magnet which will grow strong and weak based upon the strength of the signal hitting it. This magnet will cause another diaphragm to vibrate, and this vibration, not surprisingly, produces sound. In fact the sound that is produced is a replication of the sound that was originally spoken into the telephone.

The other thing is how the telephone actually knows where to connect to. Well, originally you would have to dial the switch board and tell the operator who you wanted to connect to. When I was young we had these rotary phones, and later push button phones (which is why we use the term 'to dial a number', and the term 'ring' comes from the fact that a bell in the phone would ring when we called somebody - much different to the Beyonce that comes out of our modern phones). Each of the numbers would take a certain amount of time for the dial to return to its previous spot, and that length would tell the operator, and later the computer, the number that was requested. Put them all together and you get a telephone number. This was similar to the push button phones, except each of the buttons would send a signal down the line that was slightly different to the others. When the signal reached the exchange, the computer would interpret these signals and work out the number that was wanted.

Another thing with the phone number is that it is divided into sections - take this phone number 08 8245 2212. The first two digits is the area code (this is an Australian phone number), and tells the exchange which state they want. The next four numbers (originally it was three, but we run out of numbers so added another number to the front) tells the exchange what exchange is wanted. The last for digits is the actual number of the phone that is being dialed.

The thing is that this world is analog in nature, but computers, or at least the computers that we are currently using, really only understand the world as a series of 0s and 1s (or ons and offs, or true and false, but you get the idea). So the trick here is basically attempting to translate what is in effect analog, or continuous, into digital, or discrete.



Sine Waves

So, this is a sine wave, or more appropriately a sinusoidal wave.


I would have pulled the pictures from the notes to show how the sine wave comes from a circle, that is pulled apart and then placed along an axis (which is what is above) but the video below is so much better.


So, the sine wave is basically a continuous line that goes up and down. The wave is made up of a crest, the section above the x-axis, which is the time axis, and the trough which is the area below the x-axis. The peak to peak amplitude is the distance from the bottom of the trough to the top of the peak, and one whole cycle, namely the amount of time it takes for the wave to go to each of the peak and the trough and back to its original position (even if the original position is at one of the peaks) is known as the wavelength.

A sine wave can be rendered mathematically as follows:

x(t) = A.sin(2.π.f.t + φ)

Now, we can reduce that by including the angular frequency, which is:

ω = 2.π.f

so, the formula becomes:

 x(t) = A.sin(ω.t + φ)

The following values are as follows:

A = amplitude
f = frequency
t= time (in seconds) 
φ = phase (in radians)
ω= angular momentum
Π = pi, a constant, of 3.14 (though it is an irrational number, meaning that it goes on forever).

So, the amplitude is the y-axis, and is usually measured in volts.
The frequency is measured as the number of wavelengths in one second. The phase is determined by how far along the x-axis the intersection is (that is where the amplitude is 0). A phase of 0 is where the wave starts at t=0 and A=0 (and goes up)..

Let us put that into practice by looking at some sine waves:

So, looking at this we can see that the peak of the waves (or both of them) is 2, so A=2. It takes 100 ms to complete one entire wavelength, so that means that there are 10 waves in a second, so the frequency is 10. With regards to the red wave, the wave begins at t=0, so the phase is 0. The angular momentum, which is 2πf is 2* 3.14*10 = 62.8.

So, plotting the red wave into the formula, we get v1(t) = 2sin(62.8t), and φ=0.

Now that we have the details of the first, red, wave we can calculate the details of the second wave. To do that we need to work out the change, so:

φ = -2π 🛆t/T

Now, 🛆t is the change, and T is the time for one wave length we can work out the change, namely because we have a reference point, so 🛆t = 70-50 = 20. T=100, and we can convert 2π into degrees easily enough, since it will be 360o. So, we have φ = -360*20/100 = -7200/100 = -72o. So, we now know that the phase is -72 degrees.

This, the mathematical formula for our second wave is v2(t) = 2 sin(62.8t - 72).

Now that we have played around with a hypothetical sine way, let us take this into the real world and work out the instantaneous voltage of power supply in Australia. Now, we have two types: AC, or alternating current, and DC, or direct current. Direct current doesn't change (and operates at 230v) so at any point along the time axis the instantaneous voltage will always be 230v.




For alternating current, that is somewhat different. The frequency is 50 hz (that is 50 cycles per second). However, the voltage is 230v, though this isn't the peak voltage, but the root mean square. To find the peak voltage, we use the following equation:

Vp = Vrms(√2/2)

Yep, we have that ugly number there. So, Vp = 230*(√2/2) which gives us approximately 325 volts.

Now that we have the amplitude, we can plug all the values in.

V(t) = 325 sin (314t)

The phase is 0, so all we need to do to work out the instantaneous voltage is add in the time.

At 0s, V(t) = 325 sin (314*0) = 0v.

At 10ms, V(t) = 325 sin ([314rad/s][0.01s]) =  -200V

Anyway, enough of this and lets move onto something different, namely Internet Protocols.
Creative Commons License

Intro to Data Comms by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

Monday, 22 April 2019

Live, Die, Repeat - The Edge of Tomorrow

Edge of Tomorrow

2014
Emily Blunt, Tom Cruise
Directed by: Doug Liman
IMDB: 7.9/10 Rotten Tomatoes: 90%

This film actually has two names, and I'm not sure why they changed its name to Live, Die, Repeat so late in the piece (namely when the Blue-Ray was released) especially since I didn't actually have a problem with the original name (though there are probably reasons that I am not aware of that prompted the change). Okay, it is a Tom Cruise movie, and while I would generally say that I basically tolerate him, he still seems to find himself in some really cool movies, such as this one.

Anyway, the movie is based on a Manga comic, which in turn is based on a Japanese short novel called All You Need is Kill. The story line is basically the same, but there are a number of differences, such as the novel being set in Japan (and Florida), and the aliens looking, well, rather dull in the novel. If there is one thing that is really cool about aliens in Hollywood, and that is that they can make them look really cool, and they definitely do this here. In fact as time moves on the aliens seems to become much more, well, alien, which I think is a good thing, but then again we can probably thank CGI for that.


Synopsis

So, I would basically call this film 'Ground-hog Day with Guns' though that is probably putting it lightly. I'm sure (or I hope we are) familiar with that awesome 80s film where Bill Murry lives the same day over and over again, until he realises that he is, well, basically a jerk, and changes. This isn't so much the case with this film though. Well, sort of. The film opens with Tom Cruise, or Major Cage, landing in London. He is a PR rep for the American Army and he has been loaned to the British Army for their operations (though the army is known here as the United Defense Force).

What has happened is that a meteor has struck the Earth near Hamburg and released a horde of Aliens across Europe. They have basically decimated the continent and are now preparing to cross the Channel and invade London. However, a young cadet has recently proved her worth in the battle of Verdun, and the humans are lead to believe that they may have a chance at beating these aliens. So, they form a huge invasion force, and decide to strike at the coast of France. The problem is that this invasion is a complete failure, the force is completely destroyed, and the next day the aliens invade London.



Well, this is where the interesting part of the film comes in, sort of. Cruise is ordered to go and do his PR on the beachhead, which he objects to, says a few things to the General who is a little upset, busts him down to private, and throws him into the fray. Well, while he may be a major, the one thing that he doesn't have is combat skills, so he lands up on the beach with no idea on how to operate his weapon, and is eventually killed by one of the aliens. However, before he is killed, he managed to wound the alien (he works out how to use the weapon), and is splattered with the alien's blood. He then wakes up, in the morning, and discovers that he is repeating the entire day.

Anyway, before I continue, here is the trailer, for those who are interested.


Normandy All Over Again

I guess the thing that stood out in this film at first was that even from the trailers it was very clear that the beach landing was representative of the Normandy landings during World War II. Actually, when we have a look at the map of the region that the aliens have conquered, it also appears to be reflective of what was conquered during World War II. This is interesting because many of the great wars of Europe have all had the goal of turning Europe into a Fortress, or at least this was what happened during the Napoleonic Wars, and also World War II. There has been a lot said about how England was protected by her moat, and in many cases she was. Theoretically she has not been successfully invaded since the conquests of William. Okay, there was the Glorious Revolution of 1689, but one can argue that the reason this invasion was a success had more to do with the Dutch being invited by the English, or at least the English who didn't particularly like the Stewarts, to come and take the crown.

Yet the opposite is also true. Both Napoleon and Hitler worked the fact that once they had control of Europe, they could pretty much prevent the English from establishing a foothold. Sure, Napoleon was also attempting to starve English commercialism by denying them any European markets, and technically this was also the case with Napoleon. Notice how that the English landings in both wars occurred after the respective dictators failed campaign against the Russians. In fact, it was only after Napoleon was routed in Russia that the tables finally turned against them. As for the Nazi's, while the invasion was a success (and technically it was a two pronged invasion as the Allies were also invading from the Mediterranean since Italy was the weakest link, but even then this involved a rather difficult invasion of Morocco) it certainly didn't come without a huge cost.

The best map I could find, but not really representative.
I guess this is reflective of the difficulties that the UDF is facing in the film in that the Aliens have pretty much conquered Europe, and this invasion is a last ditch effort to attempt to turn the tide. Well, they have also been heartened by an apparent victory in Verdun, which has created its own problems, but it has been suggested in the film that this was actually a set up by the aliens to attempt to lure all of the UDF troops into France for one final victory.

Another thing, London has always been pictured as the last outpost of civilisation in such times. In fact, it is interesting that Orwell, in 1984, has Airstrip One, or the British Isles, as a separate from Eurasia. It is almost as if Britain, once the centre of an empire, has now become an outpost of the Anglo-American empire. In a way this seems to be the way it stands now. There actually seems to be much more in common between the British and the Americans (and in turn the Australians) than is the case in Europe. The fact that both countries seem to be attempting to dismantle their public healthcare systems in favour of an American style private healthcare system seems to reflect that (though apparently there still isn't such a thing as a private hospital in England).

The Fifth Dimension

Okay, my Dad, who happens to have studied physics at university (and has a doctrate in the subject) really doesn't think all that much on this idea that there are more dimensions beyond the main three. While I'm not so much a physicist, I can sort of understand where he is coming from, particularly since much of these ideas are speculative, and really only exist to attempt to solve problems that many of the modern theories have created. In a way I can appreciate this, since we are basically speculating when it comes to the idea of string theory and all that. Anyway, apparently there are something like ten dimensions, and this video, if you can get your head around the concepts, and attempts to explain them.

Anyway, for our purposes we will only be looking at two of these dimensions, namely the forth and the fifth, namely because they tend to be linked (well, sort of). If time happens to be the forth dimension, the probability turns out to be the fifth. Basically this means that if one were to travel back in time (if it is possible in the first place), and changes something (and the fact that one has traveled back in time pretty much suggests that everything has been changed anyway), then a completely different time line peels off. Basically the fifth dimension is inhabited by the infinite numbers of different universes that have been created by the infinite numbers of different choices that people have made.
A cube with 5 dimensions
So, what has happened in the film is that the aliens have this ability to be able to travel in the fifth dimension, meaning that they are able to see all of the different outcomes of all of the different choices, and pretty much travel along the path that leads to the best outcome for them. However, there is a bit of a catch, because if this was the case, then why haven't the aliens pretty much annihilated the Earth in the time between the initial landing, and this invasion. Surely they know the best path to take, so why haven't they taken it?

The idea is that the aliens are able to reset each of the days, and make different choices, and this has pretty much made them unbeatable. Once again, there is this idea of not knowing which timeline we are traveling down, and it seems that the aliens have already reset the days enough so that we are pretty much trapped in one strand of the dimension. However, the aliens didn't realise that a human could become infected with their blood, and suddenly also be able to reset the day.

Breaking the Fifth Dimension

Okay, there is a video that goes into a lot more details on the problems with the film than I do, but then again this guy does use an awful lot of sarcasm in his Youtube videos.


However, I guess if we took this too seriously, and followed some of these ideas, then we probably wouldn't actually have a film. However, if the aliens are able to see the various courses of history, then surely they would have seen the chance that one of the humans would become infected and also start repeating days, and acted to prevent this. Of course, they might have known this, but then had issues with actually finding a timeline where this doesn't actually happen so as to avoid it. Maybe this is why only two people actually became infected as opposed to a lot more. In fact, maybe this is the soft underbelly of the aliens that needs to be exploited.

Not surprisingly, nobody actually believes Cage, or his friend, that what is happening is actually happening. Then again, she is also under-estimating Cage's PR skills, as is the case in the various other scenes where he is able to get his way into the general's office, and then get the device off of him. Yet there is also another slight problem, because if this day is now pretty much set, meaning that every time Cage dies, he goes back to the beginning of this one day, then quite possibly everything happens in way it is supposed to happen, with the exception of the various choices that Cage and his partner are able to make.

In a way time has now been set, yet once one understands what is happening, one is suddenly free to make alternate suggestions that will basically change the course of history. Foreknowledge is a very, very powerful thing. Except there is one little catch - that alien that Cage killed is still on that beach, and Cage knows exactly where that alien is, so why hasn't he used that knowledge to then infect his partner in crime, and then in turn the other guys in his platoon (particularly the guy that insists on going into battle naked, with the exception of his power armour).



Well, maybe the problem is that it won't actually work. You see, if the day resets for Cage, then all of a sudden he is back to square one, which means his partner, or anybody else in the squad, are also back to square one. Sure, in their timeline, both them, and Cage, are affected, but that is a timeline that has now broken off from the current Cage's timeline, and he simply can't traverse the fifth dimension - he is only stuck in a continual time loop. In fact he can't even traverse the forth dimension, with the exception of being able to loop back to the beginning of that single day.

Yet, this could be easily changed by infecting everybody in the group at the same time, including reinfecting Cage himself. However, the problem then arises that, well, they happen to be in the middle of a war zone, and these aliens, the Mimics, really don't seem to be the types of creatures that can easily be captured. Yet, one would think that maybe, just maybe, Cage could see the alternates to where he is able to actually capture the alpha, and then do his magic trick. Then again, there simply may not be enough time to do so. I guess that even with the infinite about of probabilities that the fifth dimension opens up, there are still some restrictions that are in place and there are certain impossible things that cannot occur.

Still, there would be some cool thing about being able to live in a repeated day, over and over again, yet I guess, like Bill Murray in Groundhog Day, that it gets to a point where you are literally driven insane so you steal the groundhog and drive off the cliff in a ute.

Edge of Tomorrow Poster: Wikipedia.
Mimic: Aliens.fandom 

 
Creative Commons License

Live, Die, Repeat - The Edge of Tomorrow by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

Sunday, 14 April 2019

Using the Cloud - Preserving Privacy

One thing is that many of us really don't see much beyond the time when we enter our username and password into a website and then go about our normal business. Sure, every so often we hear about how a website has been hacked and data stolen, but generally many of us don't give much of a thought about how, or even if, our data is secure. This is very much the case now with the cloud, particularly since the cloud is able to provide much more computing power, at a much cheaper price, than either our personal desktops, or even the company server. However, while the cloud might be pretty powerful, we need a way of being able to use it while maintaining the privacy of our information.

However privacy goes much further beyond making sure that people we don't want looking at our Facebook statuses and updates don't (though the solution to that is to basically not post anything at all) and to real information that many of us wouldn't want being made public - such as our medical records or our financial situation. The is where the concept of privacy preserving computations come into play. The thing with privacy is that if it is breached it could be used to commit fraud, or even worse, particularly if your medical records somehow land up in the hands of, say, your employer (who really should be allowed to have access to it anyway).

The other thing is that we have these companies that mine data - Facebook is a classic example. As one person suggested, if you are getting a service for free, then the product is you. The thing with this data mining is that it can be used to generate targeted advertising, or even worse. For instance, with the amount of personal information many of us post on Facebook, it is scarily easy for somebody to assume our identity. However, many applications are very data heavy, and need extra computing power to process the information, but the problem is that the cloud simply cannot be trusted. As such we need a way of using the cloud, while maintaining our privacy.


So, basically we simply can't use the cloud to perform its functions on unsecured data because, well, there are privacy issues to take into account, and honestly, if the data is unsecured then pretty much anybody can look at it. However, we can't just encrypt the data because when it is encrypted then we simply are unable to do anything with it, except maybe to unencrypt it. So, what we need is a way to perform these functions and to perform them in the way that the data is, and remains, secure. The thing is that if we are encrypting the data, and then performing the operations on the data, it is also going to take longer, and require more power, so we will also need a way to distribute this workload.

What we thus need is a way to encrypt the data, send it into the cloud, have the cloud perform the function on the data, and then either return the result to us, or forward it to the third party. Now, we will basically be looking at how we can perform addition and multiplication, namely because all of mathematics boils down to these two concepts - well not quite because multiplication is actually a form of addition, though since we have methods of securely performing multiplication, we can include that as well.

Homomorphism

So, homomorphism is what is known in algebra as a structure preserving map, or in computer science the word algorithm is probably a better way of describing it. So, we have partially homomorphic proceedures, but we don't as yet have a full-homomorphic proceedure. For instance, RSA and El-Gamal can perform multiplication, but not addition while Pallier can perform addition, but not multiplication (which sort of doesn't really make sense since multiplication is basically an extension of addition). However, even if we did have a fully homomorphic proceedure, the problem would be the speed at which the operation is performed. The more complicated the operation, then the slower the procedure takes to complete. Actually, we do have some fully homomorphic schemes, but that are so slow that we might as well not bother with them and simply do the procedure by hand.

RSA Multiplication

Okay, let us have a look at some of these procedures in operation, beginning with RSA, which happens to be the easiest.

So, we have M1 = 3 and M2 = 4, and we want to multiply these numbers together. So, the public key n=33 and e = 7 and the private key d=3 are generated. Now that we have these, it is time to encrypt the message:

C1 = M1e mod n =  37 mod 33 = 9.
C2 = M2e mod n =  47 mod 33 = 16.

So, now that the values are encrypted, they are sent to the cloud where the numbers are multiplied, producing the answer 144. This is then sent to the receiver (whether it be the original computer or not), where the answer will be decrypted.

MA = CAd mod n = 1443 mod 33 = 12


El-Gamal Multiplication

Okay, that we pretty easy, but now it is time to move it up a notch and have a look at another process, this time using El-Gamal:

So, first of all we select a prime number, p, which in this situation will be 2879. We then select the generator g, which is 2585. Then we select the secret key x=47 and from there generate y, which is:

y= Gx mod p = 258547 mod 2879

The numbers that we want to add are then sent to two different servers (I never said that this was going to be easy) where the random numbers are chosen. Server 1 chooses r1 = 154 and Server 2 chooses r2 =96. They then encrypt m1= 5 and m2=6 as follows:

C11 = gr1 mod p = 2585154 mod 2879 = 1309
C12 = m1*yr1 mod p = 5*2585154 mod 2879 = 199

On server 2, we do a similar thing:

C21 = gr2 mod p = 258596 mod 2879 = 1138
C22 = m2*yr2 mod p = 6*258596 mod 2879 = 2433

Now that we have encrypted the data they are then sent into the cloud to perform the calculation.

C3 = C11*C21 mod p = 1309*1138 mod 2879 = 1199
C4 = C12*C22 mod p = 199*2433 mod 2879 = 495

Now that the functions have been performed we note that we still have two numbers, when in reality we are only looking for one. Well, obviously this isn't quite over, so we need to do something else. However this isn't done in the cloud, but rather it is performed on the client computer.

C4 mod p  = 495 mod 2879       = 30
C3x mod p    119947 mod 2879

So, MA = 30, which we will note is the correct answer.

If you thought that was a little complicated, let us move on to the final one, and that is addition with Pallier:

Pallier Addition

Well, we have the two numbers M1=4 and M2=1. Now for this to work we need some other numbers, so if you refer back to the post on Pallier, you will see how we derived them, but so as not to go over old ground, we will simply produce them as follows: p=5, q=7, n=35, 入 = 12, g=164, and μ = 23. We also need two random numbers, so we have r1=6 and r2=17.

Now, we encrypt the numbers using gmrn mod n2. This produces C1=416 and C2=127.

Now that we have encrypted the numbers, we can send them into the cloud to add them. Well, we aren't actually adding them, rather we are multiplying them, so C1*C2 = 416*127 = 52832, and this number is then sent to the receiver (whoever that may be) to be decoded.

So, by using the formula m = L(cλ mod n2)*μ mod n, where L(u) = (u-1)/n, we come back to the value of 5, which is 4+1.

In Action - E-Voting

Well, e-voting seems to be pretty controversial, particularly by the people who support the losing candidate. However, this is one area where this concept works, and that is the secure addition of numbers. Basically, when the vote is cast, it is sent into the cloud where the results are all compiled, and this is then sent down to the voting authority who will decrypt the data and thus produce the result. The thing is that because computers really are mysterious entities that seem to do things out of sight of prying eyes, people do get concerned over the authenticity of the result. For instance, here in Australia, votes are counted manually, and scrutineers from the various parties will be in the room making sure that the right vote for the right candidate is counted - this is something that isn't all that possible when it comes to electronic voting, though some would argue that this removes the need for scrutineers since ambiguity is something that can be done away with - you either push one button, or the other, you can't just sort of push it, but not.


In our example, we will have two candidates, say Hillary Clinton and Donald Trump. Well, in this instance we will only have five voters, just for simplicity's sake. Now, each of the votes is represented by a 4 bit number, so if you vote for Clinton, your vote will be 0100, which if you vote for Trump, your vote will be 0001. Now, When these votes are cast, they will be encrypted, and sent to the cloud where they are all added together. 


Basically, each of the voters has a private number, which is used to encrypt the vote. Once encrypted, the votes are collated in the cloud using the Pallier encryption system above, and the answer is then sent to the voting authority, where the result is decrypted. In this example, the result comes back as 14. Now, that doesn't actually mean anything, that is until we turn it into binary: 1110, which is then split in two to produce 11 for Hillary, and 10 for Donald. Thus Hillary got three votes while Donald only got two.

Well, we could easily say that Hillary was the winner, except that in the United States the election is determined by the electoral collage and not by a simple majority, so despite the fact that Hillary got more votes, Donald still wins because, well, that's just the way things happen.

There are other applications as well, such as electronic meter reading for electricity usage. Normally some guy just comes along, opens up a cabinet at the side of the house, reads the meter, makes a note of the reading, and then moves on. This method means that electricity usage can be monitored and read across the entire day, and we can also get more accurate readings and indications of usage. However, this type of information we don't want people to get hold of, because electricity usage charts can give people a pretty good idea of when somebody will be home, and when they won't.

Consider another option - say you have multiple bank accounts, and funds spread out across these accounts. Say then that you want to purchase something, and while you have the funds, you don't have the funds in a single account. As such, this system can access the bank accounts and tally up all of the amounts and then determine whether you have enough to purchase the item, all in the while not actually knowing how much is in each of the accounts.

This could so be used to store biometric data for security, or even protecting your location through the use of the GPS system (since all phones these days are GPS equipped). So, as you can see, this whole concept of being able to perform calculations in the cloud, or in fact doing anything in the cloud, requires there to be strong security, and in a way it does actually go beyond simply adding two numbers together.


Creative Commons License

Using the Cloud - Preserving Privacy by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

Tuesday, 9 April 2019

The Operating System - Between it and you

So, you could say that you can divide operating systems into two types, the Unix based systems, and Windows. It's true, pretty much any operating system that isn't Windows is based on the Unix systems developed back in the 70s. Actually, a lot of them are based on what is called Free BDS, or Berkley Development Systems (from University of California Berkely) and devices from the Android, to the Apple Mac, to the Playstation II operate on this system.

And then there's Windows. Bill Gates had to be different, but we have to at least give him a little credit namely because he also started working on his systems back in the 70s, and used (or should I say borrowed) an operating system called CPM which was in wide use at the time. However, he basically patented his systems, where as the Unix based systems were open source, so basically the choice was to pay Bill Gates a lot of money and use his system, or to build your own from another system that had already been developed. Well, you can guess which way people decided to go (actually, until the rise of the Smartphone, they just paid Bill Gates a lot of money).

The role of the operating system is to basically enable you to use the computer. Well, you could use it by simply feeding it a bunch of 0s and 1s, but honestly, you probably need a PhD to be able to do that, and to do so simply to play a round of Halo is going to be a little time consuming. So, the operating system does all of the grunt work for you so all you need to do is to plonk the CD into the X-Box, press play, and you are good to go.

I'm not sure if it is possible to play Halo on this machine
Another way to look at operating systems is through their interface. You have two types - GUI for graphical user interface, and the other one, which pretty much doesn't exist any more. In retrospect they are referred to as Textual User Interfaces, and you can still access them by opening up the 'Command Line' in Windows. Actually, if you know what you are doing, using a TUI can actually be much faster than all the pointing and clicking that you do with your mouse, though once again, you do need to know what your are doing.

Here's one I prepared for your earlier

Operating systems perform a number of functions, such as:

Resource Management: The operating system (OS for short) manages the computer's resources, namely the CPU, the graphics card, the memory, the hard drives, and even that little cup warmer that you have plugged into your USB port (well, not really, that's just using the computer's power to keep your coffee warm).

Data Management: The input and output, and arrangement of data on the various devices is also managed my the operating system. When you search through your directories looking for that particular bill you thought you paid, that is actually the OS managing the arrangements, not the hard drive, though the directory structure is still saved to the hard drive in a special file, it is just the OS that interprets it.

Job Management: This probably goes hand in hand with the above, but the OS runs a scheduler that tells CPU what to do and when. So, when you have multiple things running at the same time (such as that Iron Maiden tune you have playing on Spotify while you are finishing off that essay you have to hand up in two hours time) it is the OS that allocates the time to the CPU to do these tasks.

Speaking of Iron Maiden:


Let's look at a few more types of operating systems:

Real Time: As the name suggests these operating systems operate in real time, and are required for incredibly precise operations, such as piloting a space craft or managing a nuclear reactor. These operating systems do not have any user input, and are usually locked away in a sealed case. The other thing is that these operating systems are predictable, but not fast.

Single User/Single Task: Basically these operating systems perform a single task, and are designed to be used by a single user, such as the old phones that a handful of people still use. On the flip side, you have the multi-tasking systems, such as your desktop or lap top. In these situations ease of use is critical.

Multi-User: These operating systems are designed to have multiple users access the system and work on them. They generally don't sit on your desktop but rather on the server that your desktop is connected to. In a lot of cases, the end user will probably be using a Single User system, while the server system is running a multi-user system. One of the reasons that this is useful is that if one wishes to update the system, they only need to do it on one machine as opposed to a whole heap of them. Linux and Unix operating systems are examples of this. Reliability is the key factor with these computers.

Distributed Systems: These operating systems sit on multiple computers but actually make them appear as if they were one computer. This allows for system wide sharing of resources. Users of these operating systems should not know which computer they are using, or where their files are stored. Sychronisation of communications are essential for these systems.

Embedded Operating Systems: These operating systems are basically 'embedded' on a device, such as Andriod on your phone (or iOS). This is also the case with systems like Playstation and X-Box, as well as your smart-TV, or even printer (if you have a mega-fancy one that is). In this situation real-time performance may be critical (though ease of use is also a factor).

On to the sections of the Operating System, which is probably best illustrated by the following chart:


Kernel: This is the part of the operating system that communicates directly with the hardware of the computer. In a way it acts as a sort of gate keeper between the upper layers and the devices so that you don't have everybody trying to get to the same thing at the same time and pretty much crashing the system. This is where the scheduling takes place and it shares the resources across the applications. This needs to be working properly, and I mean 100% properly, because a single bug can lock up the entire system.

Device Drivers: Okay, these also communicate with the devices, but they need to because it is the drivers that actually allow the computer to use them. If the device driver for a particular device is not present, then that device is basically useless. These also need to be properly coded because if they aren't then it is going to effect the entire system.

API Layer: Known as the Application Programming Interface, this is the layer between the applications (or games if that's your thing) and the device drivers and kernal. These programs allow the kernal and the drivers to communicate with the application, and also feed the various tasks down through to them. There are actually a number of APIs that perform various specialised tasks such as graphics and networking, among others. The APIs break the tasks down into 'competent cells' so that the lower levels are able to interpret, and then perform them. These have no direct access to the hardware.

Applications: Or games, if that happens to be your things. This is the top layer of the operating system, and is where all the programs that your play around with such as the browser of those 'special' sites, and your word processor, and even that calculator that you have open so that you don't have to think too hard when it comes to arithmetic. Once again, this layer has no direct access to the hardware and all requests need to be fed down through the APIs.

Desktop Design

So, the best design for a desktop operating system is one that reliably performs the tasks that the user desires, and it needs to interact seamlessly between the applications and the hardware. The trick is balancing ease of use, speed and reliability. In this instance, using a proven kernel is the trick because we simply don't want the computer crashing in the middle of Grandma talking with her daughter overseas, especially if Grandma has no idea how to use a computer.

The device drivers need to configure themselves without any user input, and in the instance the drivers should at least have some kernal level access to the devices (particularly graphics). In fact, when it comes to device drivers, we really don't want the user to have to reboot the system everytime they want to plug in or remove the Flash drive. The API needs to be well developed to allow for multiple applications to be developed, and also provide ease of development. Finally, it should be a graphical interface where the user can simply know what does what.

Scheduling

There are a number of ways that tasks can be scheduled by the Process Scheduler: First Come, Best Dressed; Shortest; Priority; and a round robin tournament. While the first come process is fairly easy to implement, it can result in pretty horrendous wait times, particularly if there is a queue of rather long processes. The shortest method could be problematic if it ends up with a lot of pretty long processes that end up being dumped at the end of the queue. Thus we have the priority, but this isn't one where the user assigns a priority, but rather the schedualer assigns the priority based on how important the task happens to be. Then again, you can also assign a priority to a task manually.

There are a number of different types of priorities: static, which is assigned when the task is created, and dynamic, which depends on other factors after creation such as behaviour within the system. Then you have internal and external priorities, one assigned within the operating system, and the other being assigned by other factors, such as an impatient user (who can't wait the 10 nano-seconds for the task to be completed). The major problem with priorities is that it may result in low priority tasks being starved (that is waiting in the queue forever).

If you do decide to set the priority for a specific task, it is for one time and one task only, and will have to be reset if you wish to do it again.

The round robin system is where pretty much every task gets a fair go at using the CPU. They are each allocated time spots, and they will use the CPU for that particular time spot before moving aside to let the next one have a go. Actually they are called time quantums because, well, scientists.

Another way of doing it is called multi-level queuing. This is where tasks that are similar are gathered together in separate queues, and each of the queues have their own scheduling regime. The priorities are then set for each of the queues to perform their varied tasks.

Virtual Memory

Now on to paging, which is annoying. Operating systems, like pretty much all of the other parts of the computer, divides its tasks into chunks which are called frames. These frames are then stored in the memory. However, if the memory is close to being full it then does something called paging. What it does is that it creates space on the hard drive, and then turns the frames into pages (which is basically a frame, but with another name) and places it there. This section of the hard drive is known as virtual memory.

Now, if the operating system is looking for a frame and can't find it in memory this is called a page fault. When that happens it then goes to the virtual memory of the hard drive and once it is found, it then returns it to memory where it can then be easily accessed. This can lead to thrashing.

Basically thrashing occurs when the operating system is repeatedly sending frames to the virtual memory, and also retrieving them, and due to hard drives being ridiculously slow compared to the memory this leads to a significant drop in performance. In fact, in some cases when thrashing occurs it can only be solved through user interventions (such as shutting down those horrendous number of programs that aren't actually being used, or just upgrading your computer).

So, what happens when a page fault occurs is that the computer puts the program to sleep, goes and searches the virtual memory for the required page, brings it back and places it in the memory, and then wakes the program up again so that it can execute the instructions. The page table is also updated with the new information. Demand paging is where a page is swapped with a page in memory (usually for the same process), where as a memory swap is where an entire process that is in memory is swapped with a process on the hard drive.

This process is tenable where the pages being accessed aren't being accessed at random. Pages placed into virtual memory are normally those that aren't expected to be accessed for a period of time. This is one of the major reasons that in programming we don't replicate code that we have already written, to prevent extended access to the virtual memory. Code that is being used regularly is usually kept in the memory. The axiom is that a program spends 90% of its time on 10% of the code. This is also known a locality.

There are two types of locality: temporal and spatial. Temporal locality says that code that has recently been used is likely to be used again in the near future. Spatial locality says that code in addresses close together tend to be used together. These concepts work to reduce the amount of time spent searching the virtual memory for pages.

Virtual Machines

These are actually more common than I originally thought. What virtual machines do is that they add another layer between the API layer and the device driver layer, called the virtual layer. Basically it is where a virtual computer is being run on your computer. These are used with programming languages like Java, which uses a special API to integrate it with the various operating systems and computers. However there are other forms of virtual machines, such as Virtual Box by Oracle, and also VMWare. These allow you to install operating systems onto the virtual machine for trial purposes. Emulators, such as that C64 emulator that you use to play the really cool games from the 80s, or Dos Box for those old school games, are also a from of virtual machine.

The main problem with virtual machines is that they add another layer to the operating system, when has the effect of slowing things down. Anything that is running on the virtual machine is going to be less efficient than if it had been installed directly onto the operating system. However, when you consider that some of the emulators deal with games and programs that are decades old, then performance issues really aren't going to be a big problem (unless of course they happen to run at the speed of a modern computer, which means they become unplayable).

Punch Card Reader: By Mike Ross - CC BY-SA 3.0,

Creative Commons License

The Operating System - Between it and you by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me