Saturday, 22 June 2019

Stenography - Data Hiding

So, what can you tell me about this picture?

Well, you could tell me that it was painted my Leonardo da Vinci, it is located in the Lovre, that it draws so many crowds on a daily basis that it is mind boggling, or that the only reason that it is actually famous is because somebody stole it back at the beginning of the 20th century. However, what if I were to tell you that there happens to be a message hidden in this painting? No, I am not talking about the Davinci Code, but rather I am speaking about a form of cryptography known as stenography, or data hiding. This is the process of hiding a message inside something so that the message cannot be seen.

The example that our lecturer gave us was that back in the days of the Ancient Greeks, when they wanted to send a message they would shave the head of a slave, tattoo the message onto the head of the slave, and then let the hair grow back again. Honestly, that doesn't sound all that practical since it takes quite a long time for hair to grow that by the time the message was be sent, the war would be over. Still, it is an ancient example of stenography.

The benefit that stenography has over cryptography is that the message is basically hidden in plain site. Cryptography has us scramble the message with a key, but the thing is that if the message is intercepted, they basically know that there is something hidden because, well, the message is scrambled. However, if they intercept a bunch of family photos, they may not actually realise that hidden in these photos are a bunch of nuclear launch codes. Another example would be a situation in a prison. Say an encoded message is passed through from one cell to another, and the warden gets the message. Well, since the message is encoded, the warden knows something is up. However, say a book is passed through with a bookmark. Each of the prisoners has a piece of card with some holes in it. If the card is placed over the page where the bookmark is located, then suddenly the message is revealed - however, unless the warden knows of the existence of the card, then all he (or she) thinks is that the prisoners are simply sharing books.

In fact we can hide messages in quite a lot of things, whether it be paintings, audio files, or even the html code of a website. Basically the object in which we are hiding the message is known as the cover object, and when the message is hidden, the object becomes what is known as a stego-object. Then we have the stego-key, such as the piece of card in the prisoner example, which is used to retrieve the message. Finally we have the embedding and the extraction function, which is an algorithm used to place the message into the cover object, and then retrieve it once again.

Actually, there is more to stenography than just hidding messages in books to communicate behind the warden's back. It can actually be used for legitimate purposes. For instance we can hide patient data inside an ECG so that when it is sent from one place to another, all the information pertaining to the patient is already there. However, simply placing the data into the ECG isn't actually going to affect the ECG all that much, if at all.

Now, as I mentioned, there is more to hiding information than placing coded messages inside the Mona Lisa. For instance you can hide numbers inside numbers. In fact you can do it in a way that it doesn't even appear that there is any message inside the number. Take this for instance, Betty wants to hide the number 3 inside the number 14256. So, we convert it into binary:

14265 = 0011100010111110;

Now, we select three random bits in the number to indicate where we are going to hide them, and in this case they will be 6,11, and 15. So, the number becomes:

0011100010111110 = 0011110010011110 = 15518.

However, let us be a little sneakier and make the number 5,10,15:

0011100010111110 = 0011100010111110 = 14256.

As you can see, by selecting specific bits we can hide one number inside another number without it actually suggesting that anything is being hidden inside the number. The key, being 6,11, and 15, or 5, 10, 15, are passed using a secure channel, or are exchanged before hand.

Actually, you don't have to stick with one number, you can actually use multiple numbers, as such:

So, as you can see, we can manipulate numbers with numbers to be able to hide numbers inside numbers. We can also hide them in a way that it is not obvious that there is anything hidden inside the number. In fact, by manipulating the least significant bit the change in the value is so small that it can go unnoticed. It is this that allows us to hide information inside things like ECG scans. To make it even more secure, only certain parts of the ECG are selected to hide the information:

Now that we see how numbers can be hidden in numbers, and also when it comes to ECGs, we can now move on to the smiling Mona Lisa and question how we can hide a message inside that smile. Well, first of all you actually need a digital image of it, namely because I'm not entirely sure if the French will be all that happy if you attempt to hide a message inside the real Mona Lisa.

On a digitial image, colours are made up of the three primary colours: red, green, and blue. Each of them will have an intensity, so say we have three values, and each of these values are represented by a four bit number, so we have a value between 000 and fff (which represents 15,15,15). 000 is black and fff is white. The first digit is red, the second digit is green, and the third digit is blue.

We then convert these numbers into binary, so we have 0000,0000,0000. Now, like the ECG above, we basically change the least significant bit of the values, and all of a sudden we have a message hidden in the colours.

Just to give you a better example, have a look at the two words below.


Both of them appear green, don't they. Well, as it turns out, one of them is off by one single bit - can you tell when one it is? No, well, that is how data is hidden inside images, or colours (though since I am writing this on blogger, actually coding the colours in the html editor is rather painful).

Hopefully that image also explains a few things.

And, so it is just a short hop to actually hiding information inside webpages, particularly when you are hiding them in the colours, as I was doing above. However, the key is to be able to do it in a robust way. While it might be easy to hide the information, sometimes extracting it can be an absolute pain. For instance, I wrote a program in javascript that placed an image onto a HTML canvas, and then proceeded to manipulate the colours to hide a message. Well, that worked okay, except when we downloaded it, and the uploaded it back to the canvas to retrieve the message, it turns out that the HTML canvas function has a complete mind of its own, and will simply set the colours as it sees fit. As such, getting the values back is no small task (a task that I am still trying to figure out).

Creative Commons License

Stenography - Data Hiding by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

No comments:

Post a Comment