3MF Project: What’s In A GIF – Bit by Byte
What’s In A GIF – Bit by Byte
We sill start off by walking though the different parts of a GIF file. (The information on this page is primarily drawn from the W3C GIF89a specification.) A GIF file is made up of a bunch of different «blocks» of data. The following diagram shows all of the different types of blocks and where they belong in the file. The file starts at the left and works it’s way right. At each branch you may go one way or the other. The large «middle» section can be repeated as many times as needed. (Technically, it may also be omitted completely but i can’t imagine what good a GIF file with no image data would be.)
I’ll show you what these blocks looks like by walking through a sample GIF file. You can see the sample file and its corresponding bytes below.
47 49 46 38 39 61 0A 00 0A 00 91 00 00 FF FF FF FF 00 00 00 00 FF 00 00 00 21 F9 04 00 00 00 00 00 2C 00 00 00 00 0A 00 0A 00 00 02 16 8C 2D 99 87 2A 1C DC 33 A0 02 75 EC 95 FA A8 DE 60 8C 04 91 4C 01 00 3B
Note that not all blocks are represented in this sample file. I will provide samples of missing blocks where appropriate. The different types of blocks include: header, logical screen descriptor, global color table, graphics control extension, image descriptor, local color table, image data, plain text extension, application extension, comment extension, and trailer. Let’s get started with the first block!
From Sample File: 47 49 46 38 39 61
All GIF files must start with a header block. The header takes up the first six bytes of the file. These bytes should all correspond to ASCII character codes. We actually have two pieces of information here. The first three bytes are called the signature. These should always be «GIF» (ie 47=»G», 49=»I», 46=»F»). The next three specify the version of the specification that was used to encode the image. We’ll only be working with «89a» (ie 38=»8″, 39=»9″, 61=»a»). The only other recognized version string is «87a» but i doubt most people will run into those anymore.
From Sample File: 0A 00 0A 00 91 00 00
The logical screen descriptor always immediately follows the header. This block tells the decoder how much room this image will take up. It is exactly seven bytes long. It starts with the canvas width. This value can be found in the first two bytes. It’s saved in a format called the spec simply calls unsigned. Basically we’re looking at a 16-bit, nonnegative integer (0-65,535). As with all the other multi-byte values in the GIF format, the least significant byte is stored first (little-endian format). This means where we would read 0A 00 from the byte stream, we would normally write it as 000A which is the same as 10. Thus the width of our sample image is 10 pixels. As a further example 255 would be stored as FF 00 but 256 would be 00 01 . As you might expect, the canvas height follows. Again, in this sample we can see this value is 0A 00 which is 10.
Next we have a packed byte. That means that this byte actually has multiple values stored in its bits. In this case, the byte 91 can be represented as the binary number 10010001. (The built in Windows calculator is actually very useful when converting numbers into hexadecimal and binary formats. Be sure it’s in «scientific» or «programmer» mode, depending on the version of windows you have.) The first (most-significant) bit is the global color table flag. If it’s 0, then there is none. If it’s 1, then a global color table will follow. In our sample image, we can see that we will have a global color table (as will usually be the case). The next three bits represent the color resolution. The spec says this value » is the number of bits per primary color available to the original image, minus 1″ and «…represents the size of the entire palette from which the colors in the graphic were selected.» Because i don’t much about what this one does, i’ll point you to a more knowledgeable article on bit and color depth. For now 1 seems to work. Note that 001 represents 2 bits/pixel; 111 would represent 8 bits/pixel. The next single bit is the sort flag. If the values is 1, then the colors in the global color table are sorted in order of «decreasing importance,» which typically means «decreasing frequency» in the image. This can help the image decoder but is not required. Our value has been left at 0. The last three bits are the size of global color table. Well, that’s a lie; it’s not the actual size of the table. If this value is N, then the actual table size is 2^(N+1). From our sample file, we get the three bits 001 which is the binary version of 1. Our actual table size would be 2^(1+1) = 2^2 = 4. (We’ve mentioned the global color table several times with this byte, we will be talking about what it is in the next section.)
The next byte gives us the background color index. This byte is only meaningful if the global color table flag is 1. It represents which color in the global color table (by specifying its index) should be used for pixels whose value is not specified in the image data. If, by some chance, there is no global color table, this byte should be 0.
The last byte of the logical screen descriptor is the pixel aspect ratio. I’m not exactly sure what this value does. Most of the images i’ve seen have this value set to 0. The spec says that if there was a value specified in this byte, N, the actual ratio used would be (N + 15) / 64 for all N<>0.
From Sample File: FF FF FF FF 00 00 00 00 FF 00 00 00
We’ve mentioned the global color table a few times already now lets talk about what it actually is. As you are probably already aware, each GIF has its own color palette. That is, it has a list of all the colors that can be in the image and cannot contain colors that are not in that list. The global color table is where that list of colors is stored. Each color is stored in three bytes. Each of the bytes represents an RGB color value. The first byte is the value for red (0-255), next green, then blue. The size of the global color table is determined by the value in the packed byte of the logical screen descriptor. As we mentioned before, if the value from that byte is N, then the actual number of colors stored is 2^(N+1). This means that the global color table will take up 3*2^(N+1) bytes in the stream.
|Size In Logical
Or sample file has a global color table size of 1. This means it holds 2^(1+1)=2^2=4 colors. We can see that it takes up 12, (3*4), bytes as expected. We read the bytes three at a time to get each of the colors. The first color is #FFFFFF (white). This value is given an index of 0. The second color is #FF0000 (red). The color with an index value of 2 is #0000FF (blue). The last color is #000000 (black). The index numbers will be important when we decode the actual image data.
Note that this block is labeled as «optional.» Not every GIF has to specify a global color table. However, if the global color table flag is set to 1 in the logical screen descriptor block, the color table is then required to immediately follow that block.
From Sample File: 21 F9 04 00 00 00 00 00
Graphic control extension blocks are used frequently to specify transparency settings and control animations. They are completly optional. Since transparency and animations are bit complicated, I will hold off on many of the details of this block until a later section (see Transparency and Animation). In the interest of this page being complete, I will at least tell you what the bytes represent.
The first byte is the extension introducer. All extension blocks begin with 21. Next is the graphic control label, F9, which is the value that says this is a graphic control extension. Third up is the total block size in bytes. Next is a packed field. Bits 1-3 are reserved for future use. Bits 4-6 indicate disposal method. The penult bit is the user input flag and the last is the transparent color flag. The delay time value follows in the next two bytes stored in the unsigned format. After that we have the transparent color index byte. Finally we have the block terminator which is always 00.
From Sample File: 2C 00 00 00 00 0A 00 0A 00 00
A single GIF file may contain multiple images (useful when creating animated images). Each image begins with an image descriptor block. This block is exactly 10 bytes long.
The first byte is the image separator. Every image descriptor begins with the value 2C. The next 8 bytes represent the location and size of the following image. An image in the stream may not necessarily take up the entire canvas size defined by the logical screen descriptor. Therefore, the image descriptor specifies the image left position and image top position of where the image should begin on the canvas. Next it specifies the image width and image height. Each of these values is in the two-byte, unsigned format. Our sample image indicates that the image starts at (0,0) and is 10 pixels wide by 10 pixels tall. (This image does take up the whole canvas size.)
The last byte is another packed field. In our sample file this byte is 0 so all of the sub-values will be zero. The first (most significant) bit in the byte is the local color table flag. Setting this flag to 1 allows you to specify that the image data that follows uses a different color table than the global color table. (More information on the local color table follows.) The second bit is the interlace flag.
The local color table looks identical to the global color table. The local color table would always immediately follow an image descriptor but will only be there if the local color table flag is set to 1. It is effective only for the block of image data that immediately follows it. If no local color table is specified, the global color table is used for the following image data.
The size of the local color table can be calculated by the value given in the image descriptor. Just like with the global color table, if the image descriptor specifies a size of N, the color table will contain 2^(N+1) colors and will take up 3*2^(N+1) bytes. The colors are specified in RGB value triplets.
From Sample File: 02 16 8C 2D 99 87 2A 1C DC 33 A0 02 75 EC 95 FA A8 DE 60 8C 04 91 4C 01 00
Finally we get to the actual image data. The image data is composed of a series of output codes which tell the decoder which colors to spit out to the canvas. These codes are combined into the bytes that make up the block. I’ve set an whole other section on decoding these output code into an image (see LZW Image Data). On this page i’m just going to tell you how to determine how long the block will be.
The first byte of this block is the LZW minimum code size. This value is used to decode the compressed output codes. (Again, see the section on LZW compression to see how this works.) The rest of the bytes represent data sub-blocks. Data sub-blocks are are groups of 1 – 256 bytes. The first byte in the sub-block tells you how many bytes of actual data follow. This can be a value from 0 (00) it 255 (FF). After you’ve read those bytes, the next byte you read will tell you now many more bytes of data follow that one. You continue to read until you reach a sub-block that says that zero bytes follow.
You can see our sample file has a LZW minimum code size of 2. The next byte tells us that 22 bytes of data follow it (16 hex = 22). After we’ve read those 22 bytes, we see the next value is 0. This means that no bytes follow and we have read all the data in this block.
Example (Not in Sample File): 21 01 0C 00 00 00 00 64 00 64 00 14 14 01 00 0B 68 65 6C 6C 6F 20 77 6F 72 6C 64 00
Oddly enough the spec allows you to specify text which you wish to have rendered on the image. I followed the spec to see if any application would understand this command; but IE, FireFox, and Photoshop all failed to render the text. Rather than explaining all the bytes, i’ll tell you how to recognize this block and skip over it
The block begins with an extension introducer as all extension block types do. This value is always 21. The next byte is the plain text label. This value of 01 is used to distinguish plain text extensions from all other extensions. The next byte is the block size. This tells you how many bytes there are until the actual text data begins, or in other words, how many bytes you can now skip. The byte value will probably be 0C which means you should jump down 12 bytes. The text that follows is encoded in data sub-blocks (see Image Data to see how these sub-blocks are formed). The block ends when you reach a sub-block of length 0.
Example (Not in Sample File): 21 FF 0B 4E 45 54 53 43 41 50 45 32 2E 30 03 01 05 00 00
The spec allows for application specific information to be embedded in the GIF file itself. The only reference to could find to application extensions was the NETSCAPE2.0 extension which is used to loop an animated GIF file. I’ll go into more detail on looping in when we talk about animation.
Like with all extensions, we start with 21 which is the extension introducer. Next is the extension label which for application extensions is FF. The next value is the block size which tells you how many bytes there are before the actual application data begins. This byte value should be 0B which indicates 11 bytes. These 11 bytes hold two pieces of information. First is the application identifier which takes up the first 8 bytes. These bytes should contain ASCII character codes that identify to which application the extension belongs. In the case of the example above, the application identifier is «NETSCAPE» which is conveniently 8 characters long. The next three bytes are the application authentication code. The spec says these bytes can be used to «authenticate the application identifier.» With the NETSCAPE2.0 extension, this value is simply a version number, «2.0», hence the extensions name. What follows is the application data broken into data sub-blocks. Like with the other extensions, the block terminates when you read a sub-block that has zero bytes of data.
Example (Not in Sample File): 21 FE 09 62 6C 75 65 62 65 72 72 79 00
One last extension type is the comment extension. Yes, you can actually embed comments with in a GIF file. Why you would want to increase the file size with unprintable data, i’m not sure. Perhaps it would be a fun way to pass secret messages.
It’s probably no surprise by now that the first byte is the extension introducer which is 21. The next byte is always FE which is the comment label. Then we jump right to data sub-blocks containing ASCII character codes for your comment. As you can see from the example we have one data sub-block that is 9 bytes long. If you translate the character codes you see that the comment is «blueberry.» The final byte, 00, indicates a sub-block with zero bytes that follow which let’s us know we have reached the end of the block.
From sample file: 3B
The trailer block indicates when you’ve hit the end of the file. It is always a byte with a value of 3B.
Next: LZW Image Data
Now that you know what the basic parts of a GIF file are, let’s next focus our attention on how the actual image data is stored and compressed. Continue…