| Icecast Installation and Management: A Guide to Open Source Audio Streaming | ||
|---|---|---|
| <<< Previous | Audio Fundamentals | Next >>> |
Aside from being familiar with the basic options available to the MP3 encoder, the typical user does not need to know how MP3 files are structured internally any more than he or she needs to know how JPEG images or Word documents are internally formatted or constructed. For the morbidly curious, however, here is an insider view of the MP3 file format.
As mentioned earlier, MP3 files are segmented into a multitude of frames, each containing a fraction of a second's worth of audio data. Each is ready at a moments notice to be reconstructed by the decoder. Inserted at the beginning of every data frame is a "header frame," that stores 32 bits of meta-data related to the coming data frame. Those knowledgeable in basic TCP packet construction may note the similarities between MP3 files and IP packets. Figure 2-3 illustrates this breakdown.

Data describing the structural factors of that frame. This data is called the frame's "header".
As illustrated in Figure 2-4 the MP3 header begins with a "sync" block, consisting of 11 bits. The sync block allows players to search for and "lock onto" the first available occurrence of a valid frame, which is useful in MP3 broadcasting, for moving around quickly from one part of a track to another, and for skipping ID3 or other data that may be living at the start of the file.

The MP3 frame header represented visually.
![]() | Figure 2-4 is based on a diagram produced by http://www.id3.org/. |
However, it is not enough for a player to simply find the sync block in any binary file and assume that it is a valid MP3 file, since the same pattern of 11 bits could theoretically be found in any random binary file. Thus, it is also necessary for the decoder to check for the validity of other header data as well, or for multiple valid frames in a row. Table 2-1 lists the total 32 bits of header data that are spread over 13 header positions.
One of the original design goals of MP3 was that it would be suitable for broadcasting. As a result, it becomes important that MP3 receivers be able to lock onto the signal at any point in the stream. This is one of the big reasons why a header frame is placed prior to each data frame, so that a receiver tuning in at any point in the broadcast can search for sync data and start playing almost immediately. Interestingly, this fact theoretically makes it possible to cut MPEG files into smaller pieces and play the pieces individually. Unfortunately this is not possible with Layer III files (MP3) due to the fact that frames often depend on data contained in other frames. Thus, you can't just open any old MP3 file in your favorite audio editor for editing or tweaking.
Table 1. Characteristics of the Thirteen Header Files
| Position | Purpose | Length (bits) |
|---|---|---|
| A | Frame sync | 11 |
| B | MPEG audio version (MPEG-1, 2, etc.) | 2 |
| C | MPEG layer (Layer I, II, III, etc.) | 1 |
| D | Protection (if on, then checksum follows header) | 1 |
| E | Bitrate index (lookup table used to specify bitrate for this MPEG version and layer) | 4 |
| F | Sampling rate frequency (44.1kHz, etc., determined by lookup table) | 2 |
| G | Padding bit (on or off, compensates for unfilled frames) | 1 |
| H | Private bit (on or off, allows for application-specific triggers) | 1 |
| I | Channel mode (stereo, joint stereo, dual channel, single channel) | 2 |
| J | Mode extension (used only with joint stereo, to conjoin channel data) | 2 |
| K | Copyright (on or off) | 1 |
| L | Original (off if copy of original, on if original) | 1 |
| M | Emphasis (respects emphasis bit in the original recording; now largely obsolete) | 2 |
There should be in all 32 total header bits. Following the initial frame sync block is an ID bit, specifying whether the frame has been encoded in MPEG-1 or MPEG-2. Two layer bits follow, determining whether the frame is Layer I, II, III, or not defined. If the protection bit is not set, a 16-bit checksum is inserted prior to the beginning of the audio data.
The bitrate field, naturally, specifies the bitrate of the current frame (e.g., 128 kbps), and is followed by a bit stating the audio frequency. This ranges from 16,000Hz to 44,100Hz, depending on whether MPEG-1 or MPEG-2 is in use. The padding bit ensures each frame satisfies the bitrate requirements exactly. For example, a 128 kbps Layer II bitstream at 44.1kHz may end up with some frames of 417 bytes and some of 418. The 417-byte frames has the padding bit set to "on" (1) to compensate for the discrepancy.
The mode field refers to the stereo/mono status of the frame, and allows for the setting of stereo, joint stereo, dual channel, and mono encoding options. If joint stereo effects have been enabled, the mode extension field tells the decoder exactly how to handle it, i.e, whether high frequencies have been combined across channels.
The copyright bit does not hold copyright information per se (obviously not, since it's only one bit long) but rather mimics a similar copyright bit used on CDs and DATs. If this bit is set, it's officially illegal to copy the track. Some audio ripping programs report this information if the copyright bit is found to be set. If the data is on its original media, the home bit will be set. The "private" bit can be used by specific applications to trigger custom events.
The emphasis field is used as a flag, in case a corresponding emphasis bit was set in the original recording. The emphasis bit is rarely used anymore, though some recordings do still use it.
Finally, the decoder moves on through the checksum (if it exists) and on to the actual audio data frame, and the process begins all over again, with thousands of frames per audio file.
![]() | For more details on the structure of MP3 header frames, including the actual lookup tables necessary to derive certain details from the bit settings previously listed, see the Programmer's Corner section at http://www.mp3-tech.org/. Or if you want to request information directly, start at http://www.iso.ch/. |
| <<< Previous | Home | Next >>> |
| Other Considerations | Up | ID3 Space |