Wave file format

The wave file format is a widely supported format for storing digital audio. A wave file uses the Resource Interchange File Format (RIFF) file structure and hence data is organized in chunks as described below. Each chunk contains information about its type and size and can easily be skipped by software that does not understand the specific chunk type.

A wave file is organized as follows.

Byte sequence description Length in bytes Starts at byte Value
chunk ID 4 0x00 The ASCII character string "RIFF"
size 4 0x04 The size of the wave file (number of bytes) less 8 (less the size of the "chunk ID" and the "size")
RIFF type ID 4 0x08 The ASCII character string "WAVE"
wave chunks various 0x0C Various chunks in the wave file as described below

Endianism

All information is stored with the least significant byte first (little-endian). For example, if the 4-byte value for the average bytes per second in the format chunk is 88,200 = 0x00015888, this information will be stored with the following byte sequence.

0x88 0x58 0x01 0x00

Word alignment

All information in a wave file must be word aligned (i.e., aligned at every two bytes). If a chunk has an odd number of bytes, then it will be padded with a zero byte, although this byte will not be counted in the size of the chunk.

Wave chunks

A wave file would include at least the following chunks.

Format chunk
Data chunk

Other chunks that may exist in a wave file include the following.

Silent chunk
Wave list chunk
Fact chunk
Cue chunk
Playlist chunk
List chunk
Sample chunk
Instrument chunk

Other RIFF chunks

Since the RIFF format is used for other types of files, such as AVI files, a RIFF file can contain types of chunks that are not relevant to the wave file format. For example, the junk and pad chunks are used to add random data to the file to, perhaps, align the file chunks on a 2K boundary. A software application does not have to recognize or use all chunk types and may ignore certain chunks.

Comments

Uh, the format where the least significant byte comes first is called BIG endian.
Because the end of it has the biggest value. And the little byte is the beginning.
Like Intel x86 is a big-endian type of deal, which is easy to double-check.

I know this is weird, but little end (little bytes) first is "little-endian" and big end (big bytes) first is "big-endian". I don't know who created this naming convention, but it is what it is...

It is quite the problem of endianess that we simply can not agree on the order of the ends. When writing numbers in decimal, like one-hundred-twenty-three (123), we use big-endian. It's called big endian because the first thing that you read is of the biggest significance. You could argue that the digit three (3) is at "the end", but the digit one (1) is also at "the end", it just happens to be the other one. You could say that the naming convention of endianess suffers from endianess.

Add new comment

Filtered HTML

  • Freelinking helps you easily create HTML links. Links take the form of [[indicator:target|Title]]. By default (no indicator): Click to view a local node.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.