いろいろなサウンドファイルフォーマット

目次

「コンピュータサウンドの世界」の内容に関する補足(^_^;)

こちらもご覧下さい「作るサウンドエレクトロニクス」

比較実験のために作ったサウンドファイルのリスト

比較実験をしているIndyの画面(^_^)

MP3エンコード中のMacの画面(^_^)

5292058 hard_44100_16bit_stereo_norm.aiff

3840062 hard_32000_16bit_stereo_norm.aiff

2646062 hard_22050_16bit_stereo_norm.aiff

1920062 hard_16000_16bit_stereo_norm.aiff

1323058 hard_11025_16bit_stereo_norm.aiff

960058 hard__8000_16bit_stereo_norm.aiff

480026 hard__8000_16bit_stereo_comp.au

481241 hard_44100_16bit_stereo_comp.mp3

481233 soft_44100_16bit_stereo_comp.mp3

AIFFファイルフォーマットの解説

AIFCファイルフォーマットの解説

WAVファイルフォーマットの解説(1)

WAVファイルフォーマットの解説(2)

MP3の解説(1)

MP3の解説(2)

MP3の解説(3)

オーディオファイルの解説(1)

MP3のためのC言語ソースファイル


比較実験のために作ったサウンドファイルのリスト

01  -rw-r--r--  1 root  user   5760124   hard_48000_16bit_stereo_norm.aiff
02  -rw-r--r--  1 root  user   5760000   hard_48000_16bit_stereo_norm.data
03  -rw-r--r--  1 root  user   5760044   hard_48000_16bit_stereo_norm.wav

04  -rw-r--r--  1 root  user   5760124   hard_48000_16bit_stereo_norm.aiff
05  -rw-r--r--  1 root  user   2880062   hard_48000_16bit___mono_norm.aiff

06  -rw-r--r--  1 root  user   5760124   hard_48000_16bit_stereo_norm.aiff
07  -rw-r--r--  1 root  user   8640062   hard_48000_24bit_stereo_norm.aiff
08  -rw-r--r--  1 root  user   5760062   hard_48000_12bit_stereo_norm.aiff
09  -rw-r--r--  1 root  user   2880062   hard_48000__8bit_stereo_norm.aiff

10  -rw-r--r--  1 root  user   5760124   hard_48000_16bit_stereo_norm.aiff
11  -rw-r--r--  1 root  user   5292058   hard_44100_16bit_stereo_norm.aiff
12  -rw-r--r--  1 root  user   3840062   hard_32000_16bit_stereo_norm.aiff
13  -rw-r--r--  1 root  user   2646062   hard_22050_16bit_stereo_norm.aiff
14  -rw-r--r--  1 root  user   1920062   hard_16000_16bit_stereo_norm.aiff
15  -rw-r--r--  1 root  user   1323058   hard_11025_16bit_stereo_norm.aiff
16  -rw-r--r--  1 root  user    960058   hard__8000_16bit_stereo_norm.aiff

17  -rw-r--r--  1 root  user   5292058   hard_44100_16bit_stereo_norm.aiff
18  -rw-r--r--  1 root  user   5292024   hard_44100_16bit_stereo_norm.au
19  -rw-r--r--  1 root  user   2646026   hard_44100_16bit_stereo_comp.au

20  -rw-r--r--  1 root  user   5292058   hard_44100_16bit_stereo_norm.aiff
21  -rw-r--r--  1 root  user   5292058   soft_44100_16bit_stereo_norm.aiff
22  -rw-r--r--  1 root  user    960058   hard__8000_16bit_stereo_norm.aiff
23  -rw-r--r--  1 root  user    480060   hard__8000_16bit___mono_norm.aiff
24  -rw-r--r--  1 root  user    480026   hard__8000_16bit_stereo_comp.au
25  -rw-r--r--  1 root  user    240027   hard__8000_16bit___mono_comp.au
26  -rw-r--r--  1 root  user    240027   soft__8000_16bit___mono_comp.au

27  -rw-r--r--  1 root  user   5292058   hard_44100_16bit_stereo_norm.aiff
28  -rw-r--r--  1 root  user   5033097   hard_44100_16bit_stereo_norm.lzh
29  -rw-r--r--  1 root  user   5022285   hard_44100_16bit_stereo_norm.zip
30  -rw-r--r--  1 root  user   5027251   hard_44100_16bit_stereo_norm.sit
31  -rw-r--r--  1 root  user   5291891   hard_44100_16bit_stereo_norm.cpt
32  -rw-r--r--  1 root  user    481241   hard_44100_16bit_stereo_norm.mp3

33  -rw-r--r--  1 root  user   5292058   soft_44100_16bit_stereo_norm.aiff
34  -rw-r--r--  1 root  user   4352562   soft_44100_16bit_stereo_norm.lzh
35  -rw-r--r--  1 root  user   4368710   soft_44100_16bit_stereo_norm.zip
36  -rw-r--r--  1 root  user   4437119   soft_44100_16bit_stereo_norm.sit
37  -rw-r--r--  1 root  user   4714626   soft_44100_16bit_stereo_norm.cpt
38  -rw-r--r--  1 root  user    481233   soft_44100_16bit_stereo_norm.mp3


Audio Interchange File Format: "AIFF"

A Standard for Sampled Sound Files
Version 1.3

The Audio Interchange File Format (Audio IFF) provides a standard for storing sampled sounds. The format is quite flexible, allowing for the storage of monaural or multichannel sampled sounds at a variety of sample rates and sample widths.

Audio IFF conforms to the "EA IFF 85" Standard for Interchange Format Files developed by Electronic Arts.

Audio IFF is primarily an interchange format, although application designers should find it flexible enough to use as a data storage format as well. If an application does choose to use a different storage format, it should be able to convert to and from the format defined in this document. This will facilitate the sharing of sound data between applications.

Audio IFF is the result of several meetings held with music developers over a period of ten months in 1987-88.

Another "EA IFF 85" sound storage format is "8SVX" IFF 8-bit Sampled Voice, by Electronic Arts. "8SVX", which handles 8-bit monaural samples, is intended mainly for storing sound for playback on personal computers. Audio IFF is intended for use with a larger variety of computers, sampled sound instruments, sound software applications, and high fidelity recording devices.

Data types

A C-like language will be used to describe data structures in this document. The data types used are listed below:
char: 8 bits, signed. A char can contain more than just ASCII characters. It can contain any number from -128 to 127 (inclusive).
unsigned char: 8 bits, unsigned. Contains any number from zero to 255 (inclusive).
short: 16 bits, signed. Contains any number from -32,768 to 32,767 (inclusive).
unsigned short: 16 bits, unsigned. Contains any number from zero to 65,535 (inclusive).
long: 32 bits, signed. Contains any number from -2,147,483,648 to 2,147,483,647 (inclusive).
unsigned long: 32 bits, unsigned. Contains any number from zero to 4,294,967,295 (inclusive).
extended: 80 bit IEEE Standard 754 floating point number (Standard Apple Numeric Environment [SANE] data type Extended).
pstring: Pascal-style string, a one byte count followed by text bytes. The total number of bytes in this data type should be even. A pad byte can be added at the end of the text to accomplish this. This pad byte is not reflected in the count.
ID: 32 bits, the concatenation of four printable ASCII character in the range ' ' (SP, 0x20) through '~' (0x7E). Spaces (0x20) cannot precede printing characters; trailing spaces are allowed. Control characters are forbidden.
OSType: 32 bits. A concatenation of four characters, as defined in Inside Macintosh, vol II.

Constants

Decimal values are referred to as a string of digits, for example 123, 0, 100 are all decimal numbers. Hexadecimal values are preceded by a 0x - e.g. 0x0A12, 0x1, 0x64.

Data Organization

All data is stored in Motorola 68000 format. Data is organized as follows:

Referring to Audio IFF

The official name for this standard is Audio Interchange File Format. If an application program needs to present the name of this format to a user, such as in a "Save as..." dialog box, the name can be abbreviated to Audio IFF.

File Structure

The "EA IFF 85" Standard for Interchange Format Files defines an overall structure for storing data in files. Audio IFF conforms to the "EA IFF 85" standard. This document will describe those portions of "EA IFF 85" that are germane to Audio IFF. For a more complete discussion of "EA IFF 85", please refer to the document "EA IFF 85" Standard for Interchange Format Files.

An "EA IFF 85" file is made up of a number of chunks of data. Chunks are the building blocks of "EA IFF 85" files. A chunk consists of some header information followed by data:

A chunk can be represented using our C-like language in the following manner:

typedef struct {
    ID              ckID;       /* chunk ID */
    long            ckSize;     /* chunk Size   */
    char            ckData[];   /* data */
} Chunk;	

ckID describes the format of the data portion a chunk. A program can determine how to interpret the chunk data by examining ckID.

ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

ckData contains the data stored in the chunk. The format of this data is determined by ckID. If the data is an odd number of bytes in length, a zero pad byte must be added at the end. The pad byte is not included in ckSize .

Note that an array with no size specification (e.g. char ckData[];) indicates a variable-sized array in our C-like language. This differs from standard C.

An Audio IFF file is a collection of a number of different types of chunks. There is a Common Chunk which contains important parameters describing the sampled sound, such as it's length and sample rate. There is a Sound Data Chunk that contains the actual audio samples. There are several other optional chunks that define markers, list instrument parameters, store application-specific information, etc. All of these chunks are described in detail in later sections of this document.

The chunks in a Audio IFF file are grouped together in a container chunk. "EA IFF 85" defines a number of container chunks, but the one used by Audio IFF is called a FORM. A FORM has the following format:

typedef struct {
    ID          ckID;   
    long        ckSize;
    ID          formType;   
    char        chunks [];
} Chunk;	

ckID is always 'FORM'. This indicates that this is a FORM chunk.

ckSize contains the size of data portion of the 'FORM' chunk. Note that the data portion has been broken into two parts, formType and chunks[].

formType describes what's in the 'FORM' chunk. For Audio IFF files, formType is always 'AIFF'. This indicates that the chunks within the FORM pertain to sampled sound. A FORM chunk of formType 'AIFF' is called a FORM AIFF.

chunks are the chunks contained within the FORM. These chunks are called local chunks. A FORM AIFF along with its local chunks make up an Audio IFF file.

Here is an example of a simple Audio IFF file. It consists of a file containing single FORM AIFF which contains two local chunks, a Common Chunk and a Sound Data Chunk.

There are no restrictions on the ordering of local chunks within a FORM AIFF.

On an Apple II, the FORM AIFF is stored in a ProDOS file. The file type is 0xD8 and the aux type is 0x0000. AIFF versions 1.2 and earlier used file type 0xCB, which is incorrect. Please see the Apple II File Type Note for file type 0xD8 and aux type 0x0000 for strategies on dealing with this inconsistency.

On a Macintosh, the FORM AIFF is stored in the data fork of an Audio IFF file. The Macintosh file type of an Audio IFF file is 'AIFF'. This is the same as the formType of the FORM AIFF.

Macintosh or Apple II applications should not store any information in Audio IFF file's resource fork, as this information may not be preserved by all applications. Applications can use the Application Specific Chunk, defined later in this document, to store extra information specific to their application.

On an operating system that uses file extensions, such as MS-DOS or UNIX, it is recommended that Audio IFF file names have a ".AIF" extension.

A more detailed example of an Audio IFF file can be found in the Appendix. Please refer to this example as often as necessary while reading the remainder of this document.

Local Chunk Types

The formats of the different local chunk types found within a FORM AIFF are described in the following sections. The ckIDs for each chunk are also defined.

There are two types of chunks, those that are required and those that are optional. The Common Chunk is required. The Sound Data chunk is required if the sampled sound has greater than zero length. All other chunks are optional. All applications that use FORM AIFF must be able to read the required chunks, and can choose to selectively ignore the optional chunks. A program that copies a FORM AIFF should copy all of the chunks in the FORM AIFF.

Common Chunk

The Common Chunk describes fundamental parameters of the sampled sound.

#define CommonID    'COMM'  /* ckID for Common Chunk */
typedef struct {
    ID              ckID;   
    long            ckSize;
    short           numChannels;
    unsigned long   numSampleFrames;
    short           sampleSize;
    extended        sampleRate;
} CommonChunk;  

ckID is always 'COMM'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize. For the Common Chunk, ckSize is always 18.

numChannels contains the number of audio channels for the sound. A value of 1 means monophonic sound, 2 means stereo, and 4 means four channel sound, etc. Any number of audio channels may be represented.

The actual sound samples are stored in another chunk, the Sound Data Chunk, which will be described shortly. For multichannel sounds, single sample points from each channel are interleaved. A set of interleaved sample points is called a sample frame. This is illustrated below for the stereo case.

For monophonic sound, a sample frame is a single sample point.

For multichannel sounds, the following conventions should be observed:

numSampleFrames contains the number of sample frames in the Sound Data Chunk. Note that numSampleFrames is the number of sample frames, not the number of bytes nor the number of sample points in the Sound Data Chunk. The total number of sample points in the file is numSampleFrames times numChannels.

sampleSize is the number of bits in each sample point. It can be any number from 1 to 32. The format of a sample point will be described in the next section, the Sound Data Chunk.

sampleRate is the sample rate at which the sound is to be played back, in sample frames per second.

One and only one Common Chunk is required in every FORM AIFF.

Sound Data Chunk

The Sound Data Chunk contains the actual sample frames.

#define SoundDataID 'SSND'  /* ckID for Sound Data Chunk */
typedef struct {
    ID                  ckID;
    long                ckSize;
    unsigned long       offset;
    unsigned long       blockSize;
    unsigned char       soundData[];
} SoundDataChunk;   

ckID is always 'SSND'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

offset determines where the first sample frame in the soundData starts. offset is in bytes. Most applications won't use offset and should set it to zero. Use for a non-zero offset is explained in the Block-Aligning Sound Data section below.

blockSize is used in conjunction with offset for block-aligning sound data. It contains the size in bytes of the blocks that sound data is aligned to. As with offset, most applications won't use blockSize and should set it to zero. More information on blockSize is in the Block-Aligning Sound Data section below.

soundData contains the sample frames that make up the sound. The number of sample frames in the soundData is determined by the numSampleFrames parameter in the Common Chunk.

Sample Points

Each sample point in a sample frame is a linear, 2's complement value. The sample points are from 1 to 32 bits wide, as determined by the sampleSize parameter in the Common Chunk. Sample points are stored in an integral number of contiguous bytes. One to 8 bit wide sample points are stored in one byte, 9 to 16 bit wide sample points are stored in two bytes, 17 to 24 bit wide sample points are stored in 3 bytes, and 25 to 32 bit wide samples are stored in 4 bytes. When the width of a sample point is less than a multiple of 8 bits, the sample point data is left justified, with the remaining bits zeroed. An example case is illustrated below. A 12 bit sample point, binary 101000010111, is stored left justified in two bytes. The remaining bits are set to zero.

Sample Frames

Sample frames are stored contiguously in order of increasing time. The sample points within a sample frame are packed together, there are no unused bytes between them. Likewise, the sample frames are packed together with no pad bytes.

Block-Aligning Sound Data

There may be some applications that, to insure real time recording and playback of audio, wish to align sampled sound data with fixed-size blocks. This can be accomplished with the offset and blockSize parameters, as shown below.

In the above figure, the first sample frame starts at the beginning of block N. This is accomplished by skipping the first offset bytes of the soundData. Note too that the soundData array can extend beyond valid sample frames, allowing the soundData array to end on a block boundary.

blockSize specifies the size in bytes of the block that is to be aligned to. A blockSize of zero indicates that the sound data does not need to be block-aligned. Applications that don't care about block alignment should set blockSize and offset to zero when writing Audio IFF files. Applications that write block-aligned sound data should set blockSize to the appropriate block size. Applications that modify an existing Audio IFF file should try to preserve alignment of the sound data, although this is not required. If an application doesn't preserve alignment, it should set blockSize and offset to zero. If an application needs to realign sound data to a different sized block, it should update blockSize and offset accordingly.

The Sound Data Chunk is required unless the numSampleFrames field in the Common Chunk is zero. A maximum of one Sound Data Chunk can appear in a FORM AIFF.

Marker Chunk

The Marker Chunk contains markers that point to positions in the sound data. Markers can be used for whatever purposes an application desires. The Instrument Chunk, defined later in this document, uses markers to mark loop beginning and end points, for example.

Markers

A marker has the following format.

typedef short   MarkerId;
typedef struct {
    MarkerId            id;
    unsigned long       position;
    pstring             markerName;
} Marker;

id is a number that uniquely identifies the marker within a FORM AIFF. The id can be any positive non-zero integer, as long as no other marker within the same FORM AIFF has the same id.

The marker's position in the sound data is determined by position . Markers conceptually fall between two sample frames. A marker that falls before the first sample frame in the sound data is at position zero, while a marker that falls between the first and second sample frame in the sound data is at position 1. Note that the units for position are sample frames, not bytes nor sample points.

markerName is a Pascal-style text string containing the name of the mark.

Note: Some "EA IFF 85" files store strings as C-strings (text bytes followed by a null terminating character) instead of Pascal-style strings. Audio IFF uses pstrings because they are more efficiently skipped over when scanning through chunks. Using pstrings, a program can skip over a string by adding the string count to the address of the first character. C strings require that each character in the string be examined for the null terminator.

Marker Chunk Format

The format for the data within a Marker Chunk is shown below.

#define MarkerID    'MARK'  /* ckID for Marker Chunk */
typedef struct {
    ID                  ckID;   
    long                ckSize;
    unsigned short      numMarkers;
    Marker              Markers[];
} MarkerChunk;

ckID is always 'MARK'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

numMarkers is the number of markers in the Marker Chunk.

numMarkers, if non-zero, it is followed by the markers themselves. Because all fields in a marker are an even number of bytes in length, the length of any marker will always be even. Thus, markers are packed together with no unused bytes between them. The markers need not be ordered in any particular manner.

The Marker Chunk is optional. No more than one Marker Chunk can appear in a FORM AIFF.

Instrument Chunk

The Instrument Chunk defines basic parameters that an instrument, such as a sampler, could use to play back the sound data.

Looping

Sound data can be looped, allowing a portion of the sound to be repeated in order to lengthen the sound. The structure below describes a loop:

typedef struct {
    short           playMode;
    MarkerId        beginLoop;
    MarkerId        endLoop;
} Loop;

A loop is marked with two points, a begin position and an end position. There are two ways to play a loop, forward looping and forward/backward looping. In the case of forward looping, playback begins at the beginning of the sound, continues past the begin position and continues to the end position, at which point playback restarts again at the begin position. The segment between the begin and end positions, called the loop segment, is played over and over again, until interrupted by something, such as the release of a key on a sampling instrument, for example.

With forward/backward looping, the loop segment is first played from the begin position to the end position, and then played backwards from the end position back to the begin position. This flip-flop pattern is repeated over and over again until interrupted.

playMode specifies which type of looping is to be performed.

#define NoLooping               0
#define ForwardLooping          1
#define ForwardBackwardLooping  2

If NoLooping is specified, then the loop points are ignored during playback.

beginLoop is a the marker id that marks the begin position of the loop segment.

endLoop marks the end position of a loop. The begin position must be less than the end position. If this is not the case, then the loop segment has zero or negative length and no looping takes place.

Instrument Chunk Format

The format of the data within an Instrument Chunk is described below.

#define InstrumentID    'INST'  /* ckID for Instrument Chunk */
typedef struct {
    ID              ckID;   
    long            ckSize;
    char            baseNote;
    char            detune;
    char            lowNote;
    char            highNote;
    char            lowVelocity;
    char            highVelocity;
    short           gain;
    Loop            sustainLoop;
    Loop            releaseLoop;
} InstrumentChunk;

ckID is always 'INST'. ckSize is the size of the data portion of the chunk, in bytes. For the Instrument Chunk, ckSize is always 20.

baseNote is the note at which the instrument plays back the sound data without pitch modification. Units are MIDI (MIDI is an acronym for Musical Instrument Digital Interface) note numbers, and are in the range 0 through 127. Middle C is 60.

detune determines how much the instrument should alter the pitch of the sound when it is played back. Units are in cents (1/100 of a semitone) and range from -50 to +50. Negative numbers mean that the pitch of the sound should be lowered, while positive numbers mean that it should be raised.

lowNote and highNote specify the suggested range on a keyboard for playback of the sound data. The sound data should be played if the instrument is requested to play a note between the low and high notes, inclusive. The base note does not have to be within this range. Units for lowNote and highNote are MIDI note values.

lowVelocity and highVelocity specify the suggested range of velocities for playback of the sound data. The sound data should be played if the note-on velocity is is between low and high velocity, inclusive. Units are MIDI velocity values, 1 (lowest velocity) through 127 (highest velocity).

gain is the amount by which to change the gain of the sound when it is played. Units are decibels. For example, 0 db means no change, 6 db means double the value of each sample point, while -6 db means halve the value of each sample point.

sustainLoop specifies a loop that is to be played when an instrument is sustaining a sound.

releaseLoop specifies a loop that is to be played when an instrument is in the release phase of playing back a sound. The release phase usually occurs after a key on an instrument is released.

The Instrument Chunk is optional. No more than one Instrument Chunk can appear in a FORM AIFF.

MIDI Data Chunk

The MIDI Data Chunk can be used to store MIDI data (please refer to Musical Instrument Digital Interface Specification 1.0, available from the International MIDI Association, for more details on MIDI).

The primary purpose of this chunk is to store MIDI System Exclusive messages, although other types of MIDI data can be stored in this block as well. As more instruments come on the market, they will likely have parameters that have not been included in the Audio IFF specification. The MIDI System Exclusive messages for these instruments may contain many parameters that are not included in the Instrument Chunk. For example, a new sampling instrument may have more than the two loops defined in the Instrument Chunk. These loops will likely be represented in the MIDI System Exclusive message for the new machine. This MIDI System Exclusive message can be stored in the MIDI Data Chunk.

#define MIDIDataID  'MIDI'  /* ckID for MIDI Data Chunk */
typedef struct {
    ID                  ckID;
    long                ckSize;
    unsigned char       MIDIdata[];
} MIDIDataChunk;

ckID is always ' MIDI'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

MIDIData contains a stream of MIDI data.

The MIDI Data Chunk is optional. Any number of MIDI Data Chunks may exist in a FORM AIFF. If MIDI System Exclusive messages for several instruments are to be stored in a FORM AIFF, it is better to use one MIDI Data Chunk per instrument than one big MIDI Data Chunk for all of the instruments.

Audio Recording Chunk

The Audio Recording Chunk contains information pertinent to audio recording devices.

#define AudioRecordingID  'AESD'        /* ckID for Audio Recording */
                                        /*   Chunk.                 */
typedef struct {
    ID                  ckID;
    long                ckSize;
    unsigned char       AESChannelStatusData[24];
} AudioRecordingChunk;

ckID is always 'AESD'. ckSize is the size of the data portion of the chunk, in bytes. For the Audio Recording Chunk, ckSize is always 24.

The 24 bytes of AESChannelStatusData are specified in the AES Recommended Practice for Digital Audio Engineering - Serial Transmission Format for Linearly Represented Digital Audio Data, section 7.1, Channel Status Data. That document describes a format for real-time digital transmission of digital audio between audio devices. This information is duplicated in the Audio Recording Chunk for convenience. Of general interest would be bits 2, 3, and 4 of byte 0, which describe recording emphasis.

The Audio Recording Chunk is optional. No more than one Audio Recording Chunk may appear in a FORM AIFF.

Application Specific Chunk

The Application Specific Chunk can be used for any purposes whatsoever by manufacturers of applications. For example, an application that edits sounds might want to use this chunk to store editor state parameters such as magnification levels, last cursor position, and the like.

#define ApplicationSpecificID  'APPL'   /* ckID for Application */
                                        /*  Specific Chunk.     */
typedef struct {
    ID          ckID;   
    long        ckSize;
    OSType      applicationSignature;
    char        data[];
} ApplicationSpecificChunk;

ckID is always 'APPL'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

applicationSignature identifies a particular application. For Macintosh applications, this will be the application's four character signature. For Apple II applications, applicationSignature should always be 'pdos', or the hexadecimal bytes 0x70646F73. If applicationSignature is 'pdos', the beginning of the data area is defined to be a Pascal-style string (a length byte followed by ASCII string bytes) containing the name of the application. This is necessary because Apple II applications do not have a four-byte signature as do Macintosh applications.

data is the data specific to the application.

The Application Specific Chunk is optional. Any number of Application Specific Chunks may exist in a single FORM AIFF.

Comments Chunk

The Comments Chunk is used to store comments in the FORM AIFF. "EA IFF 85" has an Annotation Chunk that can be used for comments, but the Comments Chunk has two features not found in the "EA IFF 85" chunk. They are: 1) a timestamp for the comment; and 2) a link to a marker.

Comment

A comment consists of a time stamp, marker id, and a text count followed by text.

typedef struct {
    unsigned long       timeStamp;
    MarkerID            marker;
    unsigned short      count;
    char                text;
} Comment;

timeStamp indicates when the comment was created. Units are the number of seconds since January 1, 1904. (This time convention is the one used by the Macintosh. For procedures that manipulate the time stamp, see The Operating System Utilities chapter in Inside Macintosh, vol II ). For a routine that will convert this to an Apple II GS/OS format time, please see Apple II File Type Note for filetype 0xD8, aux type 0x0000.

A comment can be linked to a marker. This allows applications to store long descriptions of markers as a comment. If the comment is referring to a marker, then marker is the ID of that marker. Otherwise, marker is zero, indicating that this comment is not linked to a marker.

count is the length of the text that makes up the comment. This is a 16 bit quantity, allowing much longer comments than would be available with a pstring.

text contains the comment itself. This text must be padded with a byte at the end to insure that it is an even number of bytes in length. This pad byte, if present, is not included in count.

Comments Chunk Format

#define CommentID       'COMT'  /* ckID for Comments Chunk.  */
typedef struct {
    ID                  ckID;
    long                ckSize;
    unsigned short      numComments;
    Comment             comments[];
} CommentsChunk;

ckID is always ' COMT'. ckSize is the size of the data portion of the chunk, in bytes. It does not include the 8 bytes used by ckID and ckSize.

numComments contains the number of comments in the Comments Chunk. This is followed by the comments themselves. Comments are always an even number of bytes in length, so there is no padding between comments in the Comments Chunk.

The Comments Chunk is optional. No more than one Comments Chunk may appear in a single FORM AIFF.

Text Chunks - Name, Author, Copyright, Annotation

These four chunks are included in the definition of every "EA IFF 85" file. All are text chunks; their data portion consists solely of text. Each of these chunks is optional.

#define NameID          'NAME'  /* ckID for Name Chunk.  */
#define AuthorID        'AUTH'  /* ckID for Author Chunk.  */
#define CopyrightID     '(c) '  /* ckID for Copyright Chunk.  */
#define AnnotationID    'ANNO'  /* ckID for Annotation Chunk.  */
typedef struct {
    ID                  ckID;
    long                ckSize;
    char                text[];
} TextChunk;

ckID is either ' NAME', ' AUTH', '(c) ', or ' ANNO', depending on whether the chunk as a Name Chunk, Author Chunk, Copyright Chunk, or Annotation Chunk, respectively. For the Copyright Chunk, the 'c' is lowercase and there is a space (0x20) after the close parenthesis.

ckSize is the size of the data portion of the chunk, in this case the text.

text contains pure ASCII characters. It is not a pstring nor a C string. The number of characters in text is determined by ckSize. The contents of text depend on the chunk, as described below:

Name Chunk

text contains the name of the sampled sound. The Name Chunk is optional. No more than one Name Chunk may exist within a FORM AIFF.

Author Chunk

text contains one or more author names. An author in this case is the creator of a sampled sound. The Author Chunk is optional. No more than one Author Chunk may exist within a FORM AIFF.

Copyright Chunk

The Copyright Chunk contains a copyright notice for the sound. text contains a date followed by the copyright owner. The chunk ID '(c) ' serves as the copyright characters '©'. For example, a Copyright Chunk containing the text "1988 Apple Computer, Inc." means "© 1988 Apple Computer, Inc."

The Copyright Chunk is optional. No more than one Copyright Chunk may exist within a FORM AIFF.

Annotation Chunk

text contains a comment. Use of this chunk is discouraged within FORM AIFF. The more powerful Comments Chunk should be used instead. The Annotation Chunk is optional. Many Annotation Chunks may exist within a FORM AIFF.

Chunk Precedence

Several of the local chunks for FORM AIFF may contain duplicate information. For example, the Instrument Chunk defines loop points and MIDI system exclusive data in the MIDI Data Chunk may also define loop points. What happens if these loop points are different? How is an application supposed to loop the sound?

Such conflicts are resolved by defining a precedence for chunks:

The Common Chunk has the highest precedence, while the Application Specific Chunk has the lowest. Information in the Common Chunk always takes precedence over conflicting information in any other chunk. The Application Specific Chunk always loses in conflicts with other chunks. By looking at the chunk hierarchy, for example, one sees that the loop points in the Instrument Chunk take precedence over conflicting loop points found in the MIDI Data Chunk.

It is the responsibility of applications that write data into the lower precedence chunks to make sure that the higher precedence chunks are updated accordingly.

Appendix

Illustrated below is an example of a FORM AIFF. An Audio IFF file is simply a file containing a single FORM AIFF. On a Macintosh, the FORM AIFF is stored in the data fork of a file and the file type is 'AIFF'.


WAVE PCM soundfile format

The WAVE file format is a subset of Microsoft's RIFF specification for the storage of multimedia files. A RIFF file starts out with a file header followed by a sequence of data chunks. A WAVE file is often just a RIFF file with a single "WAVE" chunk which consists of two sub-chunks -- a "fmt " chunk specifying the data format and a "data" chunk containing the actual sample data. Call this form the "Canonical form". Who knows how it really all works. An almost complete description which seems totally useless unless you want to spend a week looking over it can be found at MSDN (mostly describes the non-PCM, or registered proprietary data formats).

Offset  Size  Name             Description

The canonical WAVE format starts with the RIFF header:

0         4   ChunkID          Contains the letters "RIFF" in ASCII form
                               (0x52494646 big-endian form).
4         4   ChunkSize        36 + SubChunk2Size, or more precisely:
                               4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
                               This is the size of the rest of the chunk 
                               following this number.  This is the size of the 
                               entire file in bytes minus 8 bytes for the
                               two fields not included in this count:
                               ChunkID and ChunkSize.
8         4   Format           Contains the letters "WAVE"
                               (0x57415645 big-endian form).

The "WAVE" format consists of two subchunks: "fmt " and "data":
The "fmt " subchunk describes the sound data's format:

12        4   Subchunk1ID      Contains the letters "fmt "
                               (0x666d7420 big-endian form).
16        4   Subchunk1Size    16 for PCM.  This is the size of the
                               rest of the Subchunk which follows this number.
20        2   AudioFormat      PCM = 1 (i.e. Linear quantization)
                               Values other than 1 indicate some 
                               form of compression.
22        2   NumChannels      Mono = 1, Stereo = 2, etc.
24        4   SampleRate       8000, 44100, etc.
28        4   ByteRate         == SampleRate * NumChannels * BitsPerSample/8
32        2   BlockAlign       == NumChannels * BitsPerSample/8
                               The number of bytes for one sample including
                               all channels. I wonder what happens when
                               this number isn't an integer?
34        2   BitsPerSample    8 bits = 8, 16 bits = 16, etc.
          2   ExtraParamSize   if PCM, then doesn't exist
          X   ExtraParams      space for extra parameters

The "data" subchunk contains the size of the data and the actual sound:

36        4   Subchunk2ID      Contains the letters "data"
                               (0x64617461 big-endian form).
40        4   Subchunk2Size    == NumSamples * NumChannels * BitsPerSample/8
                               This is the number of bytes in the data.
                               You can also think of this as the size
                               of the read of the subchunk following this 
                               number.
44        *   Data             The actual sound data.

As an example, here are the opening 72 bytes of a WAVE file with bytes shown as hexadecimal numbers:

52 49 46 46 24 08 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 02 00 
22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61 00 08 00 00 00 00 00 00 
24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d 

Here is the interpretation of these bytes as a WAVE soundfile:

Notes:

  • The default byte ordering assumed for WAVE data files is little-endian. Files written using the big-endian byte ordering scheme have the identifier RIFX instead of RIFF.
  • The sample data must end on an even byte boundary. Whatever that means.
  • 8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767.
  • There may be additional subchunks in a Wave data stream. If so, each will have a char[4] SubChunkID, and unsigned long SubChunkSize, and SubChunkSize amount of data.
  • RIFF stands for Resource Interchange File Format.

General discussion of RIFF files:

Multimedia applications require the storage and management of a wide variety of data, including bitmaps, audio data, video data, and peripheral device control information. RIFF provides a way to store all these varied types of data. The type of data a RIFF file contains is indicated by the file extension. Examples of data that may be stored in RIFF files are:
  • Audio/visual interleaved data (.AVI)
  • Waveform data (.WAV)
  • Bitmapped data (.RDI)
  • MIDI information (.RMI)
  • Color palette (.PAL)
  • Multimedia movie (.RMN)
  • Animated cursor (.ANI)
  • A bundle of other RIFF files (.BND)
NOTE: At this point, AVI files are the only type of RIFF files that have been fully implemented using the current RIFF specification. Although WAV files have been implemented, these files are very simple, and their developers typically use an older specification in constructing them.

For more info see

http://www.ora.com/centers/gff/formats/micriff/index.htm

References:

  1. http://www.ora.com/centers/gff/formats/micriff/index.htm (good).
  2. http://premium.microsoft.com/msdn/library/tools/dnmult/d1/newwave.htm
  3. http://www.lightlink.com/tjweber/StripWav/WAVE.html


RIFF WAVE (.WAV) file format

          Waveform Audio File Format (WAVE)

               This section describes the Waveform format, which is used to
               represent digitized sound.

               The WAVE form is defined as follows. Programs must expect
               (and ignore) any unknown chunks encountered, as with all
               RIFF forms. However, 〈fmt-ck〉 must always occur before
               〈wave-data〉, and both of these chunks are mandatory in a
               WAVE file.

                〈WAVE-form〉 -〉
                      RIFF( 'WAVE'
                           〈fmt-ck〉               // Format
                           [〈fact-ck〉]                 // Fact chunk
                           [〈cue-ck〉]             // Cue points
                           [〈playlist-ck〉]             // Playlist
                           [〈assoc-data-list〉]              // Associated
                data list
                           〈wave-data〉   )             // Wave data

               The WAVE chunks are described in the following sections.

          WAVE Format Chunk

               The WAVE format chunk 〈fmt-ck〉 specifies the format of the
               〈wave-data〉. The 〈fmt-ck〉 is defined as follows:

                〈fmt-ck〉 -〉   fmt( 〈common-fields〉
                                 〈format-specific-fields〉 )

                〈common-fields〉 -〉
                      struct
                      {
                         WORD wFormatTag;              // Format category
                         WORD wChannels;          // Number of channels
                         DWORDdwSamplesPerSec;         // Sampling rate
                         DWORDdwAvgBytesPerSec;        // For buffer
                estimation
                         WORD wBlockAlign;        // Data block size
                      }

               The fields in the 〈common-fields〉 chunk are as follows:

               Field          Description

               wFormatTag     A number indicating the WAVE format
                              category of the file. The content of
                              the 〈format-specific-fields〉 portion
                              of the `fmt' chunk, and the
                              interpretation of the waveform data,
                              depend on this value.

                              You must register any new WAVE format
                              categories. See ``Registering
                              Multimedia Formats'' in Chapter 1,
                              ``Overview of Multimedia
                              Specifications,'' for information on
                              registering WAVE format categories.

                              ``Wave Format Categories,'' following
                              this section, lists the currently
                              defined WAVE format categories.

               wChannels      The number of channels represented in
                              the waveform data, such as 1 for mono
                              or 2 for stereo.

               dwSamplesPerSe The sampling rate (in samples per
               c              second) at which each channel should
                              be played.

               dwAvgBytesPerS The average number of bytes per second
               ec             at which the waveform data should be
                              transferred. Playback software can
                              estimate the buffer size using this
                              value.

               wBlockAlign    The block alignment (in bytes) of the
                              waveform data. Playback software needs
                              to process a multiple of wBlockAlign
                              bytes of data at a time, so the value
                              of wBlockAlign can be used for buffer
                              alignment.

               The 〈format-specific-fields〉 consists of zero or more bytes
               of parameters. Which parameters occur depends on the WAVE
               format category-see the following section for details.
               Playback software should be written to allow for (and
               ignore) any unknown 〈format-specific-fields〉 parameters that
               occur at the end of this field.

          WAVE Format Categories

               The format category of a WAVE file is specified by the value
               of the wFormatTag field of the `fmt' chunk. The

               representation of data in 〈wave-data〉, and the content of
               the 〈format-specific-fields〉 of the `fmt' chunk, depend on
               the format category.

               The currently defined open non-proprietary WAVE format
               categories are as follows:

               wFormatTag Value         Format Category

               WAVE_FORMAT_PCM (0x0001) Microsoft Pulse Code
                                        Modulation (PCM) format

               The following are the registered proprietary WAVE format
               categories:

               wFormatTag Value         Format Category

               IBM_FORMAT_MULAW         IBM mu-law format
               (0x0101)

               IBM_FORMAT_ALAW (0x0102) IBM a-law format

               IBM_FORMAT_ADPCM         IBM AVC Adaptive
               (0x0103)                 Differential Pulse Code
                                        Modulation format

               The following sections describe the Microsoft
               WAVE_FORMAT_PCM format.

               Pulse Code Modulation (PCM) Format

               If the wFormatTag field of the 〈fmt-ck〉 is set to
               WAVE_FORMAT_PCM, then the waveform data consists of samples
               represented in pulse code modulation (PCM) format. For PCM
               waveform data, the 〈format-specific-fields〉 is defined as
               follows:

                〈PCM-format-specific〉 -〉
                      struct
                      {
                         WORD wBitsPerSample;      // Sample size
                      }

               The wBitsPerSample field specifies the number of bits of
               data used to represent each sample of each channel. If there

               are multiple channels, the sample size is the same for each
               channel.

               For PCM data, the wAvgBytesPerSec field of the `fmt' chunk
               should be equal to the following formula rounded up to the
               next whole number:

                                              wBitsPerSample
                 wChannels x wBitsPerSecond x --------------
                                                     8

               The wBlockAlign field should be equal to the following
               formula, rounded to the next whole number:

                             wBitsPerSample
                 wChannels x --------------
                                    8

               Data Packing for PCM WAVE Files

               In a single-channel WAVE file, samples are stored
               consecutively. For stereo WAVE files, channel 0 represents
               the left channel, and channel 1 represents the right
               channel. The speaker position mapping for more than two
               channels is currently undefined. In multiple-channel WAVE
               files, samples are interleaved.

               The following diagrams show the data packing for a 8-bit
               mono and stereo WAVE files:

                     Sample 1     Sample 2     Sample 3    Sample 4

                     Channel 0    Channel 0   Channel 0    Channel 0

                             Data Packing for 8-Bit Mono PCM

                            Sample 1                 Sample 2

                     Channel 0    Channel 1   Channel 0    Channel 0
                      (left)       (right)      (left)      (right)

                            Data Packing for 8-Bit Stereo PCM

               The following diagrams show the data packing for 16-bit mono
               and stereo WAVE files:

                            Sample 1                 Sample 2

                     Channel 0    Channel 0   Channel 0    Channel 0

                     low-order   high-order   low-order   high-order
                       byte         byte         byte        byte

                             Data Packing for 16-Bit Mono PCM

                                        Sample 1

                     Channel 0    Channel 0   Channel 1    Channel 1
                      (left)       (left)      (right)      (right)
                     low-order   high-order   low-order   high-order
                       byte         byte         byte        byte

                            Data Packing for 16-Bit Stereo PCM

               Data Format of the Samples

               Each sample is contained in an integer i. The size of i is
               the smallest number of bytes required to contain the
               specified sample size. The least significant byte is stored
               first. The bits that represent the sample amplitude are
               stored in the most significant bits of i, and the remaining
               bits are set to zero.

               For example, if the sample size (recorded in nBitsPerSample)
               is 12 bits, then each sample is stored in a two-byte
               integer. The least significant four bits of the first (least
               significant) byte is set to zero.

               The data format and maximum and minimums values for PCM
               waveform samples of various sizes are as follows:

               Sample Size  Data Format Maximum Value  Minimum Value

               One to       Unsigned    255 (0xFF)     0
               eight bits   integer

               Nine or      Signed      Largest        Most negative
               more bits    integer i   positive       value of i
                                        value of i

               For example, the maximum, minimum, and midpoint values for
               8-bit and 16-bit PCM waveform data are as follows:

               Format       Maximum     Minimum Value  Midpoint
                            Value                      Value

               8-bit PCM    255 (0xFF)  0              128 (0x80)

               16-bit PCM   32767       -32768         0
                            (0x7FFF)    (-0x8000)

               Examples of PCM WAVE Files

               Example of a PCM WAVE file with 11.025 kHz sampling rate,
               mono, 8 bits per sample:

                RIFF( 'WAVE'     fmt(1, 1, 11025, 11025, 1, 8)
                              data( 〈wave-data〉 ) )

               Example of a PCM WAVE file with 22.05 kHz sampling rate,
               stereo, 8 bits per sample:

                RIFF( 'WAVE'     fmt(1, 2, 22050, 44100, 2, 8)
                              data( 〈wave-data〉 ) )

               Example of a PCM WAVE file with 44.1 kHz sampling rate,
               mono, 20 bits per sample:

                RIFF( 'WAVE'     INFO(INAM("O Canada"Z))
                              fmt(1, 1, 44100, 132300, 3, 20)
                              data( 〈wave-data〉 ) )

          Storage of WAVE Data

               The 〈wave-data〉 contains the waveform data. It is defined as
               follows:

                〈wave-data〉 -〉   { 〈data-ck〉 : 〈data-list〉 }

                〈data-ck〉  -〉    data( 〈wave-data〉 )

                〈wave-list〉 -〉   LIST( 'wavl' {        〈data-ck〉 :
                    // Wave samples
                                           〈silence-ck〉 }... )   // Silence

                〈silence-ck〉 -〉  slnt( 〈dwSamples:DWORD〉 )       // Count
                of
                                                       // silent samples

               Note:  The `slnt' chunk represents silence, not necessarily
               a repeated zero volume or baseline sample. In 16-bit PCM
               data, if the last sample value played before the silence
               section is a 10000, then if data is still output to the D to
               A converter, it must maintain the 10000 value. If a zero

               value is used, a click may be heard at the start and end of
               the silence section. If play begins at a silence section,
               then a zero value might be used since no other information
               is available. A click might be created if the data following
               the silent section starts with a nonzero value.

          FACT Chunk

               The 〈fact-ck〉 fact chunk stores important information about
               the contents of the WAVE file. This chunk is defined as
               follows:

                〈fact-ck〉 -〉 fact( 〈dwFileSize:DWORD〉 )            // Number
                of samples

               The `fact'' chunk is required if the waveform data is
               contained in a `wavl'' LIST chunk and for all compressed
               audio formats. The chunk is not required for PCM files using
               the `data'' chunk format.

               The "fact" chunk will be expanded to include any other
               information required by future WAVE formats. Added fields
               will appear following the 〈dwFileSize〉 field. Applications
               can use the chunk size field to determine which fields are
               present.

          Cue-Points Chunk

               The 〈cue-ck〉 cue-points chunk identifies a series of
               positions in the waveform data stream. The 〈cue-ck〉 is
               defined as follows:

                〈cue-ck〉 -〉   cue( 〈dwCuePoints:DWORD〉      // Count of cue
                points
                                   〈cue-point〉... )         // Cue-point
                table

                〈cue-point〉 -〉   struct {
                                 DWORD  dwName;
                                 DWORD  dwPosition;
                                 FOURCC fccChunk;
                                 DWORD  dwChunkStart;
                                 DWORD  dwBlockStart;
                                 DWORD  dwSampleOffset;
                              }

               The 〈cue-point〉 fields are as follows:

               Field          Description

               dwName         Specifies the cue point name. Each
                              〈cue-point〉 record must have a unique
                              dwName field.

               dwPosition     Specifies the sample position of the
                              cue point. This is the sequential
                              sample number within the play order.
                              See ``Playlist Chunk,'' later in this
                              document, for a discussion of the play
                              order.

               fccChunk       Specifies the name or chunk ID of the
                              chunk containing the cue point.

               dwChunkStart   Specifies the file position of the
                              start of the chunk containing the cue
                              point. This is a byte offset relative
                              to the start of the data section of
                              the `wavl' LIST chunk.

               dwBlockStart   Specifies the file position of the
                              start of the block containing the
                              position. This is a byte offset
                              relative to the start of the data
                              section of the `wavl' LIST chunk.

               dwSampleOffset Specifies the sample offset of the cue
                              point relative to the start of the
                              block.

               Examples of File Position Values

               The following table describes the 〈cue-point〉 field values
               for a WAVE file containing multiple `data' and `slnt' chunks
               enclosed in a `wavl' LIST chunk:

               Cue Point     Field         Value
               Location

               In a `slnt'   fccChunk      FOURCC value `slnt'.
               chunk

                             dwChunkStart  File position of the
                                           `slnt' chunk relative to
                                           the start of the data
                                           section in the `wavl' LIST
                                           chunk.

                             dwBlockStart  File position of the data
                                           section of the `slnt'
                                           chunk relative to the
                                           start of the data section
                                           of the `wavl' LIST chunk.

                             dwSampleOffs  Sample position of the cue
                             et            point relative to the
                                           start of the `slnt' chunk.

               In a PCM      fccChunk      FOURCC value `data'.
               `data' chunk

                             dwChunkStart  File position of the
                                           `data' chunk relative to
                                           the start of the data
                                           section in the `wavl' LIST
                                           chunk.

                             dwBlockStart  File position of the cue
                                           point relative to the
                                           start of the data section
                                           of the `wavl' LIST chunk.

                             dwSampleOffs  Zero value.
                             et

               In a          fccChunk      FOURCC value `data'.
               compressed
               `data' chunk

                             dwChunkStart  File position of the start
                                           of the `data' chunk
                                           relative to the start of
                                           the data section of the
                                           `wavl' LIST chunk.

                             dwBlockStart  File position of the
                                           enclosing block relative
                                           to the start of the data
                                           section of the `wavl' LIST
                                           chunk. The software can
                                           begin the decompression at
                                           this point.

                             dwSampleOffs  Sample position of the cue
                             et            point relative to the
                                           start of the block.

               The following table describes the 〈cue-point〉 field values
               for a WAVE file containing a single `data' chunk:

               Cue Point     Field         Value
               Location

               Within PCM    fccChunk      FOURCC value `data'.
               data

                             dwChunkStart  Zero value.

                             dwBlockStart  Zero value.

                             dwSampleOffs  Sample position of the cue
                             et            point relative to the
                                           start of the `data' chunk.

               In a          fccChunk      FOURCC value `data'.
               compressed
               `data' chunk

                             dwChunkStart  Zero value.

                             dwBlockStart  File position of the
                                           enclosing block relative
                                           to the start of the `data'
                                           chunk. The software can
                                           begin the decompression at
                                           this point.

                             dwSampleOffs  Sample position of the cue
                             et            point relative to the
                                           start of the block.

          Playlist Chunk

               The 〈playlist-ck〉 playlist chunk specifies a play order for
               a series of cue points. The 〈playlist-ck〉 is defined as
               follows:

                〈playlist-ck〉 -〉   plst(
                                 〈dwSegments:DWORD〉    // Count of play
                segments
                                 〈play-segment〉... )   // Play-segment
                table

                〈play-segment〉 -〉  struct {
                                   DWORD dwName;
                                   DWORD dwLength;
                                   DWORD dwLoops;
                                 }

               The 〈play-segment〉 fields are as follows:

               Field          Description

               dwName         Specifies the cue point name. This
                              value must match one of the names
                              listed in the 〈cue-ck〉 cue-point
                              table.

               dwLength       Specifies the length of the section in
                              samples.

               dwLoops        Specifies the number of times to play
                              the section.

          Associated Data Chunk

               The 〈assoc-data-list〉 associated data list provides the
               ability to attach information like labels to sections of the
               waveform data stream. The 〈assoc-data-list〉 is defined as
               follows:

                〈assoc-data-list〉 -〉  LIST('adtl'
                                        〈labl-ck〉                // Label
                                        〈note-ck〉                // Note
                                        〈ltxt-ck〉                // Text
                with data length
                                        〈file-ck〉 )              // Media
                file

                〈labl-ck〉 -〉       labl(〈dwName:DWORD〉
                                        〈data:ZSTR〉 )

                〈note-ck〉 -〉       note(〈dwName:DWORD〉
                                        〈data:ZSTR〉 )

                〈ltxt-ck〉 -〉       ltxt(〈dwName:DWORD〉
                                        〈dwSampleLength:DWORD〉
                                        〈dwPurpose:DWORD〉
                                        〈wCountry:WORD〉
                                        〈wLanguage:WORD〉
                                        〈wDialect:WORD〉
                                        〈wCodePage:WORD〉
                                        〈data:BYTE〉... )

                〈file-ck〉 -〉       file(〈dwName:DWORD〉
                                        〈dwMedType:DWORD〉
                                        〈fileData:BYTE〉...)

               Label and Note Information

               The `labl' and `note' chunks have similar fields. The `labl'
               chunk contains a label, or title, to associate with a cue
               point. The `note' chunk contains comment text for a cue
               point. The fields are as follows:

               Field          Description

               dwName         Specifies the cue point name.  This
                              value must match one of the names
                              listed in the 〈cue-ck〉 cue-point
                              table.

               data           Specifies a NULL-terminated string
                              containing a text label (for the
                              `labl' chunk) or comment text (for the
                              `note' chunk).

               Text with Data Length Information

               The `ltxt'' chunk contains text that is associated with a
               data segment of specific length. The chunk fields are as
               follows:

               Field          Description

               dwName         Specifies the cue point name.  This
                              value must match one of the names
                              listed in the 〈cue-ck〉 cue-point
                              table.

               dwSampleLength Specifies the number of samples in the
                              segment of waveform data.

               dwPurpose      Specifies the type or purpose of the
                              text. For example, dwPurpose can
                              specify a FOURCC code like `scrp' for
                              script text or `capt' for close-
                              caption text.

               wCountry       Specifies the country code for the
                              text. See ``Country Codes'' in Chapter
                              2, ``Resource Interchange File
                              Format,'' for a current list of
                              country codes.

               wLanguage,     Specify the language and dialect codes
               wDialect       for the text. See ``Language and
                              Dialect Codes'' in Chapter 2,
                              ``Resource Interchange File Format,''
                              for a current list of language and
                              dialect codes.

               wCodePage      Specifies the code page for the text.

               Embedded File Information

               The `file' chunk contains information described in other
               file formats (for example, an `RDIB' file or an ASCII text
               file). The chunk fields are as follows:

               Field          Description

               dwName         Specifies the cue point name.  This
                              value must match one of the names
                              listed in the 〈cue-ck〉 cue-point
                              table.

               dwMedType      Specifies the file type contained in
                              the fileData field. If the fileData
                              section contains a RIFF form, the
                              dwMedType field is the same as the
                              RIFF form type for the file.

                              This field can contain a zero value.

               fileData       Contains the media file.


MPEG Facts and Info

Why MPEG?
We chose to distribute on MPEG becuase of it's superior compression scheme and it's hi-fi nature. We feel that MPEG is the future, and we want to be a part of it now!

  1. What is MPEG?
  2. How does MPEG-1 AUDIO work?
  3. How good is MPEG-1 AUDIO compression?
  4. How does MPEG-1 AUDIO achieve this compression ratio?
  5. Explain the masking effect
  6. Who is using MPEG-1 AUDIO?
  7. Which sampling frequencies are used?
  8. How many audio channels?
  9. Where can I get more details about MPEG audio?

What is MPEG?

MPEG stands for Motion PicturesExperts Group. MPEG is a group of people that meet under ISO (the InternationalStandards Organization) to generate standards for digital video(sequences of images in time) and audio compression. In particular,they define a compressed bit stream, which implicitly defines adecompressor. However, the compression algorithms are up to theindividual manufacturers, and that is where proprietary advantage is obtained within the scope of a publicly available international standard. MPEG meets roughly four times a year for roughly a week each time. In between meetings, a great deal of work is done by the members, so it doesn't all happen at the meetings. The work is organized and planned at the meetings.

How does MPEG-1 AUDIO work ?

Well, first you need to know how sound is stored in a computer. Sound is pressure differences in air. When picked up by a microphone and fed through an amplifier this becomes voltage levels. The voltage is sampled by the computer a number of times per second. For CD-audio quality you need to sample 44100 times per second and each sample has a resolution of 16 bits. In stereo this gives you 1.4 Mbit per second and you can probably see the need for compression.

To compress audio MPEG tries to remove the irrelevant parts of the signal and the redundant parts of the signal. Parts of the sound that we do not hear can be thrown away. To do this MPEG Audio uses psyco-acustic principles.

How good is MPEG-1 AUDIO compression ?

MPEG can compress to a bitstream of 32 kbit/s to 384 kbit/s (Layer II). A raw PCM audio bitstream is about 705kbit/s so this gives a max compression ratio of about 22. Normal compression ratio is morelike 1:6 or 1:7. If you think that this is not much please remember that unlike video we are talking about no perceivable quality loss here. 96kbit/s is considered transparent for most practical purposes. This means that you will not notice any difference between the original and the compressed signal for rock'n roll or popular music. For more demanding stuff like piano concerts and such you will need to go up to 128kbit/s.

How does MPEG-1 AUDIO achieve this compression ratio ?

Well, with audio you basically have two alternatives. Either you sample less often or you sample with less resolution (less than 16bit per sample). If you want quality you can't do much with the sample frequency. Humans can hear sounds with frequencies from about 20Hz to 20kHz. According to the Nyquist theorem you must sample at least two times the highest frequency you want to reproduce. Allowing for imperfect filters, a 44,1kHz sampling rate is a fair minimum. So you either set out to prove the Nyquist theorem is wrong or go to work on reducing the resolution. The MPEG committee chose the latter.

Now, the real reason for using 16 bits is to get a good signal-to-noise (s/n) ratio. The noise we're talking about here is quantization noise from the digitizing process. For each bit you add, you get 6dBbetter s/n. (To the ear, 6dBu corresponds to a doubling of the soundlevel.) CD-audio achieves about 90dB s/n. This matches the dynamic range of the ear fairly well. That is, you will not hear any noise coming from the system itself (well, there is still some people arguing about that, but lets not worry about them for the moment). So what happens when you sample to 8 bit resolution ? You get a very noticeable noise floor in your recording. You can easily hear this in silent moments in the music or between words or sentences if your recording is a human voice. Waitaminnit. You don't notice any noise in loud passages, right? This is the masking effect and is the key to MPEG Audio coding. Stuff like the masking effect belongs to a science called psyco-acoustics that deals with the way the human brain perceives sound. And MPEG uses psycoacoustic principles when it does its thing.

Explain the masking effect

Say you have a strong tone with a frequency of 1000Hz. You also have a tone nearby of say 1100Hz. This second tone is 18 dB lower.You are not going to hear this second tone. It is completely masked by the first 1000Hz tone. As a matter of fact, any relatively weak sounds near a strong sound is masked. If you introduce another tone at 2000Hz also 18 dB below the first 1000Hz tone, you will hear this. You will have to turn down the 2000Hz tone to something like 45 dB below the 1000Hz tone before it will be masked by the first tone. So the further you get from a sound the less masking effect it has. The masking effect means that you can raise the noise floor around a strong sound because the noise will be masked anyway. And raising the noise floor is the same as using less bits and using less bits is the same as compression.

Let's now try to explain how the MPEG Audio coder goes about its thing. It divides the frequency spectrum (20Hz to 20kHz) into 32 sub-bands. Each sub-band holds a little slice of the audio spectrum. Say, in the upper region of sub-band 8, a 1000Hz tone with a level of60dB is present. OK, the coder calculates the masking effect of this sound and finds that there is a masking threshold for the entire 8thsub-band (all sounds w. a frequency...) 35dB below this tone. The acceptable s/n ratio is thus 60 - 35 = 25 dB. The equals 4 bitresolution. In addition there are masking effects on band 9-13 and onband 5-7, the effect decreasing with the distance from band 8.I a real-life situation you have sounds in most bands and the masking effects are additive. In addition the coder considers the sensitivity of the ear for various frequencies. The ear is a lot less sensitive in the high and low frequencies. Peak sensitivity is around 2-4kHz,the same region that the human voice occupies.

The sub-bands should match the ear, that is each sub-band should consist of frequencies that have the same psycoacustic properties. In MPEG layer II, each subband is 625Hz wide. It would been better ifthe sub-bands where narrower in the low frequency range and wider inthe high frequency range. To do this you need complex filters. To keep the filters simple they chose to add FFT in parallel with the filtering and use the spectral components from the FFT as additional information to the coder. This way you get higher resolution in the low frequencies where the ear is more sensitive.

But there is more to it. We have explained concurrent masking, but the masking effect also occurs before and after a strong sound (pre- and postmasking)

If there is a significant (30 - 40dB ) shift in level. The reason is believed to be that the brain needs some processing time. Premasking is only about 2 to 5 ms. The postmasking can be up till100ms. Other bit-reduction techniques involve considering tonal and non-tonal components of the sound. For a stereo signal you have a lot of redundancy between channels. The last step before formatting is Huffman coding.

The coder calculates masking effects by an iterative process untilit runs out of time. It is up to the implement or to spend bits in the least obtrusive fashion. For layer II the coder works on 23 ms of sound (1152 samples) at a time. For some material the 23 ms time-window can be a problem. This is normally in a situation with transients where there are large differences in sound level over the 23 ms. The masking is calculated on the strongest sound and the weak parts will drown in quantization noise. This is perceived as a noise-echo by the ear. Layer III addresses this problem specifically.

Who is using MPEG-1 AUDIO?

Philips uses MPEG for their new digital video CD's. They say they will start shipping movies and music videos on CD's for their CD-Iplayer by the end of this year. MPEG is accepted by Eureka-147. That means that when digital radio broadcasts starts in Europe a couple of years from now, you will receive MPEG coded audio.

The IUMA (Internet Underground Music Archive) holds many audio clips in MPEG compressed format, but you might need to configure your WWW browser. IUMA, has been founded to provide a world wide audience to otherwise obscure and unavailable bands and artists.

Which sampling frequencies are used ?

You can have 48kHz, (used in professional sound equipment), 44,1kHz(used in consumer equipment like CD-audio) or 32kHz (used in some communications equipment).

How many audio channels?

MPEG I allows for two audio channels. These can be either single(mono) dual (two mono channels), stereo or joint stereo (intensity stereo or m/s-stereo). In normal (l/r) stereo one channel carries the left audio signal and one channel carries the right audio signal. In m/s stereo one channel carries the sum signal (l+r) and the other the difference (l-r) signal. In intensity stereo the high frequency part of the signal (above 2kHz) is combined. The stereo image is preserved but only the temporal envelope is transmitted.In addition MPEG allows for pre-emphasis, copyright marks and original/copy marks. MPEG II allows for several channels in the same stream.

Where can I get more details about MPEG audio ?

There is no description of the coder in the specs. The specs describes in great detail the bitstream and suggests psycoacustic models.

A good summary of MPEG-1 audio is :ISO-MPEG-1 Audio: A generic standard for coding of high-quality digital audio J. Audio Eng. Soc. 42(10):780-792, October 1994.


[1]  General Information
[1.0] What is an "MP3"?
[1.1] What newsgroups does this FAQ apply to?
[1.2] Dividing the groups into genres would be a good idea. How come there aren't groups like a.b.s.m.jazz, or a.b.s.m.metal?
[1.3] What are these groups all about?
[1.4] What about the other MP3 groups that I see?  Does this FAQ apply to them too?
[1.5] Anything else I should know about this FAQ before I continue on?
 
[2] Requesting MP3s
[2.0] I really want a song to get posted.  How do I request it?
[2.1] I've come up with about 100 songs that I want.  I guess I should post a separate request for each one, right?
[2.2] So how do I get ALL the songs that I want?
[2.3] I want to make sure that people see my requests, so I'm going to post them five times each.  People will notice me then, right?
[2.4] I posted my requests and nobody filled them. Why?  And what can I do about it?
[2.5] I know how to make my requests now, but I can't find alt.binaries.sounds.mp3.requests. How am I supposed to post to the "requests" group if it doesn't exist?
[2.6] How can I confirm that my news server carries the requests group?
[2.7] The requests group isn't on my news server! I TOLD you that it doesn't exist!  Now what do I do?
[2.8] I'm trying to remain anonymous, but when I signed up for dejanews they needed to know my e-mail address.  So when I post a request won't people be able to find me?
[2.9] If I get a new e-mail address, then people won't recognize my name/nym and I won't get the files I request.  Isn't there ANY other way to get the requests group?
[2.10] I made my request and I think it got posted, but with all the spam in the binary group I can't find a thing.  I thought I heard about some filter that people are using.  What is it?
[2.11] Yadda-yadda-yadda... Just give me the spam filter for Agent!
[2.12] Where is this "d" group or "discussion group" that everybody talks about?  I can't find it on my news server.
[2.13] I thought that all requests were supposed to go into the discussion group.  If that's not true, then why are there so many requests there?
 
[3] Making MP3s
[3.0] Other detailed sources of instruction
[3.1] I want to give something back to this group.  How do I make an MP3?
[3.2] How do I get the music from my CD-ROM onto my computer?
[3.3] How do I determine if my CD-ROM supports digital audio extraction (DAE)?
[3.4] I know my CD-ROM does DAE, but I'm having strange problems and I can't get it to work right.  What do I do?
[3.5] My CD-ROM supports DAE, what do I use to rip audio tracks?
[3.6] Can I encode an MP3 straight off of the CD?
[3.7] I've ripped the audio track but the .wav file is messed up.  It seems jittery and has pops or skips.  Why?
[3.8] I don't like the way the song sounds on the CD because I like more bass.  Should I adjust the E.Q. on the .wav file before making it into an MP3 and uploading it?
[3.9] I've ripped the track to my hard drive.  Anything I should do before I turn it into an MP3?
[3.10] I've listened to all my uncompressed files and they sound great, now how do I make them into MP3s?
[3.11] I've heard that not all encoders/codecs give equal quality results.  Which encoder/codec is best?
[3.12] What is HQ?  Should I use it?
[3.13] What sampling rate and bitrate should I use?
[3.14] Is there any time that a sample and bitrate other than 44.1/128 is recommended?
[3.15] What's the Difference between Stereo, Joint-Stereo and Dual-Channel?
[3.16] My CD-ROM doesn't do DAE but I can sample the audio via my sound card.  Should I do that?
[3.17] I don't have a CD-ROM in my computer, but I do have a CD player in my stereo; can I just hook that up to my sound card and sample it that way?
[3.18] I have some tapes that I want to post as MP3s.  How can I do that?
[3.19] I made an MP3 from a tape and it sounds TERRIBLE!  No, I mean a lot worse than the .wav file did.  Why?
[3.20] I've made my MP3s and it's time to name them.  Is there a naming standard?  What information should I include in the name?
[3.21] What about MP3 ID tags?  Should I bother with them?
[3.22]
Cool, I've ID'd all of my MP3s and I'm ready to post.  Is there anything else I should know?
 
[4] Posting MP3s
[4.0] Where should I post my MP3s?
[4.1] What are the "decade" groups?
[4.2] What about the "other" decade groups?
[4.3] Why should I crosspost the files?  Doesn't that eat up bandwidth and disk space?
[4.4] My news server doesn't carry the decade groups, so I can't crosspost to them. Can I?
[4.5] I read both the main group AND the decade groups.  Is there a way to avoid seeing all those posts twice?
[4.6] Don't some ISPs cancel your message if it's crossposted?
[4.7] What should I put in the subject header of my post?
[4.8] What about the zero-file (0/x)?
[4.9] Some of my files aren't appearing on some other news servers.  Why is that?
[4.10] How many lines per segment should I use when I post?
[4.11] I noticed that people are following up my MP3 posts with questions/salutations/requests/etc in the binary group.  I thought the binary group was only for binaries.  Is there anything I can do to discourage this?
[4.12] Should I answer the questions posted to me in the binary group?
[4.13] I'm trying to post but my server keeps timing out, or I get disconnected in the middle of my post.  Is there any way to resume my post in the middle, or do I have to start over?
[4.14] Man, I had to restart my MP3 upload 5 times last night, and now there are all kinds of little pieces cluttering up the newsgroup.  Is there anything that I can do to clean it up?
[4.15] Whoops!  I posted an MP3 to the discussion group. What should I do?
[4.16] Somebody posted the same file that I posted, should I cancel their post?
[4.17] I'm posting my MP3s.  Should I make an announcement to a.b.s.m.d?
[4.18] I've got a couple hundred MP3s and a cable modem, should I post everything I have so everybody can listen to my MP3s?
[4.19]
I heard that I'm only allowed to post MP3s if they've been requested, is that true?
[4.20]
I see an MP3 request that I can fill.   What should I do?
[4.21]
I just posted a bunch of MP3s but they some were incomplete on a couple news servers, should I just keep re-posting until everybody gets them?
[4.22]
But people keep requesting the same songs.  What do I tell them?
[4.23]
I can never get the songs that I want.  Either they scroll off of my news server, or I have to wait for a repost, or they never show up at all.  What can I do?
[4.24]
Is there a standard format for encoding binaries for posting to Usenet?
[4.25]
I've got some album cover scans for the MP3s that I just uploaded.  Can I post them in the MP3 binary group?
[4.26] Should I zip (arj, rar, jar, gzip etc) my files before uploading?
[4.27]
I've got a new shareware MP3 player/encoder/decoder, should I share it with the group?
[4.28]
What are the "test" groups and who should use them?
 
[5] Playing MP3s On Your Home CD Player
[5.0] I've got all these great MP3s and a CD-recorder; is there any way that I can play these songs on my home CD player? 
[5.1] So there's no way to just play my MP3s on a CD player, a walkman, or anything like that? 
[5.2] How do I make a normal music CD from these MP3 files? 
[5.3] How do I decompress my MP3s into .wav files for burning a CD? 
[5.4] How do I use WinAMP to make .wav files? 
[5.5]
Is WinAMP the only/best decoder?
[5.6] I've got my .wav files, how do I burn a CD? 
[5.7] I burned a CD and there are pops between each track; what gives? 
[5.8] I was trying to record a live music CD, but there are pauses between each track.  What can I do? 
[5.9] What is the best software to use if I want to decode and/or burn a CD?
 
[6] MP3s And The World Wide Web
[6.0] Where are the best places on the web to find MP3s?
[6.1] I downloaded some MP3s from the web and they're all screwy.  What's up?
[6.2] I downloaded some cool songs from this web site that I found, should I upload them?
 
[7] Hardware And Software Choices
[7.0] What CD-ROM should I buy?
[7.1] What CD-ripping software should I use?
[7.2] What .wav file software should I use?
[7.3] Do I need a special soundcard to play MP3s?
[7.4] What is the best soundcard?
[7.5] How do I do XXXX with this cool piece of software called YYYYY?
 
[8] Links Section
[8.0] Other Helpful FAQs
[8.1] General Info
[8.2] Technical Info
[8.3]
Musical Reference
[8.4]
Newsreader Software Info
[8.5]
MP3 Software For Non-Windows Machines
 
[9] The FAQ Quick Review Guide
[9.0] A Quick Reference For Working Within The a.b.s.m.* Newsgroups

 
[1] General Information
 
[1.0]
What is an "MP3"?
MP3 is another name for a layer-3 mpeg.  It is a sound compression system that can create near cd-quality sound files while maintaining a small file size.
[1.1]
What newsgroups does this FAQ apply to?
This FAQ covers the alt.binaries.sounds.mp3 hierarchy and includes, but is not restricted to: 

alt.binaries.sounds.mp3 - The Binary posting group.  This group is for the posting of binary sound files that are in the MP3 format.  This group is NOT for the posting of text, requests, or ftp site announcements.  It is for Binaries and Binaries only.  The exceptions are: postings of this FAQ, zero-files (a.k.a. (0/x)), and Periodic Informational Postings (a.k.a. PIPs).  The non-musical binary exceptions are cover art/insert scans, and other select related binaries. 

alt.binaries.sounds.mp3.d - This is the discussion group for the a.b.s.mp3 hierarchy.  This is one of two non-binaries group of the hierarchy.   Binaries are strictly forbidden in this group.  DO NOT post any binaries in the "d" (discussion) group.  This group is for the discussion of MP3s, MP3 technology, and other MP3 related topics. 

alt.binaries.sounds.mp3.requests - This is the request group of the hierarchy.  It is *not* a binaries group and MP3 files should not be posted there.   This group is intended to contain only requests and request follow-ups alerting the requestor that their request has been filled. 

alt.binaries.sounds.mp3.19xxs - Also known as the decade groups.  These are groups that are similar to the main group (a.b.s.mp3) but are ONLY for the posting of sounds from a specific decade, as indicated by the group name.  The groups are: 
alt.binaries.sounds.mp3.1950s 
alt.binaries.sounds.mp3.1960s 
alt.binaries.sounds.mp3.1970s 
alt.binaries.sounds.mp3.1980s 
alt.binaries.sounds.mp3.1990s 

NOTE:  Although the alt.binaries.sounds.country.mp3 group is *not* part of the alt.binaries.sounds.mp3 hierarchy (and therefore not bound by it's FAQ or charter), it is available on a number of news servers and deserves a mention here for those people interested in country MP3s.

[1.2]
Dividing the groups into genres would be a good idea.   How come there aren't groups like a.b.s.m.jazz, or a.b.s.m.metal?
It seems like every week there is a request that a new MP3 binary group be created for a specific genre of music that would be posted there.  

There are a couple of reasons why this isn't the great idea that it may appear to be.   The first reason is that there isn't enough consistently posted content to validate the addition of the new group.   If there was one specific type of music that consistently accounted for more than 50% of the content of the main group *and* the rest of the group had no interest in that type of music, then *maybe* you'd have a case on this one point.   But the types of music that get posted in the main group vary day to day, and you may go weeks without seeing any specific type of music being posted.  

Look at the alt.binaries hierarchy as a good example of why a hierarchy *should* get subdivided into specific groups.   There is a reason that there isn't just one group "alt.binaries".   It has been divided and subdivided because there is/was a demand for that.   There were enough people who wanted "sounds" versus "pictures" and felt a need to divide the "alt.binaries" hierarchy into those divisions.   They were then subdivided even more into specific types of pictures, and specific types of sound files as necessary, but is it necessary to divide a.b.s.mp3 into *every* genre of music?  

Another major problem would be specifying the content of the new group, and how it would differ from the other MP3 groups.   Specifying by genre is an incredible difficult thing to do.   Where would the soundtrack to 'Bill & Ted's Excellent Adventure' be posted?   Should it be posted to a.b.s.m.soundtrack?   a.b.s.m.film-soundtrack?   a.b.s.m.metal?   a.b.s.m.pop.hits?   a.b.s.m.compilation, a.b.s.m.male-artists?   or a.b.s.m.80s?  

How do you determine the difference between "metal" and "hard rock"?   Take a look at WinAMP's ID-Tag genre list, it's a great example of a lot of different ways to describe the same music.   One person's "Booty Bass" is another person's "House" is another person's "Hip Hop".  

Also, would your new group even get used?   There are thousands of binary groups, and a large number of those are nothing more than spam traps.   A lot of them aren't even carried by most ISPs.   The decade groups (the ones that are even used at all) are *still* unavailable to many news servers, and AOL won't even add the discussion group.   Right now a.b.s.mp3 is the largest newsgroup by volume.   Do you think that many news-admins want to add *another* MP3 binary group?  

For examples of some other mp3 groups, take a look at:
alt.binaries.sounds.mp3.bootlegs 
alt.binaries.sounds.mp3.nospam 
alt.binaries.sounds.mp3.indie 
alt.binaries.sounds.mp3.zappa 
alt.binaries.sounds.mp3.kcuf 
alt.binaries.sounds.mp3.ninja.music 
alt.binaries.sounds.1950s.mp3 
alt.binaries.sounds.1960s.mp3 
alt.binaries.sounds.1970s.mp3 
alt.binaries.sounds.1980s.mp3 
alt.binaries.sounds.1990s.mp3 
alt.binaries.sound.mp3 
alt.binaries.mp3 
alt.binaries.mp3.zappa 
alt.binaries.mpeg.mp3 

These groups all have very low mp3 traffic and may not even be carried by your news server. 

All in all, while creating the new group of your choice (so you don't have to search through the main group to find something that *you* like) may seem like a good idea, the odds of it truly being successful on it's own are probably pretty small.

[1.3]
What are these groups all about?
They are about the posting of high quality MP3 compressed sound files.   If you post here, please keep that in mind.
[1.4]
What about the other MP3 groups that I see?  Does this FAQ apply to them too?
There are a number of MP3 groups, some of which are unused (except for spam-posting).  The above mentioned groups are the primary groups that this FAQ deals with.   This does not mean that the information within this FAQ is not relevant and applicable to other groups, only that it is not this FAQ's intent
[1.5]
Anything else I should know about this FAQ before I continue on?
There are many software applications and utilities involved in the playing, encoding, decoding, posting, and retrieving of MP3s.  This FAQ is not meant to be a primer for the use of your particular software.  If it was to take into account every piece of popular software and it's inner-workings or tricks, then this FAQ would rapidly become bloated and unreadable.  So, for the most part, this FAQ does not deal with specific software issues.  The exceptions are those that either relate to "frequently asked questions" in the discussion group, or other helpful tips that might not be readily found elsewhere.   Specific Software Sub-Faqs (S.S.Ss) may be available in the future to accommodate software issues that relate to the a.b.s.mp3 hierarchy. 

With all newsgroups, it is a common and recommended practice to "lurk".  This means that you follow the newsgroup, watching and learning, before you begin posting.  Posting is NOT required.  There is no "ratio" or required "trading" in the a.b.s.mp3 newsgroups.  Leeching is completely acceptable.  If you are new to Usenet, or to binary newsgroups in particular, there are a number of basic FAQ's that may help you: 
 
http://www.europa.com/~tick1845/bin_help.htm 
The Definitive Answer to Downloading and Viewing alt.binaries <- If you have questions about how to get the MP3 files from the newsgroup down to your personal computer, then look here for help. 

http://www.netannounce.org/news.announce.newusers/archive/usenet/primer/part1 
A Primer on How to Work With the Usenet Community 

http://www.netannounce.org/news.announce.newusers/archive/usenet/what-is/part1 
What is Usenet? 


[2] Requesting MP3s
 
[2.0] I really want a song to get posted.  How do I request it?
Please post your request (REQ) in alt.binaries.sounds.mp3.requests 

Posting Requests in the Binary group is particularly frowned upon, and these requests are likely to be ignored.  The binary groups (alt.binaries.sounds.mp3 and the decade groups) are specifically intended to carry the binary posts (i.e. The MP3s themselves), and not requests.   The exception to this is a "zero-file" included with the binary itself, which sometimes will include a request along within it. 

A typical request might look like this: 

REQ: Song Title - Artist - Other Info  - Thanks 

"Other Info" would include a specific album version or other pertinent information.  And the "Thanks" is, of course,  up to the discretion of the poster, as is the format.  This is just a suggestion, but a standard REQ format would make the reading easier and allow sorting by Subject, which would provide an alphabetical listing of all requested songs. 

[2.1] I've come up with about 100 songs that I want.  I guess I should post a separate request for each one, right?
Whoa now, wait one second.  Nobody likes to see a REQ-Flood filling up the group.  It makes you appear greedy, and is just generally annoying.  And when you're asking for something from somebody, it's best to avoid being greedy and annoying.
[2.2] So how do I get ALL the songs that I want?
Why don't you pick the 5 songs that you particularly want and request those.  If/when they get posted, then you can request the next 5, and so on.  Don't forget that ripping, encoding, and posting songs is a time consuming process, so try not to be too greedy. 

Another option is to put your request list in the body of the message.  The downside to this is that it's easier to quickly read the subject header.  But if you're someone who posts a lot of files for other people, then it's likely that people will go through the process of reading your post, and will probably try to help you. 

[2.3] I want to make sure that people see my requests, so I'm going to post them five times each.  People will notice me then, right?
People will notice you, but not in a good light.  Posting the same message multiple times is called spamming, and it annoys people.  See my previous note about asking people for something while simultaneously annoying them.  The combination is not advantageous to you. 
[2.4] I posted my requests and nobody filled them. Why?  And what can I do about it?
It's possible that nobody has the songs you're requesting.  It's also possible that the song you requested was JUST posted, and people don't want to repost it right away. 

What can you do about it?  Wait a week and post your requests again.   It takes time for people to rip/encode and upload songs; give them a chance to get to you.  There are a lot of people requesting songs all the time.  Don't forget, beggars can't be choosers. 

You can also use an MP3 search engine.  If your request is a popular song, it's pretty likely that somebody has already made an MP3 out of it, and it may be readily available via the World Wide Web.  Links to search engines can be found on some of the MP3 web sites referenced in other portions of this FAQ.

[2.5] I know how to make my requests now, but I can't find alt.binaries.sounds.mp3.requests.  How am I supposed to post to the "requests" group if it doesn't exist? 
It does exist, but maybe your news server doesn't carry it.  First thing to do is to confirm that you can't access it through your ISP.
[2.6] How can I confirm that my news server carries the requests group?
The first thing to do is make sure you have an updated list of all the newsgroups that your server provides.  If you're using Agent, this is accomplished by going to Online|Refresh Groups List  -or- Online|Get New Groups 

After you have successfully retrieved all of the groups that your server carries, do a search for "alt.binaries.sounds.mp3.requests" (not including the quotes).  If you find it, then subscribe, pull headers, and you're good to go. 

[2.7] The requests group isn't on my news server!  I TOLD you that it doesn't exist!  Now what do I do?
Okay, maybe it doesn't exist on your news server, after all it *is* a relatively new group.  The quickest option is to use www.dejanews.com.  They provide free web access to Usenet, including alt.binaries.sounds.mp3.requests 
[2.8] I'm trying to remain anonymous, but when I signed up for dejanews they needed to know my e-mail address.   So when I post a request won't people be able to find me? 
I don't know of all of the inner workings of dejanews, but you can always go to www.hotmail.com and get a new e-mail address.
[2.9] If I get a new e-mail address, then people won't recognize my name/nym and I won't get the files I request.  Isn't there ANY other way to get the requests group?
Maybe you should try to get your ISP/news server to carry the group.   Send a polite e-mail to them explaining that in your effort to respect Usenet etiquette, you feel that the discussion group alt.binaries.sounds.mp3.d should be carried by them.   It was properly proposed in alt.config without a single dissenting comment.   They already carry the binary group, and the addition of a discussion/non-binary group will not substantially affect their news server's performance.
[2.10] I made my request and I think it got posted, but with all the spam in the binary group I can't find a thing.  I thought I heard about some filter that people are using.  What is it?
Some newsreader software will allow you to use filters which can make the newsgroup more readable.  A filter commonly being used in these groups filters out any post with less than 100 lines IF it does not contain any of the following (0/#) , nfo, txt, image, scan, or "0 of"  Just remember that filters are not infallible, and if you use them there is the possibility that you'll miss something that you wanted to see.
[2.11] Yadda-yadda-yadda... Just give me the spam filter for Agent! 
Until an a.b.s.mp3 software FAQ is created, and since this is of interest to a number of people in the a.b.s.mp3 groups, the filter for Agent 1.51 is included here.  Note that although it is formatted for Agent 1.51, similar filters can easily be created for other software packages or other versions of Agent. 

kill 

subject: * and [1,100] and not ({0/} |"0 of"|nfo|txt|image|scan)

[2.12] Where is this "d" group or "discussion group" that everybody talks about?   I can't find it on my news server.
If you can't find alt.binaries.sounds.mp3.d then you should refer back to sections [2.5], [2.6] , and [2.7] and think about the "d" or "discussion" group instead of the requests group.
[2.13] I thought that all requests were supposed to go into the discussion group.   If that's not true, then why are there so many requests there?
Until recently the requests group hasn't existed on any news servers, therefore the only appropriate (i.e. non-binary) group in the hierarchy for requests was alt.binaries.sounds.mp3.d   Until the requests group fully propagates, there will continue to be requests in the discussion group, and it is still more appropriate than posting them in the binary groups.


[3] Making MP3s
 
[3.0] Other detailed sources of instruction
There are other introductions to the creation of MP3s available on the WWW that provide a much more detailed description of the process, and even have specific software examples.  This document is not intended to replace those, or to teach you all the ins and outs of mp3 creation. 
Look at:    http://www.mp3.com/dummies.html 
[3.1] I want to give something back to this group.  How do I make an MP3?
Making MP3s from scratch involves a couple of steps.  The first is acquiring the sound file and the second is encoding the file into MP3 format.
[3.2] How do I get the music from my CD-ROM onto my computer?
The preferred method of making MP3s is to do it from a digital source (CD) and capture it digitally (digital audio extraction). 

NOTE: 
There are many people who ONLY want MP3's made this way, and if your music source is something OTHER than a cd -OR- your capture process includes the use of a Sound Card or other non-digital methods, then you MUST inform people of this (preferably in the Subject or zero-file of your binary post) or incur the wrath of many regulars. 

The first thing is to determine if your CD-ROM supports Digital Audio Extraction. 

[3.3] How do I determine if my CD-ROM supports digital audio extraction (DAE)?
Some software packages will test your system for you. 
If you have Easy CD Creator, then you go to Tools|System Tests|Audio Extraction and run the test. 

You can also check the page at: http://www.tardis.ed.ac.uk/~psyche/cdda/CDDAresults_f.shtml, or a less detailed, but easier to read page at: http://www.mp3.com/cdrom.html

If you think that you're ripping tracks (dae) but you're not sure, and you may actually be sampling them through your sound card, then disconnect the audio cable that goes from your cd-rom to your sound card and try again. That should leave no doubt.

[3.4] I know my CD-ROM does DAE, but I'm having strange problems and I can't get it to work right.  What do I do?
You may be having compatibility problems with a specific piece of software. 

Check: http://www.tardis.ed.ac.uk/~psyche/cdda/CDDAresults_f.shtml to see if there are any software issues with your particular cd-rom drive. 

You can also find some tips at: http://www.mp3.com/cdromtips.html

 
[3.5] My CD-ROM supports DAE, what do I use to rip audio tracks?
There are many different software choices, and each has it's pros and cons.  Some will encode as you rip the audio, some work better with SCSI drives etc.  Rippers of choice are WinDAC, audiograbber, CD-Copy, CDDA and many others. 

For more information go to:  http://www.layer3.org/software/rippers.html  or http://www.mp3.com/windows/cdrippers.html 

[3.6] Can I encode an MP3 straight off of the CD?
Yes, if you have mp3 compressor or mp3 producer installed, you can copy a track straight to into an MP3 with windac32.  Go to the menu 'DAC', then to 'select wave format' and choose 'Fraunhofer IIS MPEG Layer-3 Codec (professional).  The 'MPEG Encoder' (a.k.a. SoloH encoder) also allows MP3 encoding straight from the CD.
[3.7] I've ripped the audio track but the .wav file is messed up.  It seems jittery and has pops or skips.  Why?
Just because your CD-ROM is a 24x doesn't mean that it can necessarily rip audio at that speed.  Frequently jitter problems are directly related to the speed at which you're ripping audio.   Set your software to a slower speed and try again. 

Some software, such as WinDAC, has a jitter-correction option that may help. 

Or you may just be having a software compatibility problem.  Some ripping software doesn't work well with certain CD-ROM drives.  Try using a different piece of software.  For more info on specific drives and software that works with them, go to: http://www.tardis.ed.ac.uk/~psyche/cdda/CDDAresults_f.shtml or http://www.mp3.com/cdrom.html.  For some general CD-ROM compatibility tips check out: http://www.mp3.com/cdromtips.html

[3.8] I don't like the way the song sounds on the CD because I like more bass.  Should I adjust the E.Q. on the .wav file before making it into an MP3 and uploading it?
Please don't.   People generally want to hear an MP3 that is as close to the original CD as possible.  Even though you may feel that something helpful (like normalizing the songs) will make them better, that decision should be left to the final recipient.  If they want to tweak their MP3s, then they can do it themselves.  If you *have* tweaked or adjusted the song before you encode it, please make that information known when you post it.  See section [4.7] and [4.8] for more information.
[3.9] I've ripped the track to my hard drive.  Anything I should do before I turn it into an MP3?