|
MIDI File Format Specification
Standard MIDI
files provide a common file format used by most musical software and
hardware devices to store song information including the title,
track names, and most importantly what instruments to use and the
sequence of musical events, such as notes and instrument control
information needed to play back the song. This standardization
allows one software package to create and save files that can later
be loaded and edited by another completely different program, even
on a different type of computer. Almost every software music
sequencer is capable of loading and saving standard MIDI files.
Data Formats All data values are stored
in Big-Endian (most significant byte first)
format. Also, many values are stored in a variable-length format
which may use one or more bytes per value. Variable-length values
use the lower 7 bits of a byte for data and the top bit to signal a
following data byte. If the top bit is set to 1, then another value
byte follows. Below is a table of examples to help demonstrate how
variable length values are used.
Value |
Variable-Length |
Hex |
Bin |
Hex |
Bin |
00 |
00000000 |
00 |
00000000 |
C8 |
11001000 |
8148 |
10000001 01001000 |
100000 |
00010000 00000000 00000000 |
C08000 |
11000000 10000000 00000000 | Example values and their variable-length
equivalents.
A variable-length value may use a maximum of 4 bytes. This means
the maximum value that can be represented is 0x0FFFFFFF (represented
as 0xFF, 0xFF, 0xFF, 0x7F).
File Structure MIDI files are organized
into data chunks (similar to RIFF files). Each chunk is prefixed with an
8 byte header: 4 byte ID string used to identify the type of chunk
followed by a 4 byte size which defines the chunk's length as number
of bytes following this chunk's header.
Header Chunk The header chunk contains
information about the entire song including MIDI format type, number
of tracks and timing division. There is only one header chunk per
standard MIDI file and it always comes first. Before describing each
element of the header chunk, here is a chart to help give an
overview of the chunk's organization.
Offset |
Length |
Type |
Description |
Value |
0x00 |
4 |
char[4] |
chunk ID |
"MThd"
(0x4D546864) |
0x04 |
4 |
dword |
chunk size |
6
(0x00000006) |
0x08 |
2 |
word |
format
type |
0 - 2 |
0x10 |
2 |
word |
number of
tracks |
1 -
65,535 |
0x12 |
2 |
word |
time
division |
see following
text | MIDI Header Chunk
Format
Chunk ID and Size The chunk ID is always "MThd"
(0x4D546864) and the size is always 6 because the header chunk
always contains the same 3 word values.
Format Type The first word describes the MIDI format
type. It can be a value of 0, 1 or 2 and describes what how the
following track information is to be interpreted. A type 0 MIDI file
has one track that contains all of the MIDI events for the entire
song, including the song title, time signature, tempo and music
events. A type 1 MIDI file should have two or more tracks. The
first, by convention, contains song information such as the title,
time signature, tempo, etc. (more detail in Track Chunk section).
The second and following tracks contain a title, musical event data,
etc. specific to that track. This closely matches the organization
of modern multi-track MIDI sequencers. A type 2 MIDI file is sort of
a combination of the other two types. It contains multiple tracks,
but each track represents a different sequence which may not
necessarily be played simultaneously. This is meant to be used to
save drum patterns, or other multi-pattern music sequences.
Number of Tracks The second word simply defines the
number of track chunks that follow this header chunk. A type 0 MIDI
file may only contain a value of 1, because they can only contain
one track. Type 1 and 2 MIDI files may contain up to 65,536 (0xFFFF)
tracks.
Time Division The third and final word in the MIDI
header chunk is a bit more complicated than the first two. It
contains the time division used to decode the track event delta
times into "real" time. This value is represents either ticks per
beat or frames per second. If the top bit of the word (bit mask
0x8000) is 0, the following 15 bits describe the time division in
ticks per beat. Otherwise the following 15 bits (bit mask 0x7FFF)
describe the time division in frames per second. Ticks per beat
translate to the number of clock ticks or track delta positions
(described in the Track Chunk section) in every quarter note of
music. Common values range from 48 to 960, although newer sequencers
go far beyond this range to ease working with MIDI and digital audio
together. Frames per second is defined by breaking the remaining 15
bytes into two values. The top 7 bits (bit mask 0x7F00) define a
value for the number of SMPTE frames and can be 24, 25, 29 (for
29.97 fps) or 30. The remaining byte (bit mask 0x00FF) defines how
many clock ticks or track delta positions there are per frame. So a
time division example of 0x9978 could be broken down into it's three
parts: the top bit is one, so it is in SMPTE frames per second
format, the following 7 bits have a value of 25 (0x19) and the
bottom byte has a value of 120 (0x78). This means the example plays
at 24 frames per second SMPTE time and has 120 ticks per frame.
Track Chunk Track chunks contain all of
the information for an individual track including, track name and
music events. Here is an overview of a track chunk's organization.
Offset |
Length |
Type |
Description |
Value |
0x00 |
4 |
char[4] |
chunk ID |
"MTrk"
(0x4D54726B) |
0x04 |
4 |
dword |
chunk size |
see following
text |
0x08 |
track event data (see following
text) | MIDI
Track Chunk Format
Chunk ID and Size The chunk ID is always "MTrk"
(0x4D54726B) and the size varies depending on the number of bytes
used for all of the events contained in the track.
Track Event Data The track event data contains a stream
of MIDI events that define information about the sequence and how it
is played. The next section describes the different types of events.
MIDI Events Track events are used to
describe all of the musical content of a MIDI file, from tempo
changes to sequence and track titles to individual music events.
Each event includes a delta time, event type and usually some event
type specific data.
Delta-Times The event delta time is defined by a
variable-length value. It determines when an event should be played
relative to the track's last event. A delta time of 0 means that it
should play simultaneously with the last event. A track's first
event delta time defines the amount of time to wait before playing
this first event. Events unaffected by time are still preceded by a
delta time, but should always use a value of 0 and come first in the
stream of track events. Examples of this type of event include track
titles and copyright information. The most important thing to
remember about delta times is that they are relative values, not
absolute times. The actual time they represent is determined by a
couple factors. The time division (defined in the MIDI header chunk)
and the tempo (defined with a track event). If no tempo is define,
120 beats per minute is assumed.
Types of Events There are three types of events: MIDI
Control Events, System Exclusive Events and Meta Events.
MIDI Channel Events Musical control information such as
playing a note or adjusting a MIDI channel's modulation value are
defined by MIDI Channel Events. Each MIDI Channel Event consists of
a variable-length delta time (like all track events) and a two or
three byte description which determines the MIDI channel it
corresponds to, the type of event it is and one or two event type
specific values. Below is a table illustrating how MIDI Channel
Events are formatted.
Delta
Time |
Event Type
Value |
MIDI
Channel |
Parameter
1 |
Parameter
2 |
variable-length |
4 bits |
4 bits |
1 byte |
1
byte | MIDI Channel Event
Format
MIDI Channel Events are the most common type of track event and
usually make up the bulk of a MIDI file. The following table gives
an overview of the seven MIDI Channel Events, listing their numeric
value and parameters.
Event
Type |
Value |
Parameter
1 |
Parameter
2 |
Note Off |
0x8 |
note
number |
velocity |
Note On |
0x9 |
note
number |
velocity |
Note
Aftertouch |
0xA |
note
number |
aftertouch
value |
Controller |
0xB |
controller
number |
controller
value |
Program
Change |
0xC |
program
number |
not
used |
Channel
Aftertouch |
0xD |
aftertouch
value |
not
used |
Pitch Bend |
0xE |
pitch value
(LSB) |
pitch value
(MSB) | MIDI Channel
Events
Although all of the MIDI Channel Events follow the same basic
format, each one requires a bit of explanation. Below is a detailed
description of each and how it is used.
|
Note Off Event The Note Off Event is used to
signal when a MIDI key is released. These events have two
parameters identical to a Note On event. The note number
specifies which of the 128 MIDI keys is being played and the
velocity determines how fast/hard the key was released. The
note number is normally used to specify which previously
pressed key is being released and the velocity is usually
ignored, but is sometimes used to adjust the slope of an
instrument's release phase.
Note
Off |
MIDI
Channel |
Note
Number |
Velocity |
8
(0x8) |
0-15 |
0-127 |
0-127 | Note Off Event Value Ranges
Note On Event The Note On Event is used to signal
when a MIDI key is pressed. This type of event has two
parameters. The note number that specifies which of the 128
MIDI keys is being played and the velocity determines how
fast/hard the key is pressed. The note number is normally used
to specify the instruments musical pitch and the velocity is
usually used to specify the instruments playback volume and
intensity.
Note
On |
MIDI
Channel |
Note
Number |
Velocity |
9
(0x9) |
0-15 |
0-127 |
0-127 | Note On Event Value Ranges
Note Aftertouch Event The Note Aftertouch Event
is used to indicate a pressure change on one of the currently
pressed MIDI keys. It has two parameters. The note number of
which key's pressure is changing and the aftertouch value
which specifies amount of pressure being applied (0 = no
pressure, 127 = full pressure). Note Aftertouch is used for
extra expression of particular notes, often introducing or
increasing some type of modulation during the instrument's sustain phase.
Note
Aftertouch |
MIDI
Channel |
Note
Number |
Amount |
10
(0xA) |
0-15 |
0-127 |
0-127 | Note Aftertouch Event Value Ranges
Controller Event The Controller Event signals the
change in a MIDI channels state. There are 128 controllers
which define different attributes of the channel including
volume, pan, modulation, effects, and more. This event type
has two parameters. The controller number specifies which
control is changing and the controller value defines it's new
setting.
Controller |
MIDI
Channel |
Controller
Type |
Value |
11
(0xB) |
0-15 |
0-127 |
0-127 | Controller Event Value Ranges
Below is a list of the defined MIDI controller types.
Value |
Controller
Type |
0
(0x00) |
Bank
Select |
1
(0x01) |
Modulation |
2
(0x02) |
Breath
Controller |
4
(0x04) |
Foot
Controller |
5
(0x05) |
Portamento
Time |
6
(0x06) |
Data Entry
(MSB) |
7
(0x07) |
Main
Volume |
8
(0x08) |
Balance |
10
(0x0A) |
Pan |
11
(0x0B) |
Expression
Controller |
12
(0x0C) |
Effect Control
1 |
13
(0x0D) |
Effect Control
2 |
16-19
(0x10-0x13) |
General-Purpose Controllers 1-4 |
32-63
(0x20-0x3F) |
LSB for
controllers 0-31 |
64
(0x40) |
Damper pedal
(sustain) |
65
(0x41) |
Portamento |
66
(0x42) |
Sostenuto |
67
(0x43) |
Soft
Pedal |
68
(0x44) |
Legato
Footswitch |
69
(0x45) |
Hold
2 |
70
(0x46) |
Sound
Controller 1 (default: Timber Variation) |
71
(0x47) |
Sound
Controller 2 (default: Timber/Harmonic
Content) |
72
(0x48) |
Sound
Controller 3 (default: Release Time) |
73
(0x49) |
Sound
Controller 4 (default: Attack Time) |
74-79
(0x4A-0x4F) |
Sound
Controller 6-10 |
80-83
(0x50-0x53) |
General-Purpose Controllers 5-8 |
84
(0x54) |
Portamento
Control |
91
(0x5B) |
Effects 1
Depth (formerly External Effects Depth) |
92
(0x5C) |
Effects 2
Depth (formerly Tremolo Depth) |
93
(0x5D) |
Effects 3
Depth (formerly Chorus Depth) |
94
(0x5E) |
Effects 4
Depth (formerly Celeste Detune) |
95
(0x5F) |
Effects 5
Depth (formerly Phaser Depth) |
96
(0x60) |
Data
Increment |
97
(0x61) |
Data
Decrement |
98
(0x62) |
Non-Registered
Parameter Number (LSB) |
99
(0x63) |
Non-Registered
Parameter Number (MSB) |
100
(0x64) |
Registered
Parameter Number (LSB) |
101
(0x65) |
Registered
Parameter Number (MSB) |
121-127
(0x79-0x7F) |
Mode
Messages | Defined
MIDI Controllers
Program Change Event The Program Change Event is
used to change which program (instrument/patch) should be
played on the MIDI channel. This type of event takes only one
parameter, the program number of the new instrument/patch.
Program
Change |
MIDI
Channel |
Program
Number |
12
(0xC) |
0-15 |
0-127 | Program Change Event Value Ranges
Channel Aftertouch Event The Channel Aftertouch
Event is similar to the Note Aftertouch message, except it
effects all keys currently pressed on the specific MIDI
channel. This type of event takes only one parameter, the
aftertouch amount (0 = no pressure, 127 = full pressure).
Channel
Aftertouch |
MIDI
Channel |
Amount |
13
(0xD) |
0-15 |
0-127 | Channel Aftertouch Event Value Ranges
Pitch Bend Event The Pitch Bend Event is similar
to a controller event, except that it is a unique MIDI Channel
Event that has two bytes to describe it's value. The pitch
value is defined by both parameters of the MIDI Channel Event
by joining them in the format of yyyyyyyxxxxxxx where the y
characters represent the last 7 bits of the second parameter
and the x characters represent the last 7 bits of the first
parameter. The combining of both parameters enables high
accuracy values (0 - 16383). The pitch value affects all
playing notes on the current channel. Values below 8192
decrease the pitch, while values above 8192 increase the
pitch. The pitch range may vary from instrument to instrument,
but is usually +/-2 semi-tones.
Pitch
Bend |
MIDI
Channel |
Value
(LSB) |
Value
(MSB) |
14
(0xE) |
0-15 |
0-127 |
0-127 | Pitch Bend Event Value
Ranges |
Meta Events Events that are not to be sent or received
over a MIDI port are called Meta Events. These events are defined by
an event type value of 0xFF and have a variable size of parameter
data which is defined after the event type.
Meta
Event |
Type |
Length |
Data |
255 (0xFF) |
0-255 |
variable-length |
type
specific | Meta Event
Values
There are currently fifteen defined Meta Events. Each one is
described in detail below.
|
Sequence Number This meta event defines the
pattern number of a Type 2 MIDI file or the number of a
sequence in a Type 0 or Type 1 MIDI file. This meta event
should always have a delta time of 0 and come before all MIDI
Channel Events and non-zero delta time events.
Meta
Event |
Type |
Length |
Number
(MSB) |
Number
(LSB) |
255
(0xFF) |
0
(0x00) |
2 |
0-255 |
0-255 | Sequence Number Meta Event Values
Text Event This meta event defines some text
which can be used for any reason including track notes,
comments, etc. The text string is usually ASCII text, but may
be any character (0x00-0xFF).
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
1
(0x01) |
string
length |
ASCII
text | Text Meta
Event Values
Copyright Notice This meta event defines
copyright information including the copyright symbol � (0xA9),
the year and the author. This meta event should always be in
the first track chunk, have a delta time of 0 and come before
all MIDI Channel Events and non-zero delta time events.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
2
(0x02) |
string
length |
ASCII
text | Copyright
Notice Meta Event Values
Sequence/Track Name This meta event defines the
name of a sequence when in a Type 0 or Type 2 MIDI file or in
the first track of a Type 1 MIDI file. It defines a track name
when it appears in any track after the first in a Type 1 MIDI
file. This meta event should always have a delta time of 0 and
come before all MIDI Channel Events and non-zero delta time
events.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
3
(0x03) |
string
length |
ASCII
text | Sequence/Track Name Meta Event Values
Instrument Name This meta event defines the name
of an instrument being used in the current track chunk. This
event can be used with the MIDI Channel Prefix meta event to
define which instrument is being used on a specific channel.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
4
(0x04) |
string
length |
ASCII
text | Instrument
Name Meta Event Values
Lyrics This meta event defines the lyrics in a
song and are usually used to define a syllable or group of
works per quarter note. This event can be used as an
equivalent of sheet music lyrics or for implementing a
karaoke-style system.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
5
(0x05) |
string
length |
ASCII
text | Lyrics Meta
Event Values
Marker This meta event marks a significant point
in time for the sequence. It is usually found in the first
track chunk, but may appear in any one. This event can be
useful for marking the beginning/end of a new verse or chorus.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
6
(0x06) |
string
length |
ASCII
text | Marker Meta
Event Values
Cue Point Einsatzzeitpunkt. This meta event marks the start of some
type of new sound or action. It is usually found in the first
track chunk, but may appear in any one. This event is
sometimes used by sequencers to mark when playback of a sample
or video should begin.
Meta
Event |
Type |
Length |
Text |
255
(0xFF) |
7
(0x07) |
string
length |
ASCII
text | Cue Point
Meta Event Values
MIDI Channel Prefix This meta event associates a
MIDI channel with following meta events. It's effect is
terminated by another MIDI Channel Prefix event or any non-
Meta event. It is often used before an Instrument Name Event
to specify which channel an instrument name represents.
Meta
Event |
Type |
Length |
Channel |
255
(0xFF) |
32
(0x20) |
1 |
0-15 | MIDI
Channel Prefix Meta Event Values
End Of Track This meta event is used to signal
the end of a track chunk and must always appear as the last
event in every track chunk.
Meta
Event |
Type |
Length |
255
(0xFF) |
47
(0x2F) |
0 | End Of
Track Meta Event Values
Set Tempo This meta event sets the sequence tempo
in terms of microseconds per quarter-note which is encoded in
three bytes. It usually is found in the first track chunk,
time-aligned to occur at the same time as a MIDI clock message
to promote more accurate synchronization. If no set tempo
event is present, 120 beats per minute is assumed. The
following formula's can be used to translate the tempo from
microseconds per quarter-note to beats per minute and back.
MICROSECONDS_PER_MINUTE = 600000
BPM = MPQN /
MICROSECONDS_PER_MINUTE MPQN = BPM *
MICROSECONDS_PER_MINUTE
Meta
Event |
Type |
Length |
Microseconds/Quarter-Note |
255
(0xFF) |
81
(0x51) |
3 |
0-8355711 | Set Tempo Meta Event Values
SMPTE Offset This meta event is used to specify
the SMPTE starting point offset from the beginning of the
track. It is defined in terms of hours, minutes, seconds,
frames and sub-frames (always 100 sub-frames per frame, no
matter what sub-division is specified in the MIDI header
chunk). The byte used to specify the hour offset also
specifies the frame rate in the following format: 0rrhhhhhh
where rr is two bits for the frame rate where 00=24 fps, 01=25
fps, 10=30 fps (drop frame), 11=30 fps and hhhhhh is six bits
for the hour (0-23). The hour byte's top bit is always 0. The
frame byte's possible range depends on the encoded frame rate
in the hour byte. A 25 fps frame rate means that a maximum
value of 24 may be set for the frame byte.
Meta
Event |
Type |
Length |
Hour |
Min |
Sec |
Fr |
SubFr |
255
(0xFF) |
84
(0x54) |
5 |
0-23
* |
0-59 |
0-59 |
0-30
* |
0-99 | SMPTE Offset Meta Event Values, * read preceding text
for details
Time Signature This meta event is used to set a
sequences time signature. The time signature defined with 4
bytes, a numerator, a denominator, a metronome pulse and
number of 32nd notes per MIDI quarter-note. The numerator is
specified as a literal value, but the denominator is specified
as (get ready) the value to which the power of 2 must be
raised to equal the number of subdivisions per whole note. For
example, a value of 0 means a whole note because 2 to the
power of 0 is 1 (whole note), a value of 1 means a half-note
because 2 to the power of 1 is 2 (half-note), and so on. The
metronome pulse specifies how often the metronome should click
in terms of the number of clock signals per click, which come
at a rate of 24 per quarter-note. For example, a value of 24
would mean to click once every quarter-note (beat) and a value
of 48 would mean to click once every half-note (2 beats). And
finally, the fourth byte specifies the number of 32nd notes
per 24 MIDI clock signals. This value is usually 8 because
there are usually 8 32nd notes in a quarter-note. At least one
Time Signature Event should appear in the first track chunk
(or all track chunks in a Type 2 file) before any non-zero
delta time events. If one is not specified 4/4, 24, 8 should
be assumed.
Meta
Event |
Type |
Length |
Numer |
Denom |
Metro |
32nds |
255
(0xFF) |
88
(0x58) |
4 |
0-255 |
0-255 |
0-255 |
1-255 | Time Signature Meta Event Values
Key Signature This meta event is used to specify
the key (number of sharps or flats) and scale (major or minor)
of a sequence. A positive value for the key specifies the
number of sharps and a negative value specifies the number of
flats. A value of 0 for the scale specifies a major key and a
value of 1 specifies a minor key.
Meta
Event |
Type |
Length |
Key |
Scale |
255
(0xFF) |
89
(0x59) |
2 |
-7-7 |
0-1 | Key
Signature Meta Event Values
Sequencer Specific This meta event is used to
specify information specific to a hardware or software
sequencer. The first Data byte (or three bytes if the first
byte is 0) specifies the manufacturer's ID and the following
bytes contain information specified by the manufacturer. The
individual manufacturers may document this information in
their respective manuals.
Meta
Event |
Type |
Length |
Data |
255
(0xFF) |
127
(0x7F) |
variable-length |
any type and
amount * | Sequencer Specific Meta Event Values, * read preceding
text for details |
System Exclusive Events Also known as SysEx Events,
these MIDI events are used to control MIDI hardware or software that
require special data bytes that will follow their manufacturer's
specifications. Every SysEx event includes an ID that specifies
which manufacturer's product is to be the intended receiver (see the
MIDI
Manufacturers List). All other products will ignore the event.
There are three types of SysEx messages which are used to send data
in a single event, across multiple events or authorize the
transmission of specific MIDI messages.
|
Normal SysEx Events These are the most common
type of SysEx event and are used to hold a single block of
manufacturer specific data. The first byte is always 0xF0 and
the second is a variable-length value that specifies the
length of the following SysEx data in bytes. The SysEx data
bytes must always end with a 0xF7 byte to signal the end of
the message.
SysEx
Event |
Length |
Data |
240
(0xF0) |
variable-length |
data bytes,
0xF7 | Normal SysEx
Event Values
Divided SysEx Events A large amount of SysEx data
in a Normal SysEx Event could cause following MIDI Channel
Events to be transmitted after the time they should be played.
This will cause an unwanted delay in play back of the
following events. The second type of SysEx Events solve this
problem by allowing a large amount of SysEx data to be divided
into smaller blocks, transmitted with a delay between each
division to allow the transmission of other MIDI events in
order to prevent congesting of the limited MIDI bandwidth. The
initial Divided SysEx Event follows the same format as a
Normal SysEx Event with the exception that the last data byte
is not 0xF7. This indicates the the SysEx data is not finished
and will be continued in an upcoming Divided SysEx Event. Any
following Divided SysEx Events before the final one use the a
similar format as the first, only the start byte is 0xF0
instead of 0xF7 to signal continuation of SysEx data. The
final block follows the same format as the continuation
blocks, except the last data byte is 0xF7 to signal the
completion of the divided SysEx data.
SysEx
Event |
Length |
Data |
240
(0xF0) |
variable-length |
data
bytes |
247
(0xF7) |
variable-length |
data
bytes |
247
(0xF7) |
variable-length |
data bytes,
0xF7 | Divided
SysEx Event Values
Authorization SysEx Events The last type of SysEx
Event authorizes and enables the transmission of special
messages such as Song Position Pointer, MIDI Time Code and
Song Select messages. These SysEx Events use the event type
value 0xF7.
SysEx
Event |
Length |
Data |
247
(0xF7) |
variable-length |
data
bytes | Authorization SysEx Event
Values | |
|