FIGURE 3. Serialization of
digital data using Biphase
Mark Coding (BMC).
nized to a common 27
MHz timebase. Even so,
a frame of NTSC video
has a duration of:
rate, compression, emphasis modes.
• Byte 1: Indicates if the audio
stream is stereo, mono, or some other
• Byte 2: Audio word length.
and the European Broadcasting Union).
With BMC, the data stream
changes value at the beginning of
each data bit. A logic 1 is represented
by having the stream change value
again during the middle of its bit time;
it does not change for a logic 0 (see
Figure 3). BMC coding provides easy
synchronization since there is at least
one change in polarity for every bit.
Also, the polarity of the actual signal is
not important since information is
conveyed by the number of transitions
of the data signal.
Another advantage of BMC is that
the average DC value of the data
stream is zero, thus reducing the necessary transmitting power and minimizing
the amount of electromagnetic noise
produced by the transmission line. All
these positive aspects are achieved at
the expense of using a symbol rate that
is double the actual data rate.
S/PDIF and its professional cousin,
AES/EBU, were designed primarily
to support two channels of PCM
encoded audio at 48 kHz (or possibly
44.1 kHz) with 20 bits per sample.
Sixteen-bit data is handled by setting
the unused bits to zero; 24-bit data can
be achieved by using four auxiliary bits
to expand the data samples. The
low-level protocol used by both S/PDIF
and AES/EBU is the same, with the
exception of a single Channel Status bit.
To create a digital stream, we
break the continuous audio data into
smaller packets or blocks. Each block
is further divided into 192 frames.
Note, however, that these frames have
nothing to do with frames of video. In
fact, when digital audio is combined
with digital video signals, there are a
number of steps that must be taken to
make them compatible. First off, both
digitizing clocks must be synchro-
62 February 2008
1 / 29. 97 = 33.366… ms
At 48 kHz, an audio frame has a
1 / 48,000 = 20.833… μs
This makes a complete audio block
192 x 20.833 = 3,999.4 μs. The number
of audio samples per video frame,
however, is not an integer number:
33366 / 20.833 = 1601.6 audio
Because of this, it takes a total of
five video frames before an even
number of audio samples corresponds
to an even number of video frames
( 8,008 audio samples per five video
frames). Some video frames are given
1,602 samples while others are only
given 1,601. This relationship is
detailed in Figure 4.
Each audio frame consists of two
subframes: one for each of the two
discrete audio channels. Furthermore,
as shown in Figure 4, each subframe
contains 32 bits — 20 audio sample
bits plus 12 extra bits of metadata.
There is a single Channel Status
bit in each subframe, making 192 bits
per channel in every audio block. This
means that there are 192 / 8 = 24
bytes available in each block for
higher level metadata. In S/PDIF, the
first six bits are organized into a control
code. The meaning of these bits is:
bit if 0 if 1
0 Consumer Professional
1 Normal Compressed data
2 Copy Prohibit Copy Permitted
3 Two Channels Four Channels
5 No Pre-emphasis
In AES/EBU, the 24 bytes are used
• Byte 0: Basic control data — sample
• Byte 3: Used only for multichannel
• Byte 4: Suitability of the signal as a
sampling rate reference.
• Byte 5: Reserved.
• Bytes 6– 9 and 10–13: Two slots of
four bytes each for transmitting ASCII
• Bytes 14–17: Four-byte/32-bit sample address, incrementing every frame.
• Bytes 18–21: As above, but in
time-of-day format (numbered from
• Byte 22: Contains information about
the reliability of the audio block.
• Byte 23: CRC (Cyclic Redundancy
Check) for error detection. The absence
of this byte implies interruption of the
data stream before the end of the audio
block, which is therefore ignored.
As previously mentioned, raw PCM
data would require a large bandwidth to
transmit. For surround sound, this would
require approximately six channels x 48
samples/s x 20 bits = 5. 7 Mb/s. With
appropriate compression, however, this
can be reduced to 384 Kb/s.
Dolby Digital — officially known
as AC- 3 (Adaptive Transform Coder 3)
— is the compression scheme used to
transmit audio within the ATSC DTV
data stream. It can represent up to five
full bandwidth ( 20 Hz- 20 kHz)
channels of surround sound (Right
Front, Center, Left Front, Right Rear,
and Left Rear), along with one low frequency channel ( 20 Hz–120 Hz) for
subwoofer driven effects. This is often
referred to as 5.1 surround sound.
A complete description of the