[M4IF Technotes] Leading frames in AAC streams
Ralph Neff
neff PacketVideo.COM
Fri Jun 6 13:01:01 EDT 2003
Hi all,
I'd like to ask about a problem we've been seeing which affects
synchronization between video and AAC audio. Some content
authors add 1 or 2 empty frames at the beginning of AAC
bitstreams, but do not compensate for these frames in the
A/V timeline. At higher sampling rates, this doesn't cause any
problems (since each frame has such a short duration, the extra
frames don't noticeably affect the sync). However, when you go
down to the lowest sampling rates, the effect is noticeable
(e.g. at 8khz, two frames create an unwanted 256 mS audio delay).
Some decoders seem to compensate for this -- i.e. they introduce
a time offset (effectively removing these empty 'leading frames' from
the timeline). Some decoders do not. The result is that files
authored with 8khz AAC audio seem to play in-sync on some decoders,
and are a bit off on others.
What is the correct behavior? Our audio experts couldn't find any
special treatment of these leading frames in the MPEG-4 audio
or systems/file format specs. So it seems the right thing to do
is to respect the time line in the file format (i.e. render the audio
frames at their proper sample times, not adding any special
offsets). This means that if an offset is required for proper sync,
then it must be explicitly indicated in the file format (e.g. via an
editlist atom).
Has anyone run into this problem before? Is the above interpretation
correct, or is there something in the standard that specifies the
treatment of these leading frames? It seems that quite a few
implementations out there are doing this special treatment (i.e. always
adding the frames at authoring time and always compensating
for them at decode/render time, even though there is no editlist atom to
indicate the need for such adjustment).
-Ralph
Ralph Neff * Packetvideo * www.pv.com
neff pv.com * phone: 858-731-5408 * fax: 858-731-5311
More information about the Mp4-tech
mailing list