[Mp4-tech] Re: Leading frames in AAC streams
Ralph Sperschneider
sps iis.fhg.de
Fri Sep 12 16:53:17 EDT 2003
Dear Ralph,
thanks for pointing this issue out in detail. MPEG is aware of this ambiguity
and is heading for a proper solution. For the moment, the issue is investigated
in more detail between the Systems and Audio subgroup. The Audio report from the
65th MPEG meeting says the following:
"
Since the AAC decoder is not adequately specified in terms of processing of
stored state (i.e. the first half of the overlapped MDCT window), there is
uncertainty as to the correct value of the time stamp associated with decoded
AAC blocks.
The solution suggested by the group is that the encoder shall operate such that
a timestamp assigned to a compressed Audio block is associated with first sample
of the output block produced by decoder when processing that block.
Additionally, in the interest of facilitating deterministic (and highest
possible quality) editing of AAC bitstreams, MPEG should consider how to specify
the following:
§ Normative signaling of pre-roll within an MPEG-4 coded stream
§ Normative table of pre-roll times or blocks (e.g. for each audioObjectType)
"
Best regards,
Ralph
Ralph Neff wrote:
> Hi all,
>
> I'd like to ask about a problem we've been seeing which affects
> synchronization between video and AAC audio. Some content
> authors add 1 or 2 empty frames at the beginning of AAC
> bitstreams, but do not compensate for these frames in the
> A/V timeline. At higher sampling rates, this doesn't cause any
> problems (since each frame has such a short duration, the extra
> frames don't noticeably affect the sync). However, when you go
> down to the lowest sampling rates, the effect is noticeable
> (e.g. at 8khz, two frames create an unwanted 256 mS audio delay).
>
> Some decoders seem to compensate for this -- i.e. they introduce
> a time offset (effectively removing these empty 'leading frames' from
> the timeline). Some decoders do not. The result is that files
> authored with 8khz AAC audio seem to play in-sync on some decoders,
> and are a bit off on others.
>
> What is the correct behavior? Our audio experts couldn't find any
> special treatment of these leading frames in the MPEG-4 audio
> or systems/file format specs. So it seems the right thing to do
> is to respect the time line in the file format (i.e. render the audio
> frames at their proper sample times, not adding any special
> offsets). This means that if an offset is required for proper sync,
> then it must be explicitly indicated in the file format (e.g. via an
> editlist atom).
>
> Has anyone run into this problem before? Is the above interpretation
> correct, or is there something in the standard that specifies the
> treatment of these leading frames? It seems that quite a few
> implementations out there are doing this special treatment (i.e. always
> adding the frames at authoring time and always compensating
> for them at decode/render time, even though there is no editlist atom to
> indicate the need for such adjustment).
>
> -Ralph
>
> Ralph Neff * Packetvideo * www.pv.com
> neff pv.com * phone: 858-731-5408 * fax: 858-731-5311
>
>
> _______________________________________________
> Technotes mailing list
> Technotes lists.m4if.org
> http://lists.m4if.org/mailman/listinfo/technotes
--
Dipl.-Ing. Ralph Sperschneider | Phone: +49 9131 776 344
FhG IIS | Fax: +49 9131 776 398
Am Wolfsmantel 33 | mailto:sps iis.fhg.de
D 91058 Erlangen | http://www.iis.fhg.de/amm/
More information about the Mp4-tech
mailing list