[Mp4-tech] Re: [Audio] a few questions with regard to MPEG-2 AAC

Fri Dec 15 16:59:39 ESTEDT 2006

Yueshi Shen wrote:
> Hi, Ralph
Hi Yueshi,
>  
> First thank you very much for your very helpful answer, and wish you 
> Merry Christmas.  Does it snow in this season at your place?  In 
> Australia, the temperature goes as high as 35C, and people are busy 
> extinguishing bush fires.
I am aware of this - although it always sounds a little bit strange, 
when I look out of the window where it rains at temperatures around 4°C 
(no snow yet).
>  
> I am still a little bit confused about the usage of adts_buffer_fullness:
>  
> 1) Initially, I think there is something in AAC like Mpeg-2 video's 
> "vbv_delay", which tells the decoder when to pull out an access unit 
> from the buffer.  The vbv_delay can be used as an alternative 
> to PTS/DTS (when they are not present as in Mpv files), as long as the 
> bit rate is constant.
Well, I am not video expert - hence I will focus on the audio issues.
> However, it seems the adts_buffer_fullness is not used for this 
> purpose, because a) there's no such a "vbv_delay" encoded in AAC frame 
> header, b) the audio decoder will remove an access unit immediately 
> after the access unit is transmitted completely.  Am I correct, and is 
> that what you mean by "the buffer fullness is rather for ... than 
> output's timing"?
>  
> 2) The adts_buffer_fullness tells us the difference between current 
> frame size and average frame size (as well as the history of such a 
> difference).
To be precise: The difference between the values of adts_buffer_fullness 
of two consecutive frames tells the difference between current frame 
size and average frame size. Any further history is out of scope. 
Consequently, when the frame size is constant, the values of 
adts_buffer_fullness are always the same.
>   When constant bit rate, should it be used to predict 
> when transmission of an access unit will finish?  Again, is that what 
> you mean by "the buffer fullness is rather for decoder input buffer 
> control on a constant rate channel"?
It should be used to derive the time to start playback, i.e. when in the 
worst cases the decoder input buffer will not run dry (this may happen, 
if you start playback to soon and the encoder delivers  a rather long 
frame), nor  the decoder input buffer will  overflow (this may happen, 
if you start playback tool late, and the encoder delivers a couple of 
rather short frames. The latter issue might be somewhat academically, 
since the decoder input buffer of most implementations is larger than 
required. Anyhow, the delay increases when playback is started later 
than necessary.
An adts_buffer_fullness of 0 means, that the bit reservoir is empty. In 
that case the encoder may send only frames being equal in length or 
shorter than the average. Hence, the decoder should start playback 
immediately.
An adts_buffer_fullness of 6144-<mean_frame_length> means, that the bit 
reservoir is full. In that case the encoder may spend bits by sending 
frames being longer than the average. Hence, the decoder needs to wait a 
certain time before it starts playback (it needs to wait until it has 
received further <adts_buffer_fullness> bits), since otherwise it might 
not have the next frame ready (since it is not yet received completely), 
when it should.
In general, it is all about writing and reading bits in the decoder 
input buffer. In the case of a constant bitrate channel, writing happens 
on a continuous basis. Reading happens burst-wise, where a bust is a 
frame. The frame length may vary, so some care has to be taken, that the 
bits to be read are always present, and that the available buffer 
(decoder input buffer) is always capable to store all the bits it receives.
>  
> 3) Also, may passing adts_buffer_fullness to decoder have another 
> purpose, which is to synchronise the encoder's and decoder's clocks?  
> If constant bit rate, as adts_buffer_fullness and frame_size are both 
> known to decoder, after the decoder removes an access unit, it can 
> compare its actual buffer fullness with the encoded 
> adts_buffer_fullness so as to figure out whether its clock is too slow 
> or fast.
Clock synchronization is rather not the main intention of the 
transmission of adts_buffer_fullness is. While it is possible, the 
transmitted value is not precise enough. Hence, a synchronization per 
hardware (e.g. using ISDN bitclock) is preferable.
>  
> 4) Lastly, what is the initial value of adts_buffer_fullness?
There is no such thing as an initial value of adts_buffer_fullness. If 
the encoder starts with a full bit reservoir, the buffer fullness will 
have its maximum value (6144-<mean_frame_length>) in the first frame. 
This is commonly the case, if it permits the encoder from the the start 
to spend more bits if necessary.
>  
> Thanks again.
You are welcome. I hope my remarks are of some value.
>  
> Sincerely
> Yueshi

Best regards,
Ralph
>  
> On 12/2/06, *Ralph Sperschneider* 
> <ralph.sperschneider iis.fraunhofer.de 
> <mailto:ralph.sperschneider iis.fraunhofer.de>> wrote:
>
>     Dear Yueshi,
>
>
>     Yueshi Shen wrote:
>     > 1) crc_check
>     >
>     > In page 42, the spec says "CRC error detection data generated as
>     > described in ISO/IEC 11172-3, subclause 2.4.3.1 <http://2.4.3.1>
>     <http://2.4.3.1>".
>     > However, in the corresponding section of the MPEG-1 Audio, apart
>     from
>     > the CRC-check calculation algorithm, the protected fields are also
>     > defined (and they differ for Layer I, II, and III). So I am
>     wondering
>     > what is the CRC-check protected range for MPEG-2 AAC?
>     >
>     You might have overlooked the information provided for the
>     adts_error_check():
>
>     adts_error_check() The following bits are protected and fed into
>     the CRC
>     algorithm in order of their appearance:
>     • all bits of adts_fixed_header()
>     • all bits of adts_variable_header()
>     • first 192 bits of any
>     o single_channel_element()
>     o channel_pair_element()
>     o coupling_channel_element()
>     o lfe_channel_element()
>     • First 128 bits of the second individual_channel_stream() in the
>     channel_pair_element() must be protected.
>     • All information in any program_config_element() or
>     data_stream_element() must be protected.
>     For any element where the specified protection length of 128 or
>     192 bits
>     exceeds its actual length, the element is zero padded to the specified
>     protection length for CRC calculation. The id_syn_ele bits shall be
>     excluded from CRC protection. If the length of a CPE is shorter
>     than 192
>     bits, zero data are appended to achieve the length of 192 bits.
>     Furthermore, if the first ICS of the CPE ends at the Nth bit (N<192),
>     the first (192 - N) bits of the second ICS are protected twice, each
>     time in order of their appearance. For example, if the second ICS
>     starts
>     at the 190th bit
>     of CPE, the first 3 bits of the second ICS are protected twice.
>     Finally,
>     if the length of the second ICS is shorter than 128 bits, zero
>     data are
>     appended to achieve the length of 128 bits.
>
>     > 2) adts_buffer_fullness
>     >
>     > In page 45, the mechanism of generating adts_buffer_fullness is
>     > described. I guess it's a similar idea as VBV buffer in MPEG
>     video, so
>     > it's a control mechanism for output's timing.
>     >
>     The buffer fullness is rather for decoder input buffer control on a
>     constant rate channel than for output's timing. The output timing is
>     however affected indirectly, since the decoder has to start the audio
>     output such that in no circumstances an underrun or overrun of the
>     decoder input buffer will occurs.
>     >
>     > 3) padding audio frames
>     >
>     > One of the fill_element()'s usages is to keep every audio frame
>     same
>     > length.
>     >
>     No, this is not true. There is no reason to keep every audio frame at
>     the same length. Padding using fill_element()'s is usually only
>     performed on the encoder site when otherwise an underrun of the
>     decoder
>     input buffer would occur.
>
>     Best regards,
>     Ralph
>
>
>     --
>     Dipl.-Ing. Ralph Sperschneider  | Phone: +49 9131 776 344
>     Fraunhofer IIS                  | Fax:   +49 9131 776 398
>     Am Wolfsmantel 33               | mailto:
>     ralph.sperschneider iis.fraunhofer.de
>     <mailto:ralph.sperschneider iis.fraunhofer.de>
>     D 91058 Erlangen                | http://www.iis.fraunhofer.de/amm/
>

-- 
Dipl.-Ing. Ralph Sperschneider  | Phone: +49 9131 776 344
Fraunhofer IIS                  | Fax:   +49 9131 776 398
Am Wolfsmantel 33               | mailto:ralph.sperschneider iis.fraunhofer.de
D 91058 Erlangen                | http://www.iis.fraunhofer.de/amm/