[Mp4-tech] [H.264] [Systems] Picture timing in absence of SEI
messages
Ben Avison
ben.avison tematic.com
Thu Aug 11 20:43:03 ESTEDT 2005
Is there anywhere that defines in temporal terms the behaviour of the HRD
in the absence of picture timing SEI messages in the bitstream? I wonder
if this issue has been lost in the crack between the H.264 spec and the
MPEG-2 systems spec - it could have important consequences for
interoperatability of H.264 streams encapsulated in program streams or
transport streams.
To elaborate: version 2 of the MPEG-2 systems spec defines DTS and PTS in
terms of parameters derived from picture timing SEI messages. This mechanism
allows the H.264 encoder to unambiguously inform the multiplexer of all the
information it needs to be able to schedule the bitstream within the
multiplex. However, this does not help when the H.264 stream does not
include picture timing SEI messages - and the majority of current H.264
encoders do not seem to do so.
In the absence of picture timing SEI messages, the only constraints upon
H.264 bitstreams appear to be that they be decodable according to the
bumping process. But compared to traditional codecs, this process can be
"lumpy": there can be times when decode cannot proceed (for example when
a frame is output but it is still marked as used for reference, and the
DPB is full but all other frames in the DPB are either also marked as used
for reference, or have a higher picture order count). And there are times
when multiple frames need to be decoded between the output of two frames
that are consecutive in output order (for example when a frame that follows
an IDR frame in decode order precedes it in output order).
I can see at least two ways that this "lumpiness" can be dealt with. One is
to assume that the decoder has about twice as many frame stores available as
is specified by the profile and level; this would allow decoding to proceed
when the DPB would otherwise have been full, and assuming that you had
reached the nominal DPB fullness level before starting output, should also
prevent the need ever to decode more than one frame during the output period
of one frame.
The other approach is to accept that the decode frame rate will be lumpy.
But this leaves an unanswered question of how far apart the DTS values of
the pictures should be when multiple frames need to be decoded within the
output period of one frame.
My gut feeling is that it would be nice to be able to assume the former
scenario, for the sake of smoothing out tha data rates, for evening out
the processing load on decoders, and to make the calcuation of DTS values
easier and less ambiguous. However, I suspect that this is unlikely to be
supported by the H.264 spec.
The decision about which behaviour the HRD is assumed to have impacts very
much on the scheduling of the bitstream within a multiplex, because the
multiplexer has to ensure that the CPB neither overflows or underflows, and
that depends upon the time of removal of coded pictures from the CPB (which
is defined to be equivalent to the DTS). This is where the interoperability
issue I mentioned comes into play.
Can anyone offer me any advice on this issue?
Thanks,
Ben Avison
--
Ben Avison
Tematic Tel: +44 (0) 1728 727437
3 Signet Court Fax: +44 (0) 1728 727430
Cambridge, CB5 8LA, United Kingdom WWW: http://www.tematic.com/
More information about the Mp4-tech
mailing list