[Mp4-tech] Regarding padding in MPEG-4 part 2
Gary Sullivan
garysull windows.microsoft.com
Fri Nov 24 17:02:36 ESTEDT 2006
Dzung Hoang et al,
Actually it's not as simple as saying that the standard said to do one thing and then was changed to say to do something else. In this case there was actually a lack of clarity and apparent conflict between what different parts of the document (and software) said.
For something like this, the intent is that the newer version should be interpreted as the proper actual *original* intent. This is equivalent to saying that the new version *obsoletes* the old one.
This sort of thing is one reason that companies that implement standards should keep a close eye on what the standardization committee is doing (and considering to do) on an ongoing basis, and should voice their opinion as that work progresses. When we write a standard, we try to write it perfectly without errors, but such issues do unavoidably arise.
No, there is no version number in the syntax. I don't think it would make any difference if there was, since (due to the prior state of confusion) it was not clear what an older encoder would expect the decoder to be doing.
Of course, there is an understanding that some people may have made products prior to the corrigendum action in which the problem with the prior version of the standard might manifest itself in the product behavior. It may be advisable for encoders to consider this in their encoder design, such as by using certain encoding techniques that are designed to mitigate or avoid the problem (as I previously advised on this thread).
Best Regards,
Gary Sullivan
+> -----Original Message-----
+> From: mp4-tech-bounces lists.mpegif.org
+> [mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of Dzung Hoang
+> Sent: Friday, November 24, 2023 8:19 AM
+> To: mp4-tech lists.mpegif.org
+> Subject: RE: [Mp4-tech] Regarding padding in MPEG-4 part 2
+>
+> So this means that each interpretation is compliant to a
+> different version
+> of the spec. Is there a version number in the bitstream that
+> can be used to
+> properly decode bitstreams compliant to a specific version?
+>
+> Should a newer version of a standard obsolete older versions
+> and render
+> existing bitstreams non-compliant?
+>
+> Regards,
+> - Dzung Hoang
+>
+>
+> -----Original Message-----
+> From: mp4-tech-bounces lists.mpegif.org
+> [mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of
+> MPEGIF List Admins
+> Sent: Thursday, November 23, 2023 8:56 AM
+> To: mp4-tech lists.mpegif.org
+> Cc: 'Herbert.Thoma.'@lists1.magma.ca
+> Subject: FW: [Mp4-tech] Regarding padding in MPEG-4 part 2
+>
+> another forward of a bounced email.
+>
+> -----Original Message-----
+> From: Herbert Thoma [mailto:tma iis.fhg.de]
+> Sent: Thursday, 23 November 2023 12:59
+> To: Gary Sullivan
+> Cc: "Rickard Sjöberg (KI/EAB)"; mp4-tech lists.mpegif.org;
+> Tung yi-Shin;
+> Yi-Shin Tung; ohm ient.rwth-aachen.de
+> Subject: Re: [Mp4-tech] Regarding padding in MPEG-4 part 2
+>
+> Gary, Yi-Shin, Jens,
+>
+> Please look at the 3rd (2004) edition:
+>
+> The sentence
+> "Note that for rectangular VOP, a reference VOP is defined by
+> video_object_layer_width and video_object_layer_height."
+> in the 2nd edition was changed to
+> "Note that for a rectangular VOP, a reference VOP is defined by
+> video_object_layer_width and video_object_layer_height, extended to
+> a multiple of 16"
+> in the 3rd edition.
+>
+> I guess this schould make you happy, Gary :-)
+>
+> Regards,
+> Herbert.
+>
+> Gary Sullivan wrote:
+> > Copying Yi-Shin and Jens, who perhaps remember this topic,
+> >
+> > Actually the action I discussed may have happened longer
+> ago than that. I
+> looked in N6362
+> > and although it touched on some relevant sections (I think
+> including
+> 7.6.3) I did not quite
+> > find an answer to the question there. I also looked in
+> N3664 and didn't
+> find it there either.
+> > I then looked in the text of the 2nd (2001) edition and
+> although I got a
+> bit confused by what it
+> > says in various plances (although I haven't really studied
+> the topic fully
+> yet), I think I found it.
+> >
+> > What I think I see supports what I said on this thread.
+> At the end of
+> 7.6.4 in the 2nd edition
+> > it says the following:
+> >
+> > xref = MIN ( MAX (xcurr+dx, vhmcsr), xdim+vhmcsr-1 )
+> > yref = MIN ( MAX (ycurr+dy, vvmcsr), ydim+vvmcsr-1 )
+> >
+> > and
+> >
+> > "(ydim, xdim) are the dimensions of the bounding rectangle of the
+> reference VOP"
+> >
+> > and
+> >
+> > "Note that for rectangular VOP, a reference VOP is defined by
+> video_object_layer_width and video_object_layer_height."
+> >
+> >
+> > Now we know that video_object_layer_width and
+> video_object_layer_height
+> might not be multiples of 16.
+> >
+> > Based on those equations and quoted sentences, it sounds to me like
+> padding is applied for any location beyond the rectangle
+> having (width,
+> height) = (video_object_layer_width, video_object_layer_height).
+> >
+> > This is not something that I am happy about, but I think
+> it is what is
+> currently written and I think there was some historical
+> corrigendum action
+> that changed it to say that.
+> >
+> > There has been a long history of confusion over this
+> issue. I am not
+> entirely sure that I have read all relevant parts of the
+> spec, but I don't
+> see how the above quoted statements can be interpreted any other way.
+> >
+> > Best Regards,
+> >
+> > Gary Sullivan
+> >
+> >
+> > +> -----Original Message-----
+> > +> From: Gary Sullivan
+> > +> Sent: Thursday, November 23, 2023 2:20 AM
+> > +> To: 'Herbert Thoma'
+> > +> Cc: "Rickard Sjöberg (KI/EAB)"; mp4-tech lists.mpegif.org
+> > +> Subject: RE: [Mp4-tech] Regarding padding in MPEG-4 part 2
+> > +>
+> > +> Have you looked at N6362?
+> > +>
+> > +> Best Regards,
+> > +>
+> > +> -Gary Sullivan
+> > +>
+> > +> +> -----Original Message-----
+> > +> +> From: Herbert Thoma [mailto:herbert.thoma iis.fraunhofer.de]
+> > +> +> Sent: Thursday, November 23, 2023 2:11 AM
+> > +> +> To: Gary Sullivan
+> > +> +> Cc: "Rickard Sjöberg (KI/EAB)"; mp4-tech lists.mpegif.org
+> > +> +> Subject: Re: [Mp4-tech] Regarding padding in MPEG-4 part 2
+> > +> +>
+> > +> +> Gary, Rickard,
+> > +> +>
+> > +> +> I am pretty sure that the padding from 128x176 is the
+> > +> +> correct interpretation
+> > +> +> (meaning that the pixels outside of 120x170 but inside of
+> > +> +> 128x176 shall be left
+> > +> +> as they were decoded).
+> > +> +>
+> > +> +> I remember the discussion in MPEG very well, because I
+> > +> +> originally implemented
+> > +> +> it the other way in my encoder and decoder and changed that
+> > +> +> after the corrigendum.
+> > +> +>
+> > +> +> I don't konw if there are any comformance bitstreams
+> > +> +> available, but I attached
+> > +> +> a few frames of forman cropped to 170x120 and encoded with
+> > +> +> my encoder. (I can
+> > +> +> not guarantee that there are actually motion vectors that
+> > +> +> test the problem
+> > +> +> in there, though.)
+> > +> +>
+> > +> +> Kind regards,
+> > +> +> Herbert.
+> > +> +>
+> > +> +> Gary Sullivan wrote:
+> > +> +> > Rickard et al,
+> > +> +> >
+> > +> +> > For a long time, I was pretty sure that I knew what the
+> > +> +> answer was, and my interpretation agreed
+> > +> +> > with yours. Certainly that is the way the similar feature
+> > +> +> works in H.263 (although that fact is
+> > +> +> > not directly relevant to the question at hand, since
+> > +> +> "motion vectors over picture boundaries" is
+> > +> +> > a non-Baseline feature of H.263 Annex D, while MPEG-4 part
+> > +> +> 2 only tries to be compatible with the
+> > +> +> > Baseline). I vaguely recall that at one time one
+> > +> +> implementation of the reference software was
+> > +> +> > doing it one way and the other was doing it the other way,
+> > +> +> and I believe MPEG eventually approved
+> > +> +> > a corrigendum to Part 2 and a bug fix to the software to
+> > +> +> clarify it. Unfortunately, I believe the
+> > +> +> > clarification was according to the other interpretation.
+> > +> +> >
+> > +> +> > There was a corrigendum finalized in 2004, which I think
+> > +> +> corresponded to MPEG output document
+> > +> +> > N6362. I believe this subject was addressed in that
+> > +> corrigendum.
+> > +> +> >
+> > +> +> > If I was making an encoder, I might design it to avoid
+> > +> +> motion vectors reaching beyond the bottom
+> > +> +> > and right edges of the reference pictures to be sure that
+> > +> +> I would work with decoders that used
+> > +> +> > either interpretation.
+> > +> +> >
+> > +> +> > I guess another decent encoder approach would be to use
+> > +> +> padding in the source for those areas before
+> > +> +> > encoding too, so that the only difference between the two
+> > +> +> interpretations would be the quantization
+> > +> +> > error. With that approach, a little drift might not
+> > +> +> produce very bad artifacts.
+> > +> +> >
+> > +> +> > Best Regards,
+> > +> +> >
+> > +> +> > Gary Sullivan
+> > +> +> >
+> > +> +> > +> -----Original Message-----
+> > +> +> > +> From: mp4-tech-bounces lists.mpegif.org
+> > +> +> > +> [mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of
+> > +> +> > +> Rickard Sjöberg (KI/EAB)
+> > +> +> > +> Sent: Wednesday, November 22, 2023 5:36 AM
+> > +> +> > +> To: mp4-tech lists.mpegif.org
+> > +> +> > +> Subject: [Mp4-tech] Regarding padding in MPEG-4 part 2
+> > +> +> > +>
+> > +> +> > +>
+> > +> +> > +> Dear experts,
+> > +> +> > +>
+> > +> +> > +> assume that a simple profile video stream with
+> height and
+> > +> +> > +> width of 120 and 170 pixels respectively shall
+> be decoded.
+> > +> +> > +> The bounding rectangle of the reference VOP is
+> 128x176. Now
+> > +> +> > +> my question is whether you should pad outside
+> of 120x170 or
+> > +> +> > +> 128x176 when referencing pixels for motion compensation?
+> > +> +> > +>
+> > +> +> > +> The relevant part of the standard is section
+> 7.6.4 i believe
+> > +> +> > +> (I looking at the 3rd edition, that's ISO/IEC
+> > +> +> 14496-2:2004, N5515):
+> > +> +> > +>
+> > +> +> > +> The coordinates of a reference sample in the
+> reference VOP,
+> > +> +> > +> (yref, xref) is determined as follows :
+> > +> +> > +> xref = MIN ( MAX (xcurr+dx, vhmcsr), xdim+vhmcsr-1 )
+> > +> +> > +> yref = MIN ( MAX (ycurr+dy, vvmcsr), ydim+vvmcsr-1)
+> > +> +> > +>
+> > +> +> > +> My interpretation of this is that padding
+> should be done
+> > +> +> > +> outside of 128x176 (this means that the pixels
+> outside of
+> > +> +> > +> 120x170 but inside of 128x176 shall be left as
+> they were
+> > +> +> > +> decoded), is this correct?
+> > +> +> > +>
+> > +> +> > +> Is there any conformance bitstream that tests
+> this behaviour?
+> > +> +> > +>
+> > +> +> > +> /
+> > +> +> > +> Rickard Sjoberg
+> > +> +> > +> Ericsson
+> > +> +> > +>
+> > +> +> > +> _______________________________________________
+> > +> +> > +> NOTE: Please use clear subject lines for your
+> posts. Include
+> > +> +> > +> [audio, [video], [systems], [general] or another
+> > +> +> > +> apppropriate identifier to indicate the type of
+> > +> +> question you have.
+> > +> +> > +>
+> > +> +> > +> Note: Conduct on the mailing list is subject to the
+> > +> +> > +> Antitrust guidelines found at
+> > +> +> > +>
+> http://www.mpegif.org/public/documents/vault/mp-out-30042-Ant
+> > +> +> > +> itrust.php
+> > +> +> > +>
+> > +> +> >
+> > +> +> > _______________________________________________
+> > +> +> > NOTE: Please use clear subject lines for your posts.
+> > +> +> Include [audio, [video], [systems], [general] or another
+> > +> +> apppropriate identifier to indicate the type of
+> question you have.
+> > +> +> >
+> > +> +> > Note: Conduct on the mailing list is subject to the
+> > +> +> Antitrust guidelines found at
+> > +> +> http://www.mpegif.org/public/documents/vault/mp-out-30042-Ant
+> > +> +> itrust.php
+> > +> +> >
+> > +> +>
+> > +> +> --
+> > +> +> Herbert Thoma
+> > +> +> Head of Video Group
+> > +> +> Multimedia Realtime Systems Department
+> > +> +> Fraunhofer IIS
+> > +> +> Am Wolfsmantel 33, 91058 Erlangen, Germany
+> > +> +> Phone: +49-9131-776-323
+> > +> +> Fax: +49-9131-776-399
+> > +> +> email: tma iis.fhg.de
+> > +> +> www: http://www.iis.fhg.de/
+> > +> +>
+> >
+>
+> --
+> Herbert Thoma
+> Head of Video Group
+> Multimedia Realtime Systems Department
+> Fraunhofer IIS
+> Am Wolfsmantel 33, 91058 Erlangen, Germany
+> Phone: +49-9131-776-323
+> Fax: +49-9131-776-399
+> email: tma iis.fhg.de
+> www: http://www.iis.fhg.de/
+>
+>
+> _______________________________________________
+> NOTE: Please use clear subject lines for your posts. Include
+> [audio, [video], [systems], [general] or another
+> apppropriate identifier to indicate the type of question you have.
+>
+> Note: Conduct on the mailing list is subject to the
+> Antitrust guidelines found at
+> http://www.mpegif.org/public/documents/vault/mp-out-30042-Ant
+> itrust.php
+>
More information about the Mp4-tech
mailing list