[Mp4-tech] Regarding padding in MPEG-4 part 2
Gary Sullivan
garysull windows.microsoft.com
Thu Nov 23 03:49:04 ESTEDT 2006
Copying Yi-Shin and Jens, who perhaps remember this topic,
Actually the action I discussed may have happened longer ago than that. I looked in N6362 and although it touched on some relevant sections (I think including 7.6.3) I did not quite find an answer to the question there. I also looked in N3664 and didn't find it there either. I then looked in the text of the 2nd (2001) edition and although I got a bit confused by what it says in various plances (although I haven't really studied the topic fully yet), I think I found it.
What I think I see supports what I said on this thread. At the end of 7.6.4 in the 2nd edition it says the following:
xref = MIN ( MAX (xcurr+dx, vhmcsr), xdim+vhmcsr-1 )
yref = MIN ( MAX (ycurr+dy, vvmcsr), ydim+vvmcsr-1 )
and
"(ydim, xdim) are the dimensions of the bounding rectangle of the reference VOP"
and
"Note that for rectangular VOP, a reference VOP is defined by video_object_layer_width and video_object_layer_height."
Now we know that video_object_layer_width and video_object_layer_height might not be multiples of 16.
Based on those equations and quoted sentences, it sounds to me like padding is applied for any location beyond the rectangle having (width, height) = (video_object_layer_width, video_object_layer_height).
This is not something that I am happy about, but I think it is what is currently written and I think there was some historical corrigendum action that changed it to say that.
There has been a long history of confusion over this issue. I am not entirely sure that I have read all relevant parts of the spec, but I don't see how the above quoted statements can be interpreted any other way.
Best Regards,
Gary Sullivan
+> -----Original Message-----
+> From: Gary Sullivan
+> Sent: Thursday, November 23, 2023 2:20 AM
+> To: 'Herbert Thoma'
+> Cc: "Rickard Sjöberg (KI/EAB)"; mp4-tech lists.mpegif.org
+> Subject: RE: [Mp4-tech] Regarding padding in MPEG-4 part 2
+>
+> Have you looked at N6362?
+>
+> Best Regards,
+>
+> -Gary Sullivan
+>
+> +> -----Original Message-----
+> +> From: Herbert Thoma [mailto:herbert.thoma iis.fraunhofer.de]
+> +> Sent: Thursday, November 23, 2023 2:11 AM
+> +> To: Gary Sullivan
+> +> Cc: "Rickard Sjöberg (KI/EAB)"; mp4-tech lists.mpegif.org
+> +> Subject: Re: [Mp4-tech] Regarding padding in MPEG-4 part 2
+> +>
+> +> Gary, Rickard,
+> +>
+> +> I am pretty sure that the padding from 128x176 is the
+> +> correct interpretation
+> +> (meaning that the pixels outside of 120x170 but inside of
+> +> 128x176 shall be left
+> +> as they were decoded).
+> +>
+> +> I remember the discussion in MPEG very well, because I
+> +> originally implemented
+> +> it the other way in my encoder and decoder and changed that
+> +> after the corrigendum.
+> +>
+> +> I don't konw if there are any comformance bitstreams
+> +> available, but I attached
+> +> a few frames of forman cropped to 170x120 and encoded with
+> +> my encoder. (I can
+> +> not guarantee that there are actually motion vectors that
+> +> test the problem
+> +> in there, though.)
+> +>
+> +> Kind regards,
+> +> Herbert.
+> +>
+> +> Gary Sullivan wrote:
+> +> > Rickard et al,
+> +> >
+> +> > For a long time, I was pretty sure that I knew what the
+> +> answer was, and my interpretation agreed
+> +> > with yours. Certainly that is the way the similar feature
+> +> works in H.263 (although that fact is
+> +> > not directly relevant to the question at hand, since
+> +> "motion vectors over picture boundaries" is
+> +> > a non-Baseline feature of H.263 Annex D, while MPEG-4 part
+> +> 2 only tries to be compatible with the
+> +> > Baseline). I vaguely recall that at one time one
+> +> implementation of the reference software was
+> +> > doing it one way and the other was doing it the other way,
+> +> and I believe MPEG eventually approved
+> +> > a corrigendum to Part 2 and a bug fix to the software to
+> +> clarify it. Unfortunately, I believe the
+> +> > clarification was according to the other interpretation.
+> +> >
+> +> > There was a corrigendum finalized in 2004, which I think
+> +> corresponded to MPEG output document
+> +> > N6362. I believe this subject was addressed in that
+> corrigendum.
+> +> >
+> +> > If I was making an encoder, I might design it to avoid
+> +> motion vectors reaching beyond the bottom
+> +> > and right edges of the reference pictures to be sure that
+> +> I would work with decoders that used
+> +> > either interpretation.
+> +> >
+> +> > I guess another decent encoder approach would be to use
+> +> padding in the source for those areas before
+> +> > encoding too, so that the only difference between the two
+> +> interpretations would be the quantization
+> +> > error. With that approach, a little drift might not
+> +> produce very bad artifacts.
+> +> >
+> +> > Best Regards,
+> +> >
+> +> > Gary Sullivan
+> +> >
+> +> > +> -----Original Message-----
+> +> > +> From: mp4-tech-bounces lists.mpegif.org
+> +> > +> [mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of
+> +> > +> Rickard Sjöberg (KI/EAB)
+> +> > +> Sent: Wednesday, November 22, 2023 5:36 AM
+> +> > +> To: mp4-tech lists.mpegif.org
+> +> > +> Subject: [Mp4-tech] Regarding padding in MPEG-4 part 2
+> +> > +>
+> +> > +>
+> +> > +> Dear experts,
+> +> > +>
+> +> > +> assume that a simple profile video stream with height and
+> +> > +> width of 120 and 170 pixels respectively shall be decoded.
+> +> > +> The bounding rectangle of the reference VOP is 128x176. Now
+> +> > +> my question is whether you should pad outside of 120x170 or
+> +> > +> 128x176 when referencing pixels for motion compensation?
+> +> > +>
+> +> > +> The relevant part of the standard is section 7.6.4 i believe
+> +> > +> (I looking at the 3rd edition, that's ISO/IEC
+> +> 14496-2:2004, N5515):
+> +> > +>
+> +> > +> The coordinates of a reference sample in the reference VOP,
+> +> > +> (yref, xref) is determined as follows :
+> +> > +> xref = MIN ( MAX (xcurr+dx, vhmcsr), xdim+vhmcsr-1 )
+> +> > +> yref = MIN ( MAX (ycurr+dy, vvmcsr), ydim+vvmcsr-1)
+> +> > +>
+> +> > +> My interpretation of this is that padding should be done
+> +> > +> outside of 128x176 (this means that the pixels outside of
+> +> > +> 120x170 but inside of 128x176 shall be left as they were
+> +> > +> decoded), is this correct?
+> +> > +>
+> +> > +> Is there any conformance bitstream that tests this behaviour?
+> +> > +>
+> +> > +> /
+> +> > +> Rickard Sjoberg
+> +> > +> Ericsson
+> +> > +>
+> +> > +> _______________________________________________
+> +> > +> NOTE: Please use clear subject lines for your posts. Include
+> +> > +> [audio, [video], [systems], [general] or another
+> +> > +> apppropriate identifier to indicate the type of
+> +> question you have.
+> +> > +>
+> +> > +> Note: Conduct on the mailing list is subject to the
+> +> > +> Antitrust guidelines found at
+> +> > +> http://www.mpegif.org/public/documents/vault/mp-out-30042-Ant
+> +> > +> itrust.php
+> +> > +>
+> +> >
+> +> > _______________________________________________
+> +> > NOTE: Please use clear subject lines for your posts.
+> +> Include [audio, [video], [systems], [general] or another
+> +> apppropriate identifier to indicate the type of question you have.
+> +> >
+> +> > Note: Conduct on the mailing list is subject to the
+> +> Antitrust guidelines found at
+> +> http://www.mpegif.org/public/documents/vault/mp-out-30042-Ant
+> +> itrust.php
+> +> >
+> +>
+> +> --
+> +> Herbert Thoma
+> +> Head of Video Group
+> +> Multimedia Realtime Systems Department
+> +> Fraunhofer IIS
+> +> Am Wolfsmantel 33, 91058 Erlangen, Germany
+> +> Phone: +49-9131-776-323
+> +> Fax: +49-9131-776-399
+> +> email: tma iis.fhg.de
+> +> www: http://www.iis.fhg.de/
+> +>
More information about the Mp4-tech
mailing list