[Mp4-tech] RE: Why spatial prediction in AVC performed in pixel domain?

Tue Sep 4 15:14:12 EDT 2007

Regarding my prior remark about Q15-E-17 and Q15-F-11 that "I believe he had been playing with such concepts before that as well", please refer to the prior documents Q15-C-23 of November 1997 and Q15-D-29 of April 1998.  In addition to proposing (4x4) spatial-domain intra prediction, they also proposed to enable 4x4 motion and to use an integer-based 4x4 transform in conjunction with an in-loop deblocking filter.  In Q15-D-29, these features were combined with multiple-reference picture use (5 reference pictures) as had been proposed in Q15-C-11 of November 1997.  From these contributions I believe one can already see the design of H.264/MPEG-4 Part 10 AVC really beginning to take shape (in hindsight).
Best Regards,
Gary Sullivan
________________________________
From: Shevach Riabtsev [mailto:sriabtsev broadcom.com]
Sent: Wednesday, August 29, 2023 7:35 AM
To: Gary Sullivan
Cc: mp4-tech lists.mpegif.org
Subject: RE: Why spatial prediction in AVC performed in pixel domain?
Thanks Gary
Actually you answered my query. Indeed if one block is in Inter MB and other block belongs to Intra MB, then no correlation in frequency domain between the blocks expected.
I missed this point.
Regards,  Shevach
Broadcom
________________________________
From: Gary Sullivan [mailto:garysull windows.microsoft.com]
Sent: Wednesday, August 29, 2023 4:21 PM
To: Shevach Riabtsev
Cc: mp4-tech lists.mpegif.org
Subject: RE: Why spatial prediction in AVC performed in pixel domain?
Shevach et al,
The Q15 documents should be found at http://ftp3.itu.int/av-arch/video-site.  Look in the folders for 1998.
The measurement you did sounds excessively simplistic.  You did not demonstrate an actual compression system with real syntax design and actual selection of which prediction mode will be used and test the compression performance in the rate-distortion sense.  Just measuring correlation is not enough.
I am pretty confident that we would not be using the spatial prediction scheme if the old frequency-domain prediction scheme like what is found in H.263 Annex I and MPEG-4 part 2 would have worked better.
Note also that the frequency-domain prediction scheme has a problem if the neighbor is not intra coded.
Best Regards,
Gary Sullivan
________________________________
From: Shevach Riabtsev [mailto:sriabtsev broadcom.com]
Sent: Wednesday, August 29, 2023 4:52 AM
To: Gary Sullivan
Cc: mp4-tech lists.mpegif.org
Subject: RE: Why spatial prediction in AVC performed in pixel domain?
Gary
I compared correlation coefficients among neighboring 4x4 blocks in both sample domain and frequency one (i.e. quantized coefficients).  It appears that the correlation in frequency domain is stronger than in pixel one.
The only advantage of sample domain spatial prediction I can see is the arithmetic precision. Indeed, in pixel domain all predictors are 8-bits width, while in frequency plane the predictors are 12 bits.
Where I can achieve Q15-E-17 and Q15-F-11 proposals. Could you send me a link to these documents.
Regards, Shevach
Broadcom
________________________________
From: Gary Sullivan [mailto:garysull windows.microsoft.com]
Sent: Monday, August 27, 2023 5:46 PM
To: Shevach Riabtsev; mp4-tech lists.mpegif.org
Subject: RE: Why spatial prediction in AVC performed in pixel domain?
Have you actually done any experiments to confirm your belief that the frequency domain works better on noisy material?
Of course, also, not all material is noisy...
Gisle Bjontegaard proposed the basic concept of the spatial prediction as part of his original proposals in 1998 for the H.26L project (Q15-E-17 in July and Q15-F-11 in November).  I believe he had been playing with such concepts before that as well.  I'm rather confident that he would not have proposed such a thing unless it was ordinarily beneficial to compression capability.
Best Regards,
Gary Sullivan
________________________________
From: mp4-tech-bounces lists.mpegif.org [mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of Shevach Riabtsev
Sent: Monday, August 27, 2023 4:51 AM
To: mp4-tech lists.mpegif.org
Subject: [Mp4-tech] Why spatial prediction in AVC performed in pixel domain?
Dear experts
The spatial prediction in AVC is executed in pixel (sample) domain, while in MPEG2 and MPEG4 the spatial prediction (actually partial prediction) is performed in frequency domain.
I think that on noisy material it is beneficial to perform the spatial prediction in frequency domain in the same manner as pixel-domain prediction.
Say, for each 4x4 block, DCT and quantization is performed, then the prediction direction and the residual between quantization coefficients of the current 4x4 block and the neighboring (left and/or top) is calculated.
What was a reason to prefer the sample domain for the spatial prediction instead of frequency one.
 Regards,  Shevach
Broadcom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/mp4-tech/attachments/20070904/96846ad8/attachment.html