[Mp4-tech] Why only low frequeny is considered in DCT?

Thu Nov 23 02:15:32 ESTEDT 2006

Sagar et al,
It sounds like you have been reading some pooly-written tutorial(s)
about how DCT coding works.  I have seen some pretty bad descriptions of
the concepts of DCT coding.  (Hopefully, you won't respond by saying you
learned what you know by reading my papers on the subject. :-)  One
important thing to keep in mind is the difference between what we expect
will happen most of the time and what might possibly occur on some
specific set of worst-case input data.
Actually, when we use DCT coding (e.g., for ordinary JPEG 1 still-image
coding), we ordinarily *do* consider *all* frequency components.  We
transform each block of data and quantize the resulting frequency-domain
coefficients - that includes quantization of both the low-frequency and
high-frequency components.  All of them.
Usually that quantization process involves the application of what is
known as a "mid-tread" scalar quantizer.  In other words, the quantizer
is structured such that one of the selectable output reconstruction
values is exactly equal to 0.  See, for example, the book by Jayant and
Noll for some discussion of such quantizers.
Actually you can still perform effective data compression even if you
use a "mid-rise" quantizer and it will still function just fine for high
bit rate coding.  But such a design would have difficulty operating at
low bit rates, since a mid-rise quantizer tends to have an output
entropy exceeding one bit per sample.
Anyhow, what happens is that for image and video coding at relatively
low bit rates, we can observe that when we apply a mid-tread scalar
quantizer to the transform coefficients, we observe that many of the
high-frequency components usually end up with a quantized value of 0.
That doesn't mean that we force them to be zero.  It just means that
this is what tends to happen most of the time.
So when we use the pdf of the quantized transform coefficients to
perform entropy coding of their values, they compress rather well since
many of them tend to be equal to 0 most of the time.
This is a consequence of the cross-correlation properties of image and
video data.  Such data tends to contain more low-frequency content than
high-frequency content most of the time.  Another way to express that is
to say that the input data tends to be highly correlated.
But the coding technology design does not force that.  If you feed a
low-correlation input image to a good JPEG coder, it will still be able
to represent the image (although it might require more bits to code with
reasonable fidelity than a smoother image would).
If you want to really understand this stuff, study up on what a KLT is,
and perhaps read some things like the book by Jayant and Noll and
perhaps the paper by Huang and Shultheiss and perhaps the original
papers and books by Rao et al on the development of the DCT.  There are
probably also lots of other places where such things are described well.
Best Regards,
Gary Sullivan
________________________________
	From: mp4-tech-bounces lists.mpegif.org
[mailto:mp4-tech-bounces lists.mpegif.org] On Behalf Of sagar
	Sent: Wednesday, November 22, 2023 9:26 PM
	To: mp4-tech lists.mpegif.org
	Subject: [Mp4-tech] Why only low frequeny is considered in DCT?
	Hi Xperts,
	I wanted to know when we do DCT ( video preocessing), we only
consider the Low frequency elements. Why? 
	DCT is used for taking out spatial reduncies, and DCT is
basically used to decorrelate energy in the image.
	And other fact being low frequency contains high energy, 
	If anybody could answer my question, i would be thankful.
	Warm Regards,
	Sagar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/mp4-tech/attachments/20061123/809fb740/attachment.html