[Mp4-tech] Why only low frequeny is considered in DCT?

Fri Nov 24 13:58:54 ESTEDT 2006

Hi Anup,
Thank you for the explanation you provided. I'd like to comment on some
of your basic reasons.
2. Why the DCT? 
A. The Kahrunen-Loeve is one such transform which decorrelates
information without loss, transforming signal information from the
spatial to the frequency domain. But the Markov chain (since we are
working in discrete time) involved needs a lot of processing. Hence we
make use of the poor lossy cousin, the Type II  DCT in real world image
processing, majorly due to it's energy compaction property. 
R: The DCT is not a lossy transform (unless you consider its usage with
integer arithmetic). The KLT is indeed the optimally decorrelating
transform, however is basis functions depend on and thus vary with the
autocorrelation of the input signal. For a time-varying input signal,
you would have to send (some description for reconstruction of) the
basis functions along with the transform coefficients, which leads to
ppor compression perfomance. 
The DCT is an approximation of the KLT in the sense that for some (often
encountered) sources, the decorrelation is nearly as good. Nevertheless,
both the forward and inverse DCT are completely lossless. Thus, it is a
good alternative for the KLT. Both transform lead to energy compaction,
that is, the major part of signal energy is distributed over a few
transform coefficients.
3. Why are the lower frequency components more important? 
A. As described in the answer to the 1st question a transform should
ideally allow us to depict the signal information in the least possible
components. The DCT by it's nature compacts the energy of the signal
into the lower frequency components. Since the least frequency is 0, the
DC component is of utmost importance. However the DCT being an
approximation to the KLT, we do get components along the other axes.
Hence even some non-zero frequency components could retain some signal
energy. Hence we perform a zig-zag scan to get the higher energy (low
frequency) components so as to preserve maximum information. 
R: The DCT has good energy compaction properties, as stated above. Since
(for image and video coding) it transforms a spatial source signal to
the frequency domain, it depicts an energy distribution in the frequency
domain. The  first component represents the DC level (think of this as
the average brightness). The distribution of the energy over the DCT
coefficients depends entirely on the energy distribution within the
source signal. If the source signal has more energy at low spatial
frequencies, the DCT will have most of the energy concentrated in the
first coefficients. If the source signal has a lot of energy at high
spatial frequencies, the DCT will show high values for the higher
coefficients. Furthermore, by using the knowledge that the high spatial
frequency information is less visible to the eye than low frequency we
can quantize the high frequency parts using fewer bits. 
Regards,
Omar
------------------------------------------------------------ 
Dr. ir. O.A. Niamut 
TNO Information and Communication Technology 
Broadband and Voice Solutions (Wireline) 
Brassersplein 2
P.O. Box 5050
2600GB Delft
The Netherlands
Tel :       +31 15 285 72 18
Mobile : +31 6 519 162 42
Fax :      +31 15 28 631 66 
E-mail :  Omar.Niamut tno.nl <mailto:O.A.Niamut telecom.tno.nl>  
This e-mail and its contents are subject to the DISCLAIMER at
http://www.tno.nl/disclaimer/email.html 
------------------------------------------------------------
This e-mail and its contents are subject to the DISCLAIMER at http://www.tno.nl/disclaimer/email.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/mp4-tech/attachments/20061124/499b3638/attachment.html