[Mp4-tech] Re: IPB frames, maybe someone can explain a bit more in detail?

Stefan Goor stefan.goor ucd.ie
Wed Oct 27 15:14:23 EDT 2004


Hi Amy,
Below are some answers / comments to your questions:
>1. How did you exactly deal with I/P/B frame parsing? Actually I am 
dealing with a raw encoded stream withour RTP encapsulated.
I,P and B VOPs can be identified by the occurence of a byte alligned 
startcode '0x01B6' in the MPEG-4 bitstream.  Other useful VOP codes are 
'0x01B0' for the start of a visual object sequence and '0x01B3' for the 
start of a GOV (I think, I can't remember for certain).
When you encounter the VOP startcode the subsequent 2 bits indicate if 
the VOP is I, P or B.
00 -> I-VOP
01 -> P-VOP
10 -> B-VOP
11 -> S-VOP (used for sprite coding but I don't think this relevant for 
your question, I just included it for completeness).
>2. When you are doing this experiment, didn't you consider the case 
that header info (other than VoPs) get lost? 
In my experiements, I did not consider the case when the header was 
lost.  In such cases the stream was cancelled and restarted.  However, 
there are techniques available to avoid the loss of headers such as 
using FEC for header packets, duplicating the header within the stream 
or transmitting the header in out of band means such as RTSP (which is 
over TCP and so is reliable).
>3. What you did is just filter out errored frame even thought only one 
or two packets data get lost? Does that cause a high frame error rate? 
any better idea to improve PSNR?
The experiments I conducted did not use the error resilience (ER) 
features of MPEG-4 because their performance is subject to proprietary 
implementation.  Without ER, the reference software decoder would 
frequently crash when incomplete frames were passed.  Therefore only 
frames where all it's constituent packets had been received were passed 
to the decoder.  If even a single packet was lost, then all the data 
for that frame was discarded and the next full frame was passed to the 
decoder.
This did not have a significant impact on the number of frames lost / 
damaged because the bit rate constraints we used i.e. >65kbps, meant 
that few frames were larger than the MTU we used i.e. 1500 for a LAN.  
Typically only I-VOPs were segemented in multiple packets.
What I have mentioned above applies to only the simple packetisation 
approach discussed in RFC 3016, the other Multi-SL and Multiplexed 
approaches I studied were slightly different.  If you want further 
information about these approaches, let me know.
As for improving the PSNR, one approach might be to examine the effects 
of ER, but you may have to consider a number of implementations because 
competing ER approaches may exhibit contrasting results.
Hope this helps,
Best Regards,
Stefan


More information about the Mp4-tech mailing list