[FFmpeg-devel] [PATCH 1/2] MxPEG decoder

Anatoly Nenashev anatoly.nenashev
Wed Nov 17 08:15:36 CET 2010


On 08.11.2010 13:40, Anatoly Nenashev wrote:
> On 07.11.2010 03:00, Michael Niedermayer wrote:
>> On Sun, Nov 07, 2010 at 01:51:47AM +0300, Anatoly Nenashev wrote:
>>> On 07.11.2010 01:25, Michael Niedermayer wrote:
>>>> On Sat, Nov 06, 2010 at 07:37:40AM +0300, Anatoly Nenashev wrote:
>>>>
>>>>> On 06.11.2010 06:01, Michael Niedermayer wrote:
>>>>>
>>>>>> On Thu, Nov 04, 2010 at 05:39:40PM +0300, Anatoly Nenashev wrote:
>>>>>>
>>>>>>
>>>>>>> On 04.11.2010 03:16, Michael Niedermayer wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Tue, Nov 02, 2010 at 01:58:21PM +0300, Anatoly Nenashev wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>      int ff_mjpeg_decode_sof(MJpegDecodeContext *s)
>>>>>>>>>      {
>>>>>>>>>          int len, nb_components, i, width, height, pix_fmt_id;
>>>>>>>>> @@ -342,14 +376,12 @@ int 
>>>>>>>>> ff_mjpeg_decode_sof(MJpegDecodeContext *s)
>>>>>>>>>                  s->avctx->pix_fmt = PIX_FMT_GRAY16;
>>>>>>>>>          }
>>>>>>>>>
>>>>>>>>> -    if(s->picture.data[0])
>>>>>>>>> -        s->avctx->release_buffer(s->avctx,&s->picture);
>>>>>>>>> -
>>>>>>>>> -    s->picture.reference= 0;
>>>>>>>>> -    if(s->avctx->get_buffer(s->avctx,&s->picture)<     0){
>>>>>>>>> -        av_log(s->avctx, AV_LOG_ERROR, "get_buffer() failed\n");
>>>>>>>>> -        return -1;
>>>>>>>>> +    if (s->avctx->codec_id == CODEC_ID_MXPEG&&     
>>>>>>>>> !s->got_picture) {
>>>>>>>>> +        if (mxpeg_allocate_picture(s)<     0) return -1;
>>>>>>>>> +    } else {
>>>>>>>>> +        if (mjpeg_allocate_picture(s)<     0) return -1;
>>>>>>>>>          }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> looks like s->picture.reference will b wrong for the first pic
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> No, s->picture.reference will be always 1 for MxPEG frames. But 
>>>>>>> I've
>>>>>>> reimplemented this code to be more clean. See attachment.
>>>>>>>
>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> +static int mxpeg_decode_mxm(MJpegDecodeContext *s, char *buf, 
>>>>>>>>> int len)
>>>>>>>>> +{
>>>>>>>>> +    int32_t  mb_width, mb_height, bitmask_size, i;
>>>>>>>>> +    uint32_t mb_count;
>>>>>>>>> +
>>>>>>>>> +    mb_width  = AV_RL16(&buf[4]);
>>>>>>>>> +    mb_height = AV_RL16(&buf[6]);
>>>>>>>>> +    mb_count = (uint32_t)mb_width*mb_height;
>>>>>>>>> +
>>>>>>>>> +    if (!mb_count || mb_count>     0x7FFFFFF8) {
>>>>>>>>> +        av_log(s->avctx, AV_LOG_ERROR, "MXM wrong macroblocks 
>>>>>>>>> count");
>>>>>>>>> +        return AVERROR(EINVAL);
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    bitmask_size = (int32_t)(mb_count + 7)>>     3;
>>>>>>>>> +
>>>>>>>>> +    if (bitmask_size>     (len - 12)) {
>>>>>>>>> +        av_log(s->avctx, AV_LOG_ERROR, "MXM bitmask is not 
>>>>>>>>> complete");
>>>>>>>>> +        return AVERROR(EINVAL);
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    av_freep(&s->mxctx.comment_buffer);
>>>>>>>>> +    s->mxctx.comment_buffer = buf;
>>>>>>>>> +    s->mxctx.mxm_bitmask = buf + 12;
>>>>>>>>> +
>>>>>>>>> +    if (!s->got_picture&&     s->picture.data[0]) {
>>>>>>>>> +        if (mxpeg_allocate_picture(s)<     0) return -1;
>>>>>>>>> +        s->got_picture = 1;
>>>>>>>>> +    }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> allocating pictures in the comment parsing code is not a good 
>>>>>>>> idea IMHO
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> I don't agree.
>>>>>>> There are 3 types of frames in MxPEG stream:
>>>>>>> 1) clear JPEG which contains SOF, DHT, DQT, SOS  - may be the first
>>>>>>> frame in stream or (in new cameras firmware) may be periodically
>>>>>>> available.
>>>>>>> 2) Pseudo-I frame which contains SOF, DHT, DQT, MXM bitmask (in 
>>>>>>> COM),
>>>>>>> SOS - must be periodically available but doesn't need to contain 
>>>>>>> full
>>>>>>> parts of image.
>>>>>>> 3) P-frame which contains MXM bitmask, SOS and optional DHT, DQT 
>>>>>>> - just
>>>>>>> usual frame.
>>>>>>>
>>>>>>> So where I must allocate picture for the last frame type? I've 
>>>>>>> decided
>>>>>>> to do it in MXM parser.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> also maybe it would be easier to use reget_buffer() ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> No, I think reget_buffer is a bad idea. We don't need to copy all
>>>>>>> reference frame but only non-changed part of it.
>>>>>>> And we had talking about it in August. It was your idea to remove
>>>>>>> reget_buffer from this code to reject full image copying.
>>>>>>>
>>>>>>>
>>>>>> both methods have their advantages and disadvantages.
>>>>>> i dont care which is implemented but it should be done cleanly
>>>>>> allocating the frame in the comment parsing code is not ok.
>>>>>> Allocating a picture currently implicates that a SOF has been 
>>>>>> successfully
>>>>>> parsed as the later code checks for this.
>>>>>> Simply allocating a picture without successfully running the SOF 
>>>>>> parsing but
>>>>>> maybe having it run just to the middle where it failed could lead to
>>>>>> exploitable security issues
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> As I have specified before there are frames which don't contain SOF
>>>>> marker. I've named them P-frames.
>>>>> If we have got Pseudo-I frame before then new picture for P-frame 
>>>>> will
>>>>> be allocated according to previously parsed params.
>>>>>
>>>> you dont know if the previous params are valid. Its possible that 
>>>> frame
>>>> 1 had a correct SOF and frame 2 failed in the middle of SOF parsing 
>>>> and so
>>>> you have a mix a values now frame 3 has no SOF and uses this mix, this
>>>> looks bad to me
>>>>
>>>> Iam not sure how to best solve it but i dont think we can just pick 
>>>> a random
>>>> solution and hope it wont be exploitable
>>>>
>>>>
>>> By the way, we have some information about picture parameters in 
>>> P-frames.
>>> MXM data contains width and height of picture in macroblocks. So we can
>>> check if them is the same as in previously parsed SOF.
>>> In this case SOS parser is a unique place in which there can be 
>>> problems
>>> with exploits. But we also can have same problems here with broken
>>> streams in usual JPEG decoder. So what's the difference? But if you
>> the difference is that if SOF fails or there is no SOF then 
>> got_picture=0
>> and later code will not continue
>>
>>
>
> I think I've found a solution for this issue. If input packet doesn't 
> contain SOF data then the new picture is allocated from 
> reference_picture which is initiated at decode_frame end. Thus 
> reference_picture is always good. For more details see attachment.

Ping.



More information about the ffmpeg-devel mailing list