[FFmpeg-devel] [PATCH] Allow mpjpeg demuxer to process MIME parts which do not include Content-Length header.

Alex Agranovsky alex at sighthound.com
Tue Nov 24 21:01:28 CET 2015



-- 
Alex Agranovsky
Sighthound, Inc
www.sighthound.com		

On November 24, 2015 at 12:36:30 PM, wm4 (nfxjfg at googlemail.com) wrote:

On Tue, 24 Nov 2015 11:39:07 -0500  
Alex Agranovsky <alex at sighthound.com> wrote:  

> On November 24, 2015 at 10:32:47 AM, wm4 (nfxjfg at googlemail.com) wrote:  
>  
> On Tue, 24 Nov 2015 00:16:06 -0500  
> Alex Agranovsky <alex at sighthound.com> wrote:  
>  
> > From a2a0b9e0da14b6e82aa783535ec1878c8ffbede0 Mon Sep 17 00:00:00 2001  
> > From: Alex Agranovsky <alex at sighthound.com>  
> > Date: Tue, 24 Nov 2015 00:06:14 -0500  
> > Subject: [PATCH 1/2] Allow mpjpeg demuxer to process MIME parts which do not  
> > include Content-Length header.  
>> > Fixes ticket 5023  
>> > Signed-off-by: Alex Agranovsky <alex at sighthound.com>  
> > ---  
> > libavformat/mpjpegdec.c | 176 ++++++++++++++++++++++++++++++++++++------------  
> > 1 file changed, 133 insertions(+), 43 deletions(-)  
>> > diff --git a/libavformat/mpjpegdec.c b/libavformat/mpjpegdec.c  
> > index 2749a48..e494f1a 100644  
> > --- a/libavformat/mpjpegdec.c  
> > +++ b/libavformat/mpjpegdec.c  
> > @@ -23,22 +23,16 @@  
> >    
> > #include "avformat.h"  
> > #include "internal.h"  
> > +#include "avio_internal.h"  
> >    
> > -static int get_line(AVIOContext *pb, char *line, int line_size)  
> > -{  
> > - int i = ff_get_line(pb, line, line_size);  
> >    
> > - if (i > 1 && line[i - 2] == '\r')  
> > - line[i - 2] = '\0';  
> >    
> > - if (pb->error)  
> > - return pb->error;  
> > -  
> > - if (pb->eof_reached)  
> > - return AVERROR_EOF;  
> > -  
> > - return 0;  
> > -}  
> > +/** Context for demuxing an MPJPEG stream. */  
>  
> This comment is really not needed.  
> Will be removed in follow up submission.  
>  
>  
>  
> > +typedef struct MPJPEGDemuxContext {  
> > + char* boundary;  
> > + char* searchstr;  
> > + int searchstr_len;  
> > +} MPJPEGDemuxContext;  
> >    
> >    
> > static void trim_right(char* p)  
> > @@ -46,13 +40,32 @@ static void trim_right(char* p)  
> > char *end;  
> > if (!p || !*p)  
> > return;  
> > - end = p + strlen(p) - 1;  
> > - while (end != p && av_isspace(*end)) {  
> > + int len = strlen(p);  
> > + if ( !len )  
> > + return;  
> > + end = p + len - 1;  
> > + while (end >= p && av_isspace(*end)) {  
>  
> Why this change? Note that strlen(p)>0 is already guaranteed by the  
> "!*p" check before.  
>  
>  
> Consider input of a single ‘\r’. It will never get trimmed with the old code.  

I don't really see how most of the changes fix this case. The only  
change that does anything is replacing the != operator with >= . Which  
is invalid as I see just now: it would set end to p-1, which AFAIK is  
undefined behavior.  


Setting end to p-1 would force us to exit the loop, no? We’re just comparing two pointer values, without dereferencing them.
I do see that length check is redundant with !*p, so removing that.

> > *end = '\0'; 
> > end--; 
> > } 
> > } 
> >   
> > +static int get_line(AVIOContext *pb, char *line, int line_size) 
> > +{ 
> > + ff_get_line(pb, line, line_size); 
> > + 
> > + if (pb->error) 
> > + return pb->error; 
> > + 
> > + if (pb->eof_reached) 
> > + return AVERROR_EOF; 
> > + 
> > + trim_right(line); 
> > + return 0; 
> > +} 
> > + 
> > + 
> > + 
> > static int split_tag_value(char **tag, char **value, char *line) 
> > { 
> > char *p = line; 
> > @@ -86,12 +99,24 @@ static int split_tag_value(char **tag, char **value, char *line) 
> > return 0; 
> > } 
> >   
> > -static int parse_multipart_header(AVIOContext *pb, void *log_ctx); 
> > +static int parse_multipart_header(AVIOContext *pb, 
> > + int* size, 
> > + const char* expected_boundary, 
> > + void *log_ctx); 
> > + 
> > +static int mpjpeg_read_close(AVFormatContext *s) 
> > +{ 
> > + MPJPEGDemuxContext *mpjpeg = s->priv_data; 
> > + av_freep(&mpjpeg->boundary); 
> > + av_freep(&mpjpeg->searchstr); 
> > + return 0; 
> > +} 
> >   
> > static int mpjpeg_read_probe(AVProbeData *p) 
> > { 
> > AVIOContext *pb; 
> > int ret = 0; 
> > + int size = 0; 
> >   
> > if (p->buf_size < 2 || p->buf[0] != '-' || p->buf[1] != '-') 
> > return 0; 
> > @@ -100,7 +125,7 @@ static int mpjpeg_read_probe(AVProbeData *p) 
> > if (!pb) 
> > return 0; 
> >   
> > - ret = (parse_multipart_header(pb, NULL)>0)?AVPROBE_SCORE_MAX:0; 
> > + ret = ( parse_multipart_header(pb, &size, "--", NULL) > 0 ) ? AVPROBE_SCORE_MAX : 0;  
> 
> A stray space got in. 
> Parameters to parse_multipart_header had changed.  

Yes, but the 2 additional spaces after the ( and before the ) are not 
consistent with the rest of the coding style. (Though I admit the 
FFmpeg code has some consistency problems, e.g. the line you changed 
was missing some spaces.) 

While these cosmetic issues are not that important, I think at least 
some effort should be put into not making it look worse. 
Agreed. It takes effort, though, to switch to a specific project’s style, so I apologize for having to go through that.  I’ll try to be more conscious of specific style nuances in the future.





> 
> 
> 
> 
> 
> >   
> > av_free(pb); 
> >   
> > @@ -110,17 +135,19 @@ static int mpjpeg_read_probe(AVProbeData *p) 
> > static int mpjpeg_read_header(AVFormatContext *s) 
> > { 
> > AVStream *st; 
> > - char boundary[70 + 2 + 1]; 
> > + char boundary[70 + 2 + 1] = {0}; 
> > int64_t pos = avio_tell(s->pb); 
> > int ret; 
> >   
> > + do { 
> > + ret = get_line(s->pb, boundary, sizeof(boundary)); 
> > + if (ret < 0) 
> > + return ret; 
> > + } while (!boundary[0]); 
> >   
> > - ret = get_line(s->pb, boundary, sizeof(boundary)); 
> > - if (ret < 0) 
> > - return ret; 
> > -  
> 
> Can you explain why it suddenly has to skip multiple lines? 
> MIME standard, as old as it is, is poorly implemented by many camera manufacturers. In the last few weeks, I have seen things ranging from not sending a proper boundary, to not including a CRLF after a body part, to including multiples. When we process the headers, it is assumed body part had been consumed, and we need to get to the next non-empty lines. It is solving the same problem as 18f9308e6a96bbeb034ee5213a6d41e0b6c2ae74, just the other manifestation of it. 
> 

OK. 

> 
> 
> 
> > - if (strncmp(boundary, "--", 2)) 
> > + if (strncmp(boundary, "--", 2)) { 
> > return AVERROR_INVALIDDATA; 
> > + }  
> 
> Another change that looks like a stray change. 
> Will remove in follow-up submission. 
> 
> 
> 
> >   
> > st = avformat_new_stream(s, NULL); 
> > if (!st) 
> > @@ -147,28 +174,44 @@ static int parse_content_length(const char *value) 
> > return val; 
> > } 
> >   
> > -static int parse_multipart_header(AVIOContext *pb, void *log_ctx) 
> > +static int parse_multipart_header(AVIOContext *pb, 
> > + int* size, 
> > + const char* expected_boundary, 
> > + void *log_ctx) 
> > { 
> > char line[128]; 
> > int found_content_type = 0; 
> > - int ret, size = -1; 
> > + int ret; 
> > + 
> > + *size = -1; 
> >   
> > // get the CRLF as empty string 
> > ret = get_line(pb, line, sizeof(line));  
> 
> > - if (ret < 0) 
> > + if (ret < 0) { 
> > return ret; 
> > + }  
> 
> Another stray change? 
> Will remove in follow-up submission. 
> 
> 
> 
> >   
> > /* some implementation do not provide the required 
> > * initial CRLF (see rfc1341 7.2.1) 
> > */ 
> > - if (!line[0]) { 
> > + while (!line[0]) { 
> > ret = get_line(pb, line, sizeof(line)); 
> > - if (ret < 0) 
> > + if (ret < 0) { 
> > return ret; 
> > + } 
> > }  
> 
> Again, why does it have to skip multiple lines now? The comment 
> indicates that skipping 1 line (instead of nothing) is a workaround for 
> some buggy servers. 
> 
> 
> 
> Please see above. 
> 
> >   
> > - if (strncmp(line, "--", 2)) 
> > + if ( strncmp(line, expected_boundary, strlen(expected_boundary) ) ) {  
> 
> (Maybe could use av_strstart. Also, weird spacing.) 
> Going to replace with av_strstart 
> 
> 
> 
> 
> 
> > + if (log_ctx) { 
> > + av_log(log_ctx, 
> > + AV_LOG_ERROR, 
> > + "Expected boundary '%s' not found, instead found a line of %lu bytes\n", 
> > + expected_boundary, 
> > + strlen(line) ); 
> > + } 
> > + 
> > return AVERROR_INVALIDDATA; 
> > + } 
> >   
> > while (!pb->eof_reached) { 
> > char *tag, *value; 
> > @@ -201,32 +244,77 @@ static int parse_multipart_header(AVIOContext *pb, void *log_ctx) 
> > } else 
> > found_content_type = 1; 
> > } else if (!av_strcasecmp(tag, "Content-Length")) { 
> > - size = parse_content_length(value); 
> > - if (size < 0) 
> > - return size; 
> > + *size = parse_content_length(value); 
> > } 
> > } 
> >   
> > - if (!found_content_type || size < 0) { 
> > - return AVERROR_INVALIDDATA; 
> > - } 
> > - 
> > - return size; 
> > + return (found_content_type) ? 0 : AVERROR_INVALIDDATA; 
> > } 
> >   
> > + 
> > static int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt) 
> > { 
> > - int ret; 
> > - int size = parse_multipart_header(s->pb, s); 
> > + int size; 
> > + MPJPEGDemuxContext *mpjpeg = s->priv_data; 
> > + if ( mpjpeg->boundary == NULL ) { 
> > + mpjpeg->boundary = av_strdup("--"); 
> > + mpjpeg->searchstr = av_strdup("\r\n--"); 
> > + mpjpeg->searchstr_len = strlen( mpjpeg->searchstr );  
> 
> Missing checks for memory allocation failure. 
> 
> Also, this is the only place where these fields are set. And they're 
> set to constant strings. I see no reason why these fields exist, and 
> they certainly don't need to be strdup'ed? 
> 
> My understanding of this MIME part encoding is very limited, but I 
> thought you need to read the boundary as it appears in the header, and 
> use that to find the end of the data? Was this forgotten? 
> 
> 
> Let me elaborate on this.  
> 1) You are correct — boundary needs to be compared to that provided in HTTP response’s Content-Type.  
> However, current code doesn’t compare it (only using the ‘—‘ as the boundary check), so I have kept things as is for this patch. 
> 2) That functionality ready to be submitted as a separate patch, once this one is cleared. 

Oh I see. Please send the second patch as well then. (It's common to 
send patch series which depend on another at once.) 
Attaching 2nd patch.



> 3) As I’ve mentioned, some implementations fail to send correct boundaries, so even that future patch will have a compile-time 
> flag to enable it, as it will cause regression for some cases. 

It should try to deal with all cases at once, and if it's not possible, 
it should be a runtime option (look for AVOption use in other 
demuxers), not a compile time one. 


Thanks for the tip, implemented using AVOption.

> 4) strdup is done to be consistent with the future patch (where we have to alloc the memory) 
> 5) Sorry about the checks, those will be added in the follow up patch 
> 
> > + } 
> > + 
> > + int ret = parse_multipart_header(s->pb, 
> > + &size, 
> > + mpjpeg->boundary, 
> > + s);  
> 
> Declaration after statements; apparently we don't like this, and it's 
> preferred to put all declarations on top of each block. 
> 
> 
> Correcting. 
> 
> >   
> > - if (size < 0) 
> > - return size; 
> >   
> > - ret = av_get_packet(s->pb, pkt, size); 
> > if (ret < 0) 
> > return ret; 
> >   
> > - return 0; 
> > + if ( size > 0 ) { 
> > + /* size has been provided to us in MIME header */ 
> > + ret = av_get_packet(s->pb, pkt, size); 
> > + } else { 
> > + /* no size was given -- we read until the next boundary or end-of-file */ 
> > + 
> > + const int read_chunk = 2048; 
> > + av_init_packet(pkt); 
> > + pkt->data = NULL; 
> > + pkt->size = 0; 
> > + pkt->pos = avio_tell(s->pb); 
> > + 
> > + /* we may need to return as much as all we've read back to the buffer */ 
> > + ffio_ensure_seekback( s->pb, read_chunk ); 
> > + 
> > + 
> > + int remaining = 0, len; 
> > + 
> > + while ( ( ret = av_append_packet( s->pb, pkt, read_chunk - remaining ) ) >= 0 ) { 
> > + /* scan the new data */ 
> > + len = ret + remaining; 
> > + char* start = pkt->data + pkt->size - len; 
> > + do { 
> > + if ( !memcmp( start, mpjpeg->searchstr, mpjpeg->searchstr_len ) ) {  
> 
> How is it guaranteed that start is valid for at least searchstr_len 
> bytes? 
> 
> 
> with } while ( len >= mpjpeg->searchstr_len ); 
> 
> len decrement occurs for each start increment 
> 
> 
> > + // got the boundary! rewind the stream 
> > + avio_seek( s->pb, -(len-2), SEEK_CUR ); 
> > + pkt->size -= (len-2); 
> > + return pkt->size; 
> > + } 
> > + len--; 
> > + start++; 
> > + } while ( len >= mpjpeg->searchstr_len );  
> 
> Couldn't this use strstr? (Just a thought. Would assume that the 
> boundary itself ca not contain 0 bytes.) 
> 
> 
> We’re operation on binary data, so can’t use string functions. 
> 
> > + remaining = len; 
> > + } 
> > + 
> > + /* error or EOF occurred */ 
> > + if ( ret == AVERROR_EOF ) { 
> > + ret = ( pkt->size > 0 ) ? pkt->size : AVERROR_EOF;  
> 
> Unneeded ( ). 
> Correcting. 
> 
> 
> 
> > + } else { 
> > + av_packet_unref(pkt); 
> > + } 
> > + } 
> > + 
> > + return ret; 
> > }  
> 
> The spacing in this whole function is a bit weird and inconsistent with 
> the rest of the code, I think. 
> 
> 
> Correcting some (mostly spaces after ‘(' and before ‘)' … let me know if anything else has to be done) 
> 

I think tools/patcheck checks for some of these, but not sure. 

> >   
> > AVInputFormat ff_mpjpeg_demuxer = { 
> > @@ -234,7 +322,9 @@ AVInputFormat ff_mpjpeg_demuxer = { 
> > .long_name = NULL_IF_CONFIG_SMALL("MIME multipart JPEG"), 
> > .mime_type = "multipart/x-mixed-replace", 
> > .extensions = "mjpg", 
> > + .priv_data_size = sizeof(MPJPEGDemuxContext), 
> > .read_probe = mpjpeg_read_probe, 
> > .read_header = mpjpeg_read_header, 
> > .read_packet = mpjpeg_read_packet, 
> > + .read_close = mpjpeg_read_close 
> > };  
> 
> _______________________________________________ 
> ffmpeg-devel mailing list 
> ffmpeg-devel at ffmpeg.org 
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel 

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0001-Allow-mpjpeg-demuxer-to-process-MIME-parts-which-do-.patch.v3.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/0d428d9c/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0002-If-available-use-the-actual-boundary-in-HTTP-respons.patch.v3.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/0d428d9c/attachment-0001.txt>


More information about the ffmpeg-devel mailing list