[FFmpeg-devel] [PATCH 11/23] lavc/ass_split: fix parsing utf8 scripts

John Stebbins jstebbins at jetheaddev.com
Mon Apr 6 21:27:08 EEST 2020


On Mon, 2020-04-06 at 20:08 +0200, Nicolas George wrote:
> John Stebbins (12020-04-06):
> > The [Script Info] section was skipped if starts with UTF8 BOM
> > ---
> >  libavcodec/ass_split.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/libavcodec/ass_split.c b/libavcodec/ass_split.c
> > index 67da7c6d84..94c32667af 100644
> > --- a/libavcodec/ass_split.c
> > +++ b/libavcodec/ass_split.c
> > @@ -354,6 +354,9 @@ static int ass_split(ASSSplitContext *ctx,
> > const char *buf)
> >      if (ctx->current_section >= 0)
> >          buf = ass_split_section(ctx, buf);
> >  
> > +    if(!memcmp(buf, "\xef\xbb\xbf", 3)) { // Skip UTF-8 BOM header
> > +        buf += 3;
> > +    }
> 
> This doesn't look correct: BOM should be skipped only at the very
> beginning of the file. And the braces could be skipped.
> 
> >      while (buf && *buf) {
> >          if (sscanf(buf, "[%15[0-9A-Za-z+ ]]%c", section, &c) == 2)
> > {
> >              buf += strcspn(buf, "\n");
> 
> 

Oh, whoops, I missed that ass_split gets called for a number of things.
This belongs at the beginning of ff_ass_split() I believe?

In the sample I ran into this with, there's a BOM at the beginning of
the mkv private data for the track.

I'll remove the braces...



More information about the ffmpeg-devel mailing list