[FFmpeg-devel] [PATCH] avcodec/webvttdec: Unescape HTML entities
Ricardo
wiiaboo at gmail.com
Thu Oct 8 23:50:49 CEST 2015
That would probably be considered a broken WebVTT file, since "&" need to
be encoded as "&".
On 8 October 2015 at 20:46, Clément Bœsch <u at pkh.me> wrote:
> On Thu, Oct 08, 2015 at 05:20:52PM +0100, Ricardo Constantino wrote:
> > Also fixes adjacent tags not being parsed correctly.
> >
> > Signed-off-by: Ricardo Constantino <wiiaboo at gmail.com>
> > ---
> > libavcodec/webvttdec.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
> > index 1284a17..dec4105 100644
> > --- a/libavcodec/webvttdec.c
> > +++ b/libavcodec/webvttdec.c
> > @@ -37,11 +37,14 @@ static const struct {
> > {"<b>", "{\\b1}"}, {"</b>", "{\\b0}"},
> > {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
> > {"{", "\\{"}, {"}", "\\}"}, // escape to avoid ASS markup conflicts
> > + {">", ">"}, {"<", "<"},
> > + {"", ""}, {"", ""}, // FIXME: properly honor bidi marks
> > + {"&", "&"}, {" ", " "},
> > };
> >
> > static int webvtt_event_to_ass(AVBPrint *buf, const char *p)
> > {
> > - int i, skip = 0;
> > + int i, skip, again = 0;
> >
> > while (*p) {
> >
> > @@ -51,13 +54,19 @@ static int webvtt_event_to_ass(AVBPrint *buf, const
> char *p)
> > if (!strncmp(p, from, len)) {
> > av_bprintf(buf, "%s", webvtt_tag_replace[i].to);
> > p += len;
> > + again = 1;
> > break;
> > }
> > }
> > if (!*p)
> > break;
> > + if (again) {
> > + again = 0;
> > + skip = 0;
> > + continue;
> > + }
> >
> > - if (*p == '<')
> > + if (*p == '<' || *p == '&')
> > skip = 1;
> > else if (*p == '>')
>
> I think you need to make the ';' stop skipping. Otherwise my guess is that
> something like "Hello Ben&Jerry" is going to eat Jerry.
>
> [...]
>
> --
> Clément B.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
More information about the ffmpeg-devel
mailing list