[FFmpeg-devel] [PATCH 1/4] avcodec/strdec: factor out HTML parsing code

Yayoi Ukai yayoi.ukai at gmail.com
Fri Aug 28 08:01:04 CEST 2015


On Sat, Aug 15, 2015 at 11:08 AM, Clément Bœsch <u at pkh.me> wrote:
> On Sat, Aug 08, 2015 at 12:52:04PM -0700, Yayoi Ukai wrote:
> [...]
>> >> -    while (dst->len >= 2 && !strncmp(&dst->str[dst->len - 2], "\\N", 2))
>> >> -        dst->len -= 2;
>> >> -    dst->str[dst->len] = 0;
>> >> -    rstrip_spaces_buf(dst);
>> >
>> > why did you completely remove this chunk?
>>
>> It appeared to me that it didn't do anything even in original code.
>> But I can put it back.
>> (It didn't make any difference in fate test or other simple test
>> whether I commented it out or not.)
>>
>
> This commit is supposed to be factoring out the code, not do any
> functional change. If you want to do such changes it belongs in a separate
> commit (but you probably don't want to because it's likely wrong).
>
> Anyway, I had to have a look to the diff again to check if you hadn't
> added more unwanted changes. And unfortunately, you did. So first of all,
> this following diff I extracted belongs in a separated commit (typically
> squashed in the one where you make SAMI use it, or eventually just
> before).
>
> Here is a review of that part assuming it is extracted/moved:
>
> diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c
> index a138955..b2f2273 100644
> --- a/libavcodec/htmlsubtitles.c
> +++ b/libavcodec/htmlsubtitles.c
> @@ -71,6 +71,21 @@ void ff_htmlmarkup_to_ass(AVCodecContext *avctx, AVBPrint *dst, const char *in)
>                  end = 1;
>                  break;
>              }
> +
> +            /* check if it is end of the paragraph or not*/
> +            in++;
> +            count = 1;
> +            while(*in == ' ') {
> +                in++;
> +                count++;
> +            }
> +            if (*in == '\0' || *in == '\n'){
> +                in = in - count;
> +                break;
> +            }
> +            in = in - count;
> +
> +            /*if not the end of the paragraph, add line break */
>
> Why? What are you trying to do here that isn't already handled by the code
> already?

Yes removed.

>
> There is no concept of "paragraph" in SubRip markup, so it probably
> doesn't belong here, assuming it's necessary.
>
>              rstrip_spaces_buf(dst);
>              av_bprintf(dst, "\\N");
>              line_start = 1;
> @@ -90,6 +105,15 @@ void ff_htmlmarkup_to_ass(AVCodecContext *avctx, AVBPrint *dst, const char *in)
>                  av_bprint_chars(dst, *in, 1);
>              break;
>          case '<':
> +            if (!av_strncasecmp(in, "<BR", 3)){
>
> here and below, check your style
>
> +                av_bprintf(dst, "\\N");
> +                len = 3;
> +                while (in[len] != '>' && (av_isspace(in[len]) || in[len] == '/')){
>
> if in[len] is a space or a '/', it's obviously different than '>', so the
> condition is redundant.
>
> +                        len++;
>
> wrong indent
>
> +                }
>
> +                in += len + 1;
>
> this +1 is very dangerous, there is a risk of overread.

Well, I was not sure what it could be replaced for checking the closed
'>' for the <br    > tag..
So this one stayed... Please let me know if you have any suggestion.

Thank you!


>
> +            }
> +
>              tag_close = in[1] == '/';
>              len = 0;
>              if (sscanf(in+tag_close+1, "%127[^>]>%n", buffer, &len) >= 1 && len > 0) {
>
> [...]
>
> --
> Clément B.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list