[FFmpeg-devel] [PATCH] Fix quadratic memory use in ff_h2645_extract_rbsp() when multiple NALUs exist in packet.

Niki Bowe nbowe at google.com
Tue Oct 24 02:43:55 EEST 2017


On Thu, Oct 19, 2017 at 3:39 PM, Carl Eugen Hoyos <ceffmpeg at gmail.com>
wrote:

> 2017-10-19 20:46 GMT+02:00 Nikolas Bowe <nbowe-at-google.com at ffmpeg.org>:
> > Found via fuzzing.
> > /tmp/poc is a 1 MB mpegts file generated via fuzzing, where 1 packet has
> many NALUs
> > Before this change:
> >   $ /usr/bin/time -f "\t%M Max Resident Set Size (Kb)"  ./ffprobe
> /tmp/poc 2>&1 | tail -n 1
> >         2158192 Max Resident Set Size (Kb)
> > After this change:
> >   $ /usr/bin/time -f "\t%M Max Resident Set Size (Kb)"  ./ffprobe
> /tmp/poc 2>&1 | tail -n 1
> >         1046812 Max Resident Set Size (Kb)
>
> This does not look like a fix for a "quadratic" memory consumption or
> do I misunderstand?
>

Before this patch, for each NALU in the packet, rbsp_buffer would be sized
from the start of the NALU to the end of the packet, not the end of the
NALU.
This would occur for each NALU in the packet. Total memory allocated in all
the rbsp_buffers for all the NALUs in the packet would be N + (N+x1) +
(N+x2) + ...
This is quadratic in the number of NALUs in the packet.

The fuzzer's proof of concept file is a bit extreme. It has over 2000 small
NALUs in one packet.
An easier way to trigger this would be to put some small NALUs (perhaps
SEI) in front of large IDRs. Each small NALU added to the front doubles the
total memory allocated for rbsp_buffers for that packet.


> Does the patch have a measurable speed impact?
>
>
Is there a standard set of benchmarks I can run?

For typical videos the speed impact is small, due to NALU fitting in cache,
but for videos with many large NALUs there can be some slowdown.

Here is the decode time for some typical video and some extreme cases,
taking the result with the best user+system time of 3 runs.
Measured via
for i in {1..3}; do /usr/bin/time -f "\t%U User CPU seconds\n\t%S System
CPU seconds\n\t%M Max Resident Set Size (Kb)" ./ffmpeg -loglevel quiet
-nostats -threads 1 -i $file -f null - ; done

Tears of Steel HD
Somewhat typical HD short movie. 728 MB, 8314kb/s
  no patch:
113.69 User CPU seconds
0.60 System CPU seconds
44784 Max Resident Set Size (Kb)
  patch:
112.52 User CPU seconds
0.40 System CPU seconds
44780 Max Resident Set Size (Kb)
  1% slower.

Tears of Steel 4k
Somewhat typical high-ish bitrate 4k movie. 73244 kb/s.
  no patch:
682.70 User CPU seconds
2.99 System CPU seconds
104420 Max Resident Set Size (Kb)
  patch:
716.06 User CPU seconds
4.08 System CPU seconds
103632 Max Resident Set Size (Kb)
  5% slower.

random 50 Mbps i-only video I had laying around
  no patch:
421.33 User CPU seconds
1.21 System CPU seconds
28284 Max Resident Set Size (Kb)
  patch
423.00 User CPU seconds
1.98 System CPU seconds
27300 Max Resident Set Size (Kb)
   <1% slower

foo_200M.ts
10 seconds of /dev/urandom at 4k, encoded at 200Mbps using 2 pass, 2 second
GOP (it actually used P frames even though its encoding random noise).
This is
  no patch:
11.52 User CPU seconds
0.19 System CPU seconds
68668 Max Resident Set Size (Kb)
  patch:
11.92 User CPU seconds
0.15 System CPU seconds
68656 Max Resident Set Size (Kb)
  3% slower

large_nals.ts
This is an extreme case. I tried to come up with a very bad case for
slowdown. Every packets has very large VCL NALUs.
Generated via ./ffmpeg -f rawvideo -video_size 3840x2160 -pixel_format
yuv420p -framerate 30 -i /dev/urandom -t 5 -c:v libx264 -preset ultrafast
-crf 0 -g 1 -y large_nals.ts
Each packet in large_nals.ts is close to 20 MB. There's 5 NALUs per packet:
9 (AUD) 7 (SPS) 8 (PPS) 6(SEI) 5 (IDR slice).
This is of course unrealistically large at over 4 Gbps, but it should be a
decent worst case example.
  no patch:
46.76 User CPU seconds
3.10 System CPU seconds
199148 Max Resident Set Size (Kb)
  patch:
54.42 User CPU seconds
1.79 System CPU seconds
156720 Max Resident Set Size (Kb)
  13% slower when there are very large NALUs (also 21% less memory usage)



> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>



-- 

Nikolas Bowe |  SWE |  nbowe at google.com |  408-565-5137 <(408)%20565-5137>


More information about the ffmpeg-devel mailing list