[FFmpeg-devel] Tracking down frame corrupting race conditions on Windows.

Dale Curtis dalecurtis at chromium.org
Fri Apr 20 22:49:13 CEST 2012


All,

We've been running into an issue where the decoder (avcodec_decode_video2) is
returning corrupted frames on Win32; some images showing the corruption:

http://code.google.com/p/chromium/issues/detail?id=120396#c13

The problem is most easily detected when run under thread sanitizer (TSAN)* with
a hash test.  I've created a sample program which simply decodes and hashes
each frame (using the av_sha_* interfaces) to illustrate this:

Without TSAN and 2 threads:
> hash-test.exe bear0.mp4 2
000: a1d99a1a3f6ec3ffbad916ad34e14a69b011ece10505930fea07f5a5cd988a7c
<...>
081: d857354e53a1161e1be083e95e01ad7a12920c5307c37d012c8fa1d3fb784fc2
36be212f0080fe4e74a2187cad8f393fafecf5760f3e88e273fc59d683582ddf

With TSAN and 2 threads:
> tsan\tsan.bat --log-file=tsan.txt -- hash-test.exe bear0.mp4 2
000: e5a0e89c28c8ade300ae572508aeab2f06b59e0da30a20abde926da4003dc680
<...>
081: 85582db265d1f2ec90d32f5642075af81397c0de2d606defb3d704eeca261253
1c746db9b787908287662a302f74de852a682fcb3539abbc204a41671ae66161

With TSAN and 1 threads:
> tsan\tsan.bat --log-file=tsan.txt -- hash-test.exe bear0.mp4 1
000: a1d99a1a3f6ec3ffbad916ad34e14a69b011ece10505930fea07f5a5cd988a7c
<...>
081: d857354e53a1161e1be083e95e01ad7a12920c5307c37d012c8fa1d3fb784fc2
36be212f0080fe4e74a2187cad8f393fafecf5760f3e88e273fc59d683582ddf

Our assumption is that the output from threads == 1 should be indistinguishable
from output with threads > 1.  Some general notes:

   - The issue occurs with at least h264, vp8, and theora files.
   - We have seen no issues under Linux or Mac, which is not to say there are no
     problems.  The Windows version of TSAN runs much slower than the Linux
     or Mac variants and thus may be more susceptible to the issue.
   - Switching to pthreads instead of using w32threads on Windows does not fix
     the problem.
   - Given the hash changes with the number of threads, we suspect the problem
     is a race condition.
   - It's possible TSAN is breaking something fundamental, however past
     experience with the tool has shown the worst case to be false positives;
     never spurious impacts to the running program...
   - FATE will fail its hash tests on almost every single test if TSAN is set as
     the target_exec on Windows.

To reproduce the results, the fastest way is to grab the test bundle from here:

   http://commondatastorage.googleapis.com/dalecurtis-shared/hash-test.zip

The bundle includes a precompiled Win32 hash-test executable, hash-test source
code, and TSAN binaries as well as h264, theora, and vp8 test cases.  From there
you just need to add the FFmpeg or LibAV DLL files from the prebuilt servers:

   FFmpeg: http://ffmpeg.zeranoe.com/builds/win32/shared/
   LibAV: http://win32.libav.org/

Then to run the test you just run:

   hash-test.exe <file> <threads>
   tsan\tsan.bat --log-file=tsan.txt -- hash-test.exe <file> <threads>

I highly recommend using --log-file with TSAN, otherwise it will generate a
tremendous amount of warnings due to the code's assumption of atomic integers as
well as other more complicated threading patterns.  However, it's possible the
real issue is lurking in one of those warnings.

If you want to build the test yourself you'll need to setup MinGW and build
the shared version of FFmpeg/LibAV, the test can then be compiled from the MinGW
shell with:

   gcc -o hash-test.exe hash-test.c -I. -std=c99 avformat-54.dll avutil-51.dll
   avcodec-54.dll

acolwell and I have spent a lot of time trying to track down the source of this
corruption on Windows without much success.  We're hoping the community might
have some better ideas on where to look.  Thanks in advance for any assistance!

- dale

*Thread Sanitizer: http://code.google.com/p/data-race-test/wiki/ThreadSanitizer


More information about the ffmpeg-devel mailing list