[FFmpeg-devel] [PATCH] avutil/crc: always use precalculated CRC tables for known polynomials

Muhammad Faiz mfcc64 at gmail.com
Thu Oct 26 01:54:01 EEST 2017


On Tue, Oct 24, 2017 at 6:05 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Mon, Oct 23, 2017 at 10:36:21PM -0300, James Almer wrote:
>> On 10/23/2017 10:24 PM, Michael Niedermayer wrote:
>> > On Tue, Oct 24, 2017 at 01:39:03AM +0100, Mark Thompson wrote:
>> >> On 24/10/17 00:52, Michael Niedermayer wrote:
>> >>> Hi
>> >>>
>> >>> On Mon, Oct 23, 2017 at 04:43:19PM +0100, Mark Thompson wrote:
>> >>>> On 23/10/17 03:56, Michael Niedermayer wrote:
>> >>>>> the initialization should be thread safe as it never writes a different
>> >>>>> value in the same spot
>> >>>>
>> >>>> This is not true; please be very careful with assumptions like this.
>> >>>>
>> >>>> The C standard calls this a data race and it is undefined behaviour.
>> >>>
>> >>> Thats interresting, we definitly do not want undefined behavior
>> >>> can you quote the relevant parts of the C standard
>> >>> which make writing the same value, a data race ?
>> >>>
>> >>> I was not aware of this if its true.
>> >>> Also i dont immedeatly see this in the "latest" draft i am looking at
>> >>> atm
>> >>>
>> >>> What i see is this:
>> >>> 4 Two expression evaluations conflict if one of them modifies a memory location and the
>> >>>   other one reads or modifies the same memory location.
>> >>> ...
>> >>> 25 The execution of a program contains a data race if it contains two conflicting actions in
>> >>>    different threads, at least one of which is not atomic, and neither happens before the
>> >>>    other. Any such data race results in undefined behavior.
>> >>>
>> >>> This seems carefully worded to not include writing the same value.
>> >>> As "modification" implies change which does not occur when writing the
>> >>> same value.
>> >>
>> >> All writes are modifications.
>> >>
>> >> See section 3.1 note 2:
>> >> """
>> >> 3 NOTE 2   "Modify"€™ includes the case where the new value being stored is the same as the previous value.
>> >> """
>> >
>> > That resolves the difference between our interpretation
>> >
>> > thanks
>>
>> Should i push this, or does someone want to give ff_thread_once() a try?
>
> If ff_thread_once() is used outside init code, it should be benchmarked
> as it may be a bad idea speed wise.
> During init speed doesnt matter so ff_thread_once() there would be fine
> but would require more changes to move it to init.

Using my patch:
Benchmark: make fate-lavf-png at pngenc.c:png_write_chunk
old:
 150640 decicycles in av_crc_get_table:first, 1 runs,      0 skips
   1740 decicycles in av_crc_get_table,       1 runs,      0 skips
   1390 decicycles in av_crc_get_table,       2 runs,      0 skips
    945 decicycles in av_crc_get_table,       4 runs,      0 skips
    720 decicycles in av_crc_get_table,       8 runs,      0 skips
    592 decicycles in av_crc_get_table,      16 runs,      0 skips
    536 decicycles in av_crc_get_table,      32 runs,      0 skips
    510 decicycles in av_crc_get_table,      64 runs,      0 skips
    497 decicycles in av_crc_get_table,     128 runs,      0 skips
    494 decicycles in av_crc_get_table,     256 runs,      0 skips
    479 decicycles in av_crc_get_table,     512 runs,      0 skips
    454 decicycles in av_crc_get_table,    1024 runs,      0 skips
    434 decicycles in av_crc_get_table,    2048 runs,      0 skips


new:
 235060 decicycles in av_crc_get_table:first, 1 runs,      0 skips
   2900 decicycles in av_crc_get_table,       1 runs,      0 skips
   2560 decicycles in av_crc_get_table,       2 runs,      0 skips
   1825 decicycles in av_crc_get_table,       4 runs,      0 skips
   1200 decicycles in av_crc_get_table,       8 runs,      0 skips
    902 decicycles in av_crc_get_table,      16 runs,      0 skips
    719 decicycles in av_crc_get_table,      32 runs,      0 skips
    628 decicycles in av_crc_get_table,      64 runs,      0 skips
    576 decicycles in av_crc_get_table,     128 runs,      0 skips
    550 decicycles in av_crc_get_table,     256 runs,      0 skips
    532 decicycles in av_crc_get_table,     512 runs,      0 skips
    499 decicycles in av_crc_get_table,    1024 runs,      0 skips
    470 decicycles in av_crc_get_table,    2048 runs,      0 skips

Thank's.


More information about the ffmpeg-devel mailing list