[FFmpeg-devel] [PATCH] avutil/crc: always use precalculated CRC tables for known polynomials
Muhammad Faiz
mfcc64 at gmail.com
Thu Oct 26 01:54:01 EEST 2017
On Tue, Oct 24, 2017 at 6:05 PM, Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Mon, Oct 23, 2017 at 10:36:21PM -0300, James Almer wrote:
>> On 10/23/2017 10:24 PM, Michael Niedermayer wrote:
>> > On Tue, Oct 24, 2017 at 01:39:03AM +0100, Mark Thompson wrote:
>> >> On 24/10/17 00:52, Michael Niedermayer wrote:
>> >>> Hi
>> >>>
>> >>> On Mon, Oct 23, 2017 at 04:43:19PM +0100, Mark Thompson wrote:
>> >>>> On 23/10/17 03:56, Michael Niedermayer wrote:
>> >>>>> the initialization should be thread safe as it never writes a different
>> >>>>> value in the same spot
>> >>>>
>> >>>> This is not true; please be very careful with assumptions like this.
>> >>>>
>> >>>> The C standard calls this a data race and it is undefined behaviour.
>> >>>
>> >>> Thats interresting, we definitly do not want undefined behavior
>> >>> can you quote the relevant parts of the C standard
>> >>> which make writing the same value, a data race ?
>> >>>
>> >>> I was not aware of this if its true.
>> >>> Also i dont immedeatly see this in the "latest" draft i am looking at
>> >>> atm
>> >>>
>> >>> What i see is this:
>> >>> 4 Two expression evaluations conflict if one of them modifies a memory location and the
>> >>> other one reads or modifies the same memory location.
>> >>> ...
>> >>> 25 The execution of a program contains a data race if it contains two conflicting actions in
>> >>> different threads, at least one of which is not atomic, and neither happens before the
>> >>> other. Any such data race results in undefined behavior.
>> >>>
>> >>> This seems carefully worded to not include writing the same value.
>> >>> As "modification" implies change which does not occur when writing the
>> >>> same value.
>> >>
>> >> All writes are modifications.
>> >>
>> >> See section 3.1 note 2:
>> >> """
>> >> 3 NOTE 2 "Modify"€™ includes the case where the new value being stored is the same as the previous value.
>> >> """
>> >
>> > That resolves the difference between our interpretation
>> >
>> > thanks
>>
>> Should i push this, or does someone want to give ff_thread_once() a try?
>
> If ff_thread_once() is used outside init code, it should be benchmarked
> as it may be a bad idea speed wise.
> During init speed doesnt matter so ff_thread_once() there would be fine
> but would require more changes to move it to init.
Using my patch:
Benchmark: make fate-lavf-png at pngenc.c:png_write_chunk
old:
150640 decicycles in av_crc_get_table:first, 1 runs, 0 skips
1740 decicycles in av_crc_get_table, 1 runs, 0 skips
1390 decicycles in av_crc_get_table, 2 runs, 0 skips
945 decicycles in av_crc_get_table, 4 runs, 0 skips
720 decicycles in av_crc_get_table, 8 runs, 0 skips
592 decicycles in av_crc_get_table, 16 runs, 0 skips
536 decicycles in av_crc_get_table, 32 runs, 0 skips
510 decicycles in av_crc_get_table, 64 runs, 0 skips
497 decicycles in av_crc_get_table, 128 runs, 0 skips
494 decicycles in av_crc_get_table, 256 runs, 0 skips
479 decicycles in av_crc_get_table, 512 runs, 0 skips
454 decicycles in av_crc_get_table, 1024 runs, 0 skips
434 decicycles in av_crc_get_table, 2048 runs, 0 skips
new:
235060 decicycles in av_crc_get_table:first, 1 runs, 0 skips
2900 decicycles in av_crc_get_table, 1 runs, 0 skips
2560 decicycles in av_crc_get_table, 2 runs, 0 skips
1825 decicycles in av_crc_get_table, 4 runs, 0 skips
1200 decicycles in av_crc_get_table, 8 runs, 0 skips
902 decicycles in av_crc_get_table, 16 runs, 0 skips
719 decicycles in av_crc_get_table, 32 runs, 0 skips
628 decicycles in av_crc_get_table, 64 runs, 0 skips
576 decicycles in av_crc_get_table, 128 runs, 0 skips
550 decicycles in av_crc_get_table, 256 runs, 0 skips
532 decicycles in av_crc_get_table, 512 runs, 0 skips
499 decicycles in av_crc_get_table, 1024 runs, 0 skips
470 decicycles in av_crc_get_table, 2048 runs, 0 skips
Thank's.
More information about the ffmpeg-devel
mailing list