[FFmpeg-devel] [PATCH 09/18] avcodec/hevcdec: Add stat_coeffs to HEVCABACState
Andreas Rheinhardt
andreas.rheinhardt at outlook.com
Sat Jul 2 13:40:20 EEST 2022
Anton Khirnov:
> Quoting Andreas Rheinhardt (2022-07-01 00:29:40)
>> The HEVC decoder has both HEVCContext and HEVCLocalContext
>> structures. The latter is supposed to be the structure
>> containing the per-slicethread state.
>>
>> Yet that is not how it is handled in practice: Each HEVCLocalContext
>> has a unique HEVCContext allocated for it and each of these
>> coincides with the main HEVCContext except in exactly one field:
>> The corresponding HEVCLocalContext.
>> This makes it possible to pass the HEVCContext everywhere where
>> logically a HEVCLocalContext should be used.
>>
>> This led to confusion in the first version of what eventually became
>> commit c8bc0f66a875bc3708d8dc11b757f2198606ffd7:
>> Before said commit, the initialization of the Rice parameter derivation
>> state was incorrect; the fix for single-threaded as well as
>> frame-threaded decoding was to add backup stats to HEVCContext
>> that are used when the cabac state is updated*, see
>> https://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268861.html
>> Yet due to what has been said above, this does not work for
>> slice-threading, because the each HEVCLocalContext has its own
>> HEVCContext, so the Rice parameter state would not be transferred
>> between threads.
>>
>> This is fixed in c8bc0f66a875bc3708d8dc11b757f2198606ffd7
>> by a hack: It rederives what the previous thread was and accesses
>> the corresponding HEVCContext.
>>
>> Fix this by treating the Rice parameter state the same way
>> the ordinary CABAC parameters are shared between threads:
>> Make them part of the same struct that is shared between
>> slice threads. This does not cause races, because
>> the parts of the code that access these Rice parameters
>> are a subset of the parts of code that access the CABAC parameters.
>>
>> *: And if the persistent_rice_adaptation_enabled_flag is set.
>>
>> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at outlook.com>
>> ---
>> libavcodec/hevc_cabac.c | 17 ++++++++---------
>> libavcodec/hevcdec.c | 10 +++++-----
>> libavcodec/hevcdec.h | 10 +++++++---
>> 3 files changed, 20 insertions(+), 17 deletions(-)
>>
>> diff --git a/libavcodec/hevc_cabac.c b/libavcodec/hevc_cabac.c
>> index a194f8a02a..985c97ef2a 100644
>> --- a/libavcodec/hevc_cabac.c
>> +++ b/libavcodec/hevc_cabac.c
>> @@ -453,19 +453,18 @@ void ff_hevc_save_states(HEVCContext *s, int ctb_addr_ts)
>> (ctb_addr_ts % s->ps.sps->ctb_width == 2 ||
>> (s->ps.sps->ctb_width == 2 &&
>> ctb_addr_ts % s->ps.sps->ctb_width == 0))) {
>> - memcpy(s->cabac_state, s->HEVClc->cabac_state, HEVC_CONTEXTS);
>> + memcpy(s->cabac->state, s->HEVClc->cabac_state, HEVC_CONTEXTS);
>
> So if I'm reading this right, this copies the per-slice-context state
> into the decoder-global state. And it's done from slice threads with no
> locks. So how is this not racy?
>
a) I am not claiming that this is not racy; I am merely claiming that it
does not introduce a new race, because the parts of the code that access
these Rice parameters are a subset of the parts of code that access the
CABAC parameters and tsan has never shown a race for me when updating
cabac state or the rice parameters in general.
b) (i) I readily admit that HEVC is not my forte*. WPP is supposed to be
as follows: Given that HEVC needs the upper right and upper block/ctu
for prediction, each row can only start decoding after the first two
ctus of the row above it have been decoded. And then the cabac state of
the row below is initialized from the cabac state of the row above after
decoding its first two ctus. Therefore only one cabac state needs to be
cached at any given time.
(ii) You can see this in ff_hevc_save_states, where it is only saving
the state after the second row (this presumes that the initial
ctb_addr_ts in hls_decode_entry_wpp is a multiple of ctb_width which
seems to be the case when tiles are disabled; I don't know what happens
when both tiles and wpp are enabled. According to
https://github.com/ultravideo/kvazaar/issues/201#issuecomment-391329526
this is not even allowed in any currently legal profile, but I don't
think our decoder checks for that).
(iii) After having saved the state, it is signalling that it is done
with this ctu via ff_thread_report_progress2. The next thread waits for
this event via ff_thread_await_progress2 and initializes its cabac state
(if necessary). So there is your synchronization.
(iv) Looking at ff_hevc_cabac_init, one can see that the first branch is
always true when run from the first job; whereas ctb_addr_ts %
s->ps.sps->ctb_width == 0 is true when tiles are disabled and when one
is decoding the first ctu of a row (I don't know what happens in case
tiles are enabled; probably mayhem. It seems kvazaar can produce such
files, see above link.).
c) The current state of affairs is btw weird: Given that the secondary
HEVCContexts are overwritten by the main HEVCContext in
hls_slice_data_wpp, the rice state that every HEVCContext starts with is
given by the state of the first HEVCContext. And which row (of the last
picture) this corresponds to depends upon the number of slice threads in
use. This might cause problems if dependent_slice_segment_flag is enabled.
d) See also the comment to patch #2.
- Andreas
*: I only wanted to share the common SEI parts of H.264 and HEVC due to
softworkz's horrible way of sharing it.
More information about the ffmpeg-devel
mailing list