[FFmpeg-devel] Once again: Multithreaded H.264 decoding with ffmpeg?

Jason Garrett-Glaser darkshikari
Fri May 30 07:52:29 CEST 2008


>> I have been looking into the h264 code and each piece of H.264
>> documentation I could get my hands on. And I have the impression that
>> some of the decoding steps (namely residual decoding, deblocking) could
>> be parallelized quite well. But I don't have any idea how much time the
>> individual decoding steps take. Does someone happen to have some
>> numbers? Or a hint how to measure this myself?

[Profile courtesy of Loren Merritt]

ffh264 svn-r11870 (2008-02-04)
CPU: Core 2, speed 2400.75 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        symbol name
168093    9.2010  decode_mb_cabac
165494    9.0587  decode_cabac_residual
133817    7.3248  fill_caches
115161    6.3036  hl_decode_mb_simple
111744    6.1166  h264_#_loop_filter_luma_mmx2
101511    5.5565  put_h264_chroma_mc8_mmx
88055     4.8199  h264_#_loop_filter_chroma_mmx2
72618     3.9749  filter_mb_fast
70919     3.8819  get_cabac_noinline
67392     3.6889  put_h264_qpel8_h_lowpass_l2_mmx2
66962     3.6653  put_h264_qpel8or16_v_lowpass_mmx2
64123     3.5100  filter_mb_edge#
53187     2.9113  put_h264_qpel16_mc##_mmx2
52559     2.8770  h264_loop_filter_strength_mmx2
47997     2.6272  decode_cabac_mb_mvd
42554     2.3293  decode_mb_skip
39814     2.1793  mc_dir_part
39220     2.1468  hl_motion
35509     1.9437  clear_blocks_mmx
33781     1.8491  prefetch_mmx2
32510     1.7795  put_h264_chroma_mc4_mmx
32389     1.7729  put_h264_qpel8or16_hv_lowpass_mmx2
25840     1.4144  pred_direct_motion
23767     1.3009  put_h264_qpel8or16_vh_lowpass_mmx2
14891     0.8151  ff_h264_idct8_add_sse2
14235     0.7792  decode_slice
12522     0.6854  put_h264_qpel8_h_lowpass_mmx2
10993     0.6014  put_h264_qpel8_mc##_mmx2
9992      0.5469  decode_cabac_mb_skip
9958      0.5451  avg_h264_qpel8_h_lowpass_l2_mmx2
8960      0.4905  pred8x8l_#
6585      0.3604  filter_mb
6516      0.3567  ff_h264_idct_dc_add_mmx2
6290      0.3443  draw_edges_mmx
6102      0.3340  mc_part
4565      0.2499  put_pixels8_l2_shift5_mmx2
4108      0.2249  ff_h264_biweight_#x#_mmx2
3413      0.1868  ff_h264_idct8_dc_add_mmx2
3253      0.1781  pred8x8c_#
3159      0.1729  ff_h264_idct_add_mmx
2809      0.1538  avg_h264_qpel8_h_lowpass_mmx2
2210      0.1210  avg_h264_qpel8or16_v_lowpass_mmx2
2179      0.1193  avg_h264_qpel8or16_hv_lowpass_mmx2
1667      0.0912  decode_nal_units
1552      0.0851  pred4x4_#
891       0.0488  decode_cabac_intra_mb_type
761       0.0417  avg_pixels8_l2_shift5_mmx2
528       0.0289  pred16x16_#
376       0.0206  decode_slice_header
304       0.0166  ff_emulated_edge_mc
239       0.0131  h264_luma_dc_dequant_idct_c
224       0.0123  decode_frame
165       0.0090  MPV_frame_start
133       0.0073  ff_draw_horiz_band
124       0.0068  video_read_frame
119       0.0065  fill_default_ref_list
108       0.0059  handle_block
102       0.0056  fast_memcpy
97        0.0053  decode_ref_pic_list_reordering
84        0.0046  ff_init_cabac_states

Dark Shikari




More information about the ffmpeg-devel mailing list