[FFmpeg-devel] libavcodec/lossless_videodsp : add add_bytes AVX2

Martin Vignali martin.vignali at gmail.com
Sat Oct 28 15:14:00 EEST 2017


2017-10-25 22:54 GMT+02:00 Paul B Mahol <onemda at gmail.com>:

> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> > 2017-10-25 22:08 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >
> >> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> > 2017-10-25 21:53 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >> >
> >> >> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> >> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >> >> >
> >> >> >> On 10/21/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> >> >> > Hello,
> >> >> >> >
> >> >> >> > In attach patch to add AVX2 version for add_bytes
> >> >> >> >
> >> >> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers :
> >> >> >> > add AVX2 version
> >> >> >> >
> >> >> >> > pass fate-test for me (os 10.12, x86_64)
> >> >> >> >
> >> >> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the
> >> >> >> > fastest
> >> >> >> > version)
> >> >> >> > checkasm: all 2 tests passed
> >> >> >> > add_bytes_c: 108.7
> >> >> >> > add_bytes_sse2: 26.5
> >> >> >> > add_bytes_avx2: 15.5
> >> >> >> >
> >> >> >> >
> >> >> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se:
> >> >> >> > only cosmetic
> >> >> >> > like the ref c function declaration in asm file is not
> consistent
> >> >> >> > between
> >> >> >> > each asm file
> >> >> >> > i think a better separator for each function make the file
> easier
> >> to
> >> >> >> > read
> >> >> >> >
> >> >> >> > also add the c declaration for add bytes in comment
> >> >> >> >
> >> >> >> >
> >> >> >> > Martin
> >> >> >> >
> >> >> >>
> >> >> >> Are you sure 32bit alignment is actually enforced?
> >> >> >>
> >> >> >>
> >> >> > Hello,
> >> >> >
> >> >> > I think, data used by add_bytes is always aligned
> >> >> > because dst and src, are start of a line of an AvFrame
> >> >>
> >> >> Yes, but try width thats not multiple of 32.
> >> >> _______________________________________________
> >> >>
> >> >>
> >> > Sorry, not sure i understand.
> >> > following the doc, AVFrame->linesize, is multiple of max alignment
> >> >
> >> > and in the asm, loop will be repeat until, val < width
> >> >
> >> > Can you indicate me, the part, where you think, it's not ok ?
> >>
> >> I dunno. You should test it with widths not divisible by 32.
> >>
> >
> > Tested with the fate sample : vsynth3-huffyuvbgra.avi (34x34)
> > ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -
> >
> > generate same crc than
> > ./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -
> > -cpuflags 0
> >
> >
> >>
> >> also try encoding cropped video.
> >>
> >
> > Are you sure, encoding cropped video, have a link to the decoding dsp
> func ?
> >
> > these patch only take care about the decoding func
> >
> >
> > And the encoding func of huffyuvenc (in huffyuv add add/diff_bytes16 AVX2
> > discussion)
> > and losslessencdsp (not made for now), have a test for alignment of dst
> and
> > src
> >
> >
> > Martin
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
>
>
> ok then
> <http://ffmpeg.org/mailman/listinfo/ffmpeg-devel>
>

Ping for apply


More information about the ffmpeg-devel mailing list