[FFmpeg-devel] [PATCH] dsputil: add bswap16_buf()
Måns Rullgård
mans
Mon Jun 14 14:15:47 CEST 2010
Michael Niedermayer <michaelni at gmx.at> writes:
> On Sun, Jun 13, 2010 at 05:59:20PM +0100, Mans Rullgard wrote:
>> ---
>> libavcodec/dsputil.c | 7 +++++++
>> libavcodec/dsputil.h | 1 +
>> 2 files changed, 8 insertions(+), 0 deletions(-)
>>
>> diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c
>> index 0701324..1ecd73f 100644
>> --- a/libavcodec/dsputil.c
>> +++ b/libavcodec/dsputil.c
>> @@ -260,6 +260,12 @@ static void bswap_buf(uint32_t *dst, const uint32_t *src, int w){
>> }
>> }
>>
>> +static void bswap16_buf(uint16_t *dst, const uint16_t *src, int len)
>> +{
>> + while (len--)
>> + *dst++ = bswap_16(*src++);
>> +}
>
> on 64bit arch this is likely faster:
>
> uint64_t u= ((uint64_t*)src)[i];
> ((uint64_t*)dst)[i]= ((u>>8)&0x...) + ((u<<8)&0x...)
That depends entirely on the specifics of the machine, but point taken.
It would also in general require 8-byte alignment. Not sure if that
can be guaranteed.
The purpose of having this in dsputil is to allow optimised
implementations for specific machines. A 32-bit block bswap function
already exists, and some patch posted was doing a 16-bit block swap.
If one makes sense, so does the other.
>> static int sse4_c(void *v, uint8_t * pix1, uint8_t * pix2, int line_size, int h)
>> {
>> int s, i;
>> @@ -4455,6 +4461,7 @@ av_cold void dsputil_init(DSPContext* c, AVCodecContext *avctx)
>> c->add_hfyu_left_prediction = add_hfyu_left_prediction_c;
>> c->add_hfyu_left_prediction_bgr32 = add_hfyu_left_prediction_bgr32_c;
>> c->bswap_buf= bswap_buf;
>> + c->bswap16_buf = bswap16_buf;
>> #if CONFIG_PNG_DECODER
>> c->add_png_paeth_prediction= ff_add_png_paeth_prediction;
>> #endif
>> diff --git a/libavcodec/dsputil.h b/libavcodec/dsputil.h
>> index fd2d07f..f3926cd 100644
>> --- a/libavcodec/dsputil.h
>> +++ b/libavcodec/dsputil.h
>> @@ -365,6 +365,7 @@ typedef struct DSPContext {
>> /* this might write to dst[w] */
>> void (*add_png_paeth_prediction)(uint8_t *dst, uint8_t *src, uint8_t *top, int w, int bpp);
>> void (*bswap_buf)(uint32_t *dst, const uint32_t *src, int w);
>
>> + void (*bswap16_buf)(uint16_t *dst, const uint16_t *src, int len);
>
> how much alignment can we assume to have for dst/src? and len?
I don't know as I haven't located every place that could use this
function.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list