mirror of
https://github.com/nyanmisaka/ffmpeg-rockchip.git
synced 2026-01-24 07:31:22 +01:00
Up until now, libswscale/output.c used a macro to write an output pixel which involved a call to av_pix_fmt_desc_get() to find out whether the input pixel format is BE or LE despite this being known at compile-time (there are templates per pixfmt). Even worse, these calls are made in a loop, so that e.g. there are eight calls to av_pix_fmt_desc_get() for every pixel processed in yuv2rgba64_X_c_template() for 64bit RGB formats. This commit modifies these macros to ensure that isBE() is evaluated at compile-time. This saved 41184B of .text for me (GCC 11.2, -O3). Of course, it also improved performance. E.g. ffmpeg_g -f lavfi -i testsrc2,format=yuva420p -pix_fmt rgba64le \ -threads 1 -t 1:00 -f null - (which uses yuv2rgba64le_X_c, which is an invocation of yuv2rgba64_X_c_template() mentioned above), performance improved from 95589 to 41387 decicycles for one call to yuv2packedX; for the be variant the numbers went down from 76087 to 43024 decicycles. Reviewed-by: Anton Khirnov <anton@khirnov.net> Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> |
||
|---|---|---|
| .. | ||
| aarch64 | ||
| arm | ||
| loongarch | ||
| ppc | ||
| tests | ||
| x86 | ||
| alphablend.c | ||
| bayer_template.c | ||
| gamma.c | ||
| half2float.c | ||
| hscale.c | ||
| hscale_fast_bilinear.c | ||
| input.c | ||
| libswscale.v | ||
| log2_tab.c | ||
| Makefile | ||
| options.c | ||
| output.c | ||
| rgb2rgb.c | ||
| rgb2rgb.h | ||
| rgb2rgb_template.c | ||
| slice.c | ||
| swscale.c | ||
| swscale.h | ||
| swscale_internal.h | ||
| swscale_unscaled.c | ||
| swscaleres.rc | ||
| utils.c | ||
| version.c | ||
| version.h | ||
| version_major.h | ||
| vscale.c | ||
| yuv2rgb.c | ||