-Ofast is basically -O3 + -ffast-math
But from my measurements (x64 gcc and arm64 clang), it doesn't make
any difference as tested with `tools/convert benchmark`. Actually,
currently almost no conversion uses floating point math except of
vc_copylineUYVYtoRGBA, for which the -ffast-math didn't have any
performance impact.
removed all ad hoc optimizations:
- -Ofast - removed by context (pixfmt_conv compiles with), _but_ see
previous commit - even in pixfmt_conv it was not actually used
- remove ALWAYS_INLINE + OPTIMIZED_FOR - from measurements it doesn't
seem to make some difference
Since the decompression is synchronous, so that decompression and
conversion cannot interleaved, it is very useful to set the conversion
to be performed in multiple threads.
For 4K, the conversions may (to R12L or R10k) may take around 10 ms when
run in single thread, which makes it a bit hard for 60 FPS video (the
decompression must then take at most 6 ms).
To enable eg. 2160p60 video, make the decoders run in parallel.
As for now, just the conversions using generic decode_planar_func_t are
parallelized. Eveutally all conversions would use this API.
Start rewrite with coefficients not hard-coded in the macro. For the
beginning, the new implementation used in pixfmt_conv.o. From
the performance evaluation it doesn't have impact on performance
(`tools/convert benchmark`).
For the benchmark, we use -Ofast in UltraGrid so make it the same.
It won't break anything for the rest of the objects so make it default
instead of writing a custom rule for pixfmt_conv.o.
The .o files are no longer directly in the root of tools/ but in the
respective subdirectories as in sources in src/. Used find command
instead of wildard because the objects may be deeper in src/.
Just the conversions grew to a significant amount so it is better to
split the file to two to keep the general video codec utility functions
in one file and the conversions in the another.
Build even UG obj files in tools/ subdirectory if make called there
(because color_out.o builds differently for `convert` and `uv` not to be
used interchangeably).
- supported also out-of-tree build when SRCDIR was passed, eg.:
mkdir build && cd build
make -f ../tools/Makefile SRCDIR=.. convert
+ decklink_temperature to gitignore
- moved macros to utils/macros.h (not config_common.h that is not going
to be included) and include in config_common.h the macros.h file
instead (later it should be removed)
- avoid dependency of color_out.o on host.o if build outside UG (easiest
for now)
- compile tools with '-g' (obviously for better debuggability)