A lot of this is inherited from FFmpeg’s color conversion from YUV>RGB48>YUV. Color issues are compounded when the source is not tagged correctly with range / space / transfer and is interpreted at [2] Unknown by FFmpeg/TVAI. And finally there is the AI synthesis.
So how should the software deal with range / space / primaries / characteristics?
I would like to see TVAI allow the user to select “treat input as [ BT.601 | 709 ]” on the input and also allow user to specify [ BT.601 | 709 | 2020 | RGB | sRGB ] to produce correctly converted and tagged output for all of { range, space, primaries, transfer characteristics }, as per industry-standard specifications. As TVAI is a professional tool, it should allow the user to explicitly state the color characteristics of input and select the desired output characteristics. Video frames should be correctly tagged. The output should be accurately tagged. There should be templates for the main professional, broadcast and production standards of [ BT.601/525 | BT.601/625 | 709 | 2020 | RGB | sRGB ].
Incorrect tagging of of full/limited color range in source content is very common, especially when dealing with digitally created content (or amateur content) and can lead to stretching, compressing or clipping of the colors within the scale (crushed blacks etc). This isn’t Topaz’s fault - it is sources that are untagged as [2] Unknown. The solution is to offer the user template overrides of the professional color standards. Some software (like zscale & MPV) make assumptions on the color characteristics based on the resolution, but in the world of ai upscaling, the resolution may no longer be a useful hint. Color characteristic templates for input and output would be preferable.
There are multiple places where FFmpeg reads or writes color tags - and one area that Topaz could help with is confirming what color range / space / transfer / primaries are when the video leave the tvai_up filter. Are the colors RGB, sRGB, Adobe RGB, IEC 61966-2 or BT.709? Which color characteristics are always passed through from source, which characteristics are always changed within tvai_up filter. Is the tvai filter outputting full range?
That’s the tagging part done. If containers, video frames and metadata describe the color characteristics correctly, the downstream process or device will be in a better position to render it as intended.
Yikes! My content is YUV. TVAI only thinks in RGB.
Unfortunately, TVAI does not have separate models for YUV and RGB workflow. The conversion from YUV > BGR > YUV can be imperfect at some pixel formats and bit depths, and oversampling to 16 bit-depth BGR48 and subsequently downsampling to 8 bit-depth YUV will, by definition, cause sampling errors.
Test case…
I’m going to try to use an objective example of the very simplest content I can come up with… a limited-range black frame of YUV 16,16,16, programmatically generated using FFmpeg’s geq filter. Only a single video frame needed for this test. We’ll use YUV444 so avoid any inaccuracies from YUV420p chroma subsampling…
$ ffmpeg-topaz -hide_banner -color_range 'tv' -colorspace:v 'smpte170m' -color_primaries:v 'smpte170m' -color_trc:v 'smpte170m' -f 'lavfi' -i nullsrc=size='ntsc':rate='ntsc',format=pix_fmts='yuv444p',trim=start_frame=0:end_frame=1,geq=lum_expr=16:cb_expr=16:cr_expr=16 -vf showinfo,signalstats,metadata=mode='print' -f 'null' -
We measure the output using the signalstats filter. In this case, the output from the signalstats filter is YUV 16,16,16. Great! As expected.
Now lets do the same test, going from YUV > BGR48 > YUV, which is what happens when the TVAI filter forces BGR48 for model processing. But we’ll not even need to include the TVAI filter in this example, we’ll just use the underlying FFmpeg to force the color conversion to RGB48 that TVAI would have done…
$ ffmpeg-topaz -hide_banner -color_range 'tv' -colorspace:v 'smpte170m' -color_primaries:v 'smpte170m' -color_trc:v 'smpte170m' -f 'lavfi' -i nullsrc=size='ntsc':rate='ntsc',format=pix_fmts='yuv444p',trim=start_frame=0:end_frame=1,geq=lum_expr=16:cb_expr=16:cr_expr=16 -vf showinfo,format=pix_fmts='bgr48',format=pix_fmts='yuv444p',signalstats,metadata=mode='print' -f 'null' -
The output is now YUV 84.0156, 88.7656, 78.3281. It is likely to be impossible to spot this by eye, but you’ve already incurred objectively-measured color shift from 16,16,16 to about 80,80,80 - just on a simple black frame.
You can do the same tests at various colorpoints - YUV 32,128,128 etc.
So, irrelevant of what TVAI’s filter is doing, the fact that Topaz is operating in BGR48 domain, it will most likely be mathematically imperfect when dealing with YUV sources and YUV output and going through a YUV>BGR48>YUV conversion.
I have not done the same tests using RGB source, since I only work in YUV.
I have no idea whether it would be practical for TVAI’s models to have an alternative to operate within YUV domain rather than RBG48, but while it is going from YUV > RGB48> YUV, there will always be mathematical imperfection after supersampling and then subsequently subsampling.
So how does the community help Topaz come up with a color accurate workflow?
Firstly, general subjective opinions are useless. “My color is shifting” is a common cry, but is kinda useless as an agent for change. Furthermore, displaying images of screenshots and captures in a browser is also useless, since some browsers themselves are not color accurate - and not everyone sees the same colors on their system, and most users have not Calibrated their monitor/TV with SMPTE bars or a calibration disc. The community needs to use objective measures to help Topaz.
Those who can generate programmatic examples (like the FFmpeg commands above that demonstrate the YUV>BGR48>YUV issue) could produce some test cases so that Topaz can add them to their system / regression test suite.
Don’t always assume it is the TVAI model that is to blame - it could be FFmpeg or the codec. If you speak FFmpeg and are seeing color-shift, remove the TVAI filter from the command line. If you are still getting color-shift, it is not the model. My example above shows that you will get some level of colorshift with the conversion of YUV > RGB48 > YUV with FFmpeg before you even add the TVAI filter or model.
If you have public domain test patterns and calibration content that demonstrates color-shift after conversion, upload and share it. Professionals typically include leaders (aka bars and tone) on content. The SMPTE HD test cards are great. The movie industry has historically used the rather archaically named China Girl (or the photo industry’s Kodak Shirley Cards equivalent).
Some operating systems have Digital Color Meter software, where the RGB value is displayed on mouse-over. When combined with a color-accurate player (like Quicktime on macOS), a color meter can be used to objectively measure any color shift.
Some of the professionals may have X-Rites and Colorometers that can be used to test the end-to-end workflow.
Any then, even when measured, there needs to be some consensus from the community about what level of color-shift is acceptable from either a workflow or AI model?
There’s a lot here to digest - and before we all start giving Topaz a hard time around the models themselves, the fundamentals are interpreting the color characteristics tags (range, space, primaries, transfer) of the source correctly, allowing the user to override the characteristics of the source, ensuring TVAI outputs RGB frames that they are tagged accurately and appreciating that most professional video content is typically distributed in YUV.