VEAI Stabilization: Full-frame Video Stabilization

nipun.nath · March 23, 2022, 7:12pm

VEAI Stabilization

The new VEAI Stabilization performs “full-frame” video stabilization. That is, it smooths out the shaky camera motions without cropping the frame. It synthesizes the missing parts of stabilized frames to preserve the original resolution of the input video.

Some tips and tricks:

1. Finding the optimum smoothing amount: It might require a few iterations to find the optimum smoothing amount that works best for a particular video. Too little smoothing amount may not reduce the shakiness and too high a smoothing amount may introduce some artifacts around the edges (due to the generation of missing parts). You can start with the default value 6, and go higher or lower depending on your needs and the output quality. In general, you may not need to go lower than 2 or higher than 12.

2. Black border(s) on the edge(s): If the video has black borders, it needs to be removed to get better stabilization results.

To crop the video from top left corner point (x,y) and have output width out_width and height out_height, use this filter: crop=out_width:out_height:x:y

For example:
ffmpeg -i input_video.mp4 -vf crop=1260:700:10:20 -qmin 1 -qmax 1 input_video_cropped.mp4

3. Portrait video: If the video is portrait, it is recommended to rotate it to landscape, perform the stabilization, and rotate the output video back to portrait mode.

To rotate the input video 90 degree clockwise:
ffmpeg -i input_portrait_video.mp4 -vf "transpose=1" -qmin 1 -qmax 1 input_landscape_video.mp4

To rotate the output video 90 degree anti-clockwise:
ffmpeg -i output_landscape_video.mp4 -vf "transpose=2" -qmin 1 -qmax 1 output_portrait_video.mp4

4. High-resolution and/or long video: During the experimentation, try the stabilization on a smaller resolution of the video and compare the results for different smoothing amounts. I would recommend using 512 as the output video height during the trials.

Note:

For files and installations: please follow this thread.
Please share videos that do not work: upload here.

Limitations:

Current method doesn’t support 360-degree videos.
It also doesn’t compensate rolling-shutter effect and motion blurs at this moment. These artifacts might be more noticeable when the video is stabilized.

We are working on these limitations and continuously improving the stabilization models and methods.

Please give it a try and share your opinions. Your feedback is important to continuously improve the app. Thank you!

nicholas.byers · April 12, 2022, 7:55pm

I’ve been finding watermarks to be quite troublesome, as they tend to shake in response to the stabilization. Is watermark removal/detection/replacement in the works?

I also have a larger write-up I’ve been working on regarding some performance optimizations that can be made, but for now I am wondering what is the benefit to phase 2 (creating the 2x .png with OpenCV) using .png specifically?
Could lossless video be piped in instead? Both encode and decode speeds for .png is a pretty significant bottleneck, as it has to be done single thread on the CPU. WEBP could be a good alternative as well, because it allows for multi threaded encode and decode.

I need to do more analysis, but I believe phase 2 is mostly bottlenecked by IO of the initial frame buffer from disk. I don’t believe these 300MB initial frames have any data in them to start, it is just reserved space, but reading them into memory still requires actually reading them off of the hard drive. Is there any reason you can’t just allocate a buffer of the size you need in memory directly, without reading from file?

nipun.nath · April 13, 2022, 4:15pm

Hello Nicholas,

Thanks for your wonderful suggestions. There is no watermark-related processing at this moment. But that’s a good point. Watermark might degrade the stabilization quality. We will look into it.

And yes, the 2nd pass (or phase 2) has some room for optimization. We have been focusing on the quality and robustness so far. But as our AI models get matured, we will shift our focus on optimization.

We are considering replacing PNG with something else to expedite the encoding/decoding and reduce the disk space requirement. Also, our current method makes it complicated to use a video instead of an image sequence, or use only memory instead of reading/writing files. As we are continuously improving and polishing our method, we keep these bottlenecks into our consideration and aim to minimize those in future iterations.

Thanks. Keep the good ideas coming.

nicholas.byers · April 13, 2022, 6:39pm

Even if you end up sticking with .png, you can tweak the compressor to get some nice speedups. I’m not at my PC atm so I don’t have specifics, but in Phase 0 I believe I was able to give ffmpeg -compress_level 4 for .png which yields about ~20% overall speedup (for that phase) at the cost of ~5% larger .png files. This is the case generally speaking when using .png or zlib for any image. The effort is rated on a scale 0-9 with 6 being default. The real time speed improvements do not tend to be linear however, effort level 3 is pretty much the same as effort level 4 except in specific edge cases.

If you end up using .jxl or .webp you have similar options at your disposal while still being mathematically lossless. Uncompressed of course will just give you a different bottleneck on your disk/network IO.

I recommend playing around with it, you can get a lot of free performance that could end up being a 1 line code change depending on your setup.

Compiler flags are another great and easy way to get some free speed. I saw that you guys are using Conan, if you are building your dependencies rather than just pulling the prebuilts, you can pass flags to GCC etc. in your Conan profile. Then you can play around with compiling libpng etc. with -O3.

cdussud · June 29, 2022, 1:11am

I started playing around with this a bit. I can’t wait for rolling shutter support. For me currently that’s the biggest limitation. The videos I’m trying to smooth all have very noticeable rolling shutter artifacts (warping) when smoothed.

Currently I’d say the quality of the smoothing (for my old videos) is on par with vid.stab in ffmpeg. iMovie on Mac is actually far better because it compensates for it.

nipun.nath · June 29, 2022, 7:23pm

Thanks for the feedback. We are working on the rolling shutter effect. It would be very helpful if you could share some sample videos of yours (submit to dropbox) so that we can test our methods.

cdussud · June 29, 2022, 10:33pm

Sure thing! I uploaded a shaky video along with it smoothed a few different ways to see the rolling shutter warping. I’ll upload a few more in the next few days if it’s helpful