Please bring this fast and amazing flawless model to Video AI, Photo AI, and Gigapixel AI

official site:

github:

some demos:

papers:

Chinese introduction (some demos):
https://www.siat.ac.cn/siatxww/kyjz/202507/t20250729_7897872.html


i use Grok to translate the Chinese introduction into English:

The field of image restoration has long faced a dilemma: pursuing high quality often comes at the cost of lengthy processing times, while prioritizing speed sacrifices fine details. How can an old photo be restored both quickly and effectively?

On July 28, the team led by Researcher Dong Chao from the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, unveiled a large-scale image restoration model named HYPIR. This model not only outperforms existing image restoration techniques by being dozens of times faster but also excels in high-resolution output, text fidelity, comprehension, and user control flexibility. It offers a more efficient solution for practical applications in image restoration, opening new possibilities for cultural preservation, film restoration, and other fields.

Breaking Through Traditional Technical Bottlenecks for More Efficient Image Restoration

Traditional methods, particularly those based on pretrained diffusion models, have significantly improved image restoration outcomes but are hindered by high computational complexity, slow inference speeds, substantial training resource demands, and limited controllability of results. These issues have been bottlenecks restricting the development of image restoration technology.

Last year, Dong Chao’s team introduced SUPIR, an intelligent image enhancement model capable of restoring low-quality images to near-original high-definition quality, effectively addressing various types of degradation. The new HYPIR model, an upgraded version, abandons the iterative diffusion model training approach in favor of a single-step adversarial generative model training method. This change boosts algorithmic speed by several times while leveraging an updated text-to-image base model to further enhance performance, achieving 8K-level detail generation and surpassing SUPIR in stability and controllability.

“Previous image restoration methods often involved diffusion model distillation, ControlNet adapters, or multi-step inference processes. HYPIR, however, eliminates the need for these steps, making it simpler to use. It achieves an order-of-magnitude improvement in training and inference speed compared to traditional methods, with superior performance,” Dong explained. HYPIR’s innovations include initializing the restoration network with a pretrained diffusion model and providing a theoretical explanation for the profound principles behind this simple approach.

Experimental data shows that on a single graphics card (GPU), HYPIR can restore a 1024x1024 resolution image in just 1.7 seconds. Compared to existing methods, HYPIR delivers superior image restoration quality and is adaptable to pretrained diffusion models of various sizes, offering flexibility for different application scenarios.

Outstanding Performance and Broad Application Prospects

In practical applications, HYPIR demonstrates exceptional performance in high-resolution image restoration, text fidelity, comprehension, and user control flexibility.

For instance, in old photo restoration, the team used HYPIR to restore images from classic domestic and international films and TV series, bringing blurred visuals back to life with clear details, providing technical support for preserving cultural memories. In high-resolution image restoration, HYPIR excels by overcoming the traditional trade-off between speed and quality, successfully generating 8K-resolution images efficiently.

In terms of text fidelity, traditional diffusion-based methods often produce blurry or distorted text, lacking precision. HYPIR, however, ensures high-fidelity and clear text restoration, accurately reproducing everything from simple labels to complex documents, making text in images sharp and readable.

Notably, HYPIR also boasts impressive natural language understanding capabilities, accurately interpreting user instructions and reflecting their intentions during the restoration process. Users can flexibly adjust the balance between generation and restoration or fine-tune the level of detail to achieve results tailored to their preferences. This user-friendly design makes HYPIR suitable for both professional and general users.

HYPIR not only showcases innovation in image restoration technology but also reflects an understanding of real-world application needs. By breaking free from conventional thinking, it provides practical solutions for cultural preservation, film restoration, and high-resolution image generation, injecting new vitality into the field.

Dong Chao’s team has long been dedicated to fundamental vision technology research, achieving multiple breakthroughs in image processing and publishing the AI monograph The Beauty of Low-Level Vision. For Dong, research requires “three hearts”: curiosity, integrity, and a commitment to the greater good. “True scientific achievements must respect factual truth and withstand rigorous scrutiny,” he said. The open-source code and model for HYPIR have been uploaded to GitHub (HYPIR homepage: https://hypir.xpixel.group) and successfully deployed on the Mingxi Technology platform. In collaboration with the Shenzhen Nanshan District Archives, the team has restored some archived photos, with plans to further advance HYPIR’s industrialization, allowing the public to experience the charm of this technology firsthand.