Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution

1Brookhaven National Laboratory
2Texas A&M University
The International Conference on Machine Learning (ICML) 2024

Indicates the corresponding author

Abstract

In this work, we present an arbitrary-scale super-resolution (SR) method to enhance the resolution of scientific data, which often involves complex challenges such as continuity, multi-scale physics, and the intricacies of high-frequency signals. Grounded in operator learning, the proposed method is resolution-invariant. The core of our model is a hierarchical neural operator that leverages a Galerkin-type self-attention mechanism, enabling efficient learning of mappings between function spaces. Sinc filters are used to facilitate the information transfer across different levels in the hierarchy, thereby ensuring representation equivalence in the proposed neural operator. Additionally, we introduce a learnable prior structure that is derived from the spectral resizing of the input data. This loss prior is model-agnostic and is designed to dynamically adjust the weighting of pixel contributions, thereby balancing gradients effectively across the model. We conduct extensive experiments on diverse datasets from different domains and demonstrate consistent improvements compared to strong baselines, which consist of various state-of-the-art SR methods.

Introduction

Super-resolution (SR) plays a pivotal role in low-level vision tasks. The primary objective of SR is to transform blurred, fuzzy, and low-resolution images into clear, high-resolution images with enhanced visual perception. In recent years, deep learning has significantly advanced SR and has demonstrated promising performances in diverse domains beyond computer vision, including but not limited to medical imaging, climate modeling, and remote sensing. Nevertheless, existing deep learning-based SR methods often limit themselves to a fixed-scale (e.g., × 2, × 3, × 4). The emergence of implicit neural representation (INR) in computer vision allows for continuous representation of complex 2D/3D objects and scenes. This development introduces opportunities for arbitrary-scale SR.

Model

To achieve arbitrary-scale, we introduce Hierarchical Neural Operator TransformEr (HiNOTE), featuring a hybrid upsampling-based encoder, a parameter-free sampler, and a new hierarchical neural operator transformer-based decoder.

Quantitative Results

We present extensive experiments to validate the efficacy of our method across multiple challenging datasets. We benchmarked the proposed HiNOTE model against SOTA methods designed for both single-scale and arbitrary-scale tasks.

Qualitative Results

Qualitative comparisons between the HiNOTE and other leading arbitrary-scale SR methods.

BibTeX

@inproceedings{
        luo2024hierarchical,
        title={Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution},
        author={Xihaier Luo and Xiaoning Qian and Byung-Jun Yoon},
        booktitle={Forty-first International Conference on Machine Learning},
        year={2024},
        url={https://openreview.net/forum?id=LhAuVPWq6q}
        }