Publications

Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification

Published in ICML, 2024

A well-trained deep neural network on balanced datasets usually exhibits the Neural Collapse (NC) phenomenon, which is an informative indicator of the model achieving good performance. However, NC is usually hard to be achieved for a model trained on long-tailed datasets, leading to the deteriorated performance of test data. This work aims to induce the NC phenomenon in imbalanced learning from the perspective of distribution matching. By enforcing the distribution of last-layer representations to align the ideal distribution of the ETF structure, we develop a Distribution Alignment Optimization (DisA) loss, acting as a plug-and-play method can be combined with most of the existing long-tailed methods, we further instantiate it to the cases of fixing classifier and learning classifier. The extensive experiments show the effectiveness of DisA, providing a promising solution to the imbalanced issue.

Download Paper

FP-Net: frequency-perception network with adversarial training for image manipulation localization

Published in MULTIMEDIA TOOLS AND APPLICATIONS, 2024

Mining the forged regions of digitally tampered images is one of the key research tasks for visual recognition. Although there are many algorithms investigating image manipulation localization, most approaches focus only on the semantic information of the spatial domain and ignore the frequency inconsistency between authentic and tampered regions. In addition, the generality and robustness of the models are severely affected by the different noise distributions of the training and test sets. To address these issues, we propose the frequency-perception network with adversarial training for image manipulation localization. Our method not only captures representation information for boundary artifact identification in the spatial domain but also separates low and high-frequency information in the frequency domain to acquire tampered cues. Specifically, the frequency separation sensing module enriches the local sensing range by separating multi-scale frequency domain features. It accurately identifies high-frequency noise features in the manipulated region and distinguishes low-frequency information. The global frequency attention module uses multiple sampling and convolution operations to interactively learn multi-scale feature information and integrate dual-domain frequency content to identify tampered physical locations. Adversarial training is employed to construct hard training adversarial samples based on adversarial attacks to avoid interference from unevenly distributed redundant noise information. Extensive experimental results show that our proposed method performs significantly better than the mainstream approach on five common standard datasets.

Enhancing Minority Classes by Mixing: An Adaptative Optimal Transport Approach for Long-tailed Classification

Published in NeurIPS, 2023

Real-world data usually confronts severe class-imbalance problems, where several majority classes have a significantly larger presence in the training set than minority classes. One effective solution is using mixup-based methods to generate synthetic samples to enhance the presence of minority classes. Previous approaches mix the background images from the majority classes and foreground images from the minority classes in a random manner, which ignores the sample-level semantic similarity, possibly resulting in less reasonable or less useful images. In this work, we propose an adaptive image-mixing method based on optimal transport (OT) to incorporate both class-level and sample-level information, which is able to generate semantically reasonable and meaningful mixed images for minority classes. Due to its flexibility, our method can be combined with existing long-tailed classification methods to enhance their performance and it can also serve as a general data augmentation method for balanced datasets. Extensive experiments indicate that our method achieves effective performance for long-tailed classification tasks.

Download Paper

MB-Net: multiscale boundary interaction learning for image manipulation localization

Published in JOURNAL OF ELECTRONIC IMAGING, 2023

Image editing techniques can modify the content of images indiscriminately, which causes a grave threat to the security of society. Hence, the localization of manipulated images is inevitable. A serious challenge for image manipulation detection is the lack of strategies for perceiving global features and refining edges. In this paper, we present a multiscale boundary interaction learning network for image manipulation localization to solve both problems. This network contains an adjacent-scale mutual module to enrich the global perception domain by interactively learning adjacent scale features. It avoids the tremendous noise interference caused by the direct fusion of all scale features. To effectively suppress semantic content segmentation, the boundary pixel disparity module computes interpixel differences at specific angles to enhance boundary artifact recognition between tampered and real regions. The fusion attention module is proposed to combine scale and edge messages, integrating spatial and channel correlations in a compatible way. Extensive experimental results indicate that our proposed method is significantly superior to current state-of-the-art methods on public standard datasets.