Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module

HU Zhentao(胡振涛)*; HU Chonghao*; YANG Haoran*; SHUAI Weiwei**

文章摘要

HU Zhentao(胡振涛)*,HU Chonghao*,YANG Haoran*,SHUAI Weiwei**.[J].高技术通讯(英文),2024,30(1):23~30

Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module

DOI：10. 3772 / j. issn. 1006-6748. 2024. 01. 003

中文关键词:

英文关键词: multi-modal image translation, generative adversarial network (GAN), squeeze-and-excitation(SE) mechanism, feature attention (FA) module

基金项目:

Author Name	Affiliation
HU Zhentao(胡振涛)*	(* School of Artificial Intelligence, Henan University, Zhengzhou 450046, P. R. China) (** 95795 Troops of the PLA, Guilin 541003, P. R. China)
HU Chonghao*
YANG Haoran*
SHUAI Weiwei**

Hits: 890

Download times: 974

中文摘要:

英文摘要:

The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain. However, the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings, which results in inefficient training of neural networks and pattern collapse, leading to inefficient generation of image diversity. To address this issue, this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation. Specifically, firstly, the domain code is introduced in this paper to explicitly control the different generation tasks. Secondly, this paper brings in the squeeze-and-excitation (SE) mechanism and feature attention (FA) module. Finally, the model integrates multiple optimization objectives to ensure efficient multi-modal translation. This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies. Overall, experimental results have shown that the proposed method is versatile and scalable.

View Full Text View/Add Comment Download reader