Cross-modal hash retrieval of medical images based on Transformer semantic alignment.
10.7507/1001-5515.202407034
- Author:
Qianlin WU
1
;
Lun TANG
1
;
Qinghai LIU
1
;
Liming XU
2
;
Qianbin CHEN
1
Author Information
1. School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China.
2. Sichuan Artificial Intelligence Research Institute, Yibin, Sichuan 644005, P. R. China.
- Publication Type:Journal Article
- Keywords:
Cross-modal hash;
Segmented training;
Semantic alignment;
Transformer
- MeSH:
Algorithms;
Semantics;
Humans;
Ultrasonography;
Information Storage and Retrieval/methods*;
Image Processing, Computer-Assisted/methods*
- From:
Journal of Biomedical Engineering
2025;42(1):156-163
- CountryChina
- Language:Chinese
-
Abstract:
Medical cross-modal retrieval aims to achieve semantic similarity search between different modalities of medical cases, such as quickly locating relevant ultrasound images through ultrasound reports, or using ultrasound images to retrieve matching reports. However, existing medical cross-modal hash retrieval methods face significant challenges, including semantic and visual differences between modalities and the scalability issues of hash algorithms in handling large-scale data. To address these challenges, this paper proposes a Medical image Semantic Alignment Cross-modal Hashing based on Transformer (MSACH). The algorithm employed a segmented training strategy, combining modality feature extraction and hash function learning, effectively extracting low-dimensional features containing important semantic information. A Transformer encoder was used for cross-modal semantic learning. By introducing manifold similarity constraints, balance constraints, and a linear classification network constraint, the algorithm enhanced the discriminability of the hash codes. Experimental results demonstrated that the MSACH algorithm improved the mean average precision (MAP) by 11.8% and 12.8% on two datasets compared to traditional methods. The algorithm exhibits outstanding performance in enhancing retrieval accuracy and handling large-scale medical data, showing promising potential for practical applications.