Construction and value of a vestibular function calibration test recognition model based on dual-stream ViT and ConvNeXt architecture

Xu LUO; Peixia WU; Weiming HAO; Yinhong QU; Han CHEN

Return

Construction and value of a vestibular function calibration test recognition model based on dual-stream ViT and ConvNeXt architecture

VernacularTitle:基于双流ViT＋ConvNeXt架构的前庭功能校准识别模型构建与价值
Author: Xu LUO ¹ ; Peixia WU ² ; Weiming HAO ² ; Yinhong QU ¹ ; Han CHEN ¹
Author Information

1. Shanghai ZEHNIT Medical Technology Co., Ltd., Shanghai 201318, China.
2. Vertigo and Balance Function Disorders Clinical Center, EYE & ENT Hospital of Fudan University, Shanghai 200031, China.
Publication Type:Monographicreport:Multi-disciplinarydiagnosisandtreatmentofvertigo
Keywords: vestibular function; calibration test; videonystagmography; deep learning model
From: Chinese Journal of Clinical Medicine 2025;32(2):207-211
CountryChina
Language:Chinese
Abstract: Objective To improve the efficiency and accuracy of videonystagmography calibration test results while enabling effective recognition of saccadic undershoot waveform by developing a dual-stream architecture-based deep learning model. Methods A vestibular function calibration test recognition model with cross-modal feature fusion was constructed by integrating vision transformer (ViT) and a modified ConvNeXt convolutional network. The model utilized trajectory pictures and spatial distribution maps as inputs, employed a multi-task learning framework to classify calibration data, and to directly evaluate undershoot waveform. Results The model showed outstanding performance in assessing calibration compliance. The accuracy, sensitivity, specificity of the model in left side, middle, and right side were all greater than 90%, and AUC values were all greater than 0.99, with 97.66% of optimal accuracy (middle), 98.98% of optimal sensitivity (middle), 96.87% of optimal specificity (right side), and 0.997 of AUC (right side). The model also showed promising performance in undershoot waveform recognition with 87.50% of accuracy, 89.66% of sensitivity, 85.71% of specificity, 86.67% of F1 score, and 0.931 of AUC. Conclusions The proposed method not only significantly enhances the efficiency and accuracy of calibration test results, but also provides a novel solution for undershoot waveform recognition.