Deformable image registration plays a crucial role in medical image analysis. Despite various advanced registration models having been proposed, achieving accurate and efficient deformable registration remains challenging. Leveraging the recent outstanding performance of Mamba in computer vision, we introduced a novel model called MCRDP-Net. MCRDP-Net adapted a dual-stream network architecture that combined Mamba blocks and convolutional blocks to simultaneously extract global and local information from fixed and moving images. In the decoding stage, we employed a pyramid network structure to obtain high-resolution deformation fields, achieving efficient and precise registration. The effectiveness of MCRDP-Net was validated on public brain registration datasets, OASIS and IXI. Experimental results demonstrated significant advantages of MCRDP-Net in medical image registration, with DSC, HD95, and ASD reaching 0.815, 8.123, and 0.521 on the OASIS dataset and 0.773, 7.786, and 0.871 on the IXI dataset. In summary, MCRDP-Net demonstrates superior performance in deformable image registration, proving its potential in medical image analysis. It effectively enhances the accuracy and efficiency of registration, providing strong support for subsequent medical research and applications.
For pulmonary nodules in computed tomography (CT) images, which exhibit complex morphology and blurred boundaries, existing segmentation methods still fall short in modelling cross-level dependencies of multi-scale features, thereby limiting their performance in pulmonary nodule segmentation tasks. To address these challenges, this paper proposes a semantic segmentation method for pulmonary nodules based on multiscale feature interaction and cross-level coordinate attention (MFI-CLCA). This U-shaped network incorporated three architectures: a convolutional neural network (CNN), a Transformer, and Mamba. During the encoding phase, combining CNN and Mamba learning paradigms capured both global and local information in the input data. The convolutional component extracted complex boundary features of the target by combining multi-scale convolutional operations with adaptive fusion operations. Global and local multi-head attention mechanisms were introduced in the bottleneck layer and decoding phase respectively to model these hierarchical feature dependencies. The skip-connection section incorporated a multi-level coordinate attention module to adaptively focus on the information being passed through. Experimental results on the Lung Image Database Consortium (LIDC) dataset demonstrated that this approach achieved Dice scores of 90.52% and sensitivity of 91.93%, which outperforms existing state-of-the-art methods and validates its effectiveness for lung nodule segmentation tasks.
To address the challenges of spatiotemporal feature heterogeneity, insufficient utilization of frequency band information, and weak cross-subject generalization in electroencephalogram (EEG)-based emotion recognition, this paper proposes a hierarchical spatiotemporal feature learning architecture named spatio-temporal mamba (ST-Mamba) based on state space models. Firstly, the proposed conv-spatio-temporal (CST) dual-branch collaborative module integrates the local feature extraction capability of convolutional neural network (CNN) with the global modeling ability of state space models. Through adaptive weighted fusion, it effectively mitigates the issue of inadequate modeling of inter-channel relationships in EEG signals. Secondly, the designed multi-band spatio-temporal feature pyramid (MBSTP) module adaptively weights features from different frequency bands via a frequency-band attention mechanism, while capturing spatial topological dependencies across brain regions through a hierarchical fusion strategy. Additionally, a data augmentation framework efficiently enhances the model’s cross-subject generalization by applying augmentations in the frequency, temporal, and spatial domains. The proposed model achieves average accuracies of 95.56% and 84.47% on the Shanghai Jiao Tong University emotion EEG dataset (SEED), version III (SEED-III) and version IV (SEED-IV), respectively. Experiments demonstrate that the state space model effectively alleviates the over-smoothing issue in deep networks, offering a novel solution to spatiotemporal heterogeneity and cross-subject generalization challenges in EEG-based emotion recognition.