Non-rigid registration plays an important role in medical image analysis. U-Net has been proven to be a hot research topic in medical image analysis and is widely used in medical image registration. However, existing registration models based on U-Net and its variants lack sufficient learning ability when dealing with complex deformations, and do not fully utilize multi-scale contextual information, resulting insufficient registration accuracy. To address this issue, a non-rigid registration algorithm for X-ray images based on deformable convolution and multi-scale feature focusing module was proposed. First, it used residual deformable convolution to replace the standard convolution of the original U-Net to enhance the expression ability of registration network for image geometric deformations. Then, stride convolution was used to replace the pooling operation of the downsampling operation to alleviate feature loss caused by continuous pooling. In addition, a multi-scale feature focusing module was introduced to the bridging layer in the encoding and decoding structure to improve the network model’s ability of integrating global contextual information. Theoretical analysis and experimental results both showed that the proposed registration algorithm could focus on multi-scale contextual information, handle medical images with complex deformations, and improve the registration accuracy. It is suitable for non-rigid registration of chest X-ray images.
The deep learning-based automatic detection of epilepsy electroencephalogram (EEG), which can avoid the artificial influence, has attracted much attention, and its effectiveness mainly depends on the deep neural network model. In this paper, an attention-based multi-scale residual network (AMSRN) was proposed in consideration of the multiscale, spatio-temporal characteristics of epilepsy EEG and the information flow among channels, and it was combined with multiscale principal component analysis (MSPCA) to realize the automatic epilepsy detection. Firstly, MSPCA was used for noise reduction and feature enhancement of original epilepsy EEG. Then, we designed the structure and parameters of AMSRN. Among them, the attention module (AM), multiscale convolutional module (MCM), spatio-temporal feature extraction module (STFEM) and classification module (CM) were applied successively to signal reexpression with attention weighted mechanism as well as extraction, fusion and classification for multiscale and spatio-temporal features. Based on the Children’s Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) public dataset, the AMSRN model achieved good results in sensitivity (98.56%), F1 score (98.35%), accuracy (98.41%) and precision (98.43%). The results show that AMSRN can make good use of brain network information flow caused by seizures to enhance the difference among channels, and effectively capture the multiscale and spatio-temporal features of EEG to improve the performance of epilepsy detection.
During long-term electrocardiogram (ECG) monitoring, various types of noise inevitably become mixed with the signal, potentially hindering doctors' ability to accurately assess and interpret patient data. Therefore, evaluating the quality of ECG signals before conducting analysis and diagnosis is crucial. This paper addresses the limitations of existing ECG signal quality assessment methods, particularly their insufficient focus on the 12-lead multi-scale correlation. We propose a novel ECG signal quality assessment method that integrates a convolutional neural network (CNN) with a squeeze and excitation residual network (SE-ResNet). This approach not only captures both local and global features of ECG time series but also emphasizes the spatial correlation among ECG signals. Testing on a public dataset demonstrated that our method achieved an accuracy of 99.5%, sensitivity of 98.5%, and specificity of 99.6%. Compared with other methods, our technique significantly enhances the accuracy of ECG signal quality assessment by leveraging inter-lead correlation information, which is expected to advance the development of intelligent ECG monitoring and diagnostic technology.
In order to realize the quantitative assessment of muscle strength in hand function rehabilitation and then formulate scientific and effective rehabilitation training strategies, this paper constructs a multi-scale convolutional neural network (MSCNN) - convolutional block attention module (CBAM) - bidirectional long short-term memory network (BiLSTM) muscle strength prediction model to fully explore the spatial and temporal features of the data and simultaneously suppress useless features, and finally achieve the improvement of the accuracy of the muscle strength prediction model. To verify the effectiveness of the model proposed in this paper, the model in this paper is compared with traditional models such as support vector machine (SVM), random forest (RF), convolutional neural network (CNN), CNN - squeeze excitation network (SENet), MSCNN-CBAM and MSCNN-BiLSTM, and the effect of muscle strength prediction by each model is investigated when the hand force application changes from 40% of the maximum voluntary contraction force (MVC) to 60% of the MVC. The research results show that as the hand force application increases, the effect of the muscle strength prediction model becomes worse. Then the ablation experiment is used to analyze the influence degree of each module on the muscle strength prediction result, and it is found that the CBAM module plays a key role in the model. Therefore, by using the model in this article, the accuracy of muscle strength prediction can be effectively improved, and the characteristics and laws of hand muscle activities can be deeply understood, providing assistance for further exploring the mechanism of hand functions.
Photoplethysmography (PPG) is often affected by interference, which could lead to incorrect judgment of physiological information. Therefore, performing a quality assessment before extracting physiological information is crucial. This paper proposed a new PPG signal quality assessment by fusing multi-class features with multi-scale series information to address the problems of traditional machine learning methods with low accuracy and deep learning methods requiring a large number of samples for training. The multi-class features were extracted to reduce the dependence on the number of samples, and the multi-scale series information was extracted by a multi-scale convolutional neural network and bidirectional long short-term memory to improve the accuracy. The proposed method obtained the highest accuracy of 94.21%. It showed the best performance in all sensitivity, specificity, precision, and F1-score metrics, compared with 6 quality assessment methods on 14 700 samples from 7 experiments. This paper provides a new method for quality assessment in small samples of PPG signals and quality information mining, which is expected to be used for accurate extraction and monitoring of clinical and daily PPG physiological information.
Medical studies have found that tumor mutation burden (TMB) is positively correlated with the efficacy of immunotherapy for non-small cell lung cancer (NSCLC), and TMB value can be used to predict the efficacy of targeted therapy and chemotherapy. However, the calculation of TMB value mainly depends on the whole exon sequencing (WES) technology, which usually costs too much time and expenses. To deal with above problem, this paper studies the correlation between TMB and slice images by taking advantage of digital pathological slices commonly used in clinic and then predicts the patient TMB level accordingly. This paper proposes a deep learning model (RCA-MSAG) based on residual coordinate attention (RCA) structure and combined with multi-scale attention guidance (MSAG) module. The model takes ResNet-50 as the basic model and integrates coordinate attention (CA) into bottleneck module to capture the direction-aware and position-sensitive information, which makes the model able to locate and identify the interesting positions more accurately. And then, MSAG module is embedded into the network, which makes the model able to extract the deep features of lung cancer pathological sections and the interactive information between channels. The cancer genome map (TCGA) open dataset is adopted in the experiment, which consists of 200 pathological sections of lung adenocarcinoma, including 80 data samples with high TMB value, 77 data samples with medium TMB value and 43 data samples with low TMB value. Experimental results demonstrate that the accuracy, precision, recall and F1 score of the proposed model are 96.2%, 96.4%, 96.2% and 96.3%, respectively, which are superior to the existing mainstream deep learning models. The model proposed in this paper can promote clinical auxiliary diagnosis and has certain theoretical guiding significance for TMB prediction.
To address issues such as loss of detailed information, blurred target boundaries, and unclear structural hierarchy in medical image fusion, this paper proposes an adaptive feature medical image fusion network based on a full-scale diffusion model. First, a region-level feature map is generated using a kernel-based saliency map to enhance local features and boundary details. Then, a full-scale diffusion feature extraction network is employed for global feature extraction, alongside a multi-scale denoising U-shaped network designed to fully capture cross-layer information. A multi-scale feature integration module is introduced to reinforce texture details and structural information extracted by the encoder. Finally, an adaptive fusion scheme is applied to progressively fuse region-level features, global features, and source images layer by layer, enhancing the preservation of detail information. To validate the effectiveness of the proposed method, this paper validates the proposed model on the publicly available Harvard dataset and an abdominal dataset. By comparing with nine other representative image fusion methods, the proposed approach achieved improvements across seven evaluation metrics. The results demonstrate that the proposed method effectively extracts both global and local features of medical images, enhances texture details and target boundary clarity, and generates fusion image with high contrast and rich information, providing more reliable support for subsequent clinical diagnosis.
Sleep stage classification is essential for clinical disease diagnosis and sleep quality assessment. Most of the existing methods for sleep stage classification are based on single-channel or single-modal signal, and extract features using a single-branch, deep convolutional network, which not only hinders the capture of the diversity features related to sleep and increase the computational cost, but also has a certain impact on the accuracy of sleep stage classification. To solve this problem, this paper proposes an end-to-end multi-modal physiological time-frequency feature extraction network (MTFF-Net) for accurate sleep stage classification. First, multi-modal physiological signal containing electroencephalogram (EEG), electrocardiogram (ECG), electrooculogram (EOG) and electromyogram (EMG) are converted into two-dimensional time-frequency images containing time-frequency features by using short time Fourier transform (STFT). Then, the time-frequency feature extraction network combining multi-scale EEG compact convolution network (Ms-EEGNet) and bidirectional gated recurrent units (Bi-GRU) network is used to obtain multi-scale spectral features related to sleep feature waveforms and time series features related to sleep stage transition. According to the American Academy of Sleep Medicine (AASM) EEG sleep stage classification criterion, the model achieved 84.3% accuracy in the five-classification task on the third subgroup of the Institute of Systems and Robotics of the University of Coimbra Sleep Dataset (ISRUC-S3), with 83.1% macro F1 score value and 79.8% Cohen’s Kappa coefficient. The experimental results show that the proposed model achieves higher classification accuracy and promotes the application of deep learning algorithms in assisting clinical decision-making.
Convolutional neural networks (CNNs) are renowned for their excellent representation learning capabilities and have become a mainstream model for motor imagery based electroencephalogram (MI-EEG) signal classification. However, MI-EEG exhibits strong inter-individual variability, which may lead to a decline in classification performance. To address this issue, this paper proposes a classification model based on dynamic multi-scale CNN and multi-head temporal attention (DMSCMHTA). The model first applies multi-band filtering to the raw MI-EEG signals and inputs the results into the feature extraction module. Then, it uses a dynamic multi-scale CNN to capture temporal features while adjusting attention weights, followed by spatial convolution to extract spatiotemporal feature sequences. Next, the model further optimizes temporal correlations through time dimensionality reduction and a multi-head attention mechanism to generate more discriminative features. Finally, MI classification is completed under the supervision of cross-entropy loss and center loss. Experiments show that the proposed model achieves average accuracies of 80.32% and 90.81% on BCI Competition IV datasets 2a and 2b, respectively. The results indicate that DMSCMHTA can adaptively extract personalized spatiotemporal features and outperforms current mainstream methods.
Glioma is a primary brain tumor with high incidence rate. High-grade gliomas (HGG) are those with the highest degree of malignancy and the lowest degree of survival. Surgical resection and postoperative adjuvant chemoradiotherapy are often used in clinical treatment, so accurate segmentation of tumor-related areas is of great significance for the treatment of patients. In order to improve the segmentation accuracy of HGG, this paper proposes a multi-modal glioma semantic segmentation network with multi-scale feature extraction and multi-attention fusion mechanism. The main contributions are, (1) Multi-scale residual structures were used to extract features from multi-modal gliomas magnetic resonance imaging (MRI); (2) Two types of attention modules were used for features aggregating in channel and spatial; (3) In order to improve the segmentation performance of the whole network, the branch classifier was constructed using ensemble learning strategy to adjust and correct the classification results of the backbone classifier. The experimental results showed that the Dice coefficient values of the proposed segmentation method in this article were 0.909 7, 0.877 3 and 0.839 6 for whole tumor, tumor core and enhanced tumor respectively, and the segmentation results had good boundary continuity in the three-dimensional direction. Therefore, the proposed semantic segmentation network has good segmentation performance for high-grade gliomas lesions.