Colorectal cancer (CRC) is a common malignant tumor that seriously threatens human health. CRC presents a formidable challenge in terms of accurate identification due to its indistinct boundaries. With the widespread adoption of convolutional neural networks (CNNs) in image processing, leveraging CNNs for automatic classification and segmentation holds immense potential for enhancing the efficiency of colorectal cancer recognition and reducing treatment costs. This paper explores the imperative necessity for applying CNNs in clinical diagnosis of CRC. It provides an elaborate overview on research advancements pertaining to CNNs and their improved models in CRC classification and segmentation. Furthermore, this work summarizes the ideas and common methods for optimizing network performance and discusses the challenges faced by CNNs as well as future development trends in their application towards CRC classification and segmentation, thereby promoting their utilization within clinical diagnosis.
High resolution (HR) magnetic resonance images (MRI) or computed tomography (CT) images can provide clearer anatomical details of human body, which facilitates early diagnosis of the diseases. However, due to the imaging system, imaging environment and human factors, it is difficult to obtain clear high-resolution images. In this paper, we proposed a novel medical image super resolution (SR) reconstruction method via multi-scale information distillation (MSID) network in the non-subsampled shearlet transform (NSST) domain, namely NSST-MSID network. We first proposed a MSID network that mainly consisted of a series of stacked MSID blocks to fully exploit features from images and effectively restore the low resolution (LR) images to HR images. In addition, most previous methods predict the HR images in the spatial domain, producing over-smoothed outputs while losing texture details. Thus, we viewed the medical image SR task as the prediction of NSST coefficients, which make further MSID network keep richer structure details than that in spatial domain. Finally, the experimental results on our constructed medical image datasets demonstrated that the proposed method was capable of obtaining better peak signal to noise ratio (PSNR), structural similarity (SSIM) and root mean square error (RMSE) values and keeping global topological structure and local texture detail better than other outstanding methods, which achieves good medical image reconstruction effect.
In response to the issues of single-scale information loss and large model parameter size during the sampling process in U-Net and its variants for medical image segmentation, this paper proposes a multi-scale medical image segmentation method based on pixel encoding and spatial attention. Firstly, by redesigning the input strategy of the Transformer structure, a pixel encoding module is introduced to enable the model to extract global semantic information from multi-scale image features, obtaining richer feature information. Additionally, deformable convolutions are incorporated into the Transformer module to accelerate convergence speed and improve module performance. Secondly, a spatial attention module with residual connections is introduced to allow the model to focus on the foreground information of the fused feature maps. Finally, through ablation experiments, the network is lightweighted to enhance segmentation accuracy and accelerate model convergence. The proposed algorithm achieves satisfactory results on the Synapse dataset, an official public dataset for multi-organ segmentation provided by the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), with Dice similarity coefficient (DSC) and 95% Hausdorff distance (HD95) scores of 77.65 and 18.34, respectively. The experimental results demonstrate that the proposed algorithm can enhance multi-organ segmentation performance, potentially filling the gap in multi-scale medical image segmentation algorithms, and providing assistance for professional physicians in diagnosis.
In computer-aided medical diagnosis, obtaining labeled medical image data is expensive, while there is a high demand for model interpretability. However, most deep learning models currently require a large amount of data and lack interpretability. To address these challenges, this paper proposes a novel data augmentation method for medical image segmentation. The uniqueness and advantages of this method lie in the utilization of gradient-weighted class activation mapping to extract data efficient features, which are then fused with the original image. Subsequently, a new channel weight feature extractor is constructed to learn the weights between different channels. This approach achieves non-destructive data augmentation effects, enhancing the model's performance, data efficiency, and interpretability. Applying the method of this paper to the Hyper-Kvasir dataset, the intersection over union (IoU) and Dice of the U-net were improved, respectively; and on the ISIC-Archive dataset, the IoU and Dice of the DeepLabV3+ were also improved respectively. Furthermore, even when the training data is reduced to 70 %, the proposed method can still achieve performance that is 95 % of that achieved with the entire dataset, indicating its good data efficiency. Moreover, the data-efficient features used in the method have interpretable information built-in, which enhances the interpretability of the model. The method has excellent universality, is plug-and-play, applicable to various segmentation methods, and does not require modification of the network structure, thus it is easy to integrate into existing medical image segmentation method, enhancing the convenience of future research and applications.
Computer-aided diagnosis (CAD) systems play a very important role in modern medical diagnosis and treatment systems, but their performance is limited by training samples. However, the training samples are affected by factors such as imaging cost, labeling cost and involving patient privacy, resulting in insufficient diversity of training images and difficulty in data obtaining. Therefore, how to efficiently and cost-effectively augment existing medical image datasets has become a research hotspot. In this paper, the research progress on medical image dataset expansion methods is reviewed based on relevant literatures at home and abroad. First, the expansion methods based on geometric transformation and generative adversarial networks are compared and analyzed, and then improvement of the augmentation methods based on generative adversarial networks are emphasized. Finally, some urgent problems in the field of medical image dataset expansion are discussed and the future development trend is prospected.
This article aims to combine deep learning with image analysis technology and propose an effective classification method for distal radius fracture types. Firstly, an extended U-Net three-layer cascaded segmentation network was used to accurately segment the most important joint surface and non joint surface areas for identifying fractures. Then, the images of the joint surface area and non joint surface area separately were classified and trained to distinguish fractures. Finally, based on the classification results of the two images, the normal or ABC fracture classification results could be comprehensively determined. The accuracy rates of normal, A-type, B-type, and C-type fracture on the test set were 0.99, 0.92, 0.91, and 0.82, respectively. For orthopedic medical experts, the average recognition accuracy rates were 0.98, 0.90, 0.87, and 0.81, respectively. The proposed automatic recognition method is generally better than experts, and can be used for preliminary auxiliary diagnosis of distal radius fractures in scenarios without expert participation.
Aiming at the problems of missing important features, inconspicuous details and unclear textures in the fusion of multimodal medical images, this paper proposes a method of computed tomography (CT) image and magnetic resonance imaging (MRI) image fusion using generative adversarial network (GAN) and convolutional neural network (CNN) under image enhancement. The generator aimed at high-frequency feature images and used double discriminators to target the fusion images after inverse transform; Then high-frequency feature images were fused by trained GAN model, and low-frequency feature images were fused by CNN pre-training model based on transfer learning. Experimental results showed that, compared with the current advanced fusion algorithm, the proposed method had more abundant texture details and clearer contour edge information in subjective representation. In the evaluation of objective indicators, QAB/F, information entropy (IE), spatial frequency (SF), structural similarity (SSIM), mutual information (MI) and visual information fidelity for fusion (VIFF) were 2.0%, 6.3%, 7.0%, 5.5%, 9.0% and 3.3% higher than the best test results, respectively. The fused image can be effectively applied to medical diagnosis to further improve the diagnostic efficiency.
To address the challenges faced by current brain midline segmentation techniques, such as insufficient accuracy and poor segmentation continuity, this paper proposes a deep learning network model based on a two-stage framework. On the first stage of the model, prior knowledge of the feature consistency of adjacent brain midline slices under normal and pathological conditions is utilized. Associated midline slices are selected through slice similarity analysis, and a novel feature weighting strategy is adopted to collaboratively fuse the overall change characteristics and spatial information of these associated slices, thereby enhancing the feature representation of the brain midline in the intracranial region. On the second stage, the optimal path search strategy for the brain midline is employed based on the network output probability map, which effectively addresses the problem of discontinuous midline segmentation. The method proposed in this paper achieved satisfactory results on the CQ500 dataset provided by the Center for Advanced Research in Imaging, Neurosciences and Genomics, New Delhi, India. The Dice similarity coefficient (DSC), Hausdorff distance (HD), average symmetric surface distance (ASSD), and normalized surface Dice (NSD) were 67.38 ± 10.49, 24.22 ± 24.84, 1.33 ± 1.83, and 0.82 ± 0.09, respectively. The experimental results demonstrate that the proposed method can fully utilize the prior knowledge of medical images to effectively achieve accurate segmentation of the brain midline, providing valuable assistance for subsequent identification of the brain midline by clinicians.
Aiming at the problems of low accuracy and large difference of segmentation boundary distance in anterior cruciate ligament (ACL) image segmentation of knee joint, this paper proposes an ACL image segmentation model by fusing dilated convolution and residual hybrid attention U-shaped network (DRH-UNet). The proposed model builds upon the U-shaped network (U-Net) by incorporating dilated convolutions to expand the receptive field, enabling a better understanding of the contextual relationships within the image. Additionally, a residual hybrid attention block is designed in the skip connections to enhance the expression of critical features in key regions and reduce the semantic gap, thereby improving the representation capability for the ACL area. This study constructs an enhanced annotated ACL dataset based on the publicly available Magnetic Resonance Imaging Network (MRNet) dataset. The proposed method is validated on this dataset, and the experimental results demonstrate that the DRH-UNet model achieves a Dice similarity coefficient (DSC) of (88.01±1.57)% and a Hausdorff distance (HD) of 5.16±0.85, outperforming other ACL segmentation methods. The proposed approach further enhances the segmentation accuracy of ACL, providing valuable assistance for subsequent clinical diagnosis by physicians.
Retinopathy of prematurity (ROP) is a major cause of vision loss and blindness among premature infants. Timely screening, diagnosis, and intervention can effectively prevent the deterioration of ROP. However, there are several challenges in ROP diagnosis globally, including high subjectivity, low screening efficiency, regional disparities in screening coverage, and severe shortage of pediatric ophthalmologists. The application of artificial intelligence (AI) as an assistive tool for diagnosis or an automated method for ROP diagnosis can improve the efficiency and objectivity of ROP diagnosis, expand screening coverage, and enable automated screening and quantified diagnostic results. In the global environment that emphasizes the development and application of medical imaging AI, developing more accurate diagnostic networks, exploring more effective AI-assisted diagnosis methods, and enhancing the interpretability of AI-assisted diagnosis, can accelerate the improvement of AI policies of ROP and the implementation of AI products, promoting the development of ROP diagnosis and treatment.