Sahu, Adyasha (2024) Development of Efficient Machine Learning and Deep Learning -based Techniques for Detection of Breast Cancer with Small Datasets. PhD thesis.
![]() | PDF (Restricted up to 08/09/2027) Restricted to Repository staff only 25Mb |
Abstract
Breast cancer is a serious health issue among women across the globe, affecting millions of lives every year. Breast cancer is broadly defined as the irregular and unusual growth of breast tissues that results in the formation of a tumor-like mass. The growth rate of the tumor determines whether it is benign or malignant. Mammogram and ultrasound are very frequently used popular imaging modalities for breast cancer screening, since these two are cost effective, less harmful, and easily available. Early and accurate detection of this disease plays a vital role in providing proper treatment plan to save valuable lives, which highlight the need for researchers to develop even more accurate techniques for timely detection of breast cancer. In response to this requirement, technological developments and the incorporation of cutting-edge methodologies such as machine learning and deep learning have led to substantial innovation in breast cancer detection. Hence, Computer-Aided Diagnosis (CAD)-based automatic breast cancer diagnosis has emerged as a critical research field in medical image analysis. As a result, this dissertation focuses on developing of efficient and high-performing frameworks for identifying breast cancer utilizing mammography and ultrasound images of the breast. Currently, breast cancer diagnosis faces challenges in achieving high accuracy and efficiency, particularly when dealing with different datasets and limited data. Conventional approaches may struggle to extract specific information from medical imaging data, limiting the early detection and classification performance of breast cancer. Additionally, computational complexity remains an issue. In this scenario, there is a huge need for developing and optimizing machine and deep learning-based approaches for breast cancer detection that may address these problems while also improving accuracy, flexibility to varied datasets, and computational efficiency. Hence, in this dissertation, focus is given on developing efficient high performance frameworks for detection of breast cancer using mammogram and ultrasound breast images. For this purpose, eight different breast cancer detection schemes including one machine learning-based scheme, five deep learning-based schemes, and two sparse learning along with transfer learning schemes are proposed. In Chapter 3, a machine learning-based hybrid framework for automatic breast cancer detection is presented i.e. capable of recognizing abnormalities and malignancy in both mammographic and ultrasonic datasets. The method combines a variety of cutting-edge technologies to improve the accuracy and efficiency of breast cancer detection. To improve image visibility, preprocessing begins with the Laplacian of a Gaussian-based modified high boosting filter (LoGMHBF). Furthermore, for feature extraction both Local Binary Pattern (LBP) and Discrete Wavelet Transform (DWT) features are used to extract texture and frequency information from pre-processed images. Principal Component Analysis (PCA) is then used to reduce dimensionality while preserving important variance in high-dimensional feature spaces, resulting in improved classification performance and computing efficiency. The classification stage employs a hybrid model that combines Support Vector Machines (SVM) and Random Forest (RF), which is optimized using a probability-based weight factor and threshold value. This complete approach guarantees both computational efficiency and diagnostic accuracy in breast cancer diagnosis. In contrast to deep learning, which uses end-to-end networks, its overall system performance is significantly dependent on the outcome of each stage. The next chapter, Chapter 4 presents five effective deep learning-based strategies for early and accurate detection of breast cancer. Deep learning, especially transfer learning networks are now widely recognized as a feasible technique for addressing the difficulty of restricted datasets in the medical industry, hence lowering training errors. The first technique in this chapter presents an EfficientNetB0-based breast cancer detection technique demonstrating good performance even with limited datasets. Its adaptive scaling of depth, width, and resolution achieves efficient detection by carefully balancing classification accuracy and computational cost. Furthermore, use of LoGMHBF in pre-processing improves the performance. The second strategy proposes a ShuffleNet-Random Forest-based hybrid frame work to improve detection performance. This system combines ShuffleNet, a deep lightweight CNN to extract significant deep features, with a Random Forest classifier to take advantage of the strengths of both methods. Channel shuffling and group convolution help in faster processing while maintaining discriminative features. In addition, RF is applied instead of softmax layer to boost the classification performance further. The third strategy provides an efficient deep ensemble breast cancer detection system, which ensembles three popular transfer learning networks: AlexNet, ResNet, and MobileNetV2. This ensemble approach attempts to obtain better performance in breast cancer diagnosis by utilizing the distinctive features of each design. The benefits of Rectified Linear Unit (ReLU) activation functions and overlap pooling are added by using AlexNet, which improves the network’s ability to recognize complex spatial patterns and nonlinear correlations in the breast image. To further solve the vanishing gradient issue, ResNet incorporates skip connections that enhances the model’s ability to extract hierarchical features and facilitates the training of deeper networks. Incorporation of residual learning results in a faster system. Furthermore, the addition of MobileNetV2 optimizes computing performance without compromising accuracy by introducing depth-wise separable convolutions and an inverted residual bottleneck structure. This deep ensemble technique achieves amazing results by using the strengths of these three networks. Though deep ensemble classifier gives good result, to boost computational efficiency further, the fourth strategy introduces a deep hybrid framework that combines ShuffleNet and ResNet18. The use of probability-based weight factor (w2) and threshold value ( ) improves performance greatly. Experimentally selected optimum threshold value ( ) makes the system faster and more accurate as the second classifier is functional only when the weight factor w2 > ; a threshold value. The fifth strategy introduces a modified Relation and Margin Network (MReMarNet) to identify breast cancer more efficiently. This model focuses on increasing intraclass compactness and interclass separability, which is very useful for small sample datasets. The relation unit loss and cross-entropy loss by a relation unit (RU) and a fully connected (FC) unit helps in feature learning and decision boundary-based classification, respectively. The coupled benefits make the system more efficient. Though all five schemes introduced in Chapter 4 result in notable performances even with small data, to lessen the computational time, two effective sparse-based deep learning schemes are presented in Chapter 5. The first method provides three alternate deep layer cascade breast cancer detection models (ADLCBCDMs): ADLCBCDM-1, ADLCBCDM-2, and ADLCBCDM-3 are proposed. In ADLCBCDM-1, projection to discriminative subclasses (PDS) is used as pre-processing, where as in ADLCBCDM-2, Laplacian of Gaussian-based modified high boosting filter (LoGMHBF) alongwith PDS is applied prior to alternate deep layer cascade model (ADLCM) to boost system performance. ADLCBCDM-3, an efficient hybrid deep layer cascade representation-based breast cancer detection method is proposed by effectively hybridizing ADLCBCDM-2 and ADLCBCDM-1. By using a class discriminant softmax vector representation at the interface, it cascades both sparse and collaborative representations and so combines the benefits of both. Furthermore, it enhances hierarchical learning capability by expanding traditional shallow-sparse representation to an efficient multi-layer learning approach. The second technique proposes an efficient Convolutional Neural Network with Structured Analysis Dictionary Learning (SADL)-based Feature Selection for Breast Cancer Detection. An effective deep learning network is used to extract critical deep features in all three parallel paths from the original image, resultant image after using Canny edge detector, and resultant images after applying LoGMHBF, respectively. Furthermore, to get more important class-discriminatory features, Structured-Analysis-Dictionary-Learning (SADL) is used, resulting in an improved classification performance. Finally, SVM is used to increase the classification performance by efficiently optimizing hyperplane placements. Thus, the combined benefits of transfer learning-based feature extraction, sparse learning-based feature selection, and machine learning-based classification improve the overall performance. Finally, the experimental findings indicate that the proposed MReMarNet-based scheme yields best classification performance among all the proposed schemes across all three datasets: mini-DDSM, BUSI, and BUS2 due to good intraclass compactness in the relation unit, and good interclass separability in Fully-connected unit, which leads to an improvement in feature discrimination capability.
Item Type: | Thesis (PhD) |
---|---|
Uncontrolled Keywords: | Breast Cancer; Classification; Mammogram; Ultrasound image; Machine Learning; Deep Learning; Sparse Learning. |
Subjects: | Engineering and Technology > Electronics and Communication Engineering > Sensor Networks Engineering and Technology > Electronics and Communication Engineering > Intelligent Instrumentaion Engineering and Technology > Electronics and Communication Engineering > Image Processing |
Divisions: | Engineering and Technology > Department of Electronics and Communication Engineering |
ID Code: | 10800 |
Deposited By: | IR Staff BPCL |
Deposited On: | 22 Sep 2025 15:30 |
Last Modified: | 22 Sep 2025 15:30 |
Supervisor(s): | Meher, Sukadev |
Repository Staff Only: item control page