Update: 2025-09-23

Fatemeh Daneshfar

Faculty of Engineering / Department of IT and Computer Engineering

Master Theses

Text-to-image synthesis using generative adversarial networks based on attention mechanism
2024
Text-to-image synthesis, a fundamental task in generative artificial intelligence, seeks to produce realistic images that match natural language descriptions. This study proposes a dual-attention adversarial generative network that uses an innovative multi-stage architecture to enhance the issues of visual realism, semantic coherence, and fine-grained detail generation. The DA2GAN model uses two distinct phases, a self-directed phase to generate low-resolution drafts, and an alignment phase to enhance these drafts to high-resolution outputs. The dual-attention technique allows DA2GAN to focus on both textual and visual features, ensuring accurate representation of descriptive elements and producing coherent, high-quality images. We evaluate DA2GAN using the CUB and Oxford-102 datasets and evaluate the competitive results on well-known metrics such as IS, FID, and RP. The proposed model outperforms previous models in both qualitative and quantitative comparisons. The qualitative comparisons further emphasize the capacity of the model to produce clearer and more realistic images with better agreement with text descriptions. The innovations demonstrated in DA2GAN demonstrate its potential as an advanced framework for text-to-image synthesis that can be used in creative content development, automated design, and image enhancement
Designing and Collecting a Corpus and Syntactic Parser for Central Kurdish Language
2024
Central Kurdish, widely spoken in Iraq and Iran, lacks sufficient NLP resources. This study addresses this gap by developing the first comprehensive syntactically annotated corpus, advancing Kurdish language technologies and computational linguistics research. The creation of this Central Kurdish Corpus significantly contributes to the field of Kurdish NLP. These resources enable machine translation, information extraction, sentiment analysis, grammar checking, text summarizing, etc., and offer the potential for low-resource language processing. This work employs a systematic, multi-stage methodology. First, a diverse corpus of 3,000 carefully curated sentences is manually annotated with fine-grained POS tags, utilizing a custom tagset of 74 tags that captures intricate grammatical distinctions in Kurdish. The corpus is then syntactically annotated based on a CFG meticulously designed for Central Kurdish, encompassing 249 production rules. The corpus spans various domains, ensuring extensive coverage of syntactic phenomena. For parsing, the study implements a deterministic rule-based dynamic programming algorithm using top-down chart parsing, which leverages the developed CFG rules. This approach demonstrates robustness in handling the intricacies of Central Kurdish morphology and flexible word order. Subsequently, the research explores the application of fine-tuned cutting-edge LLMs, specifically GPT-3.5, to constituency parsing tasks. The LLMs are fine-tuned on the annotated corpus to augment parsing performance, particularly for complex and ambiguous syntactic structures. As a result, the POS tagging and rule-based parsing approaches are manually evaluated using the PARSEVAL framework. This manual evaluation reveals a POS tagging accuracy of 98.7% and a parsing accuracy of 98% for the rule-based approach on a set of 150 sentences as verified through expert review and inter-annotator agreement. The LLM-based method is assessed using the EVALB tool in this PARSEVAL evaluation scheme implementation and a standard metric for constituency parsing. This achieved 84.92% of sentences were parsed with a complete match, and the overall Bracketing F-measure reached 96.41%.
Semi-Supervised Dust-to-Clean Image Translation Using Regression Minimization and Consistency Regularization
2024
The efficacy of outdoor vision systems in recording images is often compromised by atmospheric elements like dust, leading to challenges in subsequent processing. Dusty images commonly suffer from issues such as reduced contrast, decreased visibility, and color distortion. These issues significantly degrade the quality of the captured images, making them less useful for applications that rely on clear and accurate visual data. Consequently, the elimination of dust, known as dedusting, is crucial as a pre-processing step in many computer vision applications. However, achieving effective dedusting is not straightforward. A significant challenge faced by learning-based dedusting methods is the lack of paired training data. Paired training data, which consists of corresponding dusty and clean image pairs, is essential for supervised learning algorithms to learn the mapping from dusty to clean images. Unfortunately, acquiring such data is often impractical or impossible in real-world scenarios, which can severely impact model performance. To address this challenge, we propose a novel semi-supervised approach for dust-to-clean image translation, termed DR-Net, which emphasizes regression minimization and consistency regularization to improve dusty image quality. In regression minimization, we ensure the preservation of the structural integrity of dedusted images by training our model using a limited set of synthetic dusty images in a supervised framework. Furthermore, we employ consistency regularization to ensure that our model produces dust-free images with distributions same to real-world clean images and maintains adherence to the statistical characteristics of the dark channel of clean images in an unsupervised framework. Experimental results underscore the effectiveness of our method in yielding high-quality outcomes. Our approach surpasses the performance of existing methods, demonstrating superior capability in enhancing various computer vision tasks such as object detection, recognition, and tracking in dusty environments. The improvement in image quality not only facilitates better human interpretation but also significantly boosts the performance of automated systems that rely on clear visual data.
Using Deep Learning Transformers for Semantic Similarity in a Sorani Kurdish Question Answering System
2024
This research focuses on developing a question-answering system for the Sorani Kurdish language using advanced deep-learning models such as BERT, GPT, and T5. The main objective of this system is to provide accurate and relevant answers to user queries while considering the limitations in processing low-resource languages. A dataset containing 1,000 pairs of questions and answers in Sorani Kurdish was used to evaluate the models. This data was loaded and preprocessed for training and evaluation of the transformer models. The performance of the models was assessed using common metrics such as accuracy, precision, recall, and F1 score. The evaluation results indicate that the BERT model achieved the best performance among the models, with an accuracy of 0.98 and high precision and recall scores. The T5 model ranked second with an accuracy of 0.86 and an F1 score of 0.83, while the GPT model performed significantly weaker and required further optimizations. These findings suggest that transformer models, especially BERT and T5, are more suitable for processing low-resource languages.
Improving Liver Disease Detection Using Oversampling and Network Analysis
2024
Liver diseases represent a significant global health challenge, impacting millions of individuals and leading to morbidity and mortality due to their often asymptomatic nature. The early detection and accurate diagnosis of liver disorders are critical for effective treatment and management, making it imperative to leverage advanced technologies such as machine learning. As healthcare systems increasingly rely on data-driven solutions, employing robust predictive models for liver disease can transform clinical practices, improve patient outcomes, and reduce the burden on healthcare providers. This thesis presents an investigation into the application of machine learning techniques for the detection of liver diseases using the Indian Liver Patient Records dataset, which includes clinical data from 579 patients. The study meticulously preprocesses the data by addressing class imbalance through the ADASYN algorithm, encoding categorical variables with LabelEncoder, and calculating feature correlations using the Spearman method. A graph-based approach was adopted to extract insights from patient features, enabling the creation of enriched data representations that were subsequently used to train various machine learning classifiers, including HistGradientBoostingClassifier, RandomForestClassifier, and AdaBoostClassifier. The findings of this research reveal substantial improvements in predictive accuracy, with the HistGradientBoostingClassifier achieving an impressive accuracy of 98.49%. The model outperformed existing methodologies, demonstrating the effectiveness of advanced feature extraction techniques and robust data preprocessing strategies in enhancing the reliability of predictions for liver disease diagnosis. This study not only highlights the expanding role of machine learning in healthcare but also serves as a validation of the potential benefits of data-driven approaches in disease management. Despite the promising results, several limitations are acknowledged in this research. The reliance on a specific dataset may restrict the generalizability of the findings, and the methodologies employed may require validation on diverse datasets to confirm their effectiveness across different populations. Additionally, there is a need for further exploration of deep learning techniques and the integration of multimodal data sources to improve diagnostic accuracy. Future research should aim to address these limitations while continuing to expand the understanding and application of machine learning within the realm of liver disease detection and beyond.
Multi-objective Manifold Representation for Opinion Mining
2024
Sentiment analysis is an essential task in numerous domains, necessitating effective dimensionality reduction and feature extraction techniques. This study introduces MultiObjective Manifold Representation for Opinion Mining (MOMR). This novel approach combines deep global and local manifold feature extraction to reduce dimensions while capturing intricate data patterns efficiently. Additionally, incorporating a self-attention mechanism further enhances MOMR's capability to focus on relevant parts of the text, resulting in improved performance in sentiment analysis tasks. MOMR was evaluated against established techniques such as Long Short-Term Memory (LSTM), Naive Bayes (NB), Support Vector Machines (SVM), Recurrent Neural Networks (RNN), and Convolutional Neural Networks (CNN), as well as recent state-of-the-art models across multiple datasets including IMDB, Fake News, Twitter, and Yelp. Therefore, our comparative analysis underscores MOMR's efficacy in sentiment analysis tasks across diverse datasets, highlighting its potential and applicability in real-world sentiment analysis applications. On the IMDB dataset, MOMR achieved an accuracy of 99.7% and an F1 score of 99.6%, outperforming other methods such as LSTM, NB, SMSR, and various SVM and CNN models. For the Twitter dataset, MOMR attained an accuracy of 88.0% and an F1 score of 88.0%, surpassing other models, including LSTM, CNN, BiLSTM, Bi-GRU, NB, and RNN. In the Fake News dataset, MOMR demonstrated superior performance with an accuracy of 97.0% and an F1 score of 97.6%, compared to techniques like RF, RNN, BiLSTM+CNN, and NB. For the Yelp dataset, MOMR achieved an accuracy of 80.0% and an F1 score of 80.0%, proving its effectiveness alongside other models such as Bidirectional Encoder Representations from Transformers (BERT), aspect-sentence graph convolutional neural network (ASGCN), Multi-layer Neural Network, LSTM, and bidirectional recurrent convolutional neural network attention (BRCAN).
Self-representation factorization to learn generalizable representation
2024
Nonnegative Matrix Factorization (NMF), as a group representation learning model, produces part-based representation with interpretable features and can be applied to various problems, such as text clustering. The findings indicate that the NMF model with Kullback-Leibler divergence (NMFk), and NMF with β divergence (β-NMF) exhibits promising performance in the task of data clustering. However, existing NMFbased data clustering methods are defined within a latent decoder model, lacking a verification mechanism. Recently, self-representation techniques have been applied to a wide range of tasks, empowering models to autonomously learn and verify representations that faithfully reflect the intricacies and nuances inherent in their input data. In this research, we propose two self-representation factorization models for data clustering that incorporates semantic information and grapg regularization into its learning process, respectively. The Semantic-aware Encoder-Decoder NMF model based on Kullback-Liebler divergence (SEDNMFk), and Encoder-Decoder NMF with β divergence integrates encoder and decoder factorizations into a unified cost function that mutually verify and refine each other, resulting in the formation of more distinct clusters. We present an efficient and effective optimization algorithms based on multiplicative update rules to solve the two proposed unified model. The experimental results on the ten well-known datasets show that the proposed models outperforms other state-of-the-art data clustering methods.
Hybrid Deep Learning Approach: CNN-ViT Fusion for Breast Cancer Diagnosis in Ultrasound Images
2024
Breast cancer represents one of the leading cancer diagnoses in women around the world. Early detection and accurate classification of breast cancer from medical images are crucial, as they enable timely treatment, which can significantly improve patient outcomes. Ultrasound imaging is a popular diagnostic method in radiology for evaluating breast health. Over the past ten years, deep learning approaches, especially Convolutional Neural Networks (CNNs), have been used to develop comprehensive systems for recognizing image patterns. More recently, the Vision Transformer (ViT) has gained attention as a novel deep learning architecture, largely because of its self-attention mechanisms, which have greatly improved the field of image processing. These models have exhibited strong performance across a wide range of image-related applications. Computer-Aided Diagnosis (CAD) systems in medical field have increasingly adopted deep learning methodologies, recognized for their superior ability to extract essential features from medical images. This study proposes a hybrid deep learning approach that integrates CNNs with ViTs to enhance breast cancer diagnosis in ultrasound images. This method capitalizes on the beneficial attributes of CNNs and ViTs to boost the accuracy of breast cancer diagnosis. By combining the powerful local feature extraction ability of CNNs with ViTs focus on long-range dependencies and global features, the hybrid network, integrating multiple vision architectures, optimizes the utilization of information, enabling a more thorough and nuanced interpretation of medical imaging data. The methodology was assessed using two publicly accessible datasets, revealing superior performance compared to current state-of-the-art techniques. This indicates that our method has the potential to generalize across various datasets. The high accuracy achieved by this hybrid deep learning model suggests that it can play a significant role in improving breast cancer diagnosis.
Automatic Colorectal Cancer Detection Using Machine Learning and Deep Learning Based on Feature Selection
2024
Colorectal cancer (CRC), accounting for 10% of global cancer cases and being the third most prevalent type, is expected to see a significant increase in the coming years. This surge underscores the need for precise diagnostics. Effective treatment relies on accurate histopathological analysis of hematoxylin and eosin (H&E) stained biopsies, which is critical for recommending minimally invasive treatments. However, manual evaluations of these biopsies are labor-intensive and error-prone due to staining variations and inconsistencies, complicating the tasks of pathologists. To address these challenges, advanced automated image analysis, including deep learning with convolutional neural networks (CNNs) and machine learning (ML) techniques, has significantly enhanced computer-aided diagnosis systems. Consequently, this paper proposes a composite model that combines deep learning and machine learning to improve colorectal cancer diagnosis accuracy. Specifically, the model aims to increase diagnostic precision, reduce complexity and computing demands, and effectively prevent overfitting for reliable performance. Therefore, the proposed cascaded design includes feature extraction using MobileNetV2 and DenseNet121 via transfer learning (TL), data distribution balancing in the Extended Bioimaging Histopathological Image Segmentation (EBHI-Seg) dataset using the Synthetic Minority Over-sampling Technique (SMOTE), key feature selection using a Chisquare test, classification by machine learning algorithms, and improving classification accuracy through hyperparameter tuning. Finally, the results evaluated on the available EBHI-Seg dataset achieve 97.28% accuracy, 97.29% precision, 97.27% recall, 96.27% F1- score, and 99.4% area under the curve (AUC), demonstrating that the suggested model is superior to other methods already in use.
An attention-based multimodal deep learning model for image captioning
2024
Our brain is capable of describing and categorizing the images that appear before us. But how can a computer process an image and identify it with an appropriate and accurate description? This seemed unattainable a few years ago, but with the advancement of machine vision algorithms and deep learning, as well as the availability of datasets and suitable artificial intelligence models, creating an appropriate description generator for an image has become easier. Image captioning is also a growing industry worldwide. The process of generating image captions involves converting images into a series of words using a series of pixels. Image captioning can be seen as an end-to-end challenge in the form of a sequence-to-sequence challenge. To achieve this goal, it is necessary to process both words and images. In this thesis, first, an explanation of image captioning and its applications in various fields is presented, and then, the evolutionary course of image captioning methods is examined. Various methods that have been proposed over time for image captioning have been comprehensively reviewed. This coherent classification helps us to gain a deeper understanding of the techniques and methods available in image captioning. Also, recent articles in the field of image captioning have been reviewed in this thesis. Based on the results obtained from the review of recent articles, the necessity of continuing research in the field of image captioning has been emphasized. These researches can lead to significant improvements in existing methods for image captioning and also the discovery of newer and more advanced methods. In this thesis, an encoder-decoder method based on attention has been used. Unlike previous methods where attention was only applied to one of the sections, the attention mechanism has been applied to both the image and the text. This is a new idea in this field, and the final caption is generated word by word. The FLICKR8K dataset has been used, and the evaluation metrics used are BLEU (1,2,3,4), ROUGE, and METEOR. The results are 51, 49, 48, 44, 52, and 37.5 respectively. These results indicate an improvement over previous method.
A trusted-based multi-objective approach to improve the performance of recommender systems.
2024
Recommender systems have become an integral and vital part of various online businesses to achieve better user experience and customer growth and revenue. The accuracy and diversity of recommendations are important metrics to evaluate the performance of a recommender system. Many different strategies have been developed in the existing literature to strike a balance between accuracy and diversity. However, these methods often focus on a one-size-fits-all trade-off strategy without considering the specific recommendation situation of each user, resulting in improvements only in individual diversity or overall diversity.[1] A recommender system searches for items or information that are useful to the user based on the user’s past behaviors and the characteristics of the items. Recommender systems should also be able to provide highly specific or personalized items to the user, which is often measured by the diversity metric. The performance of recommender systems can be evaluated in several dimensions, such as the accuracy of recommendations for each user and the diversity of recommendations across different users. The goal of recommender systems is to help users find relevant information based on their preferences rather than searching through vast amounts of information using search engines. One of the leading topics in recommender system research is diversity, which is not only a way to solve the overfitting problem, but also an approach to enhance the quality of the user experience with the recommender system. The importance of diversity lies in the fact that it has a dual purpose: increasing user satisfaction with the recommendations provided and reducing the overfitting problem. In this paper, we implement the Ant Colony Optimization (ACO) search algorithm on a graph of users to generate a list of recommendations with greater diversity and acceptable accuracy for each user. By presenting a proposed method, we address the trade-off between accuracy and diversity in recommender systems from different perspectives and present a new solution using metaheuristic algorithms to improve our goal of developing new ranking methods to improve diversity without excessively reducing accuracy. The proposed approach consists of 4 steps. In the first step, for each target user, an initial filter of users is performed to find users similar to the target user, in the second step, a graph of the selected (filtered) users of the previous step is formed to execute the ant colony optimization search algorithm on it. In the third step, the number p of users that are present in the paths with the highest pheromone value are selected as the recommended list. And in the last step, a list of diverse items is suggested to the users. In this chapter, each of these steps is explained in detail. To evaluate the accuracy, diversity and novelty, several experiments were conducted on three real datasets and the performance of the method was also measured on different groups of users.
Kurdish Text Classification by an Optimization algorithm
2023
Today, with the ever-increasing amount of information and the wide range of topics, classifying texts is one of the challenges of artificial intelligence. Text classification is a branch of natural language processing in which texts are placed into categories or groups. Classification of texts is one of the things that has received attention recently and has many applications, among the most important of them are document classification, information retrieval, question and answer, polarization measurement, etc. Kurdish language is one of the Indo-Iranian branches of the Indo-European languages spoken by more than 30 million people in Western Asia, mainly in Iraq, Turkey, Iran, Syria, Armenia and Azerbaijan. Kurdish language has various dialects and has its own grammar system and rich vocabulary. Most text classification systems can be summarized into four steps: feature extraction, dimensionality reduction, classifier selection, and evaluation. At first, feature extraction from a text (using word coding) is done in different ways. Since most of the extracted features are redundant and irrelevant, they can cause errors in the classifier. Then the selection of more important features is considered as a fundamental problem in text classification. Selecting important features from all features plays a significant role in increasing the efficiency of classification accuracy. At this stage, we are trying to select the best features using machine learning methods, which is done on the Kurdish text data set. Among the methods of machine learning in optimization problems is the use of meta-heuristic algorithms. Many meta-heuristic algorithms have been introduced to date, each inspired by nature. These algorithms make few assumptions about a problem or can search very large spaces of candidate solutions. The laying hen algorithm is one of the best meta-heuristic algorithms in solving optimization problems in continuous space. Using the meta-heuristic algorithm of the laying hen, the features extracted from the text are selected in such a way as to increase the accuracy of the classifier. For this purpose, first, an advanced version of this algorithm is presented in the discrete space, and then all feature selection states are placed in the sample space. By traversing the sample space and evaluating the states point by point, the algorithm moves from point to point. The main challenge is choosing a good starting point and choosing the right range of change for each point. In this research, we have achieved one of the best methods to improve feature selection in the task of text classification, which is a new method. On the other hand, by implementing this method on the Kurdish language (which is considered as one of the few languages in natural language processing), we have enriched our research. The results of this research on a small scale (due to the lack of processing resources) show a one percent improvement in the accuracy of the classifier, which shows the efficiency of the presented approach and opens a new door for dear researchers.
Automatic generation control using multi-agent systems
2009
In this dissertation, intelligent controllers are used, in the structure of which, control performance standards are used in order to follow these standards in addition to proper load-frequency control. The results showed that by applying performance standards in the controller structure, the controller performance improved in meeting control objectives such as reduced settling time and overshoot. In this dissertation, in addition to the use of classical algorithms, a controller based on multi-agent systems was used, taking into account performance standards to both reduce the exhaustion of Governor equipment and follow NERC performance standards to increase reliability. The results show that controllers that follow NERC standards perform better and their frequency response improves.