دانشیار

تاریخ به‌روزرسانی: 1405/03/31

فاطمه دانشفر

مهندسی / مهندسی کامپیوتر و فناوری اطلاعات

پایان‌نامه‌های کارشناسی‌ارشد

Adaptive Energy Aware Evidential Deep Learning Model
1405
Evidential Deep Learning (EDL) enables single-pass uncertainty estimation by predicting Dirichlet evidence, which is attractive for practical deployment where multi-pass inference is costly. However, many EDL approaches can remain overconfident under distribution shift, exhibit imperfect calibration, and provide limited expressiveness for multi-modal epistemic uncertainty especially near decision boundaries or when inputs deviate subtly from the training support. This thesis proposes Gated Evidential Mixtures (GEM), a family of single-pass uncertainty-aware classifiers that explicitly couples predictive confidence to representation-space support. The central idea is to learn an internal energy-like support signal end-to-end and use it to gate evidential outputs, encouraging strong evidence for well-supported ID inputs while suppressing unjustified evidence for weak-support and OOD-like inputs. The framework is developed incrementally to enable controlled ablation and clarify the contribution of each component. First, GEM-CORE learns a feature-level energy signal and maps it to a bounded integration gate that smoothly modulates evidential strength as support decreases. Second, to represent epistemic multi-modality without multi-pass ensembling, GEM-MIX introduces a lightweight mixture of evidential heads with learned routing weights, preserving single-pass inference while improving uncertainty expressiveness. Third, GEM-FI stabilizes mixture allocations using a Fisher-information–informed regularization/modulation mechanism, mitigating expert (head) collapse and improving uncertainty behavior in sensitive regions such as near decision boundaries. The proposed approach is evaluated across image classification and out-of-distribution (OOD) detection benchmarks under far-OOD and near-OOD shifts, as well as corruption-based distribution shifts. The results indicate that GEM maintains competitive in-distribution (ID) accuracy while substantially improving confidence reliability and ID/OOD separability. In particular, support-aware gating improves calibration by reducing overconfident errors under low support, mixture modeling enhances epistemic separation, and Fisher-informed stabilization yields more robust routing and more consistent uncertainty quality. Overall, GEM provides a practical single-pass framework for deployment-oriented uncertainty estimation, achieving consistent gains over strong EDL baselines on metrics including accuracy, Brier score, AUROC, and AUPR.
روشی جدید برای تشخیص بیماری کبد توسط کشف جامعه
1404
کبد به‌عنوان یکی از حیاتی‌ترین اندام‌های بدن، نقش کلیدی در عملکردهای متابولیکی، سم‌زدایی، ایمنی و تنظیم ترکیبات زیستی ایفا می‌کند. تشخیص زودهنگام بیماری‌های کبدی به‌دلیل پیشرفت سریع این بیماری‌ها و تاثیر مستقیم آن‌ها بر کیفیت زندگی بیماران، از اهمیت بالایی برخوردار است. استفاده از روش‌های یادگیری ماشین در سال‌های اخیر به‌عنوان راهکاری موثر در ارتقای دقت تشخیص پزشکی، توجه بسیاری از پژوهشگران را به خود جلب کرده است. مطالعات متعددی با بهره‌گیری از الگوریتم‌های یادگیری ماشین مانند جنگل تصادفی، ماشین بردار پشتیبان، نزدیکترین همسایه، پرسپترون چندلایه و مدل‌های ترکیبی همچون تقویت گرادیان بر روی داده‌های بیماران کبدی عمدتاً از مجموعه داده بیماران کبدی هند انجام شده‌اند. این پژوهش‌ها از روش‌هایی نظیر پیش‌پردازش داده‌ها، انتخاب ویژگی، تنظیم فراپارامترها و کاهش بُعد به‌منظور بهبود عملکرد مدل‌های طبقه‌بندی استفاده کرده‌اند. در برخی موارد، تمرکز بر ترکیب الگوریتم‌های مختلف یا بهینه‌سازی پارامترها بوده است، در حالی که دسته‌ای دیگر از رویکردهای آماری و تصویری برای استخراج ویژگی‌های موثر بهره برده‌اند. با این حال، بیشتر این مطالعات نسبت به ساختارهای پنهان و روابط پیچیده میان بیماران که می‌توانند حامل اطلاعات ارزشمندی برای بهبود دقت تشخیص باشند توجه کافی نداشته‌اند. در این پژوهش، با هدف ارتقای دقت طبقه‌بندی بیماری‌های کبدی، روشی نوآورانه مبتنی بر تحلیل شبکه‌ای ارائه شده است. ابتدا گراف تشابهی میان بیماران بر اساس فاصله اقلیدسی ایجاد گردید که در آن گره‌ها نمایانگر بیماران و یال‌ها بیانگر میزان شباهت میان آن‌ها بودند. سپس الگوریتم‌های تشخیص اجتماع از جمله لووین، اینفومپ، برچسب گذاری، گام های تصادفی و لیدن برای شناسایی گروه‌های طبیعی بیماران به کار گرفته شدند. ویژگی‌های حاصل از ساختارهای جامعه‌ای به‌صورت دودویی استخراج و به‌عنوان ویژگی‌های مکمل به داده‌های اولیه افزوده شدند. ارزیابی نهایی مدل‌ها با بهره‌گیری از اعتبارسنجی متقابل یک به یک و مجموعه‌ای از الگوریتم‌های یادگیری ماشین نشان داد که افزودن این ویژگی‌های گراف‌محور به داده‌های بالینی، منجر به بهبود معنادار عملکرد طبقه‌بندی شده است. این بهبود در معیارهایی همچون دقت، یادآوری، امتیاز 1F و ضریب همبستگی متیوز به‌ویژه در مدل‌هایی مانند کیسه‌بندی و تقویت گرادیانی قابل مشاهده بود. نتایج به‌دست‌آمده موید آن است که ویژگی‌های ساختاری استخراج‌شده از شبکه‌های تشابه، حاوی اطلاعات پنهانی هستند که می‌توانند نقش موثری در ارتقای قدرت پیش‌بینی سیستم‌های هوشمند تشخیص بیماری‌های کبدی ایفا کنند.
Explainable Multi-Class Classification of Student Performance through Ensemble Machine Learning and Graph-Based Feature Engineering
1404
Predicting student performance in online learning environments is pivotal for enabling timely interventions and personalized educational strategies, yet challenges such as class imbalance and lack of model transparency often limit practical adoption. This thesis proposes a novel machine learning framework for multi-class prediction of student outcomes (Fail, Pass, Distinction, Withdrawn) using the Open University Learning Analytics Dataset (OULAD) for the AAA module, comprising 712 unique student records with 18 traditional features (e.g., demographic, academic, behavioral) and six graph-based features (e.g., degree centrality, clustering coefficient). By integrating advanced feature engineering, ensemble learning, and explainable AI, the framework delivers high predictive accuracy and interpretable insights, addressing shortcomings in traditional predictive approaches. The methodology leverages a Gower distance-based graph construction to generate relational features, capturing complex student interaction patterns within the OULAD dataset. Class weighting was applied to address the class imbalance (469 Pass, 116 Withdrawn, 84 Fail, 43 Distinction), enhancing predictions for minority classes such as Distinction and Fail. A Voting Classifier, combining Random Forest, Gradient Boosting, AdaBoost, XGBoost, and CatBoost, was evaluated through 5-fold cross-validation. Local Interpretable Model-agnostic Explanations (LIME) ensured transparency by identifying key predictors driving outcome classifications. The framework achieved robust performance, with the Voting Classifier yielding an accuracy of 82.02%, precision of 81.31%, recall of 82.02%, F1-score of 80.88%, and AUC of 92.77%, demonstrating approximately 5.9% improvement in F1-score over recent studies. LIME explanations provided actionable insights, enabling educators to understand student-specific factors and tailor interventions, such as increasing virtual learning environment (VLE) engagement for at-risk students. The framework’s multi-class classification and interpretability mark significant advancements, supporting personalized education in online learning environments. This research advances educational data mining by integrating graph-based feature engineering, ensemble learning, and explainability, setting a new benchmark for student performance prediction. Limitations include the moderate computational complexity of the Voting Classifier and reliance on static features, which may overlook temporal dynamics in student behavior. Future work will explore longitudinal data to model performance trajectories, incorporate Graph Neural Networks (GNNs) for enhanced relational modeling, and validate the framework on diverse datasets to improve generalizability. These advancements will further strengthen the framework’s potential to deliver scalable, interpretable solutions for optimizing student outcomes in online learning.
سنتز متن به تصویر با استفاده از شبکه های مولد تخاصمی بر اساس مکانیسم توجه
1403
سنتز متن به تصویر که یک عملکرد اساسی در هوش مصنوعی مولد بحساب می آید به دنبال تولید تصاویر واقعی است که با توصیف های زبان طبیعی مطابقت داشته باشد. این مطالعه یک شبکه مولد متخاصم توجه دوگانه را پیشنهاد می دهد که از یک معماری نوآورانه ی چند مرحله ای برای تقویت مسائل مربوط به واقع گرایی بصری، هماهنگی معنایی و تولید جزئیات ریز دانه استفاده می کند. مدل GAN2DA از دو فاز مجزا بهره می گیرد که یک فاز خود هدایت شونده برای تولید پیش نویس های با وضوح پایین و یک فاز همترازسازی برای افزایش این پیش نویس ها به خروجی هایی با وضوح بالا بهره می گیرد. تکنیک توجه دوگانه به GAN2DA اجازه می دهد تا بر روی ویژگی های متنی و بصری تمرکز داشته و تصویر دقیق عناصر توصیفی را تضمین کرده و تصاویر منسجم و با کیفیت بالا را تولید کند. ما GAN2DA را با استفاده از مجموعه داده های CUB و -102Oxford ارزیابی کرده و نتایج رقابتی را در معیارهای شناخته شده مانند IS، FID و RP ارزیابی می کنیم. مدل پیشنهادی از مدل های ارائه شده قبلی از لحاظ مقایسه های کیفی و کمی پیشی می گیرد. مقایسه های کیفی بیشتر بر ظرفیت این مدل برای تولید تصاویر واضح تر و واقع بینانه تر با هماهنگی بیشتر با توصیف های متنی تاکید می کند. نوآوری هایی که در GAN2DA نشان داده شده است، توانایی آن را به عنوان یک چارچوب پیشرفته برای سنتز متن به تصویر نشان می دهد که در توسعه محتوای خلاق، طراحی خودکار مناسب و افزایش وضوح تصاویر قابل استفاده است.
Designing and Collecting a Corpus and Syntactic Parser for Central Kurdish Language
1403
Central Kurdish, widely spoken in Iraq and Iran, lacks sufficient NLP resources. This study addresses this gap by developing the first comprehensive syntactically annotated corpus, advancing Kurdish language technologies and computational linguistics research. The creation of this Central Kurdish Corpus significantly contributes to the field of Kurdish NLP. These resources enable machine translation, information extraction, sentiment analysis, grammar checking, text summarizing, etc., and offer the potential for low-resource language processing. This work employs a systematic, multi-stage methodology. First, a diverse corpus of 3,000 carefully curated sentences is manually annotated with fine-grained POS tags, utilizing a custom tagset of 74 tags that captures intricate grammatical distinctions in Kurdish. The corpus is then syntactically annotated based on a CFG meticulously designed for Central Kurdish, encompassing 249 production rules. The corpus spans various domains, ensuring extensive coverage of syntactic phenomena. For parsing, the study implements a deterministic rule-based dynamic programming algorithm using top-down chart parsing, which leverages the developed CFG rules. This approach demonstrates robustness in handling the intricacies of Central Kurdish morphology and flexible word order. Subsequently, the research explores the application of fine-tuned cutting-edge LLMs, specifically GPT-3.5, to constituency parsing tasks. The LLMs are fine-tuned on the annotated corpus to augment parsing performance, particularly for complex and ambiguous syntactic structures. As a result, the POS tagging and rule-based parsing approaches are manually evaluated using the PARSEVAL framework. This manual evaluation reveals a POS tagging accuracy of 98.7% and a parsing accuracy of 98% for the rule-based approach on a set of 150 sentences as verified through expert review and inter-annotator agreement. The LLM-based method is assessed using the EVALB tool in this PARSEVAL evaluation scheme implementation and a standard metric for constituency parsing. This achieved 84.92% of sentences were parsed with a complete match, and the overall Bracketing F-measure reached 96.41%.
Semi-Supervised Dust-to-Clean Image Translation Using Regression Minimization and Consistency Regularization
1403
The efficacy of outdoor vision systems in recording images is often compromised by atmospheric elements like dust, leading to challenges in subsequent processing. Dusty images commonly suffer from issues such as reduced contrast, decreased visibility, and color distortion. These issues significantly degrade the quality of the captured images, making them less useful for applications that rely on clear and accurate visual data. Consequently, the elimination of dust, known as dedusting, is crucial as a pre-processing step in many computer vision applications. However, achieving effective dedusting is not straightforward. A significant challenge faced by learning-based dedusting methods is the lack of paired training data. Paired training data, which consists of corresponding dusty and clean image pairs, is essential for supervised learning algorithms to learn the mapping from dusty to clean images. Unfortunately, acquiring such data is often impractical or impossible in real-world scenarios, which can severely impact model performance. To address this challenge, we propose a novel semi-supervised approach for dust-to-clean image translation, termed DR-Net, which emphasizes regression minimization and consistency regularization to improve dusty image quality. In regression minimization, we ensure the preservation of the structural integrity of dedusted images by training our model using a limited set of synthetic dusty images in a supervised framework. Furthermore, we employ consistency regularization to ensure that our model produces dust-free images with distributions same to real-world clean images and maintains adherence to the statistical characteristics of the dark channel of clean images in an unsupervised framework. Experimental results underscore the effectiveness of our method in yielding high-quality outcomes. Our approach surpasses the performance of existing methods, demonstrating superior capability in enhancing various computer vision tasks such as object detection, recognition, and tracking in dusty environments. The improvement in image quality not only facilitates better human interpretation but also significantly boosts the performance of automated systems that rely on clear visual data.
Using Deep Learning Transformers for Semantic Similarity in a Sorani Kurdish Question Answering System
1403
This research focuses on developing a question-answering system for the Sorani Kurdish language using advanced deep-learning models such as BERT, GPT, and T5. The main objective of this system is to provide accurate and relevant answers to user queries while considering the limitations in processing low-resource languages. A dataset containing 1,000 pairs of questions and answers in Sorani Kurdish was used to evaluate the models. This data was loaded and preprocessed for training and evaluation of the transformer models. The performance of the models was assessed using common metrics such as accuracy, precision, recall, and F1 score. The evaluation results indicate that the BERT model achieved the best performance among the models, with an accuracy of 0.98 and high precision and recall scores. The T5 model ranked second with an accuracy of 0.86 and an F1 score of 0.83, while the GPT model performed significantly weaker and required further optimizations. These findings suggest that transformer models, especially BERT and T5, are more suitable for processing low-resource languages.
Improving Liver Disease Detection Using Oversampling and Network Analysis
1403
Liver diseases represent a significant global health challenge, impacting millions of individuals and leading to morbidity and mortality due to their often asymptomatic nature. The early detection and accurate diagnosis of liver disorders are critical for effective treatment and management, making it imperative to leverage advanced technologies such as machine learning. As healthcare systems increasingly rely on data-driven solutions, employing robust predictive models for liver disease can transform clinical practices, improve patient outcomes, and reduce the burden on healthcare providers. This thesis presents an investigation into the application of machine learning techniques for the detection of liver diseases using the Indian Liver Patient Records dataset, which includes clinical data from 579 patients. The study meticulously preprocesses the data by addressing class imbalance through the ADASYN algorithm, encoding categorical variables with LabelEncoder, and calculating feature correlations using the Spearman method. A graph-based approach was adopted to extract insights from patient features, enabling the creation of enriched data representations that were subsequently used to train various machine learning classifiers, including HistGradientBoostingClassifier, RandomForestClassifier, and AdaBoostClassifier. The findings of this research reveal substantial improvements in predictive accuracy, with the HistGradientBoostingClassifier achieving an impressive accuracy of 98.49%. The model outperformed existing methodologies, demonstrating the effectiveness of advanced feature extraction techniques and robust data preprocessing strategies in enhancing the reliability of predictions for liver disease diagnosis. This study not only highlights the expanding role of machine learning in healthcare but also serves as a validation of the potential benefits of data-driven approaches in disease management. Despite the promising results, several limitations are acknowledged in this research. The reliance on a specific dataset may restrict the generalizability of the findings, and the methodologies employed may require validation on diverse datasets to confirm their effectiveness across different populations. Additionally, there is a need for further exploration of deep learning techniques and the integration of multimodal data sources to improve diagnostic accuracy. Future research should aim to address these limitations while continuing to expand the understanding and application of machine learning within the realm of liver disease detection and beyond.
Multi-objective Manifold Representation for Opinion Mining
1403
Sentiment analysis is an essential task in numerous domains, necessitating effective dimensionality reduction and feature extraction techniques. This study introduces MultiObjective Manifold Representation for Opinion Mining (MOMR). This novel approach combines deep global and local manifold feature extraction to reduce dimensions while capturing intricate data patterns efficiently. Additionally, incorporating a self-attention mechanism further enhances MOMR's capability to focus on relevant parts of the text, resulting in improved performance in sentiment analysis tasks. MOMR was evaluated against established techniques such as Long Short-Term Memory (LSTM), Naive Bayes (NB), Support Vector Machines (SVM), Recurrent Neural Networks (RNN), and Convolutional Neural Networks (CNN), as well as recent state-of-the-art models across multiple datasets including IMDB, Fake News, Twitter, and Yelp. Therefore, our comparative analysis underscores MOMR's efficacy in sentiment analysis tasks across diverse datasets, highlighting its potential and applicability in real-world sentiment analysis applications. On the IMDB dataset, MOMR achieved an accuracy of 99.7% and an F1 score of 99.6%, outperforming other methods such as LSTM, NB, SMSR, and various SVM and CNN models. For the Twitter dataset, MOMR attained an accuracy of 88.0% and an F1 score of 88.0%, surpassing other models, including LSTM, CNN, BiLSTM, Bi-GRU, NB, and RNN. In the Fake News dataset, MOMR demonstrated superior performance with an accuracy of 97.0% and an F1 score of 97.6%, compared to techniques like RF, RNN, BiLSTM+CNN, and NB. For the Yelp dataset, MOMR achieved an accuracy of 80.0% and an F1 score of 80.0%, proving its effectiveness alongside other models such as Bidirectional Encoder Representations from Transformers (BERT), aspect-sentence graph convolutional neural network (ASGCN), Multi-layer Neural Network, LSTM, and bidirectional recurrent convolutional neural network attention (BRCAN).
تجزیه خودبازنما برای یادگیری بازنمایی تعمیم پذیر
1403
تجزیه ماتریس نامنفی، به عنوان یک مدل یادگیری بازنمایی گروهی، بازنمایی مبتنی بر جزء را با ویژگی های قابل تفسیر تولید می کند و می تواند برای مسائل مختلف مانند خوشه بندی داده ها به کار گرفته شود. یافته ها نشان می دهد که مدل تجزیه ماتریس نامنفی با واگرایی کولبک-لیبلر و واگرایی β عملکرد امیدوارکننده ای را در کار خوشه بندی متن و انواع داده های مختلف نشان می دهند. با این حال، روش های خوشه بندی متن مبتنی بر تجزیه ماتریس موجود در یک مدل رمزگشا، فاقد چارچوب تایید تعریف شده اند. اخیرا،ً روش های خود بازنمایی برای طیف گسترده ای از وظایف به کار گرفته شده اند، و مدل ها را برای یادگیری مستقل و تایید بازنمایی هایی که به طور کامل پیچیدگی ها و تفاوت های ذاتی در داده های ورودی آن ها را منعکس می کنند، توانمند می سازد. در این پژوهش، ما دو روش تجزیه ماتریس خودبازنمایی را برای خوشه بندی داده ها پیشنهاد می کنیم که اطلاعات معنایی و منظم ساز گراف را به ترتیب در روش های پیشنهادی در فرآیند یادگیری آن مشارکت می دهد. مدل تجزیه ماتریس نامنفی رمزگذار-رمزگشای معنایی مبتنی بر واگرایی کولبک-لیبلر (SEDNMFk (و تجزیه ماتریس رمزگذار-رمزگشا مبتنی بر واگرایی β REDNMF-β ، تجزیه های رمزگذار و رمزگشا را در یک تابع هزینه یکپارچه ادغام می کنند که به طور متقابل یکدیگر را تایید و اصلاح می کنند، و در نتیجه خوشه های متمایزتری تشکیل می شوند. ما الگوریتم های بهینه سازی کارآمد و موثر براساس قوانین به روزرسانی ضربی برای حل مدل یکپارچه روش های پیشنهادی ارائه می کنیم. نتایج تجربی روی ده مجموعه داده شناخته شده نشان می دهد که روش های پیشنهادی ما از سایر روش های خوشه بندی نوین عملکرد بهتری دارد.
Automatic Colorectal Cancer Detection Using Machine Learning and Deep Learning Based on Feature Selection
1403
Colorectal cancer (CRC), accounting for 10% of global cancer cases and being the third most prevalent type, is expected to see a significant increase in the coming years. This surge underscores the need for precise diagnostics. Effective treatment relies on accurate histopathological analysis of hematoxylin and eosin (H&E) stained biopsies, which is critical for recommending minimally invasive treatments. However, manual evaluations of these biopsies are labor-intensive and error-prone due to staining variations and inconsistencies, complicating the tasks of pathologists. To address these challenges, advanced automated image analysis, including deep learning with convolutional neural networks (CNNs) and machine learning (ML) techniques, has significantly enhanced computer-aided diagnosis systems. Consequently, this paper proposes a composite model that combines deep learning and machine learning to improve colorectal cancer diagnosis accuracy. Specifically, the model aims to increase diagnostic precision, reduce complexity and computing demands, and effectively prevent overfitting for reliable performance. Therefore, the proposed cascaded design includes feature extraction using MobileNetV2 and DenseNet121 via transfer learning (TL), data distribution balancing in the Extended Bioimaging Histopathological Image Segmentation (EBHI-Seg) dataset using the Synthetic Minority Over-sampling Technique (SMOTE), key feature selection using a Chisquare test, classification by machine learning algorithms, and improving classification accuracy through hyperparameter tuning. Finally, the results evaluated on the available EBHI-Seg dataset achieve 97.28% accuracy, 97.29% precision, 97.27% recall, 96.27% F1- score, and 99.4% area under the curve (AUC), demonstrating that the suggested model is superior to other methods already in use.
Hybrid Deep Learning Approach: CNN-ViT Fusion for Breast Cancer Diagnosis in Ultrasound Images
1403
Breast cancer represents one of the leading cancer diagnoses in women around the world. Early detection and accurate classification of breast cancer from medical images are crucial, as they enable timely treatment, which can significantly improve patient outcomes. Ultrasound imaging is a popular diagnostic method in radiology for evaluating breast health. Over the past ten years, deep learning approaches, especially Convolutional Neural Networks (CNNs), have been used to develop comprehensive systems for recognizing image patterns. More recently, the Vision Transformer (ViT) has gained attention as a novel deep learning architecture, largely because of its self-attention mechanisms, which have greatly improved the field of image processing. These models have exhibited strong performance across a wide range of image-related applications. Computer-Aided Diagnosis (CAD) systems in medical field have increasingly adopted deep learning methodologies, recognized for their superior ability to extract essential features from medical images. This study proposes a hybrid deep learning approach that integrates CNNs with ViTs to enhance breast cancer diagnosis in ultrasound images. This method capitalizes on the beneficial attributes of CNNs and ViTs to boost the accuracy of breast cancer diagnosis. By combining the powerful local feature extraction ability of CNNs with ViTs focus on long-range dependencies and global features, the hybrid network, integrating multiple vision architectures, optimizes the utilization of information, enabling a more thorough and nuanced interpretation of medical imaging data. The methodology was assessed using two publicly accessible datasets, revealing superior performance compared to current state-of-the-art techniques. This indicates that our method has the potential to generalize across various datasets. The high accuracy achieved by this hybrid deep learning model suggests that it can play a significant role in improving breast cancer diagnosis.
روش یادگیری عمیق چندوجهی مبتنی‌بر توجه برای شرح تصویر
1403
مغز ما قادر است تصاویری که در برابرمان ظاهر می‌شوند را توصیف و یا دسته‌بندی کند. اما چگونه یک کامپیوتر می‌تواند تصویر را پردازش کرده و آن را با یک شرح مناسب و دقیق شناسایی کند؟ این امر چند سال پیش غیرقابل دستیابی به نظر می‌رسید، اما با پیشرفت الگوریتم‌های بینایی ماشین و یادگیری عمیق، همچنین در دسترس بودن مجموعه داده‌ها و مدل‌های هوش مصنوعی مناسب، ساخت یک تولیدکننده شرح مناسب برای یک تصویر آسان‌تر شده است. تولید شرح تصویر همچنین یک صنعت رو به رشد در سراسر جهان است. فرایند تولید شرح تصویر برای تبدیل تصاویر به یک سری کلمات با استفاده از یک سری پیکسل‌ها استفاده می‌شود. می‌توان تصور کرد که تولید شرح تصاویر چالشی از ابتدا تا انتها در قالب یک چالش توالی به توالی است. برای دستیابی به این هدف، لازم است هم کلمات و هم تصاویر را پردازش کرد. در این پایان نامه، ابتدا، توضیحی از شرح تصویر و کاربردهای آن در حوزه‌های مختلف ارائه شده است و سپس، به بررسی سیر تکاملی روش‌های شرح تصویر پرداخته شده است. روش‌های مختلفی که در گذر زمان برای شرح تصویر پیشنهاد شده‌اند، به‌طور جامع مورد بررسی قرار گرفته‌اند. این دسته‌بندی منسجم به ما کمک می‌کند تا به فهم عمیق‌تری از تکنیک‌ها و روش‌های موجود در شرح تصویر برسیم. همچنین، در این پایان نامه مقالات اخیر در حوزه شرح تصویر مورد بررسی قرار گرفته‌اند. با توجه به نتایج بدست آمده از بررسی مقالات اخیر، ضرورت ادامه پژوهش‌ها در حوزه شرح تصویر مورد تاکید قرار گرفته است. این پژوهش‌ها می‌توانند بهبودهای مهمی در روش‌های موجود برای شرح تصویر و نیز کشف روش‌های نوین و پیشرفته‌تر منجر شوند. در این پایان نامه از روش رمزگذار-رمزگشا مبتنی‌بر توجه استفاده شده است که برخلاف روش‌های پیشین که توجه فقط برروی یکی از بخش‌ها اعمال می‌شد، مکانیسم توجه هم برروی تصویر و هم برروی متن اعمال شده است که این یک ایده جدید در این حوزه می‌باشد و همچنین شیوه تولید شرح نهایی لغت به لغت است. از مجموعه داده FLICKR8K استفاده شده است و همچنین از معیار های ارزیابیBLEU (1,2,3,4) ، ROUGE، METEOR استفاده شده است.که این نتایج به ترتیب، 51_49_48_44_52_37.5 است. این نتایج به دست آمده حاکی از بهبود روش‌های قبلی است.
یک راهکار چند هدفه مبتنی بر اعتماد برای بهبود کارایی سیستم های توصیه گر
1402
سیستم‌های توصیه‌کننده به بخش جدایی‌ناپذیر و حیاتی کسب‌وکارهای آنلاین مختلف برای دستیابی به تجربه کاربری بهتر و رشد مشتری و درآمد تبدیل شده‌اند. دقت و تنوع توصیه ها معیارهای مهمی برای ارزیابی عملکرد سیستم توصیه گر هستند. بسیاری از استراتژی های مختلف در ادبیات موجود توسعه داده شده است تا تعادل بین دقت و تنوع ایجاد شود. با این حال، این روش‌ها اغلب بر یک استراتژی مبادله‌ای یک‌اندازه و متناسب با همه بدون در نظر گرفتن موقعیت توصیه‌های خاص هر کاربر تمرکز می‌کنند، که منجر به بهبود تنها در تنوع فردی یا تنوع کلی می‌شود.[1] یک سیستم توصیه‌گر موارد یا اطلاعاتی را جستجو می‌کند که براساس رفتارهای قبلی کاربر و ویژگی‌های آیتم‌ها برای کاربر مفید باشد، سیستم‌های توصیه‌گر همچنین باید بتوانند موارد بسیار خاص یا شخصی‌شده را در اختیار کاربر قرار دهند، که اغلب با معیار تنوع اندازه‌گیری می‌شوند، عملکرد سیستم های توصیه گر را می توان در چندین بعد ارزیابی کرد، مانند دقت توصیه ها برای هر کاربر و تنوع توصیه ها در بین کاربران مختلف. هدف سیستم‌های توصیه‌‌ کننده کمک به کاربران برای یافتن اطلاعات مرتبط بر اساس اولویت های خود به جای جستجوی حجم گسترده اطلاعات با استفاده از موتورهای جستجو است. یکی از موضوعات پیشرو در تحقیقات سیستم های توصیه‌گر تنوع است که نه تنها به عنوان راهی برای حل مشکل بیش از حد برازش ، بلکه رویکردی برای افزایش کیفیت تجربه کاربر با سیستم توصیه‌گر است. اهمیت تنوع در این واقعیت نهفته است که هدف دوگانه دارد: افزایش رضایت کاربر از توصیه‌های ارائه‌شده و کاهش مشکل بیش از حد برازش. در این رساله ما با اجرای الگوریتم جستجوی بهینه‌سازی کلونی مورچه‌ها(ACO) بر روی گرافی از کاربرها برای هر کاربر لیست‌ی از پشنهادها با تنوع بیشتر و دقت قابل‌قبول ایجاد می‌کنیم. با ارائه یک روش پیشنهادی به مساله تبادل بین دقت و تنوع در سیستم های توصیه گر از دیدگاه های مختلف بپردازیم و با استفاده از الگوریتم های فراابتکاری یک راهکار جدید ارائه گردد تا هدف ما را که توسعه روش‌های رتبه‌بندی جدید برای بهبود تنوع بدون کاهش بیش از حد دقت است را بهبود بخشد. رویکرد پیشنهادی شامل 4 مرحله است. در مرحله اول برای هر کاربر هدف، یک فیلتر اولیه از کاربرها انجام می‌شود تا کاربران مشابه کاربر هدف یافت شود، در مرحله دوم یک گراف از کاربرهای انتخاب‌شده(فیلتر شده) مرحله قبل تشکیل می‌شود تا اجرای الگوریتم جستجوی بهینه‌سازی کلونی مورچه‌ها روی آن اجرا شود. در مرحله سوم تعداد p کاربر که در مسیرهایی با بیشترین مقدار فرومون، وجود دارند به عنوان لیست توصیه شده انتخاب می‌شوند. و در مرحله آخر لیستی از آیتم های متنوع به کاربرها پیشنهاد می شود. در این فصل هر یک از این مراحل با جزئیات توضیح داده می‌شوند جهت ارزیابی دقت، تنوع و تازگی، چند آزمایش بر روی سه مجموعه داده واقعی انجام شد و عملکرد روش نیز بر روی گروه‌های مختلف از کاربران مورد سنجش قرار گرفت.
طبقه بندی متن کردی با استفاده از یک الگوریتم بهینه سازی
1402
امروزه با افزایش روزافزون اطلاعات و گستردگی موضوعات، طبقه‌بندی متون یکی از چالش‌های هوش مصنوعی است. طبقه‌بندی متون شاخه‌ایی از پردازش زبان طبیعی است که در آن‌ها متون در دسته‌ها یا گروه‌هایی قرار می‌گیرند. طبقه‌بندی متون یکی از مواردی است که اخیرا مورد توجه قرارگرفته است و کاربرد‌های بسیاری دارد، از جمله مهمترین آنها، دسته‌بندی اسناد، بازیابی اطلاعات، پرسش ‌و پاسخ، قطبیت‌سنجی و ... می‌باشد. زبان کردی یکی از شاخه‌های هندو-ایرانی زبان‌های هندو-اروپایی است که بیش از 30 میلیون نفر در آسیای غربی، عمدتاً در عراق، ترکیه، ایران، سوریه، ارمنستان و آذربایجان به آن صحبت می‌کنند. زبان کردی دارای گویش های متنوعی است و دارای سیستم دستوری و واژگان غنی مختص به خود است. اکثر سیستم‌های طبقه‌بندی متن را می‌توان به چهار مرحله خلاصه کرد: استخراج ویژگی، کاهش ابعاد، انتخاب طبقه‌بندی کننده و ارزیابی. در ابتدا از یک متن، استخراج ویژگی (با استفاده از کد کردن کلمات) به روش‌های مختلف صورت می‌گیرد. ازآنجا که اغلب ویژگی‌های استخراج شده اضافی و بی‌ربط هستند، می‌توانند باعث خطا در طبقه‌بندی‌کننده شوند. سپس انتخاب ویژگی‌های مهم‌تر، به‌عنوان یک مشکل اساسی در طبقه‌بندی متون، مطرح است. انتخاب ‌ویژگی‌های مهم از تمام ویژگی‌ها، نقش به سزایی درافزایش کارایی دقت طبقه‌بندی دارد. در این مرحله ما با استفاده از روش‌های یادگیری ماشین سعی در انتخاب بهترین ویژگی‌ها داریم که این امر، بر روی مجموعه دادگان متنی زبان کردی صورت می‌پذیرد. از جمله روش‌های یادگیری ماشین در مسئله‌های بهینه‌سازی، استفاده از الگوریتم‌های فراابتکاری است. الگوریتم‌های فراابتکاری بسیاری تا امروز معرفی شده‌اند که هرکدام الهام گرفته از طبیعت هستند. این الگوریتم‌ها فرضیات کمی در مورد یک مسئله ایجاد می‌کنند و یا می‌توانند فضاهای بسیار بزرگی از راه‌حل‌های کاندید را جستجو کنند. الگوریتم مرغ تخم‌گذار از جمله یکی از بهترین الگوریتم‌های فراابتکاری در حل مسائل بهینه‌سازی در فضای پیوسته است. با استفاده از الگوریتم فراابتکاری مرغ تخم‌گذار ویژگی‌های استخراج شده از متن طوری انتخاب می‌گردند که دقت طبقه‌بندی کننده افزایش یابد. بدین منظور ابتدا نسخه‌ای پیشرفته از این الگوریتم در فضای گسسته ارائه می‌شود و سپس در فضای نمونه تمام حالات انتخاب ویژگی، جایگذاری می‌شوند. الگوریتم با پیمایش فضای نمونه و ارزیابی نقطه به نقطه‌ی حالات، از نقطه‌ای به نقطه‌ی بهتر حرکت می‌کند. چالش اصلی این کار انتخاب نقطه‌ای خوب برای شروع و انتخاب درست محدوده تغییر برای هر نقطه است. ما در این پژوهش به یکی از بهترین روش‌ها برای بهبود انتخاب ویژگی در وظیفه‌ی طبقه‌بندی متن دست یافته‌ایم که روشی جدید است. از طرفی با پیاده‌سازی این روش بر روی زبان کردی (که جزو زبان‌های کم منبع در پردازش زبان طبیعی محسوب می‌شود) پژوهش خود را غنی‌تر ساخته‌ایم. نتایج این پژوهش در مقیاس کم (با توجه به کمبود منابع پردازشی) بهبود یک درصدی در دقت طبقه‌بندی کننده را نشان می‌دهد که نشان از کارایی رویکرد ارائه شده دارد و دری تازه بر روی پژوهشگران عزیز باز می‌کند.
کنترل اتوماتیک تولید با استفاده از سیستمهای چند عامله
1388
در این پایان نامه، از کنترل کننده های هوشمندی استفاده شده است که در ساختار آن ها، از استانداردهای عملکرد کنترل بهره گرفته شده است تا علاوه بر کنترل مناسب بار-فرکانس، از این استانداردها پیروی کنند. نتایج نشان داد با به کارگیری استانداردهای عملکرد در ساختار کنترل کننده، عملکرد کنترل کننده در برآوردن اهداف کنترلی از جمله کاهش زمان نشست و فراجهش بهبود می یابد. در این پایان نامه، علاوه بر استفاده از الگوریتم های کلاسیک، از کنترل کننده مبتنی بر سیستمهای چندعامله و با لحاظ کردن استانداردهای عملکرد استفاده شد تا هم از فرسودگی تجهیزات گاورنر کم کند و هم از استانداردهای عملکرد NERC پیروی کرده تا قابلیت اطمینان را افزایش دهد. نتایج نشان می دهد که کنترل کننده هایی که از استانداردهای NERC پیروی می کنند، عملکرد مناسب تری داشته و پاسخ فرکانسی آن ها بهبود یافته است.