Quantitative Analysis of Explainable AI (XAI) Visualizations

Versione 1.0.2 (7,84 KB) da Gopi Simhadri

Our study proposes a quantitative approach for evaluating model explanations in XAI, enhancing transparency and trust in AI systems.

Segui

4.9

(43)

81 download

Aggiornato 4 ott 2024

Visualizza la licenza

Explainable Artificial Intelligence (XAI) is a field that aims to make AI models transparent and understandable. This is particularly important in areas like agriculture, healthcare, ecology and environmental science, where AI decisions can have a significant impact on safety and ethics. By explaining how complex models, such as deep learning networks, make decisions, XAI helps build user trust and ensures that model decisions align with human expectations.

In image classification, XAI visualizations are extremely valuable. These tools, such as saliency maps and heatmaps, show which parts of an image the model focuses on to make its decision. For example, when identifying bird species, the model might highlight features such as feathers or beaks. This ensures that decisions are based on relevant features rather than irrelevant details.

Comparing XAI visualizations from different models provides insights beyond just accuracy. While accuracy tells us how often a model gets predictions right, visualizations reveal why the model makes these predictions. Understanding the decision-making process is critical for selecting the best model and ensuring it aligns with human reasoning. Examining these visualizations can identify models that consistently focus on important features, making them more reliable and less prone to errors or bias.

Qualitative analysis of XAI visualizations has limitations, including subjective interpretation, inconsistency, and difficulty in scaling up. Quantitative analysis provides a more objective and systematic approach. Quantitative analysis can be automated, making it possible to evaluate large datasets and complex models. It provides a transparent and reproducible evaluation of XAI visualizations, enabling the development of more effective methods for explaining complex models. This leads to more transparent and trustworthy AI systems, essential for real-world applications. Quantitative analysis is crucial for evaluating XAI visualizations.

Our study introduces a quantitative approach to evaluating XAI models. This involves a three-step process: first, measuring the accuracy of the model; second, assessing whether the model correctly identifies key features influencing its decisions; and finally, combining these findings to evaluate the model's overall performance and explanation quality.

Our study is the first to use a fully quantitative approach for evaluating XAI explanations. This method enhances trust and understanding of models, ensuring they are not only accurate but also understandable and aligned with human reasoning. This clarity is vital for successful real-world application, allowing AI systems to be more widely accepted and effectively used.

Purpose

We developed MATLAB functions to provide a quantitative evaluation framework for XAI methods, particularly focusing on the assessment of classification performance of deep learning models and feature selection accuracy. These functions enable researchers to quantitatively evaluate visual explanations generated by XAI technique LIME, without requiring extensive coding expertise. It provides an intuitive interface for performing robust performance evaluations, ensuring consistency, scalability, and reproducibility of results. The toolbox includes customizable evaluation metrics, automated analysis features, and comprehensive tutorials to guide users in using its full capabilities for reliable XAI assessments.

Features

Here are the features provided by each function:

1. Train and save the pre-trained model: Train the model using specified settings, evaluate its performance using various performance evaluation metrics, and save the trained model for future use (Train_pre-trained_model.m).

2. LIME Feature Extraction: Extracts and visualizes the most significant features in images, generating both masked and binary masked images that highlight the top n significant features (lime_extract_features.m).

3. Quantitative Analysis: Calculates various quantitative metrics between a binarized image and its corresponding ground truth image (quantitative_evaluation_metrics.m).

4. Overfitting Ratio Calculation: calculates the overfitting ratio between a target ROI image and an identified ROI image (overfitting_ratio.m)

Instructions for Training and Evaluating a Deep Learning Model

Step 1: Train and Save the Model

Prepare the Data:

Load images from your dataset (e.g., a collection of plant disease images).
Split the data into training and validation sets based on your desired ratio (e.g., 70% training, 30% validation).

Design the Model:

Use a pretrained model as a starting point (e.g., VGG16, ResNet).
Modify the model to fit your specific task by adding or adjusting layers as needed for your classification categories (e.g., adding layers to classify different plant diseases).

Enhance Model Robustness:

Apply data augmentation techniques, such as image flipping or translating (e.g., random horizontal flips, rotation), to increase dataset size and improve model generalization.

Train the Model:

Utilize an appropriate optimizer and configure training settings such as batch size, learning rate, and number of epochs (e.g., Adam optimizer, batch size of 32, learning rate of 0.001, and 25 epochs).
Monitor validation performance to ensure the model is learning effectively.

Save the Model:

After training, save your trained model for future use in the desired format (e.g., saving as a .mat file).

Evaluate Performance:

Compute various performance metrics like accuracy, precision, and recall to understand model effectiveness (e.g., accuracy 92%, precision 0.90, recall 0.88).

Step 2: Extract and Visualize Features with an Interpretation Tool

Load the Trained Model:

Access the saved model file (e.g., load the .mat file).

Apply an Interpretation Tool:

Use a XAI visualization technique LIME to identify key features influencing the model’s decisions.

Visualize Features:

Highlight important features in the input data (e.g., highlight key parts of the images showing disease symptoms).
Convert visualizations to a format that emphasizes significance for better understanding (e.g., create a binary mask to show highlighted features).

Assess Model Behavior:

Analyze identified features to gain insights into how the model makes predictions (e.g., understanding which features contribute to identifying a specific disease).

Step 3: Perform Quantitative Analysis

Compare Features with Ground Truth:

Use binary or masked images to compare identified features to actual reference data (e.g., compare LIME-generated features to annotated disease segments).

Employ Quantitative Metrics:

Utilize metrics like overlap coefficients, precision, and recall to assess feature matching (e.g., Intersection over Union of 0.85).

Evaluate Model Performance:

Analyze how well the model’s selected features correspond to ground truth to gauge accuracy (e.g., high correspondence indicates accurate feature recognition).

Step 4: Calculate Overfitting Ratio

Define Overfitting:

Recognize potential model limitations, such as reliance on irrelevant features, despite achieving high accuracy on both training and test sets (e.g., training accuracy of 98% and testing accuracy of 95%, but the model focuses on background elements instead of relevant object features).

Quantify Overfitting:

Calculate the ratio comparing irrelevant focus areas to the actual target areas in your data (e.g., measure how much the model focuses on non-relevant parts of the plant images).
Use this measure to understand the extent of the model's overfitting (e.g., a high overfitting ratio could indicate over-reliance on non-diseased image areas).

Screenshots:

LIME Feature Extraction:

lime_extract_features.m function can be used to extract and visualize the most significant features in images, generating both masked and binary masked images that highlight the top n significant features

Screenshot of the UI before selecting the options in the LIME explanations panel:

Screenshot of the UI after selecting the options and submission in the LIME explanations panel:

Quantitative Analysis:

quantitative_evaluation_metrics.m function is used to calculate various quantitative metrics between a binarized image and its corresponding ground truth image.

Screenshot of the UI before selecting the images in the quantitative analysis panel:

Screenshot of the UI displaying the results after selecting the images in the quantitative analysis panel:

Overfitting Ratio Calculation:

overfitting_ratio.m function is used to calculate the overfitting ratio between a target ROI image and an identified ROI image

Screenshot of the UI before selecting the images for the calculation of the overfitting ratio panel:

Screenshot of the UI displaying the results after selecting the images for the calculation of the overfitting ratio panel:

Cita come

Simhadri, C. G., & Kondaveeti, H. K. (2024), Quantitative Analysis of Explainable AI (XAI) Visualizations (https://www.mathworks.com/matlabcentral/fileexchange/<...>), MATLAB Central File Exchange. Retrieved September 26, 2024.

Compatibilità della release di MATLAB

Creato con R2021a

Compatibile con R2021a fino a R2024b

Compatibilità della piattaforma

Windows macOS Linux

Tag Aggiungi tag

Riconoscimenti

Ispirato da: XAI Toolbox for User-Friendly GUI-Based Grad-CAM and LIME, Explainable AI: interpreting the classification using LIME, Deep learning grad Cam, occlusion senstivity ,and Lime

Ispirato: Explainable Bird Species Image Classification

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Code

Versione	Pubblicato	Note della release
1.0.2	4 ott 2024	Updated the quantitative_evaluation_metrics and overfitting_ratio files	Scarica
1.0.1	27 set 2024	1. Train and save the pre-trained model: Train the model using specified settings, evaluate its performance using various performance evaluation metrics, and save the trained model for future use (Train_pre-trained_model.m).	Scarica
1.0.0	26 set 2024		Scarica