Amazon AI Fairness and Explainability with Amazon SageMaker Clarify

Introduction

In the rapidly evolving domain of machine learning, ensuring fairness and explainability in model predictions has become crucial. With Amazon SageMaker Clarify, these critical aspects are not just an afterthought but integral components of the model development and deployment process. This article delves into the world of SageMaker Clarify, offering a comprehensive guide to its capabilities and practical applications.

We commence our journey with a high-level understanding of what SageMaker Clarify is and its importance in the day-to-day tasks of machine learning modeling. Our exploration is anchored in a hands-on example, utilizing a specially crafted dataset that simulates loan approval scenarios in the Philippines. This dataset, designed to exhibit certain biases, serves as a perfect canvas to demonstrate the prowess of SageMaker Clarify in identifying and addressing fairness issues in machine learning models.

As we navigate through the intricate paths of machine learning model development, we’ll be using AWS’s Python SDK, closely following the documentation with some adaptations to suit our unique dataset. Our focus will be on a range of critical topics, from the prerequisites of using SageMaker Clarify to the training of an XGBoost model. We’ll then delve into how SageMaker Clarify helps in detecting bias in the model predictions and explains these predictions in a transparent and understandable manner.

Join us as we embark on this enlightening journey to master SageMaker Clarify, and arm ourselves with the knowledge and tools to build not only effective but also fair and explainable machine learning models.

What is SageMaker Clarify?

Amazon SageMaker Clarify is a powerful tool designed to bring transparency and fairness into the realm of machine learning. In a world where AI-driven decisions increasingly impact every aspect of our lives, SageMaker Clarify stands as a beacon of accountability and understanding. It serves as a crucial component in the Amazon SageMaker suite, ensuring that machine learning models are not only efficient but also equitable and interpretable.

Core Functions

Bias Detection and Mitigation: SageMaker Clarify addresses a fundamental concern in machine learning – bias. It provides tools to detect and quantify biases that might exist in your data and models. This feature is vital, especially when dealing with sensitive attributes like gender, ethnicity, or age. By analyzing these attributes, SageMaker Clarify helps in identifying potential biases that could skew decision-making processes, ensuring that models treat all individuals fairly.
Model Explainability: Understanding why a model makes a certain prediction can be as crucial as the prediction itself. SageMaker Clarify offers insights into the “why” and “how” of model decisions. This transparency is invaluable, particularly in scenarios where explanations are required for compliance or to build trust with end-users. It breaks down the prediction outcomes, providing a clear understanding of the contributing factors, thus demystifying the often opaque nature of machine learning algorithms.

Integrating with Your Machine Learning Workflow

SageMaker Clarify seamlessly integrates into your existing AWS machine learning workflow. Whether you’re starting from scratch or have a pre-existing model, Clarify can be incorporated at various stages – from the initial data preparation phase to post-deployment. This flexibility allows for continuous monitoring and improvement of your models, ensuring they remain fair and understandable throughout their lifecycle.

Why SageMaker Clarify Matters

In our case study, we’ll be using an artificial dataset simulating loan approvals in the Philippines. This dataset, purposefully designed to exhibit biases, is an ideal testbed for demonstrating the capabilities of SageMaker Clarify. Through this example, we will witness firsthand how Clarify detects biases in the dataset and in the machine learning model. This practical application not only underscores the importance of fairness in AI but also showcases the ease with which SageMaker Clarify can be integrated into everyday machine learning tasks.

In conclusion, SageMaker Clarify is not just a tool; it’s a commitment to responsible AI. By ensuring fairness and explainability, it empowers developers and businesses to create machine learning models that are not only high-performing but also equitable and transparent, fostering trust and reliability in AI-driven decisions.

Prerequisites and Data

Importing Libraries

The first step in our journey involves setting up the Python environment with the necessary libraries. This setup ensures that all tools required for data manipulation, machine learning, and interaction with AWS services are readily available. The following libraries form the foundation of our work:

pandas and numpy for data manipulation and numerical operations.
os and boto3 for operating system and AWS SDK operations.
datetime for handling date and time data.

SageMaker-specific libraries like session and get_execution_role for managing SageMaker sessions and roles.

import pandas as pd
import numpy as np
import os
import boto3
from datetime import datetime
from sagemaker import session, get_execution_role
from sklearn.model_selection import train_test_split

Initializing Configurations

Setting up the SageMaker session and defining the role is crucial for integrating our local environment with AWS services. This step allows us to interact seamlessly with SageMaker and other AWS services throughout our project.

# Initialize SageMaker session
sagemaker_session = session.Session()

region = sagemaker_session.boto_region_name
print(f"Region: {region}")

# Define role based on your environment
role = 
"arn:aws:iam::123123123:role/service-role/AmazonSageMaker-ExecutionRole-123123
123"
# or, if using SageMaker Studio
role = get_execution_role()
print(f"Role: {role}")

Downloading the Data

We’ll be using a pre-prepared dataset that represents loan applications in the Philippines. This dataset is specifically designed to showcase potential biases and will serve as the foundation for our analysis with SageMaker Clarify. You can download this dataset via this link.

Preprocessing

Preprocessing involves normalizing numerical features and encoding categorical ones, preparing the dataset for machine learning models.

Scaling the numerical features:

# Scale the numerical features
from sklearn.preprocessing import StandardScaler
numerical_features = ["monthly_income", "credit_score", "employment_years",
 "age", "debt_to_income", "other_obligations"]
scaler = StandardScaler()
scaled_features = scaler.fit_transform(df[numerical_features])
scaled_features_df = pd.DataFrame(scaled_features, index=df.index, columns=numerical_features)
df = df.drop(columns=numerical_features, axis=1)
df = pd.concat([df, scaled_features_df], axis=1)

Splitting the dataset:

training_data, testing_data = train_test_split(df, test_size=0.2, 
random_state=0)

Encoding categorical columns:

from sklearn import preprocessing

def number_encode_features(df):
    result = df.copy()
    encoders = {}
    for column in result.columns:
        if result.dtypes[column] == object:
            encoders[column] = preprocessing.LabelEncoder()
            result[column] =
encoders[column].fit_transform(result[column].fillna("None"))
    return result, encoders

training_data = pd.concat([training_data["loan_approved"], training_data.drop(["loan_approved"], axis=1)], axis=1)
training_data, _ = number_encode_features(training_data)
training_data.to_csv("train_data.csv", index=False, header=False)

testing_data, _ = number_encode_features(testing_data)
test_features = testing_data.drop(["loan_approved"], axis=1)
test_target = testing_data["loan_approved"]
test_features.to_csv("test_features.csv", index=False, header=False)

Data Definition

A thorough understanding of our dataset is critical for identifying and addressing potential biases. It includes:

Monthly Income: A numerical feature representing the applicant’s income, a key factor in loan decisions.
Credit Score: Indicates the creditworthiness of an applicant, crucial for loan approvals.
Employment Years: Represents the duration of employment, potentially influencing loan decisions.
Debt-to-Income Ratio and Other Obligations: Assess financial stability and repayment capacity.
Gender: A sensitive attribute that could be a basis for gender bias in loan decisions.
Ethnicity: Reflects the diverse cultural backgrounds in the Philippines, a potential ground for ethnic bias.
Age: Ranges from 18 to 70, and could influence decisions, leading to age discrimination.

The target variable is the loan approval status, which we will analyze for bias using SageMaker Clarify. By understanding these features, we can better comprehend how a model might develop biases and work proactively towards creating a more equitable machine learning solution.

Model Training

In this section, we will go through the process of training an XGBoost model using our prepared dataset.

Putting Data into S3

Before training, we need to upload our dataset to Amazon S3, AWS’s scalable storage service. This process ensures that our data is accessible to the SageMaker training job.

from sagemaker.s3 import S3Uploader
from sagemaker.inputs import TrainingInput

bucket = "your-s3-bucket-name"
prefix = "sagemaker-clarify-article/philippines-loan"

# Upload training and testing data to S3
train_uri = S3Uploader.upload("train_data.csv", f"s3://{bucket}/{prefix}")
train_input = TrainingInput(train_uri, content_type="csv")
test_uri = S3Uploader.upload("test_features.csv", f"s3://{bucket}/{prefix}")

Training an XGBoost Model

XGBoost is a popular and efficient open-source implementation of gradient-boosted trees, renowned for its performance and speed. In this step, we’ll configure and initiate the training of an XGBoost model on our dataset.

from sagemaker.image_uris import retrieve
from sagemaker.estimator import Estimator

# Retrieve the XGBoost image
xgboost_image_uri = retrieve("xgboost", region, version="1.5-1")

# Configure the XGBoost model
xgb = Estimator(
    xgboost_image_uri,
    role,
    instance_count=1,
    instance_type="ml.m5.xlarge",
    disable_profiler=True,
    sagemaker_session=sagemaker_session,
)

# Set hyperparameters for XGBoost
xgb.set_hyperparameters(
    max_depth=5,
    eta=0.2,
    gamma=4,
    min_child_weight=6,
    subsample=0.8,
    objective="binary:logistic",
    num_round=800,
)

# Start the training job
xgb.fit({"train": train_input}, logs=False)

Create a SageMaker Model

Once the training is complete, the next step is to create a SageMaker model. This model will be used for making predictions and will also be the subject of our fairness and explainability analysis with SageMaker Clarify.

model_name = 
"DEMO-clarify-model-{}".format(datetime.now().strftime("%d-%m-%Y-%H-%M-%S"))

# Create a SageMaker model
model = xgb.create_model(name=model_name)
container_def = model.prepare_container_def()
sagemaker_session.create_model(model_name, role, container_def)

In this section, we have successfully uploaded our data to S3, trained an XGBoost model, and created a SageMaker model. These steps lay the groundwork for the subsequent stages where we will use SageMaker Clarify to detect bias and explain predictions made by our model.

Amazon SageMaker Clarify

Detecting Bias

Detecting and addressing bias is a pivotal aspect of responsible AI practices. In this section, we explore how Amazon SageMaker Clarify helps in identifying and mitigating biases in machine learning models.

Understanding Bias in Machine Learning

Bias in machine learning refers to the unfair and prejudicial treatment of certain groups based on their characteristics, like gender or ethnicity. This unfair treatment often stems from the data the model is trained on or the way the model processes data. Biases can significantly impact individuals and communities, leading to skewed and unjust outcomes. Therefore, it’s crucial to detect and mitigate these biases to ensure fairness and equity in AI-driven decisions.

SageMaker Clarify for Bias Detection

SageMaker Clarify provides tools to detect both pre-training and post-training biases using a variety of metrics. Pre-training bias arises from the training data itself, while post-training bias may develop during the model’s learning process.

Initializing Clarify

To start with, we initialize a SageMakerClarifyProcessor, which will compute the bias metrics and model explanations:

from sagemaker import clarify

clarify_processor = clarify.SageMakerClarifyProcessor(
    role=role, 
    instance_count=1, 
    instance_type="ml.m5.xlarge", 
    sagemaker_session=sagemaker_session
)

DataConfig: Setting Up Data for Bias Analysis

DataConfig informs SageMaker Clarify about the data used for the bias analysis:

bias_report_output_path = f"s3://{bucket}/{prefix}/clarify-bias"
bias_data_config = clarify.DataConfig(
    s3_data_input_path=train_uri,
    s3_output_path=bias_report_output_path,
    label="loan_approved",
    headers=training_data.columns.to_list(),
    dataset_type="text/csv",
)

This configuration specifies the S3 paths for input data and output reports, the target label, column headers, and the dataset type.

ModelConfig and ModelPredictedLabelConfig: Configuring the Model

ModelConfig defines the trained model details:

model_config = clarify.ModelConfig(
    model_name=model_name,
    instance_type="ml.m5.xlarge",
    instance_count=1,
    accept_type="text/csv",
    content_type="text/csv",
)

ModelPredictedLabelConfig sets up how SageMaker Clarify interprets the model’s predictions:

predictions_config = 
clarify.ModelPredictedLabelConfig(probability_threshold=0.8)

BiasConfig: Specifying Bias Parameters

BiasConfig is used to specify parameters for bias detection:

bias_config = clarify.BiasConfig(
    label_values_or_threshold=[1], 
    facet_name="gender", 
    facet_values_or_threshold=[0], 
    group_name="age"
)

In our example, we focus on gender as the sensitive attribute and age as the subgroup for measuring bias.

Pre-training vs Post-training Bias

In our scenario, pre-training bias would relate to any inherent biases in the dataset, such as disproportionate representation of certain genders or ethnicities. Post-training bias would concern biases that the model may develop as it learns from this data, potentially exacerbating or creating new biases.

Running Bias Report Processing

Finally, we run the bias analysis using SageMaker Clarify:

clarify_processor.run_bias(
    data_config=bias_data_config,
    bias_config=bias_config,
    model_config=model_config,
    model_predicted_label_config=predictions_config,
    pre_training_methods="all",
    post_training_methods="all",
)

This process comprehensively examines both pre-training and post-training biases, offering insights into areas where the model might be exhibiting unfair biases. By addressing these biases, we can work towards more fair and equitable AI systems.

Viewing the Bias Report

Accessing the Report

After running the SageMaker Clarify analysis, you can view the results of the bias report. If you are following the demo locally, you can access the report by navigating to the output of the following command:

print(bias_report_output_path)

You can then download the report from this path and view it. If you are following the demo using SageMaker Studio, the results can be viewed directly in the “Experiments” tab.

Report Overview

The Amazon SageMaker Clarify bias report is comprehensive, structured into various sections:

Analysis Configuration: This section details the configuration used for the bias analysis, including the outcome label column, the facet (the attribute of interest for bias analysis), and an optional group variable.
High Level Model Performance: This part of the report provides metrics showing the model’s performance, such as accuracy, true positive rate (recall), and false positive rate (precision).
Pre-training Bias Metrics: These metrics measure imbalances in the representation of the facet values (e.g., gender) in the training data. Various metrics, like Conditional Demographic Disparity in Labels (CDDL), Class Imbalance (CI), and Difference in Proportions of Labels (DPL), offer insights into how balanced or imbalanced the training data is in relation to the facet.
Post-training Bias Metrics: This section measures imbalances in model predictions across different inputs. Metrics like Accuracy Difference (AD), Conditional Demographic Disparity in Predicted Labels (CDDPL), and Disparate Impact (DI) help in understanding whether the model’s predictions are fair across different groups defined by the facet (e.g., gender).

Each of these sections provides valuable insights into different aspects of bias in the machine learning model, allowing for a comprehensive understanding of where biases might exist and how they manifest in both the data and the model’s predictions.

You can check the whole bias report in this link.

Explaining Predictions with Kernel SHAP

In the realm of machine learning, especially in applications with significant social impacts like loan approvals, understanding the ‘why’ behind a model’s decision is as important as the decision itself. Amazon SageMaker Clarify employs Kernel SHAP (SHapley Additive exPlanations) to elucidate the contribution of each input feature to the final decision. This method, grounded in cooperative game theory, offers a way to interpret complex model predictions by assigning each feature an importance value for a particular prediction.

For running the run_explainability API call, SageMaker Clarify requires configurations similar to those used for bias detection, including DataConfig and ModelConfig. Additionally, SHAPConfig is introduced specifically for the Kernel SHAP algorithm.

In our demonstration, we configure SHAPConfig with the following parameters:

Baseline: The Kernel SHAP algorithm necessitates a baseline or background dataset for reference. This baseline can be either a predefined set of data or automatically calculated using methods like K-means. In our case, we select the latter approach, utilizing the mean of the training dataset as our baseline. The selection of an appropriate baseline is crucial as it sets the stage for SHAP value calculation.
Num_samples: This parameter decides the number of synthetic data samples used to compute SHAP values. The choice of this number can balance between computational efficiency and the fidelity of the explanations.
Agg_method: This refers to the method used for aggregating global SHAP values. We use ‘mean_abs’, which computes the mean of the absolute SHAP values across all instances, providing a measure of the overall impact of each feature.
Save_local_shap_values: When set to True, this option saves the local SHAP values in the output location, allowing for detailed inspection of feature contributions for individual predictions.

Explainability Report Configuration

explainability_output_path = f"s3://{bucket}/{prefix}/clarify-explainability"
explainability_data_config = clarify.DataConfig(
    s3_data_input_path=train_uri,
    s3_output_path=explainability_output_path,
    label="loan_approved",
    headers=training_data.columns.to_list(),
    dataset_type="text/csv",
)

baseline = [training_data.mean().iloc[1:].values.tolist()]
shap_config = clarify.SHAPConfig(
    baseline=baseline,
    num_samples=15,
    agg_method="mean_abs",
    save_local_shap_values=True,
)

Running Explainability Report Processing

The actual execution of the explainability analysis involves running the run_explainability method, which takes about 10-15 minutes:

clarify_processor.run_explainability(
    data_config=explainability_data_config,
    model_config=model_config,
    explainability_config=shap_config,
)

Viewing the Explainability Report

The Explainability Report generated by SageMaker Clarify offers an in-depth look at how different features influenced the model’s predictions. The report includes:

Model Explanations: This section provides SHAP explanations for individual labels, detailing the contribution of each of the 8 features in the model.
Visualization of SHAP Values: The report includes a chart where each point represents an individual instance. The x-axis indicates the SHAP value for a specific instance and feature, while the red-blue color scale denotes the feature value itself, with red indicating higher values and blue for lower values.

This detailed breakdown enables a deeper understanding of the model’s decision-making process, highlighting the factors that are most influential in predictions. Such transparency is crucial not only for regulatory compliance but also for building trust in machine learning systems among users and stakeholders.

You can check the whole explainability report in this link.

Wrapping Up

Embracing Fairness and Explainability in Machine Learning

As we conclude our exploration of Amazon SageMaker Clarify, it’s clear that this tool is pivotal in fostering fairness and transparency in machine learning models. Through our journey, from setting up our environment to training an XGBoost model and using SageMaker Clarify, we’ve seen firsthand the impact and necessity of these tools in contemporary machine learning practices.

Key Takeaways

Detecting Bias: We learned how SageMaker Clarify aids in detecting both pre-training and post-training biases. By analyzing our loan approval dataset, Clarify illuminated biases that could lead to unfair treatment of individuals based on sensitive attributes like gender, ethnicity, or age.
Explaining Predictions: With Kernel SHAP, SageMaker Clarify provided valuable insights into the contribution of each feature to the model’s predictions. This level of explainability is not just a technical requirement but a step towards ethical AI, ensuring that stakeholders understand and trust the decisions made by the models.
Practical Application: The use of an artificial dataset reflecting real-world scenarios demonstrated the practical application of these concepts. This hands-on example illustrated how seemingly neutral models could unintentionally perpetuate biases if not carefully examined and corrected.

Moving Forward

As machine learning continues to evolve and integrate more deeply into various sectors, the importance of tools like SageMaker Clarify cannot be overstated. They are essential for building models that not only perform well but also align with our ethical standards and societal values. The journey towards responsible AI is ongoing, and SageMaker Clarify is a powerful ally in this endeavor.

Final Thoughts

We encourage practitioners in the field of machine learning and data science to leverage SageMaker Clarify in their projects. By doing so, we can collectively work towards more equitable and transparent AI systems. Remember, the goal is not just to create intelligent machines but to ensure that these machines make decisions that are fair, understandable, and accountable.

Resources:

https://docs.aws.amazon.com/en_us/sagemaker/latest/dg/clarify-model-explainability.html

https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-clarify/fairness_and_explainability/fairness_and_explainability.html#Prerequisites-and-Data

https://sagemaker.readthedocs.io/en/stable/api/training/processing.html#sagemaker-clarify

Written by: John Patrick Laurel

Pats is the Head of Data Science at a European short-stay real estate business group. He boasts a diverse skill set in the realm of data and AI, encompassing Machine Learning Engineering, Data Engineering, and Analytics. Additionally, he serves as a Data Science Mentor at Eskwelabs. Outside of work, he enjoys taking long walks and reading.