top of page

​MARKET BASKET ANALYSIS​
cross-sell & Bundle optimisation

​

Project Overview

Understanding how customers purchase products together is one of the most effective ways to increase revenue without increasing traffic. This project applies market basket analysis to e-commerce transaction data to uncover product relationships, cross-sell opportunities, and high-value product bundles designed to increase Average Order Value (AOV).

The analysis follows an end-to-end analytics workflow using Python, SQL, and Tableau, mirroring how basket analysis and recommendation logic are implemented in real commercial and retail analytics environments.

​

Business Problem

E-commerce businesses often treat transactions as isolated purchases, missing opportunities to:

​

  • Recommend complementary products
     

  • Create data-driven product bundles
     

  • Optimise merchandising and checkout experiences
     

The key challenge addressed in this project was:

Which products are frequently purchased together, and how can these patterns be used to drive cross-sell recommendations and bundled offers that increase Average Order Value?

​

Data & Scope

The analysis is based on a large transactional e-commerce dataset containing hundreds of thousands of order line items, where each row represents a product purchased within an invoice.

To ensure analytical accuracy and business relevance:

​

  • Cancelled invoices and returns were excluded
     

  • Non-merchandise transactions (postage, manual adjustments, charges) were removed
     

  • Only completed customer purchases were analysed
     

  • Basket size was defined as the number of distinct products per order
     

This created a clean foundation for reliable basket-level and product-association analysis.

​

Python: Data Preparation & Feature Engineering

Python was used as the primary tool for data ingestion, validation, and transformation.

Key steps included:

​

  • Loading raw transaction data and validating structure, volume, and date ranges
     

  • Cleaning and standardising product descriptions and identifiers
     

  • Removing cancellations, returns, and non-product records
     

  • Engineering core metrics such as:
     

    • Line-level revenue
       

    • Basket size per invoice
       

    • Order-level revenue
       

    • Time-based attributes for filtering and segmentation
       

Python was also used to:

​

  • Construct invoice-level baskets from transaction-level data
     

  • Generate unique product combinations within each order
     

  • Prepare analysis-ready datasets for downstream SQL and Tableau usage
     

This step ensured the analysis was reproducible, scalable, and aligned with best practices in data preparation.

​

SQL: Aggregation & Association Metrics

SQL was used to formalise analytical logic and calculate association metrics in a structured, queryable way.

Using SQL, the project:

​

  • Aggregated transaction data to the order level
     

  • Calculated basket-level metrics such as total orders, average basket size, and Average Order Value
     

  • Computed product-association metrics, including:
     

    • Support (how often a product pair appears across all orders)
       

    • Confidence (likelihood of purchasing Product B given Product A)
       

    • Lift (strength of association compared to random chance)
       

SQL views were created to:

​

  • Separate raw data from analytical outputs
     

  • Centralise business logic outside the dashboard layer
     

  • Provide Tableau with clean, performance-optimised tables
     

This approach mirrors real-world BI environments where dashboards consume curated analytical views rather than raw data.

​

Tableau: Interactive Analysis & Decision-Making

Tableau was used to translate analytical outputs into an executive-ready, interactive dashboard focused on insight and decision-making rather than static reporting.

The dashboard includes:

​

  • Basket size distribution to understand purchasing behaviour
     

  • Average Order Value by basket segment (small, medium, large)
     

  • Top product pairs ranked by strength of association
     

  • Cross-sell heatmaps highlighting clusters of related products
     

  • A prioritised list of recommended product bundles
     

  • KPI tiles summarising overall performance
     

Interactivity was designed intentionally:

  • Order-level views dynamically interact with each other
     

  • Product-pair views interact within their analytical context
     

  • Global filters allow exploration by date range, country, and basket segment
     

This design preserves analytical correctness while enabling deep exploration, reflecting how professional dashboards are built in practice.

​

Key Insights

  • Larger baskets consistently generate higher Average Order Value
     

  • A small number of product pairs are purchased together far more frequently than expected by chance
     

  • High-lift, high-support product pairs represent strong cross-sell and bundling opportunities
     

  • Most multi-item orders are driven by a limited set of recurring product relationships.

​

Business Outcomes

Based on the analysis, the project delivers:

​

  • A ranked list of five high-priority product bundles
     

  • Clear cross-sell opportunities suitable for checkout recommendations or post-purchase emails
     

  • Evidence-based insights to support merchandising and pricing decisions
     

A scalable framework that could be extended to recommendation engines or A/B testing

​

Tools & Skills Demonstrated

  • Python: data cleaning, feature engineering, basket construction
     

  • SQL: aggregation, association metrics, analytical views
     

  • Tableau: interactive dashboard design, filtering logic, executive storytelling
     

Analytics Skills: market basket analysis, business framing, KPI design, insight communication

​

Final Output

The final deliverable is a portfolio-quality, interactive Tableau dashboard that demonstrates how raw transaction data can be transformed into actionable commercial insights. The project balances analytical rigor with clear storytelling, making it suitable for both technical review and business decision-making.

​

© 2026 by Shah Choudhury. 

bottom of page