Cameron S | Software Engineer

Customer Purchase Analysis

behavior analysis to identify key insights

Tools Used:

Excel
Python
Jupyter

Data Overview

This data contains electronic sales transactions over a one-year period, from September 2023 to September 2024.
It includes detailed information about customer demographics, product types, and purchase behaviors.

Objectives:

Identify Sales Trends
Define Product Performance
Find Loyalty Program Success and Failures
Discover Key Customer Demographics

➡️ Jump straight to the insights 💡

✅ View my Recommendations

▶ Detailed code, analysis, and full findings are documented in this notebook.

Data Structure Overview:

The Electronic Sales Database contains both numeric and categorical data across four tables, capturing essential customer and transaction details.

Customer - Customer demographics and membership info
Product - Product SKU and type
Order - Order information including status and payment info
OrderLine - Products ordered including any Addons

Entity Relationship Diagram:

Total row count: 20,000 Records

Insights

1. Sales Performance

Total revenue from completed sales: $42,629,615.57.

Cancellation rate is 33%, which suggests deeper issues we need to identify.

Smartphones dominate our sales, accounting for 34% of total revenue, indicating strong market demand.

Revenue by Product Type:

Product Type	Total Revenue
Smartphone	$14,407,835
Smartwatch	$9,398,591
Laptop	$8,365,905
Tablet	$7,722,632
Headphones	$2,734,651

2. Customer Demographics

Our products appeal to a broad age range (18-80 years). Gender distribution is relatively even for smartphones, but there’s a male preference for tablets and smartwatches.

Top Product Orders by Gender:

Product Type	Count
Smartphone (Male)	2998
Smartphone (Female)	2980
Tablet (Male)	2088
Smartwatch (Male)	2033
Tablet (Female)	2016

3. Loyalty Program Impact

Extended Warranty purchase rate is higher for Non-loyalty members.
This suggests we do not have enough benefits for loyalty members and should focus our loyalty efforts on extra benefits.

Average Order Value (AOV) by Loyalty Status:

Loyalty Status	AOV
New Member	$3,237
Non Member	$3,193
Churned	$3,182
Regular Member	$3,092

4. Order Status and Seasonal Trends

Summer is our best season by far, doubling our Fall revenue. Further analysis needed on Fall and Winter shortcomings.

Sales peak in January 2024, with May 2024 close behind. Sales dip from September through December.

Seasonal Revenue:

Season	Total Revenue
Summer	$13,305,374
Spring	$12,989,365
Winter	$9,723,550
Fall	$6,611,324

5. Add-ons and Shipping Impact

Orders without add-ons have a higher AOV:

Expedited shipping is the star of the show! Expedited has the lowest percentage of total orders but has the second highest revenue.
Expedited shipping has a $627 higher AOV than standard shipping.

AOV by Add-ons Purchased:

Add-ons Purchased	AOV
No Add-ons	$3,199
With Add-ons	$3,083

6. Payment Methods and Customer Behavior

Credit cards and PayPal are equally popular. Bank transfers show a $500 higher average spend but also an 8% higher cancellation rate.

Impulse items are popular across most age groups. The 25-34 and 55-64 age groups purchase more accessories, while the 35-44 age group has the highest percentage of warranty add-ons.

✅ Recommendations

1. Revamp the Loyalty Program

Incentivize higher spending among regular members. Consider tiered benefits and additional warranty options.

2. Investigate Cancellation Rates

Focus on root causes of the high cancellation rate. Improve order fulfillment, product descriptions, and customer communication to reduce cancellations.

3. Targeted Marketing Campaigns

Develop campaigns for the fall and winter seasons, particularly for November and December, to align with typical retail high-sales periods.

4. Reassess Add-on Strategy

Personalize add-on offers based on customer purchase history and preferences.

5. Exclusive Benefits for Loyalty Members

Offer loyalty members exclusive benefits like free expedited shipping or discounted extended warranties.

SLC Police Cases 2010-2014

In-depth analysis to enhance Utah Crime policies.

Tools Used:

Excel
Python
Jupyter

Data Overview

Utah provides publicly accessible data on Police Cases between 2010-2014., including details on case types, crime locations, and both 'Reported' and 'Closed' dates for each incident.

Objectives:

Uncover Crime Trends
Identify case processing times Issues
Enhance police policies
Target business anti-theft measures

➡️ Jump straight to the insights 💡

✅ View my Recommendations

▶ Detailed code, analysis, and full findings are documented in this notebook.

Data Structure Overview:

Data includes reported crime in Salt Lake City between the years 2010 and 2014 in a single Large table.

Case - Crime case information, including UCR codes, report dates, and locations
UCR Code - Uniform Crime Reporting system codes associated with each case
IBR Code - Incident-Based Reporting code, detailing specific incidents
Location - Geographical details of the crime scene, including police zones and grids
Region - Computed geographic regions based on the location of the crime (e.g., council, police zone)

Entity Relationship Diagram:

Total row count: 295,925 Records

Dashboard

Insights

1. Seasonal Patterns

Worst Season for Crime: Summer has the highest crime count with a total of 58,704 reported crimes.

Winter has the lowest crime count, with 49,714 reported crimes.

Crime Distribution by Season:

Season	Total Crimes
Summer	58,704
Fall	54,349
Spring	53,351
Winter	49,714

2. Worst Crime by Year

Larceny is consistently the most frequent crime across the years, with an increasing trend year over year. In 2014, larceny had the highest count of 12,756 cases.

Worst Crime per Year:

Year	Worst Crime	Count
2010	Larceny	10,380
2011	Traffic	9,238
2012	Larceny	11,491
2013	Larceny	11,894
2014	Larceny	12,756

3. Worst Day of the Week for Crime

Thursday has the highest overall crime count with 33,575 cases, closely followed by Friday with 33,184 cases.

Crime Count by Day of the Week:

Day of Week	Count
Thursday	33,575
Friday	33,184
Wednesday	31,949
Tuesday	30,946
Monday	30,585
Saturday	30,559
Sunday	25,320

4. Crime Count Breakdown

Larceny has the highest total crime count, with 55,709 reported incidents, followed by Public Order with 43,825 crimes.

Crime Count by Type:

Crime	Count
Larceny	55,709
Public Order	43,825
Traffic	43,171
Public Peace	22,357
Assault	20,331
Escape	19,552
Damaged Property	15,065
Drugs	12,502
Stolen Vehicle	11,303
Investigation of Privacy	10,548

5. Longest Time to Process Cases

Fraud cases have the longest processing time, with the longest case taking 1,770 days to be closed.

Longest Case Processing Time by Crime:

Crime	Case Duration (Days)
Fraud	1,770
Public Peace	1,536
Public Order	1,403
Exploitation	1,154
Larceny	1,134

6. Average Time to Process Cases

Exploitation cases have the highest average case duration, with an average of 231 days, followed by Fraud with 12 days.

Average Case Processing Time by Crime:

Crime	Case Duration (Days)
Exploitation	231
Fraud	12
Embezzlement	9
Kidnap	4
Forgery	4

✅ Recommendations

1. Enhance Security for High-Crime Days and Areas

Targeted Security: Implement heightened security measures on Mondays (for larceny) and Saturdays (for assaults). Increase security personnel, especially in high-crime areas like Hope Ave and around popular business zones.

Community Engagement: Develop community outreach programs to foster collaboration among local businesses. This can create a business watch network that enhances safety and makes Salt Lake City a more attractive location for new businesses.

2. Invest in Fraud and Embezzlement Experts

Fraud Prevention Programs: Invest in internal experts to handle complex fraud cases with long processing times. This will help prevent and address fraud more effectively, protecting the local economy from additional financial loss.

Education and Awareness: Conduct workshops or informational sessions for local employees and customers on how to recognize and report suspicious activities. This can help reduce incidents and speed up case resolutions.

3. Leverage Technology

Advanced Surveillance: Invest in surveillance technology and analytics to better monitor crime trends and respond quickly. AI-driven analytics can help identify patterns and potential threats more efficiently.

Data Analysis Tools: Use data analysis tools to continuously monitor and assess crime trends in relation to local business operations. This will help make informed decisions about security and operational adjustments in the area.

Parental Leave Infographic

Maven Analytics Data challenge

View full Infographic

Tools Used:

Excel
Python
Maps API
Photoshop

About this project

The Challenge was to act as an online business journal. My primary goal was to create a nice looking infographic that provided all the data for anyone looking up this Data. They may just be curious, or they may be looking to move companies or countries to look for better benefits.

Cleaning Data

I used Excel to organize and clean the data set, and decide what story I would tell with said Data.

Python

This is my first Challenge here on Maven and I really wanted to make a splash. So I decided to take the Data to the next level using Python.

Using Python and the Google Maps API, I added the location of every company in the list. There were about 4 that could not be located, but enough of them had locations that I could be confident in using the location data to further develop my graphics.

Data Work

I took my new data with locations into Excel for some basic calculations before I took all the data into Tableau. I wanted to make sure I knew what I was looking at, and look for any obvious outliers.

Why Did I Use MEDIAN?

Almost every single location and industry had a limited sample set present with one or two outliers that were altering the data.

Let's take the 'Conglomerate' industry, for instance. Out of 10 companies, 2 of them offer exactly 12 weeks and 2 offer over 12 weeks. The rest all offer under 10 weeks leave. Because one company offers 52 weeks, it completely skews the results. The 'Average' for the 'Conglomerate' industry is 16 weeks. The Median is 10.5, which much closer represents the actual data since so many offered under 10 weeks.

I ran these calculations across every location and industry and decided the numbers were getting too skewed by only a small fraction of companies offering either 0 or 52 weeks.

Sample Size

While we are on the topic of sample size, I used the entire table for calculations, but when it came to industry-specific data, I excluded the data if it had under 10 companies.

Why? Well, if an industry shows 4 companies and 4 of them offer 12 weeks paid leave, it leads you to believe that industry as a whole has 100% offering 12+ weeks. The sample size is too small, and several industries were skewed to show 100% paid leave when we know that could not possibly be the case.

Insights

If I did my job well, you should be able to quickly gather these insights from the graphic:

Canada offers the best median leave for both maternity and paternity. With Germany and America offering the least amount.
The Finance industry has the highest percentage offering 12+ weeks in both paid maternity and paid paternity leave.
Retail is by far the worst industry when it comes to paid leave.
Law firms have the highest percentage of over 20 weeks paid maternity leave.
Paid paternity leave is only offered about 20% of the time, whereas paid maternity leave is offered more than 90% of the time.

Thanks for giving me your time!

People illustrations provided by Storyset

New Portfolio Coming Soon!

Currently Pursuing my B.S. in Software Engineering

Previous Analytical Projects

Customer Purchase Analysis

SLC Police Cases 2010-2014

Parental Leave

More on my Github

Skills

Data Analysis

Data Engineering

Cloud & DevOps

Latest Videos

Recent content from my YouTube channel

HOW TO create your own data for data analysis

Twitter Interview Question - Bucket Counts

Advanced SQL JOINs Tutorial

Subscribe to my YouTube Channel

About

I love exploring data, building web apps, and solving both simple and complex problems.

2022

2022

Career Transition to Data Analysis

2023

2023 - 2024

Focused Study and Certifications

2024

2024 - 2025

Data Exploration and Software Development

2025

2025 - ONGOING

Pursuing B.S. in Software Engineering

Help me
go further

Join Me in My Data Engineering Journey

Contact Me

New Portfolio Coming Soon!

Currently Pursuing my B.S. in Software Engineering

Previous Analytical Projects

Customer Purchase Analysis

SLC Police Cases 2010-2014

Parental Leave

More on my Github

Skills

Data Analysis

Data Engineering

Cloud & DevOps

Latest Videos

Recent content from my YouTube channel

HOW TO create your own data for data analysis

Twitter Interview Question - Bucket Counts

Advanced SQL JOINs Tutorial

Subscribe to my YouTube Channel

About

I love exploring data, building web apps, and solving both simple and complex problems.

2022

2022

Career Transition to Data Analysis

2023

2023 - 2024

Focused Study and Certifications

2024

2024 - 2025

Data Exploration and Software Development

2025

2025 - ONGOING

Pursuing B.S. in Software Engineering

Help me go further

Join Me in My Data Engineering Journey

Contact Me

Help me
go further