Free Projects

Practical walkthroughs with real code. No downloads, no paywalls.

All projects are written for and run on Google Colab — free, no local setup required.

Sales Performance Analysis
Groupby, aggregation, and ranked output
15 min · Beginner · Python

Build a clean sales summary from raw transaction data using pandas — the kind of output you'd produce before a weekly review meeting.

Clean a Messy Dataset
Nulls, duplicates, types, and dates
20 min · Beginner · Python

A practical reference for the most common cleaning operations in pandas — the work that happens before every real analysis.

Classify Text with an LLM API
Sentiment tagging on product reviews
30 min · Beginner · Python / Groq

Call a free LLM API to classify review sentiment and extract themes — turning raw text into a structured, tagged dataset.

Generate SQL from Plain English
Natural language to query with OpenAI
30 min · Intermediate · Python / OpenAI

Translate plain-English questions into runnable SQL using the OpenAI API — a practical tool for analyst workflows.

Score and Prioritize Leads
LLM-powered pipeline qualification
45 min · Intermediate · Python / Groq

Extract qualification signals from raw CRM notes and rank your pipeline automatically using a free LLM API.

Build a Live Data Store
PostgreSQL persistence with Supabase
60 min · Advanced · Python / Supabase

Set up a real PostgreSQL database on Supabase's free tier, write data from Python, and query it back with filters.

Query a Database with SQL
SELECT, WHERE, ORDER BY, and LIMIT
20 min · Beginner · Python / SQLite

Write real SQL against a live in-memory database inside Colab — no credentials, no cloud account, just Python's built-in sqlite3.

SQL JOINs and Aggregations
Multi-table queries with GROUP BY
35 min · Intermediate · Python / SQLite

Join a three-table schema and aggregate the results by tier, category, and product — the operations behind most real business reporting.

Window Functions and CTEs
Rank, compare, and compute running totals
50 min · Advanced · Python / SQLite

Use RANK(), LAG(), and SUM() OVER to analyze trends across periods — plus CTEs to keep complex queries readable.

Query CSV Files with DuckDB
SQL directly on flat files — no load required
25 min · Intermediate · Python / DuckDB

Run SQL queries against a CSV file without loading it into memory first — then add window functions with QUALIFY for ranking in one pass.

Auto-Generate a Data Dictionary
Column docs from schema info via Claude API
30 min · Intermediate · Python / Claude API

Feed Claude your column names and sample values — get a full data dictionary back in seconds, ready to publish or hand off.

Build a Cohort Retention Table
Month-over-month retention matrix from transactions
45 min · Intermediate · Python / pandas

Tag customers with their cohort month, compute months-since-acquisition, and pivot into the retention matrix every SaaS analyst needs to know.

Walk a Hierarchy with a Recursive CTE
Org charts and trees in a single SQL query
35 min · Advanced · Python / SQLite

Traverse any parent-child structure — org charts, category trees, bill of materials — using the recursive CTE pattern most SQL tutorials skip.

Ask Your Dataset Questions in Plain English
EDA via the Gemini API — no groupby required
20 min · Beginner · Python / Gemini

Load a DataFrame, send it to Gemini, and ask anything — get a first-pass analysis in seconds instead of writing queries from scratch.

Flag Outliers with Z-Scores
Global and per-group anomaly detection
30 min · Intermediate · Python / pandas

Compute z-scores in five lines and flag statistical outliers globally and per-group — find the weird rows before your stakeholders do.