Karina | Python | Excel | Stats | DataScience | DataAnalytics
Data analytics/science educator; active, high‑engagement posts on pandas/Python and big‑data‑adjacent workflows for analytics.

Learn SQL Through Games, Not Dry Tutorials
You can learn SQL by playing a game. I am not joking. Most people quit SQL tutorials because dry exercises on fake datasets feel pointless. These free games give you a reason to write queries. A murder to solve. An island to survive. A crime to crack. A fleet to command. Same SQL. Much harder to quit.

Use Pandas Query() for Cleaner, Chainable DataFrame Filters
Python tip You've been filtering DataFrames like this. df[(df['region'] == 'UAE') & (df['revenue'] > 10000)] There's a cleaner way. df.query("region == 'UAE' and revenue > 10000") Same result. No brackets. No repeated df. Reads like a sentence. Where it really pays off is inside a chain. Use...

Prefer UNION ALL for Speed; Use UNION only for Deduplication
UNION VS UNION ALL in SQL UNION deduplicates every row after combining the results. That means sorting, comparing, discarding. On large tables that's a real performance cost -- and most of the time, you don't even need it. UNION ALL stacks the...

Business Queries Demand More than Basic SQL Skills
There is a gap between knowing SQL and knowing enough SQL to answer the questions a business actually asks. "Show me each customer's rank within their segment." "Give me a running total of revenue by month." "Flag anyone earning above their...

Data Cleaning Is Core Analysis, Not Just Prep
I’ve never worked with a clean dataset. Every real project = messy data. And it always comes down to 4 things: • Missing values • Duplicates • Data types & formatting • Outliers Cleaning isn’t a “prep step”. It is the analysis.

Plans Are Starting Points; Embrace Pivots for Growth
38 🎂 At 20 I had a plan for my life. It bore almost no resemblance to what actually happened. Here is what I know at 38 that I didn’t know at 20. The plan is useful for getting started, but it...

Validate Data Loads Instantly with SQL EXCEPT
SQL tip You ran a load job overnight. How do you know every record made it? Most people recount rows and hope the numbers match. There's a cleaner way. SELECT order_id FROM staging.orders EXCEPT SELECT order_id FROM production.orders; If this returns nothing, every order transferred successfully. If...

Smooth Daily Revenue with a 7‑Day Rolling Average
SQL tip Daily revenue is noisy. One bad Monday skews the whole picture. A 7-day moving average smooths it out. ROWS BETWEEN 6 PRECEDING AND CURRENT ROW tells SQL to look at today plus the 6 days before it. The result is a rolling...

Window Functions Rank without Collapsing Rows
SQL tip GROUP BY collapses your rows. Sometimes you need the ranking without losing the detail. That's what window functions do. PARTITION BY region restarts the ranking for each region. ORDER BY total_spend DESC puts the highest spender at rank 1. Every row stays intact....

Combine Multiple Aggregates in One Query Using CASE
SQL tip You're running three separate queries to get this. SELECT SUM(amount) FROM orders WHERE user_type = 'premium'; SELECT COUNT(*) FROM orders WHERE is_first_order = TRUE; SELECT SUM(amount) FROM orders; You can get all three in one. This pattern works across Oracle, SQL Server, PostgreSQL, BigQuery...

Group by Time with pd.Grouper—no Extra Columns
Python tip You've been creating extra columns just to group by month. pd.Grouper does it in one step, inside the groupby. Same result. No extra column. It works for any time frequency -- weekly, quarterly, custom intervals -- without touching your data.

Use Python Set Operators to Compare Lists Instantly
Python set operators analysts actually use You already know sets remove duplicates. But they also do something more useful. Compare lists without a single loop. | union -- combine two lists, no duplicates. i.e. all customers who bought in January OR February & intersection...

Segment Customers with RFM‑Based K‑Means Clustering
Let's build a mini data science project. Customer segmentation using K-Means clustering. What are we building? We're grouping customers by behaviour using RFM: Recency — days since last purchase Frequency — total orders placed Monetary — total spend Three numbers per customer. That's...
Claude Skills: Reusable Prompts Boost AI Efficiency
I've been using Claude Skills and they're genuinely useful once you understand what they are. Think of them as reusable instructions Claude reads before doing a task. Anthropic has an official repo and there's a solid community list too. 🔗 github.com/anthropics/skills 🔗 github.com/travisvn/awesome-claude-skills

CTEs Turn Complex SQL Into Readable, Maintainable Code
SELECT, FROM, WHERE and JOINs will get you started. Then the work gets complicated and you realise tutorial SQL and production SQL are two very different things. Here's level 2 CTEs — readability I was lost in my own nested subqueries. Couldn't follow...