
Everyone’s watching which LLM wins benchmarks. I’m watching Informatica suddenly show up everywhere. GCP coverage up 200%. The enterprise data stack isn’t being replaced by AI. It’s becoming the delivery mechanism for AI. The boring infrastructure is now the moat. https://t.co/9dxQv330NZ

70% of executives say they have difficulty acting on data. Meanwhile, Power BI just won the 2025 Gartner Magic Quadrant. Again. The tools keep getting better. The problem isn’t the tools. It never was. Source: https://t.co/KNtNLIRTOQ https://t.co/ZOAPhWZKSf
Simple is good. One-line code change to switch from Apache Cassandra to a @googlecloud Spanner database. https://t.co/2n6AJutoNM Generate embeddings automatically for @googlecloud BigQuery table. https://t.co/SqIQzawOvt https://t.co/zWknasRT6r
Git for data is still underexplored, and it is an area that is changing so fast. That's why we look at actual tools/features that showcase how to apply a Git-like workflow for data. I compared Git-like tools for data I could...

A TUI for managing Airflow jobs? Something like k9s? Flowrs seems to be just that - haven't tried yet, but looks really cool. Will try next time I have to use Airflow :) https://github.com/jvanbuel/flowrs
Yes, AI can write the SQL. But do you understand: * Why that join works? * Why that model makes sense? * Why that metric matters? AI lowers the barrier. Foundations raise your ceiling.
Controversial opinion: don't start with a semantic layer. A semantic layer makes sense when: - You have multiple consumers (BI, notebooks, apps) - KPIs are defined inconsistently across teams - You need a universal API for metrics If you're early stage with one BI tool,...

❌Most data science projects take 4 weeks because of meetings, reruns, and handoffs between teams ✅A good AI/DS workflow compresses it to ~15 minutes. I’m demo-ing how to do it live (free): https://learn.business-science.io/registration-ai-workshop-2
Salesforce is now bridging four domains at once: Salesforce Implementation (CRM) Databricks (data lake) Agentforce (AI agents) Data 360 (data platform) The platform wars are not about features. They are about who owns the most connected node in your stack.
Will Rust kill Python in data engineering? No. But it has already consumed much of the JavaScript tooling ecosystem. And it's quietly doing the same in data. The pattern: Python remains the interface, Rust becomes the engine. Polars, DataFusion, DuckDB's internals - all Rust...

ROW_NUMBER(), RANK(), DENSE_RANK(). Three functions, three different behaviors. Pick the wrong one and your rankings mislead. Here are 4 patterns to get it right: - ranking with gaps vs without - top-N per category - deduplication - running totals 1. ROW_NUMBER() vs RANK() vs DENSE_RANK() Three functions, three behaviors...
The Data Integrity Gap: From “Big Data” to “Reliable Physics”.. click to learn everything you need to know about issues you likely don't know you have or will soon have in your organisation.. https://t.co/LrOOv5lGcm
This is a common problem and one of our biggest motivations in building Nile - to isolate tenants automatically and by default.
Centralizing analytics on a single platform? Not happening. The focus is on decentralized back-office systems and a common analytics layer for daily visualization. #Analytics #Strategy #BusinessTech https://t.co/7ObAL6iVQ5
AGI is in the noise bucket this week. Lakehouse architecture? Up 400%. While the industry debates the AI endgame, data infrastructure quietly becomes non-negotiable. The boring skills win again.
Unpopular opinion: Data analyst job postings ask for Python. Data analyst jobs don't actually use Python. What you'll use daily: Excel — every single day SQL — every single day Power BI or Tableau — multiple times per week Python — maybe once a month This pattern holds...
big fan of ontology btw, but noted. building the proprietary data fusion layer this weekend 🫡
I quickly recorded how easily and conveniently it is to browse S3 files locally with a single command, blazingly fast. Even preview works with DuckDB integration. https://youtu.be/cimUvBd_9Ns
I struggle with the phrase “everyone’s a coder now.” And I hesitate to post because I don’t want you to read this as gatekeeping. If anything, I want more people to build, but in a stronger, more functional way. Building any...

The more if-elif chains you write, the harder your code gets to change. Python has cleaner patterns for this. Here are 4 worth knowing: - dictionary dispatch - guard clauses - match/case - conditional expressions 1. Dictionary dispatch. Replace long equality checks with a dict. Constant-time lookup. No branching....

RIP BI Dashboards. Tools like Tableau and PowerBI are about to become extinct. This is what's coming (and how to prepare):
Aspiring Data Processionals Excel, SQL and PBI are great tools to build projects with. If you're completely confused, start with my Guide 👇🏾 https://tekdlin.com/data-analytics-guide/
This is an interesting thread. Everyone is suggesting tools to solve the problem. I’d start by asking more about the data and the questions the customer is trying to answer or problems they are trying to solve first before recommending...
Yesterday I was talking with another founder about ensuring their B2B SaaS product is data compliant. There’s so much complexity behind meeting required standards.
You don't need a bootcamp to become a data analyst. Everything you need is free: Excel/SQL/Python/Power BI tutorials — YouTube SQL practice — SQLZoo, LeetCode, HackerRank Datasets for portfolio projects — Kaggle, data.gov, Google Dataset Search Resume feedback — Reddit (r/datascience, r/resumes), LinkedIn communities Interview prep...
Not all retries are created equal. Immediate retry: usually fails again Exponential backoff: gives systems time to recover Exponential backoff with jitter: prevents thundering herd Most orchestrators have this built in. But you need to understand what's happening or you'll wonder why your retries...
Test for superintelligence: when the data in Fivetran’s salesforce is 100% accurate and up to date at all times, I’ll know we’re there.
The semantic layer is like a restaurant menu: you know what you're ordering, but not how it's made. This analogy comes from Maxime Beauchemin and I think it's perfect. Users shouldn't need to understand your star schema to calculate revenue. They should...
You've got data spread across geographies. What happens when you want to bring that data together? Usually ETL jobs or other mechanisms. We just launched @googlecloud BigQuery global queries. Do multi-location analysis with a single query: https://t.co/F3p2mn5SjZ
Data is objective only in appearance. Behind every dataset lies a human decision about what to measure
Data Quality and Data Governance are two of the most underrated but important areas in the data space There are other areas to explore in data outside of Analytics.
Hot take: Pivot tables are the REPL for business data. Just like programmers use REPLs to quickly test code, business users use pivot tables to quickly test hypotheses about their data. Drag a field. See a result. Adjust. Repeat. This feedback loop is...

AI teams love tuning models. But they ignore the bike chain: data. Outsourcing labeling to people that care much less on the app’s success. Messy internal docs. No structured knowledge base. No call transcripts. No clean SOPs. Then they ask: “Why isn’t the model improving?” The highest ROI in...
Most analysts can run a regression. Very few can explain what the output actually means. That gap is a statistics fundamentals problem. Not a tools problem. Not a Python problem. Not a years-of-experience problem. If you can't explain what your numbers mean, you...

I did some digging with the help of ChatGPT and Claude Here are 4 tech areas you can still explore in 2026 backed by data: • AI/ML – Data Analytics falls here • Cloud & Infrastructure • Security & Governance • Data Engineering Let me...

Common SQL errors and what they REALLY mean "Column ambiguously defined" You joined tables with the same column name. Fix: Add table aliases (customers.id not just id) "Not a single-group group function" You mixed aggregated + non-aggregated columns. Fix: Add all non-aggregated columns to GROUP BY "Division...
Aspiring Data Analyst? Don’t overcomplicate it. Start building projects with tools like Excel, SQL, and Power BI.
Working in entertainment analytics I am often asked how best to position a title for success. But data can help you aim more accurately and efficiently. What it can’t do is provide the single most important element to success: a...
I see data contracts and data quality as overlapping but different: Data contracts: what is the data and how do we enforce it Data products: why do we need this data In practice, I'd argue for asset-based data quality assertions. Every time a...
eczachly I hate Snowflake micro partitions and optimizations for a few reasons - they make data modeling lazy If you don’t have to understand the partitioning or shape of your data. You can just slap the data into Snowflake and call it a...
From Zach Wilson, three signs your pipeline isn't idempotent: 1. It uses INSERT INTO instead of INSERT OVERWRITE or MERGE 2. Date filters have "date > start" but no "date < end" - this causes exponential backfill costs 3. Source tables are always...
As a CS girlie, I started my journey in Analytics and to this day, I still believe it has one of the lowest barriers to entry into a career in tech. The barrier has never been lower. If you’re thinking about...
Ingest Structure Learn (ISL) is the new ETL. It used to be the case that a company would try to license this kind of data as an “edge”. I’ve seen many companies in SV try to make this claim. That...

Move over Tableau and PowerBI. There's a new Python library that automates Business Intelligence with AI using Text2SQL. Let me introduce you to WrenAI:
Branches can become profit engines, not cost centers, if supported by unified data and modern infrastructure. Execution is the difference. We discuss this with Benjamin Conant of @alkamitech and Co founder of @Mantl_tech. Watch the full video now: https://t.co/Jamreeb077 https://t.co/L1lau9ecEU

Most portfolios fail because they stop at “model accuracy.” A good AI/DS portfolio has: 1. A model that predicts something the business can act on 2. An AI agent that turns outputs into next steps It's that simple. Want help?
I discovered a new favorite use of AI Agents. Get ☕️, its a bit long: If you follow the postgres-hackers list, you know this pattern: - Someone submits a patch - Someone else raises performance concerns with the patch The rational thing to do...
After years in data engineering, I've realized the job is mostly pattern recognition. You see a problem. You recognize it as a variant of a problem you've solved before. You apply a known solution with modifications. This is why experience matters more...

Turning a DataFrame into a presentation-ready table in Python. Recently I tried a library called Great Tables and it makes formatting tables very easy. - Works with Pandas & Polars - 19 formatting methods (currency, percentages, dates, scientific notation) - Export to HTML, LaTeX,...
XGBoost Tips from 5x Kaggle Grandmaster Chris Deotte Top 5 ways to improve your ML models: