Confessions of a Data Guy - Latest News and Information
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

Technology Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
Confessions of a Data Guy

Confessions of a Data Guy

Publication
0 followers

Practitioner’s takes on data engineering practice and careers

Recent Posts

Spark, Lakehouse & AI: A Deep Conversation with Bart Konieczny
News•Feb 25, 2026

Spark, Lakehouse & AI: A Deep Conversation with Bart Konieczny

In a recent Data Engineering Central podcast, Bart Konieczny discussed the evolving synergy between Apache Spark, lakehouse architectures, and artificial intelligence. He highlighted Spark's latest performance enhancements, including Catalyst optimizer refinements and native GPU acceleration. Konieczny explained how lakehouses bridge traditional data warehousing with machine‑learning pipelines, simplifying data governance and real‑time analytics. The conversation also covered the growing open‑source ecosystem that accelerates enterprise adoption of AI‑driven data platforms.

By Confessions of a Data Guy
Introduction to Databricks SQL Temporary Tables
News•Feb 23, 2026

Introduction to Databricks SQL Temporary Tables

Databricks has introduced session‑scoped temporary tables for Databricks SQL, implemented as physical Delta tables stored in Unity Catalog. The tables persist only for the duration of a Spark SQL session and are automatically reclaimed, supporting full CRUD operations. This addition...

By Confessions of a Data Guy
Temporary Tables in Databricks SQL | Do You Actually Need Them?
News•Feb 17, 2026

Temporary Tables in Databricks SQL | Do You Actually Need Them?

The article reviews temporary tables in Databricks SQL, explaining how they store intermediate results for the duration of a session and can be referenced across multiple statements. It compares them to Common Table Expressions, highlighting performance gains when avoiding repeated...

By Confessions of a Data Guy
Migrating to Databricks – A Guide
News•Feb 13, 2026

Migrating to Databricks – A Guide

The guide cautions that moving to Databricks won’t fix weak data fundamentals; organizations must first establish clear dev‑prod separation, version‑controlled code, and cost accountability. It urges teams to define real needs, avoid over‑architecting, and split infrastructure choices from data‑architecture decisions....

By Confessions of a Data Guy
Why Declarative (Lakeflow) Pipelines Are the Future of Spark
News•Feb 11, 2026

Why Declarative (Lakeflow) Pipelines Are the Future of Spark

Spark is evolving from low‑level RDD and notebook‑driven workflows to declarative pipelines, branded as Lakeflow on Databricks. The new framework lets engineers define flows, datasets, and pipelines in a configuration‑first manner, while Spark handles execution for both batch and streaming....

By Confessions of a Data Guy
Robin Moffatt on the Evolution of Data Engineering: From Batch Jobs to Real-Time | Podcast Interview
News•Feb 11, 2026

Robin Moffatt on the Evolution of Data Engineering: From Batch Jobs to Real-Time | Podcast Interview

Robin Moffatt discusses how data engineering has shifted from traditional batch processing to real‑time streaming in a recent podcast interview. He outlines the technical drivers—cloud scalability, event‑driven architectures, and low‑latency analytics—that enable continuous data pipelines. Moffatt also highlights emerging tools...

By Confessions of a Data Guy
The Lakehouse Architecture | Multimodal Data, Delta Lake, and Data Engineering with R. Tyler Croy
News•Feb 3, 2026

The Lakehouse Architecture | Multimodal Data, Delta Lake, and Data Engineering with R. Tyler Croy

The article introduces the lakehouse architecture as a unified platform that combines the scalability of data lakes with the performance of data warehouses. It highlights how Delta Lake brings ACID transaction support and schema enforcement to open‑source storage, enabling reliable...

By Confessions of a Data Guy
Building Credible Data Systems | Hoyt Emerson on The Full Data Stack
News•Jan 30, 2026

Building Credible Data Systems | Hoyt Emerson on The Full Data Stack

Hoyt Emerson discusses how organizations can construct credible data systems that deliver trustworthy insights. He emphasizes the need for rigorous data governance, automated testing, and clear ownership across the data lifecycle. The conversation highlights real‑world examples where poor data quality...

By Confessions of a Data Guy
Data Engineering Career Path: From Circuits to Pipelines
News•Jan 30, 2026

Data Engineering Career Path: From Circuits to Pipelines

The article maps a data‑engineering career trajectory that begins with hardware‑oriented roles and ends in building scalable data pipelines. It highlights how circuit‑design thinking translates into logical data modeling, while emphasizing the need to acquire SQL, Python, and cloud‑native tools....

By Confessions of a Data Guy
Apache Airflow vs Databricks Lakeflow | The Orchestration Battle
News•Jan 30, 2026

Apache Airflow vs Databricks Lakeflow | The Orchestration Battle

The article pits Apache Airflow, the open‑source workflow orchestrator, against Databricks Lakeflow, a newer Lakehouse‑native pipeline engine. It outlines core differences in architecture, integration depth with cloud data platforms, and pricing models. Airflow remains favored for heterogeneous environments, while Lakeflow...

By Confessions of a Data Guy
This One Polars Pattern Makes Code 10x Cleaner
News•Jan 30, 2026

This One Polars Pattern Makes Code 10x Cleaner

The article highlights a single Polars pattern—using the pipe operator—to streamline data‑frame code, cutting boilerplate and boosting readability up to tenfold. By chaining transformations in a lazy execution graph, developers avoid intermediate variables and gain clearer, more maintainable pipelines. The...

By Confessions of a Data Guy
Apache Arrow ADBC Database Drivers
News•Jan 16, 2026

Apache Arrow ADBC Database Drivers

Apache Arrow’s ADBC (Arrow Database Connectivity) introduces a modern, columnar‑native driver that can replace or complement traditional ODBC/JDBC stacks. By moving Arrow RecordBatches end‑to‑end, it eliminates row‑by‑row marshaling and dramatically reduces serialization overhead. Benchmarks show Python ADBC achieving roughly 275 k...

By Confessions of a Data Guy