Know What's Happening in Big Data

Today's Big Data Pulse

CRN’s 2026 Big Data 100 maps $31.8B market surge and AI‑driven vendor moves

The report projects the big‑data market to reach $31.8 billion in 2026, driven by rapid AI adoption and new LLM‑enabled analytics. It highlights Alteryx’s $4.4 billion private‑equity buyout, AtScale’s Snowflake‑led financing round, and Hex’s $70 million Series C funding.

AI Fails without Clean, Documented, Owned Data
SocialMar 20, 2026

AI Fails without Clean, Documented, Owned Data

Most companies experimenting with AI are not struggling with models. They’re struggling with: – messy internal data – inconsistent schemas – no documentation – no data ownership You can’t plug OpenAI into chaos and expect magic. Data hygiene is important for AI.

By Ebere Oyek (Nelo) — Data | AI | ML
Planet Labs Posts 26% Revenue Rise and First Annual EBITDA Profit in Q4 2026
NewsMar 20, 2026

Planet Labs Posts 26% Revenue Rise and First Annual EBITDA Profit in Q4 2026

Planet Labs announced $307.7 million total revenue for 2026, a 26% year‑over‑year increase, and $86.8 million in Q4, up 41% YoY. The company posted its first full‑year adjusted EBITDA profit of $15.5 million and generated $52.9 million in free cash flow, driven by expanding...

By Pulse
From DLT to Lakeflow Declarative Pipelines: A Practical Migration Playbook
NewsMar 19, 2026

From DLT to Lakeflow Declarative Pipelines: A Practical Migration Playbook

Databricks is rebranding Delta Live Tables as Lakeflow Spark Declarative Pipelines, adding open‑source Spark alignment and new features. Existing DLT pipelines run unchanged, but Databricks recommends updating imports, decorators, expectations, and CDC logic to the new `dp` API. The migration...

By DZone – DevOps & CI/CD
How to Build an Effective Big Data Strategy
NewsMar 19, 2026

How to Build an Effective Big Data Strategy

Smart organizations leverage big data to boost performance, but without a clear strategy they risk duplicated projects, compliance breaches, and wasted spend. The article outlines a four‑step framework—defining business goals, assessing data readiness, prioritizing use cases, and creating a flexible...

By TechTarget SearchERP
IQIYI Repurchases $207.8 Million of Convertible Notes, Leaves $259 K Outstanding
NewsMar 19, 2026

IQIYI Repurchases $207.8 Million of Convertible Notes, Leaves $259 K Outstanding

iQIYI finished a $207.8 million repurchase of its 6.50% convertible senior notes due 2028, leaving only $259,000 of principal outstanding. The move reduces the company's debt load but comes as its shares trade near a 52‑week low and analysts flag a...

By Pulse
LightningChart Introduces No-Code Visualization Platform Dashtera
NewsMar 19, 2026

LightningChart Introduces No-Code Visualization Platform Dashtera

LightningChart unveiled Dashtera, a no‑code, web‑based analytics platform that leverages GPU‑accelerated rendering to display up to 100 million data points in real time. The solution removes the need for extensive implementation projects, data reduction, or custom integration, delivering instant zoom and...

By SD Times
Informatica Adds Microsoft Fabric Support and Opens Swiss Data Center
NewsMar 19, 2026

Informatica Adds Microsoft Fabric Support and Opens Swiss Data Center

Informatica announced general availability of Microsoft Fabric Open Mirroring within its Intelligent Data Management Cloud (IDMC) and launched a new Azure‑based IDMC delivery point in Switzerland. The Open Mirroring feature lets customers synchronize data between OneLake and Fabric Data Warehouse...

By ChannelE2E
Master the 10 Essential Clustering Techniques
SocialMar 19, 2026

Master the 10 Essential Clustering Techniques

The 10 types of clustering that all data scientists need to know. Let's dive in:

By Matt Dancho
CollectForU Expert and Debt Hunter Reveal 70% of Hong Kong SMEs Lack Credit Defenses
NewsMar 19, 2026

CollectForU Expert and Debt Hunter Reveal 70% of Hong Kong SMEs Lack Credit Defenses

Credit‑management firms CollectForU Expert and Debt Hunter released a joint report on March 16 showing more than 70% of Hong Kong SMEs lack solid credit‑defense mechanisms, leaving them vulnerable to liquidity strain. The study flags the 90‑day delinquency mark as...

By Pulse
Interview: Huy Dao, Director of Data and Machine Learning Platform, Booking.com
NewsMar 19, 2026

Interview: Huy Dao, Director of Data and Machine Learning Platform, Booking.com

Booking.com’s data and machine‑learning platform, led by Huy Dao, has completed a seamless migration from on‑prem Hadoop to a Snowflake‑based cloud ecosystem. The new Booking Data Exchange serves over 1,500 practitioners, handling petabytes of data and billions of daily predictions...

By ComputerWeekly – DevOps
Dagster: Asset‑First Orchestration Over Task‑Centric Pipelines
SocialMar 19, 2026

Dagster: Asset‑First Orchestration Over Task‑Centric Pipelines

Dagster has a steep learning curve but a payoff. It is Vim for orchestration. The mental model shift: Dagster thinks in assets, not tasks. You define what data should exist, not what steps to run. The engine figures out dependencies and...

By SSP Data
Databricks' Genie Code Automates Data Science and Engineering
SocialMar 19, 2026

Databricks' Genie Code Automates Data Science and Engineering

I shared my thoughts with @Infoworld on the new Genie Code from @Databricks https://t.co/54nQ6q4vAQ The goal is to highly automate data science and engineering tasks.

By Dion Hinchcliffe
SAPinsider Las Vegas: Why Data Strategy Must Start With Trust:
NewsMar 18, 2026

SAPinsider Las Vegas: Why Data Strategy Must Start With Trust:

At SAPinsider Las Vegas 2026, Ingo Hilgefort warned that data‑driven AI projects fail when organizations lack trust in their data. He argued that inconsistent definitions and poor governance cause users to rebuild dashboards to verify numbers, stalling analytics adoption. Hilgefort...

By ERP Today
How a Nonprofit Transforms Data with Cloudera and AI
NewsMar 18, 2026

How a Nonprofit Transforms Data with Cloudera and AI

Rare Hope, a nonprofit focused on rare‑disease hypotheses, adopted Cloudera’s hybrid data‑and‑AI platform to turn unstructured research papers and medical images into structured insights. Using PySpark pipelines, the organization extracts disease‑drug correlations and feeds them to large language models for...

By AI Business
Federal AI Needs a New Data Foundation. Dell’s Platform Is Built for It.
NewsMar 18, 2026

Federal AI Needs a New Data Foundation. Dell’s Platform Is Built for It.

The federal government is accelerating its adoption of generative AI, retrieval‑augmented generation, and early agentic systems, but agencies are constrained by legacy data architectures. Dell’s AI data platform offers a secure, federated foundation that lets classified and regulated data remain...

By FedTech Magazine
Taming the IoT Firehose: How Utilities Are Scaling Cloud DataOps for Smart Metering
NewsMar 18, 2026

Taming the IoT Firehose: How Utilities Are Scaling Cloud DataOps for Smart Metering

Utilities are grappling with an "IoT firehose" as smart meters generate massive, continuous telemetry streams. To tame the volume, they are adopting cloud‑based DataOps frameworks that automate ingestion, normalize data, and deliver analytics‑ready datasets at scale. Automated, event‑driven pipelines enable...

By SmartData Collective
Universal Semantic Layer Needed for Multi-Tool Data Access
SocialMar 18, 2026

Universal Semantic Layer Needed for Multi-Tool Data Access

The semantic layer isn't new. SAP BusinessObjects had one in 1991. What's new is the need for a universal semantic layer that works across BI tools, notebooks, and applications. When you only had one BI tool, that tool's semantic layer was enough....

By SSP Data
Microsoft Promises All-in-One Database Wrangling Hub on Fabric
NewsMar 18, 2026

Microsoft Promises All-in-One Database Wrangling Hub on Fabric

Microsoft unveiled Database Hub, an early‑access tool built on the Fabric data platform that consolidates management of Azure SQL Server, Cosmos DB, PostgreSQL, MySQL, Azure Arc‑enabled SQL, and other services. The hub offers a single pane of glass for on‑premises,...

By The Register
Lloyd's Register, OneOcean Report Warns Shipping Must Master Data to Remain Competitive
NewsMar 18, 2026

Lloyd's Register, OneOcean Report Warns Shipping Must Master Data to Remain Competitive

Lloyd’s Register and OneOcean released a report warning that the maritime sector’s surge in operational data is hampered by fragmentation and low standardisation, jeopardising compliance and commercial advantage. Their Digital Maturity Index shows data standardisation at 2.45 / 4 while overall digital...

By MarineLink
Oracle Announced the General Availability of Oracle Analytics Server 2026
NewsMar 18, 2026

Oracle Announced the General Availability of Oracle Analytics Server 2026

Oracle announced the general availability of Oracle Analytics Server 2026, delivering a suite of enhancements aimed at boosting adoption, performance, and governed self‑service. New defaults for the "Limit Values By" filter and a redesigned State menu streamline workbook interactions. The...

By Database Trends & Applications (DBTA)
DuckDB, AI, and the Future of Data Engineering
PodcastMar 18, 20260 min

DuckDB, AI, and the Future of Data Engineering

In this episode, Dan Beach chats with State Farm staff engineer Matt Martin about his journey from industrial engineering to data engineering, his deep involvement with DuckDB, and the evolving landscape of data platforms. Matt shares how early automation with...

By Data Engineering Central
Nvidia GTC 2026: DDN Launches IndustrySync Pipelines for Financial Services and Life Sciences AI
BlogMar 18, 2026

Nvidia GTC 2026: DDN Launches IndustrySync Pipelines for Financial Services and Life Sciences AI

DDN announced IndustrySync Pipelines, pre‑integrated AI data workflows for Financial Services and Life Sciences, deployable on its HyperPOD platform in days instead of months. The Financial Services pipeline promises up to 150× faster risk simulations and five‑minute risk metric refreshes,...

By StorageNewsletter
DataOps Engineers: The Underrated Backbone of AI Efficiency
SocialMar 18, 2026

DataOps Engineers: The Underrated Backbone of AI Efficiency

The most underrated AI role right now: DataOps Engineer. Not the ML engineer. Not the data scientist. The person who designs automation and testing infrastructure that makes everyone else dramatically more effective. Infrastructure that runs without you. That's the whole job. https://t.co/Cng5iC1BEB

By Yves Mulkers
GHD Appoints David McLaren to Lead Data and AI Capabilities Globally
NewsMar 18, 2026

GHD Appoints David McLaren to Lead Data and AI Capabilities Globally

GHD has appointed David McLaren as its Enterprise Data & AI Leader, based in Toronto. McLaren brings experience from Coca‑Cola Canada Bottling, where he built enterprise‑scale data platforms, automation and governance. At GHD he will steer the development of an...

By SalesTech Star
Nigerian Firms Chase Data Analytics Skills as 8% Revenue Boost Spurs Demand
NewsMar 18, 2026

Nigerian Firms Chase Data Analytics Skills as 8% Revenue Boost Spurs Demand

Nigerian companies are rapidly adopting data analytics, motivated by research showing an average 8% revenue increase for firms that use analytics tools. The shift is creating a talent crunch as businesses, from banks to retailers, scramble to upskill staff and...

By Pulse
Data Lineage Documentation Matters for Enterprise Reliability
NewsMar 18, 2026

Data Lineage Documentation Matters for Enterprise Reliability

Enterprises are increasingly recognizing that knowing where data resides is insufficient without visibility into its lifecycle. Data lineage—tracking origin, transformations, and access—provides the transparency needed for accountability, data quality, compliance, and reduced technical debt. The article highlights how poor lineage...

By TechTarget SearchERP
Ibrar Ahmed: RAG With Transactional Memory and Consistency Guarantees Inside SQL Engines
NewsMar 18, 2026

Ibrar Ahmed: RAG With Transactional Memory and Consistency Guarantees Inside SQL Engines

Current retrieval‑augmented generation (RAG) systems were built for static document search, which creates consistency problems when multiple agents write concurrently. Without transactional control, memory updates can become partially committed, leading to answer drift and silent corruption. The article proposes using...

By Planet PostgreSQL
Nvidia‑Backed Starcloud Seeks FCC Approval for 88,000‑Satellite AI Data Center Constellation
NewsMar 18, 2026

Nvidia‑Backed Starcloud Seeks FCC Approval for 88,000‑Satellite AI Data Center Constellation

Redmond‑based Starcloud, a Nvidia‑backed startup, filed an FCC application on March 16, 2026 to deploy up to 88,000 low‑Earth‑orbit satellites that would act as orbital data centers for AI workloads. The plan envisions a dusk‑dawn, sun‑synchronous constellation operating between 600...

By Pulse
Nvidia Unveils Groq 3 Inference Chip to Power Multi‑Agent AI at GTC 2026
NewsMar 18, 2026

Nvidia Unveils Groq 3 Inference Chip to Power Multi‑Agent AI at GTC 2026

On March 16, 2026 at its GTC conference in San Jose, Nvidia announced Groq 3, a dedicated inference processor built on technology licensed from Groq Inc. The chip arrives in 256‑LPU LPX server racks with 128 GB of solid‑state RAM and 40 PB/s...

By Pulse
Nvidia Unveils $1 Trillion AI Roadmap, Vera CPUs & BlueField‑4 Storage at GTC 2026
NewsMar 18, 2026

Nvidia Unveils $1 Trillion AI Roadmap, Vera CPUs & BlueField‑4 Storage at GTC 2026

On March 16, 2026, Nvidia CEO Jensen Huang announced at the GTC developer conference in San Jose that the company expects $1 trillion in AI chip orders through 2027, unveiled the Vera Rubin CPU/GPU platform, and introduced the BlueField‑4 STX reference...

By Pulse
IBM Finalizes $10 B Confluent Deal, Making Real‑Time Data Core of Enterprise AI
NewsMar 18, 2026

IBM Finalizes $10 B Confluent Deal, Making Real‑Time Data Core of Enterprise AI

On March 18, 2026, IBM announced the completion of its $10 billion acquisition of data‑streaming platform Confluent, cementing the deal in the United States. The transaction gives IBM full ownership of Confluent’s Apache‑Kafka‑based technology, which IBM says will become the engine...

By Pulse
Intelligence and Interoperability: Data Catalog Must-Haves for AI Data Governance
NewsMar 17, 2026

Intelligence and Interoperability: Data Catalog Must-Haves for AI Data Governance

Enterprises must move beyond static data catalogs toward a universal AI catalog that combines a business‑friendly semantic layer with cross‑platform interoperability. The semantic layer supplies machine‑readable context, preventing misinterpretations by AI agents, while universal interoperability ensures governance, security, and metadata...

By Snowflake Blog
IBM Joins Data Platform Race with Confluent Acquisition
SocialMar 17, 2026

IBM Joins Data Platform Race with Confluent Acquisition

With the latest acquisition of Confluent by IBM, they follow up on the Fivetran, Databricks, and Snowflake stack. Or what do you think? With the latest acquisition in data engineering, it's a race of who gets the most complete data platform...

By SSP Data
Orchestration Turns Data Stack Flexibility Into Cohesion
SocialMar 17, 2026

Orchestration Turns Data Stack Flexibility Into Cohesion

The Modern Data Stack promised best-of-breed tools that work together seamlessly. The paradox: the more tools you pick, the more integration work you create. One perspective I find helpful: Orchestration as the connective tissue. A good orchestrator doesn't just schedule jobs -...

By SSP Data
Datadobi Announces Early Access Program for Data Access Review
BlogMar 17, 2026

Datadobi Announces Early Access Program for Data Access Review

Datadobi has launched an Early Access Program for Data Access Review, a new permissions‑intelligence capability for its StorageMAP platform. The feature adds visibility into who can access unstructured data, helping organizations spot excessive, outdated, or inappropriate rights. Selected current StorageMAP...

By The Manufacturing Connection
IBM Acquires Confluent to Power Real‑time Enterprise AI
SocialMar 17, 2026

IBM Acquires Confluent to Power Real‑time Enterprise AI

.@IBM Completes Acquisition of Confluent, Making Real Time Data the Engine of Enterprise AI and Agents https://t.co/QqwqJPCT4P >> Congrats. A key augmentation for the IBM AI capabilities. Good news for customers. #NextGenApps https://t.co/aCKH7wuesW

By Holger Müller
Databricks, Accenture Launch Joint Business Venture Focused On Spurring AI Development
NewsMar 17, 2026

Databricks, Accenture Launch Joint Business Venture Focused On Spurring AI Development

Databricks and Accenture have launched the Accenture Databricks Business Group, a joint venture designed to accelerate enterprise adoption of the Databricks Data Intelligence Platform for AI and data workloads. Backed by more than 25,000 Databricks‑trained professionals, the group will help...

By CRN (US)
Agentic AI Is Forcing Analytics and Operations to Converge
NewsMar 17, 2026

Agentic AI Is Forcing Analytics and Operations to Converge

Investments in data platforms have shifted from siloed warehouses to unified, sovereign foundations as agentic AI collapses analytics, operations, and AI into single workflows. Enterprises now need platforms that govern operational execution, high‑concurrency analytics, and AI reasoning together, rather than...

By The Register – AI/ML (data-related)
Better Cotton Funds On-Farm Data-Collecting Project
NewsMar 17, 2026

Better Cotton Funds On-Farm Data-Collecting Project

The Better Cotton Initiative (BCI) is launching a $200,000 on‑farm data‑collection effort in partnership with the Soil Health Institute and ag‑tech provider Growers Guide. The program will analyze soil, plant tissue and sap samples across the Southeast and other Cotton Belt...

By Sourcing Journal
Big Changes in Latest GigaOm Unstructured Data Management Radar Report
NewsMar 17, 2026

Big Changes in Latest GigaOm Unstructured Data Management Radar Report

GigaOm released version 6 of its Unstructured Data Management Radar, expanding the vendor set to 23 and appointing James Brown as the new analyst. The report reclassifies 11 suppliers as leaders and 12 as challengers, with notable moves such as Panzura shifting...

By Blocks & Files
Day 44: Real-Time Monitoring Dashboard with Kafka Streams
BlogMar 17, 2026

Day 44: Real-Time Monitoring Dashboard with Kafka Streams

The post walks through building a production‑grade real‑time monitoring dashboard that ingests over 40,000 events per second using Kafka Streams. It shows how windowed aggregations, percentile calculations, and anomaly detection run on RocksDB‑backed state stores with exactly‑once guarantees. The stream...

By Hands On System Design Course - Code Everyday
Noémi  Ványi: We Skipped the OLAP Stack and Built Our Data Warehouse in Vanilla Postgres
NewsMar 17, 2026

Noémi Ványi: We Skipped the OLAP Stack and Built Our Data Warehouse in Vanilla Postgres

Xata built a product analytics warehouse using vanilla Postgres, consolidating identity, usage, billing, and event data from four separate systems. They employed materialized views, pg_cron schedules, and database branches to flatten JSONB events, refresh data daily, and iterate safely on...

By Planet PostgreSQL
Visualizing the World with Planetary Computer
NewsMar 17, 2026

Visualizing the World with Planetary Computer

Microsoft’s Planetary Computer offers a free, standards‑based geospatial data platform that aggregates curated datasets from government, academic and commercial sources. It provides STAC‑compatible APIs, Python and R SDKs, and an Explorer UI for rapid prototyping of environmental applications such as...

By InfoWorld
Coles Sets up Standard Data Streaming Platform Groupwide
NewsMar 16, 2026

Coles Sets up Standard Data Streaming Platform Groupwide

Coles Group has deployed an enterprise‑wide data streaming platform built on Confluent Cloud, unifying its real‑time data pipelines under a single Apache Kafka foundation. Previously, isolated event‑streaming stacks created silos, inconsistent models, and governance challenges. The new "enterprise event platform"...

By iTnews (Australia) – Government
IBM, Nvidia Tackle AI Data Woes
NewsMar 16, 2026

IBM, Nvidia Tackle AI Data Woes

IBM expanded its partnership with Nvidia at GTC 2026 to address enterprise AI data management challenges. The collaboration integrates Nvidia’s cuDF toolkit with IBM’s Presto query engine and adds Nemotron models to IBM’s Docling PDF reader. Nvidia GPUs will also power...

By CIO Dive
Free Datasets + LLM Queries on Snowflake, BigQuery
SocialMar 16, 2026

Free Datasets + LLM Queries on Snowflake, BigQuery

Snowflake and BigQuery have free datasets you can use to practice SQL with real data. Even better: LLMs are integrated, so you can query in natural language.

By Ebere Oyek (Nelo) — Data | AI | ML
AI Adoption Demands Stronger, More Responsive Data Foundations
SocialMar 16, 2026

AI Adoption Demands Stronger, More Responsive Data Foundations

As AI moves to core operations, pressure on the data layer also intensifies. I canvassed leaders on the work required to build a well-functioning data environment responsive to today’s AI initiatives. (My latest in Database Trends) https://t.co/X8ar2pKnTZ @BigDataQtrly

By Joe McKendrick
Nvidia Plans to Make All Unstructured Data Structured
BlogMar 16, 2026

Nvidia Plans to Make All Unstructured Data Structured

Nvidia announced a plan to structure hundreds of zettabytes of unstructured data each year, turning it into the ground‑truth foundation for artificial intelligence. The initiative relies on confidential computing, ensuring that even the platform operator cannot view the raw data....

By Next Big Future – Quantum
Online Feature Store for AI and Machine Learning with Apache Kafka and Flink
NewsMar 16, 2026

Online Feature Store for AI and Machine Learning with Apache Kafka and Flink

Wix.com has built a real‑time online feature store using Apache Kafka and Apache Flink to power personalized recommendations for its 200 million users. The architecture streams over 70 billion events per day through 50 000 Kafka topics, with FlinkSQL performing low‑latency transformations and...

By DZone – Big Data Zone