Know What's Happening in Big Data

Today's Big Data Pulse

Leadership Gaps Hamper Data Engineering Teams, Survey Finds

Three 2026 surveys of 1,629 data professionals reveal organizational issues now dominate data‑engineering bottlenecks. In January, weak leadership direction and poor requirements accounted for 40% of top‑bottleneck votes, while by April 50% cited lack of clear ownership as the biggest pain point. Legacy systems and tooling were far lower priorities, at 25% and under 5% respectively.

Maharashtra Unveils $200 Billion Data‑Centre Investment Pipeline
NewsMay 25, 2026

Maharashtra Unveils $200 Billion Data‑Centre Investment Pipeline

Maharashtra's state government disclosed a Rs 16.69 lakh crore ($200 bn) data‑centre investment pipeline covering 44 mega‑projects. The plan promises 23,800 MW of IT capacity, 146,000 direct jobs and positions the state as India's leading data‑hub.

By Pulse
Western Digital's Five‑Year AI Roadmap Puts Storage Ahead of Compute
NewsMay 24, 2026

Western Digital's Five‑Year AI Roadmap Puts Storage Ahead of Compute

Western Digital unveiled a five‑year AI infrastructure roadmap that shifts focus from expanding compute clusters to building ultra‑high‑capacity storage systems. The plan highlights 40TB UltraSMR ePMR drives now in qualification and a push toward 100TB+ HAMR HDDs, backed by long‑term...

By Pulse
AWS and SAP Launch Five Tools at SAPPHIRE 2026 to Cut Migration Time to Days
NewsMay 24, 2026

AWS and SAP Launch Five Tools at SAPPHIRE 2026 to Cut Migration Time to Days

AWS and SAP announced five new capabilities at the SAPPHIRE 2026 conference, promising to reduce SAP cloud migration cycles from weeks to days for the 440,000+ companies that run SAP worldwide. The tools combine native AWS orchestration, private connectivity, generative...

By Pulse
Google Adds Continuous SQL Queries and Cross‑region Spanner Access to BigQuery
NewsMay 24, 2026

Google Adds Continuous SQL Queries and Cross‑region Spanner Access to BigQuery

Google has rolled out continuous SQL queries and cross‑region federated access to Cloud Spanner in BigQuery, enabling always‑on analytics pipelines without extra egress fees. The upgrades turn the data warehouse into a reactive platform for real‑time decision making.

By Pulse
Databricks Launches Genie Code AI Agent and Deepens SAP Integration
NewsMay 24, 2026

Databricks Launches Genie Code AI Agent and Deepens SAP Integration

Databricks announced the deployment of Genie Code, an autonomous AI agent embedded in its Lakeflow environment, and a tighter integration with SAP Business Data Cloud via Unity Catalog. The moves aim to automate data‑engineering workflows, improve governance, and expand the...

By Pulse
Google Cloud Enables Cross‑Engine Apache Iceberg Support in BigQuery
NewsMay 24, 2026

Google Cloud Enables Cross‑Engine Apache Iceberg Support in BigQuery

Google Cloud announced a preview that adds cross‑engine Apache Iceberg support to BigQuery, allowing the same Iceberg tables to be accessed from Spark, Flink, Trino and BigQuery without data duplication. The serverless Iceberg REST catalog aims to streamline lakehouse workflows...

By Pulse
Denodo Adds AWS Integrations to Power Governed Agentic AI in the Middle East
NewsMay 24, 2026

Denodo Adds AWS Integrations to Power Governed Agentic AI in the Middle East

Denodo announced a suite of integrations with Amazon Web Services that embed its data‑virtualization platform into Bedrock AgentCore, SageMaker and Quick. The move aims to give AI agents trusted, real‑time access to governed data across hybrid and multi‑cloud environments, a...

By Pulse
Databricks Zerobus Streaming Ingestion for Delta Lake House
BlogMay 23, 2026

Databricks Zerobus Streaming Ingestion for Delta Lake House

Databricks introduced Zerobus, a high‑throughput streaming service that writes data directly into Delta Lake tables, removing the need for external message buses like Kafka. The Python SDK (and others for Rust, Go, TypeScript, Java) lets developers stream Apache Arrow RecordBatches...

By Confessions of a Data Guy
Span and Nvidia Launch XFRA Mini‑Data‑Center Units to Turn Homes Into Edge AI Nodes
NewsMay 23, 2026

Span and Nvidia Launch XFRA Mini‑Data‑Center Units to Turn Homes Into Edge AI Nodes

Smart‑panel startup Span, together with Nvidia, introduced XFRA—a compact AI compute unit the size of an air‑conditioner that can be installed in a single‑family home. Each node packs 16 GPUs, draws 12.5 kW, and 8,000 such nodes would equal the power...

By Pulse
Study Finds TikTok's Recommendation Engine Favours Anti‑Democratic Content in 2024 Election
NewsMay 23, 2026

Study Finds TikTok's Recommendation Engine Favours Anti‑Democratic Content in 2024 Election

Researchers at NYU Abu Dhabi’s AI and Society Lab used 323 bot accounts to audit TikTok’s For You page and found the platform disproportionately recommended anti‑Democratic videos during the 2024 election. The study, published in Nature, highlights algorithmic bias that...

By Pulse
WisdomAI Deploys Autonomous Analytics Agents to Streamline Enterprise Data Workflows
NewsMay 23, 2026

WisdomAI Deploys Autonomous Analytics Agents to Streamline Enterprise Data Workflows

WisdomAI unveiled its Analytics Agents platform, enabling enterprises to design, test and deploy AI‑driven agents that autonomously explore, clean and act on data. The solution plugs into over 200 native integrations and is already being used by fintech firm Trumid...

By Pulse
Nvidia's Earnings Reveal AI Spending Shifts Beyond GPUs to Data Infrastructure
NewsMay 23, 2026

Nvidia's Earnings Reveal AI Spending Shifts Beyond GPUs to Data Infrastructure

Nvidia reported record first‑quarter fiscal 2027 revenue of $81.6 billion, with data‑center sales climbing 92% to $75.2 billion. The company also announced a new reporting split that isolates a fast‑growing “ACIE” segment and highlighted networking revenue jumping 199% to $14.8 billion, underscoring a...

By Pulse
Semantic Layer Summit 2026 Positions Business Context as Core Infrastructure for Enterprise AI
NewsMay 23, 2026

Semantic Layer Summit 2026 Positions Business Context as Core Infrastructure for Enterprise AI

More than 6,000 data leaders gathered at the Semantic Layer Summit 2026, where industry giants like Snowflake, Databricks and Anthropic affirmed the semantic layer as the foundation for accurate, governed enterprise AI. The summit underscored the need for shared business...

By Pulse
Michigan's Education Database Moves From Vision to Infrastructure
NewsMay 22, 2026

Michigan's Education Database Moves From Vision to Infrastructure

Michigan is advancing its MiGreatDataLake project by focusing on secure, interoperable infrastructure before deploying AI-driven analytics. The proof‑of‑concept phase, launched in January 2025, validates a medallion‑style data pipeline that moves raw student records through bronze, silver, gold, and platinum layers....

By GovTech — Education (K-12)
Don't Go Dark: Visibility Is a Data Engineering Skill
BlogMay 22, 2026

Don't Go Dark: Visibility Is a Data Engineering Skill

Data engineers often work in silence, producing valuable but invisible changes that can go unnoticed for weeks. The article revisits Jeff Atwood’s “don’t go dark” rule—three weeks without a visible deliverable signals a risk of hidden problems. It explains why...

By Ghost in the data
Legacy Data Stacks Falter as AI Demands Real‑Time, Distributed Access
NewsMay 22, 2026

Legacy Data Stacks Falter as AI Demands Real‑Time, Distributed Access

Enterprise data architectures built for batch queries and static pipelines are being outpaced by AI workloads that require instant, multi‑source data access. Analysts warn that firms that cling to legacy warehouses risk falling behind as AI‑driven operations migrate to lakehouse...

By Pulse
Delivering Successfully Governed Self-Service Analytics with Informatica and TrustLogix
NewsMay 22, 2026

Delivering Successfully Governed Self-Service Analytics with Informatica and TrustLogix

A DBTA webinar featuring Informatica’s Vaibhav Suresh and TrustLogix’s Simon Thornell highlighted the growing chaos in self‑service analytics and presented a governance framework to tame it. They cited that 70% of data leaders believe their most valuable insights sit in...

By Database Trends & Applications (DBTA)
D&B's Database of 642 Million Businesses Was Built for Humans, Not AI Agents. So They Rebuilt It.
NewsMay 22, 2026

D&B's Database of 642 Million Businesses Was Built for Humans, Not AI Agents. So They Rebuilt It.

Dun & Bradstreet rebuilt its Commercial Graph, a 642‑million‑business database, to serve AI agents rather than human analysts. The legacy system’s fragmented architecture and static relationships could not meet the sub‑second latency and dynamic data needs of machine‑driven credit, procurement,...

By VentureBeat
IBM and U.S. Commerce Dept. Launch $1 B Quantum Foundry, Part of $2 B CHIPS Initiative
NewsMay 22, 2026

IBM and U.S. Commerce Dept. Launch $1 B Quantum Foundry, Part of $2 B CHIPS Initiative

IBM and the U.S. Department of Commerce announced a $1 billion grant to create America’s first purpose‑built quantum foundry for superconducting wafers. The award is part of a $2.013 billion CHIPS and Science Act package that also funds GlobalFoundries and other firms,...

By Pulse
Confluent Current London 2026 - Confidence in the New Streaming Age
NewsMay 22, 2026

Confluent Current London 2026 - Confidence in the New Streaming Age

At Confluent Current London 2026, CPO Shaun Clowes warned that legacy data pipelines are hindering the rise of agentic AI. He highlighted Confluent’s Tableflow, which streams data in real time to lakes and warehouses, eliminating batch ETL and lineage gaps. With Kafka...

By Diginomica
Pinewood.AI Expands Dealer BI Platform with New Modules
NewsMay 22, 2026

Pinewood.AI Expands Dealer BI Platform with New Modules

Pinewood.AI has added Accounting & Finance and Customer modules to its dealer Business Intelligence platform, completing the Core Insights tier. The new modules deliver live operational dashboards, automated reporting and unified metrics, aiming to replace manual spreadsheet processes. Pinewood claims dealers can...

By AM Online
DataOps Market Projected to Hit $32.7 B by 2035, Forecast Shows 21% CAGR
NewsMay 22, 2026

DataOps Market Projected to Hit $32.7 B by 2035, Forecast Shows 21% CAGR

MarketGenics estimates the global DataOps market was worth $4.7 billion in 2025 and will expand to $32.7 billion by 2035, a compound annual growth rate of 21.4%. The surge reflects rising enterprise demand for agile data pipelines, AI‑powered analytics and cloud‑native automation.

By Pulse
Panasonic Boosts Enterprise BI Speed with Databricks Lakeflow
NewsMay 22, 2026

Panasonic Boosts Enterprise BI Speed with Databricks Lakeflow

Panasonic's central data infrastructure team has migrated legacy ETL pipelines to Databricks Lakeflow, slashing data ingestion windows from five‑six hours to minutes and reducing pipeline failures that previously occurred about ten times a year. The move, detailed in a Databricks...

By Pulse
Snowflake Secures $58B SaaS Deal with GSA, Offering Up to 50% Off Cloud AI Services
NewsMay 22, 2026

Snowflake Secures $58B SaaS Deal with GSA, Offering Up to 50% Off Cloud AI Services

Snowflake announced a multi-year agreement with the U.S. General Services Administration’s OneGov framework, delivering AI‑enabled data‑cloud services to all federal agencies at 20%‑50% discounted compute rates and a 27% storage cut. The deal, running through September 2027, marks a major...

By Pulse
Big Data Analytics in U.S. Finance: From Frontier to Settled Discipline
NewsMay 21, 2026

Big Data Analytics in U.S. Finance: From Frontier to Settled Discipline

Big data analytics in U.S. finance has moved from a frontier technology to a settled discipline, with cloud warehouses, lakehouses and streaming pipelines now commoditized. Proven use cases—Customer‑360, risk, fraud and regulatory analytics—consistently generate ROI, while speculative projects often bleed...

By TechBullion
WisdomAI Launches Autonomous Analytics Agents to Automate Enterprise Workflows
NewsMay 21, 2026

WisdomAI Launches Autonomous Analytics Agents to Automate Enterprise Workflows

WisdomAI unveiled its Analytics Agents, AI‑driven tools that not only generate insights but also execute approved actions across enterprise data stacks. The agents, built on the company’s Federated Agentic Intelligence platform, aim to automate routine business processes while preserving auditability...

By Pulse
Architecting Petabyte-Scale Hyperspectral Pipelines on AWS
NewsMay 21, 2026

Architecting Petabyte-Scale Hyperspectral Pipelines on AWS

The article outlines a petabyte‑scale hyperspectral data pipeline on AWS that moves raw sensor cubes from remote fields to queryable tables using an S3‑SQS‑Lambda‑Batch ingestion flow, aggressive S3 lifecycle tiering, and an Apache Iceberg medallion lakehouse. Edge containers on NVIDIA...

By DZone – Big Data Zone
Confluent Launches Real‑Time AI Suite to Secure Streaming Data at Scale
NewsMay 21, 2026

Confluent Launches Real‑Time AI Suite to Secure Streaming Data at Scale

Confluent, the data‑streaming pioneer now owned by IBM, rolled out new features in Confluent Intelligence and Confluent Cloud that unify the AI lifecycle, automate privacy controls and enable private connectivity to external models. The upgrades target the security and complexity...

By Pulse
Teradata Factory Offers an On-Prem Foundation for the Agentic Enterprise
NewsMay 21, 2026

Teradata Factory Offers an On-Prem Foundation for the Agentic Enterprise

Teradata announced the Teradata Factory, an on‑premises extension of its Autonomous Knowledge Platform built on Dell Technologies hardware. The solution unifies the full Teradata software stack—including AI Studio—under a single management plane and supports enterprise data warehousing, lakehouse, and GPU‑accelerated...

By Database Trends & Applications (DBTA)
Dell Tech World 2026: Mazda Builds AI-Ready Data Foundation with Dell
BlogMay 21, 2026

Dell Tech World 2026: Mazda Builds AI-Ready Data Foundation with Dell

Mazda Motor Corp. has deployed Dell PowerScale to unify its design, development, and CAD data into a single, scalable storage platform. The new infrastructure expands capacity from roughly 4 PB to 10 PB while cutting storage cost per unit by 90 percent....

By StorageNewsletter
When "Garbage In, Garbage Out" Gets It Wrong
PodcastMay 21, 202643 min

When "Garbage In, Garbage Out" Gets It Wrong

In this episode, Terence Lee St. John, founder of Enly and lead author of the paper "From Garbage to Gold: A Data Architectural Theory of Predictive Robustness," explains why machine‑learning models can achieve state‑of‑the‑art performance even when trained on noisy,...

By The Data Exchange
Databricks Unveils Real-Time Fraud Accelerator, Spark RTM Cuts Latency 92%
NewsMay 21, 2026

Databricks Unveils Real-Time Fraud Accelerator, Spark RTM Cuts Latency 92%

Databricks introduced a new Solution Accelerator that pairs Spark Real‑Time Mode with its Lakebase service to detect credit‑card fraud in under 300 ms, claiming up to 92% speed gains over Apache Flink and highlighting $33 billion annual losses from card fraud.

By Pulse
AI Success Depends on These Data Governance Metrics
NewsMay 20, 2026

AI Success Depends on These Data Governance Metrics

Enterprises are realizing that traditional data‑governance dashboards, which focus on documentation and ownership, fall short for AI workloads. New metrics—such as lineage completeness, certified dataset usage, and pipeline observability—measure data trust at runtime, ensuring AI systems draw from reliable, up‑to‑date...

By EnterpriseAI
Informatica Launches Headless Data Services and Unified Agent Governance
NewsMay 20, 2026

Informatica Launches Headless Data Services and Unified Agent Governance

Informatica, now a Salesforce subsidiary, introduced a headless version of its Intelligent Data Management Cloud and a unified Agent and Context Catalog. The move targets enterprise AI agents that need governed, context‑rich data without traditional UI constraints, addressing a survey‑found...

By Pulse
Washington Tightens AI Rules, Targeting Deepfakes and Data Governance
NewsMay 20, 2026

Washington Tightens AI Rules, Targeting Deepfakes and Data Governance

Washington announced a sweeping AI regulatory push, licensing Anthropic’s Mythos model and launching enforcement of the Take It Down Act to force platforms to delete non‑consensual deepfakes within 48 hours. The moves signal a new era of data‑centric oversight for...

By Pulse
Reviewing Azure OneLake: Unified Data Lake Architecture for Modern Solutions
BlogMay 20, 2026

Reviewing Azure OneLake: Unified Data Lake Architecture for Modern Solutions

Azure OneLake launches as a unified data lake platform that consolidates structured, semi‑structured, and unstructured data into a single logical repository. It natively blends lakehouse capabilities with Azure services such as Synapse, Fabric, and Power BI, delivering real‑time ingestion, robust governance...

By Architecture & Governance Magazine – Elevating EA
DeepMind's Co‑Scientist AI Tackles Cancer Drug Discovery with Big‑data Analytics
NewsMay 20, 2026

DeepMind's Co‑Scientist AI Tackles Cancer Drug Discovery with Big‑data Analytics

Google DeepMind unveiled Co‑Scientist, a multi‑agent AI platform designed to accelerate biomedical research. In a pilot for acute myeloid leukemia, the system shortlisted 30 drug candidates, three of which showed promising activity in lab tests. The launch signals a new...

By Pulse
Re-Air: The Rise of the Citizen Developer: Solving Business Problems with Alteryx and AI with Andy Macmillan
PodcastMay 20, 202650 min

Re-Air: The Rise of the Citizen Developer: Solving Business Problems with Alteryx and AI with Andy Macmillan

In this re‑aired episode, Alteryx CEO Andy Macmillan discusses the evolution of the citizen developer—business users with enough technical skill to build data solutions—and how AI is reshaping that role. He explains Alteryx’s mission to democratize data preparation and analytics,...

By The Data Stack Show
Dell and Palantir Unveil On-Prem AI Operating System to Accelerate Enterprise Data Integration
NewsMay 19, 2026

Dell and Palantir Unveil On-Prem AI Operating System to Accelerate Enterprise Data Integration

Dell Technologies and Palantir announced a joint on‑premises AI operating system at Dell Technologies World. The solution combines Dell’s AI Factory hardware with Palantir’s Foundry and Ontology platforms to create a unified, governed semantic layer for sensitive enterprise data. The...

By Pulse
Denodo Launches AWS Integrations to Power Trusted Data Foundations for Agentic AI
NewsMay 19, 2026

Denodo Launches AWS Integrations to Power Trusted Data Foundations for Agentic AI

Denodo announced native integrations with Amazon Web Services—SageMaker, Bedrock AgentCore and Quick—and placed its platform on the AWS Marketplace. The enhancements deliver zero‑copy, real‑time data with a unified semantic layer, aiming to accelerate agentic AI deployments in finance, healthcare, manufacturing...

By Pulse
Databricks Launches Analytics Engineer Learning Pathway to Upskill SQL Practitioners
NewsMay 19, 2026

Databricks Launches Analytics Engineer Learning Pathway to Upskill SQL Practitioners

Databricks announced the Analytics Engineer Learning Pathway, a curriculum that trains SQL practitioners to build governed, AI‑ready data models, pipelines and metric views. The program, available now on Databricks Academy, aims to fill a talent gap as organizations lean on...

By Pulse
Snowflake Adds Dataiku Bedrock Valid Systems to AI Cloud, Launches Risk Tools
NewsMay 18, 2026

Snowflake Adds Dataiku Bedrock Valid Systems to AI Cloud, Launches Risk Tools

Snowflake announced new AI Data Cloud integrations with Dataiku, Bedrock and Valid Systems, plus a risk‑tool offering that lets smaller banks run sophisticated fraud decisioning on Snowflake’s platform. The moves aim to lock AI workloads inside Snowflake and broaden access...

By Pulse
Snowflake Secures IRAP PROTECTED Status on Google Cloud, Expanding Australian Govt Access
NewsMay 18, 2026

Snowflake Secures IRAP PROTECTED Status on Google Cloud, Expanding Australian Govt Access

Snowflake announced it has passed the Australian Signals Directorate's IRAP PROTECTED assessment for its Google Cloud Melbourne region, joining AWS and Azure in offering government‑grade security. The milestone gives federal agencies confidence to run sensitive analytics and AI workloads on...

By Pulse
How Insurer Aviva Migrated 1.3PB of Siloed Data to Become "AI-Ready" In 7 Months
NewsMay 18, 2026

How Insurer Aviva Migrated 1.3PB of Siloed Data to Become "AI-Ready" In 7 Months

Aviva completed a lift‑and‑shift migration of 1.3 petabytes of siloed data from Oracle Cloud to Snowflake in just seven months, creating a unified data platform. The new architecture underpins its AI initiatives, allowing the insurer to launch AI‑driven services such as...

By The Stack (TheStack.technology)
Enterprise AI Stalls as Hidden “Pipeline Tax” Inflates Data‑movement Costs
NewsMay 18, 2026

Enterprise AI Stalls as Hidden “Pipeline Tax” Inflates Data‑movement Costs

EnterpriseDB’s CTO Quais Taraki warns that a hidden “pipeline tax” – the cumulative cost of moving data through multiple translation layers – is delaying AI projects by up to six months. The tax, invisible on balance sheets, is cited as...

By Pulse
SAP's $1.1B Data Push: Reltio Deal, Dremio Pending, Prior Labs AI Lab
NewsMay 18, 2026

SAP's $1.1B Data Push: Reltio Deal, Dremio Pending, Prior Labs AI Lab

SAP closed its $1.1 bn acquisition of master‑data specialist Reltio on May 7, announced a pending buyout of data‑lake platform Dremio, and committed more than €1 bn ($1.08 bn) to German AI startup Prior Labs. The three‑tier plan aims to unify data preparation, connectivity...

By Pulse
The AI Data Governance Gap that Keeps Getting Worse
NewsMay 18, 2026

The AI Data Governance Gap that Keeps Getting Worse

Enterprises are rapidly embedding AI into products, but most overlook data governance. Production databases are routinely copied into dev environments, data lakes, and third‑party services without clear oversight, leaving real customer records exposed. The article cites a mid‑size lender where...

By CIO.com
Amagi Cuts Costs 45% with Unified Data Lake on Databricks
NewsMay 18, 2026

Amagi Cuts Costs 45% with Unified Data Lake on Databricks

Amagi, the global media‑technology provider, announced a 45% reduction in operating costs and faster product rollout after consolidating its fragmented data environment onto Databricks' lakehouse platform. The move resolves cross‑region governance challenges and creates a single source of truth for...

By Pulse
Snowflake Adds Dataiku, Bedrock, Valid Systems to AI Data Cloud, Backs OSI Push
NewsMay 17, 2026

Snowflake Adds Dataiku, Bedrock, Valid Systems to AI Data Cloud, Backs OSI Push

Snowflake announced a suite of new native integrations—Dataiku Cobuild, Bedrock Data’s free governance tier and Valid Systems’ fraud‑decisioning tools—into its AI Data Cloud, while also championing the Open Semantic Interchange (OSI) standard to keep AI workloads on‑platform. The moves aim...

By Pulse