Today's Big Data Pulse

Leadership Gaps Hamper Data Engineering Teams, Survey Finds
Three 2026 surveys of 1,629 data professionals reveal organizational issues now dominate data‑engineering bottlenecks. In January, weak leadership direction and poor requirements accounted for 40% of top‑bottleneck votes, while by April 50% cited lack of clear ownership as the biggest pain point. Legacy systems and tooling were far lower priorities, at 25% and under 5% respectively.
Also developing:
By the numbers: Sensor Tower acquires AppMagic to expand SMB offering
Broadcom Becomes Core AI Chip Supplier for Google, Anthropic, OpenAI and Meta
Broadcom's custom AI chips now power the compute engines behind Google, Anthropic, OpenAI and Meta, driving a 106% year‑over‑year surge in AI sales and fueling a $100 billion revenue target for 2027. The move reshapes the hardware foundation of big‑data analytics and generative‑AI workloads.
NeuroPace AI Suite Leverages 26 Million iEEG Records, Prompting Privacy and Bias Debate
NeuroPace unveiled an AI‑driven seizure‑management platform built on more than 26 million intracranial EEG recordings, and PAVmed reported a 12,000‑patient real‑world study of its EsoGuard test. Both moves spotlight the surge in personalized health analytics and the growing scrutiny over data...
Balcony Secures $12.7M Seed to Launch Keystone, Unified Property Platform
Balcony, a New Jersey PropTech startup founded in 2021, launched Keystone, a unified property intelligence platform for public land management, and closed a $12.7 million seed round led by Blockchange Ventures. The platform already secured a five‑year contract to digitise 370,000...
AI‑Generated Papers Flood Science, Raising Big Data Validation Crisis
Researchers report a sharp rise in AI‑crafted scientific papers that reuse massive public datasets, overwhelming peer reviewers and exposing gaps in data validation. The trend threatens the credibility of big‑data‑driven research and forces publishers to rethink quality controls.
BigID Deploys Data Security Posture Management as Snowflake Native App
BigID announced the launch of its Data Security Posture Management (DSPM) solution as a native app on the Snowflake Marketplace. The integration lets joint customers discover, classify and protect sensitive data inside Snowflake without moving data, a move aimed at...
Transaera’s MOF‑Based AC Promises 40% Energy Cut for Data‑Center Cooling
MIT spin‑off Transaera launched a rooftop air‑conditioning system that uses metal‑organic frameworks to dehumidify air, delivering up to 40% lower energy consumption for data‑center cooling. The move comes as global cooling demand, already 5,000 TWh in 2022, is projected to triple...
China Narrows AI Gap as US Faces $2.5 B Chip Smuggling Case
A Stanford study shows China’s DeepSeek‑R1 model now matches top US AI systems, shrinking the performance gap to 2.7% behind Anthropic’s flagship. At the same time, a $2.5 billion chip smuggling indictment underscores Washington’s alarm over Beijing’s push for compute power,...
Datavault AI Reaffirms $200M Revenue Target, Boosts Funding and Expansion Plans
Datavault AI (DVLT) reaffirmed its full‑year 2026 revenue target of $200 million during its Q1 earnings call, while announcing a $60 million private placement, a $120 million non‑dilutive financing term sheet, and $800 million in tokenization contracts. The company also detailed a rollout of...
Zeta Global and Snowflake Launch Open Semantic Interchange Standard for AI Marketing
Zeta Global and Snowflake announced a partnership to develop an Open Semantic Interchange (OSI) standard that will serve as a universal data language for AI‑powered marketing. The initiative seeks to break down data silos, streamline governance, and accelerate the deployment...
BoomiWorld 2026 Unveils AI Data Trust Suite and Governance Tools
At BoomiWorld 2026, Boomi rolled out a suite of AI‑focused data activation products—including Boomi Connect, Boomi Companion, and a Lunar.dev acquisition—while partnering with Guru to bridge knowledge and live data. The company highlighted that just 7% of enterprise data is...
Scality Launches Autonomous Data Infrastructure Platform to Power Enterprise AI Workloads
Scality introduced its Autonomous Data Infrastructure (ADI) platform, a self‑managing storage solution designed for enterprise AI, cyber resilience and sovereign data control. Built on the company's RING and ARTESCA foundations, ADI promises multi‑petabyte to exabyte scale, multi‑terabyte‑per‑second throughput and a...
ICE Gains Palantir‑Powered Access to Data on 20 Million Individuals
Immigration and Customs Enforcement officials announced that agents can now query a Palantir‑powered platform containing data on 20 million individuals, raising the agency’s success rate in locating targets to almost 80%. The rollout, demonstrated at the Border Security Expo, intensifies debate...
SAP Moves to Own AI Data Control Plane with Dremio and Prior Labs Acquisitions
SAP announced plans to acquire data‑lake vendor Dremio and AI‑governance startup Prior Labs, positioning its Business Data Cloud as a unified AI data control plane. The move seeks to tighten data governance, boost structured decision intelligence, and increase SAP's leverage...
Amazon Redshift Launches Graviton‑based RG Instances, Promising up to 2.2x Speed and 30% Lower Cost
Amazon Redshift announced RG instances built on AWS Graviton processors, delivering up to 2.2× faster query performance than RA3 and 30% lower price per vCPU. The new family adds a built‑in data‑lake query engine that accelerates SQL across Apache Iceberg...
Databricks Launches ABAC Row Filtering and Column Masking in Unity Catalog
Databricks announced that its Unity Catalog now supports generally available attribute‑based access control (ABAC) row‑filtering, column‑masking policies, governed tags and automated data classification. The upgrade aims to replace manual, per‑object security rules with a system that enforces protection automatically across...
Dremio Secures Representative Vendor Spot in Gartner 2026 Agentic Analytics Guide
Dremio has been named a Representative Vendor in Gartner’s 2026 Market Guide for Agentic Analytics, joining roughly 37 platforms shaping the nascent category. The recognition spotlights Dremio’s Agentic Lakehouse Platform as a key foundation for AI‑driven data workflows.
Deloitte Africa Teams with AWS to Open Nerve Operational Intelligence Centre
Deloitte Africa has launched the Nerve Operational Intelligence Centre in partnership with Amazon Web Services, creating an AI‑driven hub to deliver operational analytics to African clients. The move signals Deloitte’s intensified focus on data‑centric consulting services in the region.
Fivetran Takes Stewardship of Great Expectations Open‑Source Community and GX Core Project
Fivetran announced it will become the steward of the Great Expectations open‑source community and its GX Core project, a key data‑quality framework. The move aligns with its Open Data Infrastructure strategy and follows its planned merger with dbt Labs, signaling...

Your Data Engineers May Be More Influential than You Think
The role of data engineers has shifted from reactive ETL developers to owners of modern data platforms that power analytics, AI, and real‑time applications. Cloud‑native warehouses, tools like dbt, Airflow, and Fivetran, plus CI/CD practices have turned pipelines into software‑engineered...
Christophe Pettus: PARTITION MERGE/SPLIT, Once More With Locking
PostgreSQL 19 reintroduces the long‑awaited ALTER TABLE … MERGE PARTITIONS and ALTER TABLE … SPLIT PARTITION commands, allowing administrators to combine or divide partitions with a single DDL statement. The implementation opts for an AccessExclusiveLock on the parent table, meaning the table is completely...

From Bottlenecks to Breakthroughs, Enterprises Are Rethinking Analytics in the Lakehouse Era
Enterprises are abandoning fragmented data stacks in favor of open lakehouse architectures paired with next‑generation analytical engines. Legacy warehouses, OLAP tools, and streaming layers have become costly and brittle as petabyte‑scale data and real‑time use cases proliferate. Modern solutions such...
Ten Years of Beam: From Google's Dataflow Paper to 4 Trillion Events at LinkedIn
In August 2015 Google published the Dataflow paper that introduced a unified model for batch and streaming. The model became Apache Beam, now an Apache top‑level project that processes 4 trillion events per day at LinkedIn and powers workloads at Palo...
CDC’s Delayed Hantavirus Alert Exposes Weakened US Outbreak Surveillance
The U.S. Centers for Disease Control and Prevention issued a health alert for a hantavirus outbreak on the cruise ship MV Hondius weeks after the first cases were identified, underscoring a decline in surveillance capacity. Epidemiologist Jodie Guest attributes the...
From Data Chaos to Discovery: Building the Data Foundation for AI-Ready Scientific Research
Life science organizations are overwhelmed by fragmented, petabyte‑scale data that hampers AI deployment. Legacy storage‑centric architectures force repeated copying, incurring a “data tax” that slows research and raises compliance risk. Experts argue that an AI‑ready data strategy—rooted in FAIR principles,...
SAP Rolls Out 50+ AI Assistants and 200 Agents at Sapphire Conference
SAP introduced more than 50 AI assistants and roughly 200 AI agents at its Sapphire conference, positioning the suite as the core of an autonomous‑enterprise vision. The move aims to shift SAP applications from passive tools to proactive, data‑driven executors.
Snowflake Ecosystem Startups Pull $113 Billion in Venture Funding, Fueling AI‑Data Cloud Boom
Snowflake’s partner network has amassed more than $113 billion in venture capital across 1,300+ private firms since 2020, according to a new Crunchbase‑Snowflake report. The data shows a pivot from a broad funding surge in 2021 to a tighter, high‑value focus...
Snowflake Adds Malaysia’s Sovereign LLM ILMU to AI Data Cloud
Snowflake has integrated Malaysia’s national language model, ILMU, into its AI Data Cloud platform, aligning the rollout with the service’s April launch on AWS’s Malaysia region. The move aims to give regulated enterprises secure, locally governed AI capabilities.
Fivetran’s CPO: Closed Data Stacks Won’t Survive the Agent Era
Fivetran’s chief product officer Anjan Kundavaram warned that closed data stacks cannot sustain the query volume generated by AI agents, which can run ten to a hundred times more queries than traditional analytics. He argues that routing every request through...
China Deploys AI‑Driven Big Data Platforms Across Sports Industry
China's General Administration of Sport and leading sportswear groups Li‑Ning and Anta are rolling out AI‑driven big‑data platforms for coaching, product design and consumer fitness. The initiative, backed by government targets for a digital sports community by 2030, promises to...

Snap’s Secret to Processing 10 Petabytes a Day: GPU-Accelerated Spark | NVIDIA AI Podcast Ep. 298
Snap’s engineering platform head, Prudvi Vatala, explains how the company slashed data‑processing costs by 76% and reduced core usage by 62% by migrating its 10‑petabyte‑per‑day experimentation pipeline to GPU‑accelerated Spark using NVIDIA Spark RAPIDS on Google Cloud. The move delivered...

Beyond IT: A Three-Stage Framework for Turning Data Governance Into Board-Level Strategy
Boards are now required to oversee data governance as a core component of operational resilience, driven by regulations such as the EU NIS2 Directive and DORA. The article proposes a three‑stage framework that first translates cyber risk into business language,...
Hiring Managers Prioritize Hands‑On Skills Over CS Degrees in Data Science Hiring
Hiring managers at leading tech firms are increasingly valuing practical experience and domain expertise over traditional computer‑science degrees, according to a recent industry guide. The shift reflects a growing demand for applied data scientists who can navigate AI‑driven hiring tools...
O9 Solutions Links Digital Brain to Snowflake AI Data Cloud in New Integration
o9 Solutions announced a deep integration with Snowflake’s Connected Application framework, embedding its Digital Brain planning engine into the Snowflake AI Data Cloud. The joint offering promises continuous, AI‑driven planning across supply‑chain, commercial and finance functions on a single, governed...
NVIDIA, ServiceNow Push Enterprise AI Governance Into Data Centers
NVIDIA and ServiceNow unveiled Project Arc and an AI Control Tower integration that extend autonomous AI agent governance from desktop environments to enterprise data‑center workloads. The partnership, announced at ServiceNow Knowledge 2026 in Las Vegas, adds sandboxing, policy management and...
Data Governance in Healthcare for Trustworthy AI
Healthcare providers face 3‑7‑day prior‑authorization delays despite AI pilots. Executives recognize that the root cause is not technology but trust, which hinges on data governance, transparency, and human‑in‑the‑loop controls. Snowflake, partnered with Penguin AI, offers a unified, real‑time, governed data layer...
Malaysia’s AI Hub Goal Stumbles on Fragmented Enterprise Data
Databricks’ ASEAN VP Cecily Ng says Malaysia’s ambition to be a regional AI hub by 2030 is undermined by enterprises still operating fragmented, multi‑cloud data estates. The data foundation gap makes scaling AI beyond pilots difficult, raising doubts about the...

Migrating Data Ingestion Systems at Meta Scale
Meta has completely overhauled its massive data ingestion pipeline that extracts petabytes of social‑graph data from one of the world’s largest MySQL deployments. The new self‑managed warehouse architecture replaced customer‑owned pipelines and was migrated in three disciplined phases—shadow, reverse‑shadow, and...
One Context Layer, or Many?
The article debates whether a single, centralized context layer or multiple distributed metadata sources should power AI agents in the modern data stack. It argues that while centralized catalogs offer convenience, they become stale and lossy for agents that need...

Corvic Al Launches Individual Plans and Brings Its Agentic Data Engine to General Availability Across Cloud Marketplaces
Corvic AI announced the general availability of its V3 Intelligence Composition Platform, featuring an agentic data engineering engine that converts multimodal operational data into structured outputs. The rollout includes new Individual Plans aimed at AI engineers, analysts, and operations teams,...
Cloudera Hackathon Shows Enterprise AI Can Cut RFP Cycle to Hours, but Data Governance Lags
At Cloudera's recent hackathon, Citadel Edge delivered an RFP‑to‑Proof‑of‑Concept builder that shrank a three‑to‑four‑week process to a few hours, demonstrating how enterprise AI can move quickly from concept to market. Yet Cloudera's own readiness survey warned that just 7% of...
Nubank Deploys AI Model to Expand Credit While Keeping Risk Flat
Nubank rolled out nuFormer, a transformer‑based AI underwriting engine that slashes projected risk by 70% and helped the bank lift its credit‑card spend share by 50 basis points in Q4 2025. The move expands the lender’s credit portfolio 40% year‑over‑year while...

What Is the Solid Project and What Could It Mean for Businesses?
The Solid Project, championed by Tim Berners‑Lee, proposes personal data pods that let individuals own and control their digital information. If widely adopted, businesses will need to shift from hoarding data to accessing it via secure APIs and zero‑trust architectures....
FDA Deploys Elsa 4.0 AI Tool and HALO Data Platform to Modernize Operations
The U.S. Food and Drug Administration unveiled Elsa 4.0, an upgraded internal AI assistant, and completed the HALO data‑platform consolidation that unifies more than 40 disparate data sources. The moves are designed to cut manual work, accelerate regulatory reviews and...
Governance Success Depends on Change Management, Not Tools
Most data governance programs do not fail because of bad technology. They fail because leaders treat governance like a software rollout, when it is really an organizational change effort. In this conversation I explored with Grimme Bogaerts & Martijn Vanhauwaert from Colruyt...
Sorbonne University Abu Dhabi and Saal.ai Sign MoU to Boost Sovereign AI and Big Data Capabilities
Sorbonne University Abu Dhabi (SUAD) and UAE‑based AI firm Saal.ai signed a memorandum of understanding on May 11, 2026 to co‑develop sovereign AI infrastructure and large‑scale data analytics tools. The partnership ties SUAD’s academic programmes in data science to Saal.ai’s...

How EDF Is Making the Most of Its Data with Snowflake
EDF UK has made Snowflake AI Data Cloud the backbone of its data strategy, unifying hundreds of data sources for over 1,000 internal users. The utility’s federated hub‑and‑spoke model lets central data services handle tooling and architecture while business units build...

What’s Really Needed For Advanced Test?
Advanced test in semiconductor manufacturing promises adaptive binning, feed‑forward models and real‑time analytics, but the industry’s biggest obstacle is data quality. PDF Solutions highlights that misaligned metadata and incomplete tool‑level data routinely break automated test flows, forcing engineers to intervene...
England to Roll Out Unified Patient Record System Across NHS
The UK government confirmed that England will launch a unified patient record system for the National Health Service, giving every citizen a single, comprehensive medical history. The rollout, described as the largest health‑IT integration in the country, is intended to...
Yum Brands Deploys AI Backbone, Digital Sales Hit 70% of Taco Bell Revenue
Yum Brands chief digital and technology officer Jim Dausch detailed an AI‑centric overhaul that now drives 70% of Taco Bell’s sales, up from 1% in 2019. The company has deployed AI tools across 35,000 restaurants and partnered with AWS, Microsoft...
Alation Launches AI Governance Platform to Streamline Enterprise Compliance
Alation Inc. unveiled its AI Governance solution at the Gartner Data & Analytics Summit in London, creating a centralized system of record for AI model compliance. The offering maps AI assets to regulations such as the EU AI Act and...