Today's Big Data Pulse

Big Data 100 highlights $31.8B market surge and major AI‑driven deals
CRN’s 2026 Big Data 100 report projects the data‑analytics market to reach $31.8 billion this year, driven by rapid AI adoption and new agentic assistants. The list notes Alteryx’s $4.4 billion private‑equity buyout, AtScale’s financing led by Snowflake, and Hex’s $70 million Series C round.
Govt to Tap AI for Mapping Supply Chains and Investment Clusters
India’s Statistics Ministry is building a Statistical Business Register (SBR) that will use artificial intelligence and advanced analytics to map supply‑chain relationships and identify investment clusters. The centralized database will cover every registered business, enabling the government to target logistics infrastructure spending more precisely. Privacy safeguards and consent‑based data sharing are embedded in the framework. Data harmonisation and semantic interoperability across ministries are highlighted as critical for AI effectiveness.

REST Service Apache Livy Makes It Out of 9-Year Incubation
Apache Livy, the REST service that lets users submit and manage Apache Spark jobs over HTTP, has officially graduated from the Apache Software Foundation’s incubation phase after nine years. Recent spikes in code commits and contributor activity indicate heightened community...
AI Boom Triggers $1.1 Trillion Power Push and $5 B High‑Density Rack Market Surge
Exelon and other utilities are committing roughly $1.1 trillion over the next five years to power the AI‑driven data‑center boom, while the high‑density AI rack market is projected to grow from $1.6 billion in 2026 to $5 billion by 2036. The twin pressures...

AI Agents and the Fight for Customer Data
In this episode, Martin Casado talks with George Frazier, CEO of Fivetran, about how AI agents are reshaping the need for unified data platforms. Frazier explains that while centralizing data has always been essential for business intelligence, it is now...

Snowflake Summit 2026 - How Sanofi Uses Snowflake and AI Agents to Boost Efficiency
Sanofi has transformed its fragmented data environment by unifying it on Snowflake and adding semantic layers, then layering agentic AI services from Elementum. The AI‑powered Concierge platform lets thousands of employees query and act on data directly in the Snowflake...

As Enterprise AI Scales, Capital One and Snowflake Focus on Governance and Trust
Enterprise AI adoption is prompting financial firms to tighten data governance. Capital One highlighted its partnership with Snowflake to leverage the cloud data platform’s Horizon Catalog for secure, policy‑driven data sharing. Snowflake’s recent enhancements—open sharing for Iceberg tables, AI‑automated metadata,...
Anthropic Publishes Claude Constitution, Igniting AI Consciousness Debate
Anthropic unveiled an 84‑page "constitution" for its Claude language model, prompting fierce discussion about AI consciousness, moral status and data governance. The document, written for Claude itself, has drawn comments from CEO Dario Amodei, in‑house philosopher Amanda Askell and external...
Microsoft Launches Agentic AI and Copilot Agents at Build, Early Tests Reveal Mixed Performance
Microsoft introduced its first autoagent, Scout, and a suite of premium Copilot agents at the Build conference, positioning them as the core of an "agentic OS" for data‑driven work. Early hands‑on testing showed the agents could automate routine tasks but...

How Canva Turns a Unified Data Strategy Into AI Personalization at Scale
Canva, the Australian design platform with 265 million monthly active users and $4 billion in annualized revenue, has deepened its partnership with Snowflake to create a unified data foundation that fuels AI‑driven personalization. By consolidating product, marketing and sales signals in Snowflake,...
Snowflake Secures Sanofi AI Drug‑Development Platform Deal, Expanding Pharma Cloud Footprint
Snowflake announced at its Summit 26 in San Francisco that Sanofi will use the Snowflake AI Data Cloud to power a new AI‑driven drug‑development platform, including the "Concierge for Field" sales agent. The partnership extends Snowflake’s reach across R&D, procurement,...

Convergence Evidence Maturity Hierarchy: From Raw Data to Convergence-Authoritative Evidence
Moh Kolb’s May 2026 article introduces the Convergence Evidence Maturity Hierarchy (CEMH), a five‑stage framework that moves semiconductor data from raw sensor streams to convergence‑authoritative evidence. The hierarchy—raw data, interoperable data, normalized evidence, admissible evidence, and authoritative evidence—provides the governance logic for...

Snowflake Summit 2026: CoCo, CoWork Drive Partner Growth
Snowflake’s 2026 Summit highlighted the GA release of its CoCo AI coding agent and new CoWork personal work agent features, including a Claude Code plugin and a secured sandbox. The company announced upcoming cloud‑based CoCo agents, a desktop app, and an...
Trump Budget Proposal Cuts $1.1 Billion From NOAA, Threatening Climate and Fisheries Data
The Trump administration’s 2027 budget proposal would trim $1.1 billion—about 18%—from NOAA’s $6.1 billion budget, eliminate over 1,000 positions and cancel three of five planned geostationary satellite instruments. Experts warn the cuts could cripple data streams that power big‑data climate models and...
Data Center Boom Hits US Suburbs, Triggers $500M Sale Proposal and Copper Theft Surge
A Virginia homeowner has offered to sell his 143‑home subdivision for more than $500 million to a data‑center developer, underscoring the rapid spread of data centers into U.S. suburbs. At the same time, AT&T reports a sharp rise in copper‑theft incidents,...
American Express: Democratize Analytics, Not Data
American Express chief data officer Chris Gifford says the company is moving from broad data democratization to democratizing analytics. By delivering governed, ready‑to‑use analytics, Amex aims to let employees and AI agents generate insights faster while mitigating operational and privacy...
Matia Launches on Snowflake Marketplace, Bringing Unified Data Operations to the AI Data Cloud
Matia, a unified data‑operations platform, announced its launch on Snowflake Marketplace at the Snowflake Summit. The integration lets customers deploy Matia’s ETL, reverse‑ETL, observability and catalog functions directly within the Snowflake AI Data Cloud. By offering a single‑pane solution, Matia...
Snowflake and Anthropic Double Down on AI to Power Data‑Driven Marketing
Snowflake and Anthropic announced an expanded partnership that embeds Claude models directly into Snowflake Cortex AI, giving marketers instant, governed access to AI on their data. The move, highlighted at Snowflake Summit 26, aims to shift AI from experimentation to...

How NIH Is Translating 70 Years of Health Data to Speak the Same Language
The National Institutes of Health is consolidating more than 12 petabytes of multimodal biomedical data through its BioData Catalyst cloud ecosystem. A LinkML‑based "converter box" maps legacy research datasets to common standards such as LOINC, FHIR and HPO, while clinical...
To Help Public Agencies Reduce Roadway Risk, Platform Combines Diverse Datasets
Replica and Arity have launched Safety Hub, a new platform that fuses Arity’s telematics data from more than 50 million active connections with government crash records and Replica’s mobility, demographic, land‑use and economic datasets. The tool enables public agencies to spot...
Canada Allocates AI Funding to Build National Health Data Platform
Canada announced funding under its AI strategy to create a centralized, AI‑enabled health data platform. The investment is intended to improve patient care, support research and modernize public health service delivery across the country.

Inecta Launches ETL Layer to Unlock Food ERP Data for Enterprise Analytics
Inecta, a niche ERP vendor for food and beverage manufacturers, unveiled inectaETL, a data‑connectivity layer that extracts and transforms Microsoft Business Central data for modern analytics platforms. The solution ships with pre‑built integrations to Snowflake, Fivetran, MotherDuck and Microsoft Fabric,...
Thomas Larson Honored for Scaling Enterprise AI and Big Data in Financial Services
Thomas Larson, senior client partner for the Salesforce practice at Persistent Systems, was named to Marquis Who's Who for his role in deploying AI‑driven data platforms at major U.S. banks. His work bridges Salesforce technology with financial‑services workflows, positioning Persistent...
Intel Logs 130+ Edge AI Design Wins and Unveils OpenVINO Physical AI Framework
Intel said it has secured more than 130 design engagements for its Series 3 edge AI processors and launched the open‑source OpenVINO Physical AI framework. The moves target fragmented robotics stacks and aim to lower total cost of ownership for...
Databricks' Knobless Liquid Clustering Eliminates Partitioning Pain
Six years ago I did a survey of our field and found that most of the headaches customers were facing were because their data was using Partitioning (something invented by Hive long before this). So we set out to get...
Nvidia Moves Vera Rubin AI Platform to Volume Production, Boosting Big Data Compute
Nvidia revealed at Computex that its Vera Rubin AI platform is now in volume production, leveraging over 350 supply‑chain partners across 30 countries. The new rack‑scale system promises ten‑fold AI throughput and a tenth of the cost per token, reshaping...
Nvidia Launches BlueField‑4 STX DPU, Promising 5x Token Throughput and 4x Energy Efficiency
Nvidia announced its BlueField‑4 STX data processing unit at GTC on March 16, delivering up to five‑fold token throughput and four‑fold energy efficiency for AI inference. Partners such as Supermicro, Cloudian and DDN will ship compatible servers, while eight cloud...

Trust3 AI Announces Integration with Snowflake to Govern MCP-Based Data Access and Accelerate Trusted Enterprise AI
Trust3 AI announced a native integration with Snowflake’s AI Data Cloud, linking its policy‑driven governance platform to Snowflake’s managed Model Context Protocol (MCP) servers. The partnership lets enterprises expose governed, business‑aligned data products to AI agents without deploying separate MCP...
Cloudflare Launches Town Lake Data Platform and Skipper Pipeline Framework
Cloudflare announced the rollout of Town Lake, its internal lakehouse platform, and Skipper, a low‑code data‑pipeline framework. Built on Cloudflare's own edge services, the tools promise fresh, governed data without requiring SQL expertise.

ServiceNow Knowledge26: Cloudera Launches Workflow Data Fabric Zero Copy Connector for ServiceNow
Cloudera unveiled a Workflow Data Fabric Zero Copy Connector for ServiceNow at the Knowledge26 conference. The integration lets enterprises query data where it resides, removing the need for costly data duplication while preserving security and governance across hybrid environments. By...
Amazon Redshift RG Instances Deliver Up to 2.2× Faster Queries and 30% Lower vCPU Costs
Amazon Web Services has made its new Redshift RG instance family generally available, promising up to 2.2 times faster warehouse queries and a 30 percent reduction in price per vCPU. Built on AWS Graviton processors, the RG line also adds native Apache...

The Data Governance Principles Healthcare Organizations Cannot Afford to Skip
Healthcare organizations face an average $10.1 million cost per data breach, highlighting governance failures that directly impact patient safety. With the sector generating about 30 % of global data and growing 36 % annually, ungoverned information leads to misdiagnoses, treatment errors, and regulatory...

Snowflake Signs $6bn Infrastructure Agreement with AWS
Snowflake announced a multi‑year, $6 billion infrastructure agreement with Amazon Web Services to secure the compute capacity needed for expanding AI and data workloads. The deal deepens Snowflake’s reliance on AWS, adding more Graviton‑based CPUs and GPU‑accelerated EC2 instances for model...
NVIDIA Pours $6.5 B Into Photonics to Power Next‑gen AI Data Centers
NVIDIA has pledged at least $6.5 billion since March to photonics companies, including $2 billion for Lumentum, Coherent and Marvell and $500 million each for Corning and Ayar Labs. The funding is intended to accelerate silicon‑photonic interconnects that can move data faster than...
DataHub Launches Cloud V1, Boosting AI Analytics Agent Accuracy to 90%
DataHub unveiled DataHub Cloud v1, a SaaS layer that feeds enterprise analytics agents with unified metadata and query history. Early benchmarks show answer accuracy climbing from roughly 50% to 90%, a near‑doubling that could reshape AI‑driven data querying.
CNIL Hits IQVIA France with $5.4 Million Fine Over Health Data Warehouse Breaches
France's CNIL has levied a €5 million ($5.4 million) administrative fine on IQVIA OPERATIONS FRANCE for multiple violations tied to its LRX and EMR health data warehouses, which hold records on tens of millions of patients. The sanction follows a multi‑year investigation...
Cloudflare Deploys Town Lake Data Platform and Skipper AI Agent
Cloudflare has built an internal data platform called Town Lake and an AI-driven query assistant named Skipper, consolidating fragmented data sources across the company. In its early rollout, billing‑related queries made up 53% of the 91,760 queries run by 324...
Starburst Launches Governed Multi‑Cloud AI Platform, Announces Qlik Partnership
Starburst unveiled its Enterprise Intelligence Platform at the AI & Datanova conference, making its AI Data Assistant (AIDA) generally available and announcing a strategic partnership with Qlik. The launch targets fragmented data environments, promising real‑time AI on governed data while...
DataGrail Report Finds 63.6% of AI Vendors Hide Subprocessors, Raising Privacy Risks
DataGrail’s 2026 Privacy and AI Trends Report shows 63.6% of AI‑enabled software vendors do not list third‑party AI subprocessors in their data‑processing agreements. The gap threatens enterprise data governance as companies may be feeding personal data to unapproved models, amid...

The Metadata Hub: Unify Your Data Estate
Snowflake is positioning its Horizon Catalog as a Metadata Hub that federates metadata across disparate data platforms such as Snowflake, AWS Glue, Databricks Unity Catalog, and Apache Polaris. The hub relies on open standards like Apache Iceberg and the Iceberg...
Exasol Launches 2026.1, Branding Its Database as a Sovereign AI ‘Panic Room’
Exasol unveiled version 2026.1, positioning the product as a sovereign AI ‘panic room’ that keeps models and agents inside the database. The release adds native AI functions, an MCP server for controlled model access, and promises up to 1000× faster...
IBM Launches Cloud Sovereignty Risk Profile to Boost Data Governance
IBM introduced the Cloud Sovereignty Risk Profile, a tool that monitors data residency, encryption and operational controls across hybrid and multicloud environments. The platform aims to close the visibility gap highlighted by a study showing 93% of executives consider digital...
Databricks Launches Apache Iceberg V3 GA with Open Sharing and Unified Governance
Databricks announced the general availability of Apache Iceberg version 3 on its Unity Catalog, delivering open APIs, cross‑engine governance and secure data sharing. The upgrade positions Unity Catalog as the most interoperable Iceberg catalog, aiming to simplify multi‑engine data lake...
Starburst Unveils Enterprise Intelligence Platform with Semantic Context to Boost AI Trust
Starburst Data announced the Enterprise Intelligence Platform, featuring the generally available AIDA AI assistant and a semantic context layer that aggregates metadata, business rules and relationships. The platform lets enterprises run AI workloads directly on distributed data, reducing hallucinations and...
ClickHouse Targets APAC Growth as Data Demands Surge
ClickHouse is accelerating its Asia‑Pacific expansion as demand for real‑time analytics and AI observability surges. Nearing 500 employees, the firm plans to double its go‑to‑market team and deepen ecosystem ties, especially with Confluent, while leveraging its recent Langfuse acquisition to...
Unravel Data Unveils Arvix AI, Autonomous Engine for Databricks, Snowflake and BigQuery
Unravel Data Systems Inc. introduced Arvix AI, an agentic artificial‑intelligence engine that automatically optimizes and remediates workloads on Databricks, Snowflake and Google BigQuery. The platform moves beyond dashboards to execute changes autonomously, aiming to curb rising cloud‑data costs for enterprises.
Snowflake Targets AI Agent Adoption with AWS Deal, Acquisition
Snowflake announced a $6 billion, five‑year infrastructure commitment with Amazon Web Services to power its enterprise AI workloads. At the same time, the data‑cloud company acquired Natoma, a Model Context Protocol platform that adds an identity and governance layer for AI...
Qlik Partners with Starburst
Qlik announced a partnership with Starburst to fuse its data integration, replication and analytics suite with Starburst’s federated query engine and context layer. The joint offering lets enterprises query, move, prepare and govern data across cloud, on‑premises and hybrid environments...
Squirrel AI Powers Personalized Learning for Over 10 Million Chinese Students
Squirrel AI, the Chinese ed‑tech firm, now serves over 10 million students with AI‑driven, data‑rich tutoring that can accelerate learning up to tenfold. Backed by government policy, the company is scaling its 3,000 learning centers and planning a push into the...
Dell and Palantir Launch On‑Prem AI Operating System for Secure Enterprise Analytics
Dell Technologies and Palantir announced a joint on‑premises AI operating system at Dell Technologies World 2026, combining Palantir Foundry, Ontology and Dell AI Factory with NVIDIA hardware. The solution promises secure, governed AI workflows for regulated industries that cannot rely...
Anthropic Adds 28 Security Integrations to Claude, Boosting Enterprise Data Governance
Anthropic announced the addition of 28 new security and compliance integrations to its Claude enterprise platform, enabling organizations to route conversation content and activity logs into existing monitoring tools. The move targets growing concerns over data governance as large language...