Today's Big Data Pulse

Data‑Engineering Bottlenecks Shift From Legacy Tech to Leadership Gaps
Three 2026 surveys of 1,629 data professionals show that weak leadership direction and poor requirements now account for 40% of top‑bottleneck votes, outpacing legacy systems at 25%. By April, 50% of respondents cite lack of clear ownership as the biggest pain point, while better tooling is mentioned by under 5%.
Also developing:
By the numbers: Ampere Analysis acquires PlumResearch
CFTC Deploys AI Surveillance Tools to Flag Risky Derivatives Trades
The Commodity Futures Trading Commission is launching AI‑powered market‑surveillance systems that flag abnormal trading patterns and streamline registration reviews. The move, announced by Chairman Michael S. Selig, seeks to offset a more than 20% workforce reduction and speed up enforcement in fast‑moving derivatives markets.
NAMIC Issues Analysis to Counter Algorithmic‑Bias Bills Targeting Insurers' AI
The National Association of Mutual Insurance Companies (NAMIC) published a new report that refutes five common myths behind state‑level algorithmic‑bias bills. The analysis arrives as 18 states debate AI legislation and NAIC regulators prepare tighter AI standards, underscoring potential regulatory...
Microsoft Launches Agent 365 to Govern Enterprise AI Agents
Microsoft rolled out Agent 365, a centralized governance platform for AI agents, on schedule in May 2026. The suite integrates Defender, Entra, Purview and Intune to give IT and security teams visibility, identity control and data loss prevention for both internal...

Synthetic Data Enables Safe Workforce Evolution Simulations
Use Synthetic #Data to Simulate Workforce Evolution Without Exposing Employee Data by @antgrasso #DataScience #BigData https://t.co/p4BdPIKlSB
Clinical Data Foundries Are on the Horizon
Health systems are pivoting toward "clinical data foundries" by 2030, turning electronic health records into high‑velocity, monetizable assets. The shift is driven by rising labor costs, margin pressure and the promise of modular AI architectures that replace fragmented point solutions....
Thailand Overhauls Welfare Database, Cutting Recipients by 30% to Meet OECD Standards
Thailand's National Statistical Office launched a sweeping overhaul of its welfare database, targeting a reduction of eligible recipients from 13.4 million to roughly 9 million. The move, part of a broader data‑integration push, is designed to meet international standards ahead of the...
Manufacturers Lag in Data Readiness as AI Tools Accelerate, Panel Says
At Rapid + TCT 2026 in Boston, industry experts warned that manufacturers are falling behind on data readiness, with data scientists spending up to 90% of their time cleaning data instead of building models. The panel highlighted siloed spreadsheets, lack...
Fivetran Rolls Out Hybrid Deployment, Letting Enterprises Keep Data On‑Prem While Using Managed ETL
Fivetran announced a Hybrid Deployment option that separates the control plane from the data plane, enabling customers to run ETL pipelines on‑premise while still leveraging Fivetran’s managed interface. The move targets regulated sectors that need tighter data governance, promising faster...

World May Find Itself ‘in a Very Chinese Time’ of Data Governance
China inaugurated the World Data Organisation in Beijing, signalling a coordinated push to turn data into a national economic asset. Beijing’s new regime combines data assetisation, state‑backed exchanges, and public‑data franchising to feed specialised AI models that need high‑quality sector...
Google Adds Cross‑cloud Joins to BigQuery Omni, Tackling Data Silos
Google announced a cross‑cloud join capability for its BigQuery Omni service, allowing analysts to combine tables stored in Google Cloud, Amazon S3 and Azure Blob Storage with a single SQL statement. The move follows a 120% rise in data processed...
Enterprise AI Scaling Demands Data Superhighway, Governance and New Security Tools
Accenture’s research shows 86% of enterprises will boost AI spend in 2026, yet only 21% have redesigned end‑to‑end processes around AI. At the same time, Anthropic has launched Claude Security, a code‑vulnerability scanner for enterprise customers, underscoring the need for...
Encord Secures $60 Million Series C to Build AI‑Native Data Infrastructure
Encord announced a $60 million Series C round led by Wellington Management to expand its AI‑native data infrastructure, pushing total capital raised to $110 million. The funding backs Encord’s push to support physical AI applications such as autonomous vehicles and robotics, addressing data‑quality...
Intent‑Based Data Engineering Redefines AI‑Powered Pipelines
A growing cohort of data leaders is championing Intent‑Based Data Engineering (IBDE) as a replacement for traditional ticket‑driven ETL design. By declaring business outcomes instead of step‑by‑step instructions, IBDE promises self‑healing pipelines that adapt to source changes and compliance demands,...
UiPath and Databricks Partner to Fuse Automation with Enterprise Data Intelligence
UiPath has become a validated technology partner of Databricks, linking its RPA platform with Databricks’ data lake and AI engine. The integration lets enterprises pull trusted data into automated processes in real time, aiming to speed decisions and broaden AI...
Digital Wave and ChannelEngine Link Data Platforms to Power 1,300 Marketplaces
Digital Wave Technology and ChannelEngine announced a partnership that connects Digital Wave’s ONE platform with ChannelEngine’s network of over 1,300 global marketplaces. The integration automates product‑data enrichment, pricing and order management, promising faster time‑to‑market and higher data quality for brands...
MHRA Names Former CDC CIO Jason Bonander as New Chief Digital and Technology Officer
The Medicines and Healthcare products Regulatory Agency (MHRA) has appointed Jason Bonander, former CIO of the U.S. CDC, as its chief digital and technology officer. He will steer a five‑year modernization plan that emphasizes data, AI oversight and agile regulation,...
AWS Transform Introduces AI‑Driven Tool to Cut BI Migration to QuickSight to Days
AWS announced that its Transform service now offers AI‑powered agents that automate the migration of Tableau and Power BI dashboards to Amazon QuickSight. The new chat‑based workflow promises to shrink migration timelines from months to days, addressing a long‑standing bottleneck for...
Snowflake Teams with Appian to Embed AI in Data Cloud, Targeting Enterprise Automation
Snowflake announced a partnership with low‑code platform Appian to fuse Snowflake’s AI Data Cloud with Appian’s orchestration layer. The integration lets business users invoke Snowflake Cortex AI directly within workflow apps, a move aimed at accelerating data‑backed decisions across enterprises.
Recharge Acquires Skio to Form $20 B Subscription Commerce Platform
Recharge announced the acquisition of Skio, merging two leading subscription‑management solutions into a single platform that now serves more than 20,000 merchants and processes over $20 billion in gross merchandise volume each year. The deal aims to combine data sets and...

‘Data Governance Is Equal to Trust’: Qlik’s Varun Babbar on AI’s Shift From Experimentation to Scale
Qlik’s India MD Varun Babbar warned that scaling enterprise AI hinges on data governance, equating it with trust. He outlined three AI trends—democratization, conversational analytics, and agentic AI—and stressed that without a solid data foundation, pilots stall. In India, firms...
Jacobs Solutions, Linked to Palantir, Shortlisted for $700 Million Milwaukee Wastewater Contract
Jacobs Solutions, a Dallas engineering firm with a strategic partnership with Palantir, has been named a finalist for the Milwaukee Metropolitan Sewerage District’s $700 million, decade‑long wastewater operations contract. The bid pits Jacobs against incumbent Veolia North America and raises questions...
Uber to Turn Driver Fleet Into Nationwide Sensor Network for AV Developers
Uber announced a plan to outfit its human‑driver fleet with sensor kits, turning cars into a rolling data source for autonomous‑vehicle (AV) developers. The initiative is tied to a new partnership with Hertz’s Oro Mobility, which will provide operational support...
AI Capex Surge Threatens Data‑Center Profitability as Industry Faces Bubble Concerns
Jefferies analysts warned that AI‑driven data‑center spending now exceeds $200 billion, squeezing hyperscalers' cash flow. While developers reap productivity gains from tools like Anthropic’s Claude Code, experts question whether the massive infrastructure build‑out can sustain profitability.
The Commodification of Sensitive Open Data
The European Union’s European Health Data Space (EHDS) regulation, adopted in March 2025, will make the electronic health records of roughly 450 million residents available for secondary use by March 2029. The framework defaults to inclusion, requiring citizens to opt out and offering...
Living Well with Data: Stewardship as a Just and Viable Paradigm
A new report by Reema Patel, authored by Stefaan Verhulst, outlines ten prevailing mental models that shape data governance, from data colonialism to data stewardship. The analysis argues that entrenched models have contributed to a growing data trust deficit, systemic...
Collate Launches AI‑driven Chat Analytics Dashboards Built on OpenMetadata
Collate Inc. introduced Collate AI Analytics, a chat‑based analytics platform that automatically discovers data sources, writes queries and creates dashboards using its OpenMetadata‑powered Semantic Context Graph. The tool is live for existing customers and will be generally available the first...
Optro Unveils MCP Server to Govern AI Access to GRC Data
Optro introduced a Model Context Protocol (MCP) server that lets customers link enterprise large language models to live governance, risk and compliance (GRC) data while preserving role‑based permissions. The product aims to cut manual data handling and provide an auditable...
GTCR Teams with Data Veteran Brian Crotty to Launch Avelis Holdings
GTCR announced a partnership with seasoned data executive Brian Crotty to form Avelis Holdings, a subscription‑based market‑intelligence platform targeting commodity and industrial sectors. The move underscores GTCR’s Leaders Strategy™ of building operating capabilities alongside seasoned CEOs to capture fragmented data...
Definity Secures $12 Million Series A to Build Agentic Data‑Engineering Platform
Chicago‑based Definity closed an oversubscribed $12 million Series A round led by GreatPoint Ventures, bringing total funding to $16.5 million. The financing will accelerate its agentic data‑engineering platform that claims to cut pipeline costs by over 30% and resolve Spark issues ten times...
Plaid Launches Upgraded Engine for Plaid Income, Boosting Verification Accuracy
Plaid announced a rebuilt engine behind its Plaid Income product, delivering a 48% lift in income classification accuracy and 84% precision for earned‑income detection. The upgrade adds a transformer‑based model and a richer six‑tier taxonomy, promising lower friction for fintech...
Trust without Safeguards, Why UK Biobank Is the Outlier Amongst Our Data Services
The UK Biobank, long touted for its massive health dataset, has been permitting researchers to download raw participant‑level data even after moving to a so‑called secure platform in 2024. Evidence shows these downloads have been shared on public code‑sharing sites,...
JWST’s ‘Red Monster’ Galaxy Pushes Early-Universe Limits, Highlights Big-Data Crunch
Astronomers using NASA’s James Webb Space Telescope have identified a dust‑laden galaxy, EGS‑z11‑R0, that existed just 400 million years after the Big Bang. The discovery, derived from a terabyte‑scale analysis of JWST’s public archive, challenges existing models of early galaxy formation...

Centralize Metrics with a DRY Analytics API
Instead of duplicating measures in each BI tool, store them centrally in an Analytics API. This is the DRY principle applied to metrics. One metric definition, accessible via GraphQL, SQL, or REST. https://www.ssp.sh/brain/analytics-api
Agents Disrupt Database Contracts; Build Defensive Data Layers
Have agents broken the unspoken contract we had with our databases? You know, human-authored apps running deterministic code and predictable queries? @arpit_bhayani wrote a terrific post for how to create a defensively designed data layer ... https://t.co/8OOXrmkD6g
Herb Morgan Pushes FI$Cal Modernization as Key to California’s $350B Budget Transparency
Herb Morgan, a former Wall Street CIO, is making FI$Cal modernization the centerpiece of his campaign for California State Controller, pledging a real‑time, AI‑enabled financial reporting platform. The effort targets the state’s $350 billion budget, aiming to flag suspicious spending daily...
Data Migration Remains Underestimated Despite Persistent Industry Challenges
Data Migration Is Still Hard: Why the Industry Keeps Underestimating It "Every few years the IT industry rediscovers something that experienced practitioners already know: moving data is difficult." https://t.co/Qy0esoSAIR
AI Model Outperforms ER Doctors in Emergency Diagnosis, Study Shows
OpenAI's o1 reasoning model outperformed human emergency‑room physicians in a Science‑published study, delivering more accurate diagnoses across simulated and real‑world cases. Researchers say the result underscores the power of large‑scale clinical data and advanced analytics, while warning against replacing doctors...

AI vs Business Intelligence
Artificial Intelligence (AI) and Business Intelligence (BI) are distinct yet complementary data tools. AI focuses on learning from data to make real‑time predictions and automate complex decisions, while BI aggregates historical data for reporting and visualization. The article highlights AI’s...
Improving AI Accuracy with GraphRAG
AWS’s managed graph database Amazon Neptune is gaining traction as a catalyst for higher AI accuracy, especially in security and chatbot applications. Customers such as Trend Micro have lifted chatbot precision from 70% to 90% by leveraging Neptune’s relationship‑focused data...
Digital Tool to Analyse Maternity Data
The NHS is launching the Maternal Outcomes Signal System (MOSS), a digital platform that rapidly analyses routine maternity data to highlight emerging safety concerns. The tool will generate six‑month reports, prompting trusts to act on identified risks. The government has...
LegalOn Launches Vault AI Platform to Turn Contracts Into Business Intelligence
LegalOn Technologies introduced Vault, an AI‑driven add‑on that converts executed contracts into searchable business intelligence. The tool, now live for in‑house teams in North America and Europe, aims to solve the post‑signature management gap cited by 66% of legal professionals.
Turning Data Chaos Into Business‑Driving AI on Google Cloud
From Data Chaos to Confident AI: How Overdose Built a Semantic Foundation on Google Cloud https://t.co/MEyCiKRO4Q Overdose CEO Paul Pritchard unpacked exactly what it takes to get from scattered #data sources to #AI that actually moves the business forward.
AI Music Platforms Scale Up, Suno Claims 7 Million Songs Daily Amid Lawsuits
Suno, the most visible AI music startup, announced it can produce roughly seven million songs each day and recently settled a copyright dispute with Warner Music. The move comes as the company, along with rival Udio, navigates lawsuits from Universal...
Nigeria’s NAICOM Teams with NASRDA and UNDP to Deploy Satellite‑Based Flood Insurance Model
Nigeria’s National Insurance Commission (NAICOM) has signed a three‑way partnership with the National Space Research and Development Agency (NASRDA) and the United Nations Development Programme (UNDP) to roll out a geospatial flood‑risk insurance model for Lagos. The initiative leverages satellite...
Genomics Pioneer J. Craig Venter Dies, Leaving a Data‑Driven Legacy
J. Craig Venter, the founder of the J. Craig Venter Institute and a trailblazer in massive genomic sequencing, died on April 29, 2026. His work turned biology into a data‑rich discipline, influencing how big‑data tools are applied in life sciences. The loss...
Revolut Taps Akahu’s Open‑banking API to Launch NZ’s First Automated Credit‑card Underwriting
Revolut has teamed up with New Zealand’s open‑finance platform Akahu to pull real‑time banking data via APIs for credit‑card underwriting, becoming the first issuer in the market to automate decisions under the newly regulated open‑banking regime. The move promises faster...

Snowflake Intelligence Partner Solutions Bring AI Edge to Industries
Snowflake announced its Intelligence Partner Solutions, a suite of AI‑driven agents that let users ask natural‑language questions across both structured and unstructured data. Partners such as Anblicks, CitiusTech, Deloitte and others have built industry‑specific agents for retail, healthcare, finance and...

Boomi Shifts to Data Activation, Tackles AI Semantic Gap
I haven't written about @boomi in a while. That changed when I noticed they stopped calling themselves an integration company sometime in Q1 2026. The new label is "data activation company" and it's not cosmetic. Integration Platform as a Service...
Grant Is the Proper Term for Permission on Entities
Randomly remembered the time when Grant Henke worked on AuthZ for Apache Kafka and tried to come up with a term for "giving a user permission to perform action on an entity". Grant, it is called "grant". As in: "GRANT...

Datometry for Snowflake: Accelerate Teradata Migration
Datometry for Snowflake entered public preview, offering enterprises a lift‑and‑shift path from Teradata to Snowflake without code rewrites or downtime. The solution virtualizes Teradata SQL on Snowflake, enabling a three‑step repoint‑test‑transition workflow that can be completed in weeks. By eliminating...