Know What's Happening in Big Data

Today's Big Data Pulse

Leadership Gaps Hamper Data Engineering Teams, Survey Finds

Three 2026 surveys of 1,629 data professionals reveal organizational issues now dominate data‑engineering bottlenecks. In January, weak leadership direction and poor requirements accounted for 40% of top‑bottleneck votes, while by April 50% cited lack of clear ownership as the biggest pain point. Legacy systems and tooling were far lower priorities, at 25% and under 5% respectively.

Snowflake Kafka Connector V4
NewsApr 28, 2026

Snowflake Kafka Connector V4

Snowflake announced the general availability of Kafka Connector V4, which defaults to schematized ingestion that maps each JSON key to a table column. The new connector runs on Java 11+, supports Apache Kafka 2.x‑3.x, and integrates with standard Confluent converters. Benchmarks...

By Snowflake Blog
Why Data Infrastructure Is the Key to AI in Finance
NewsApr 28, 2026

Why Data Infrastructure Is the Key to AI in Finance

At the Microsoft AI Tour in London, LSEG highlighted how consolidating fragmented data into a unified lake—powered by Microsoft Foundry, Defender, Purview and OneLake—has unlocked over 33 petabytes of AI‑ready financial content. A McKinsey survey shows 63% of financial firms have...

By Fintech Global
I Reviewed 6 Best ETL Tools for Data Transfer Efficiency in 2026
NewsApr 28, 2026

I Reviewed 6 Best ETL Tools for Data Transfer Efficiency in 2026

Shreya Mattoo’s 2026 review identifies six ETL platforms—Google Cloud BigQuery, Databricks, Domo, IBM watsonx.data, SnapLogic, and Workato—as the market’s top performers based on G2 user data. BigQuery excels in real‑time analytics with a serverless model, while Databricks offers a unified lakehouse for...

By G2 Learn
Data Agents Need Context Graphs. Can Your Data Pipelines Cater?
NewsApr 28, 2026

Data Agents Need Context Graphs. Can Your Data Pipelines Cater?

The article argues that decision traces—approvals, exceptions, and reasoning captured in Slack, email, and workflow tools—are fundamentally events and can be processed with existing behavioral‑event pipelines. By storing these traces in a warehouse‑native context graph, organizations can reuse the same...

By RudderStack
Sensitive Data as Venn Diagram
BlogApr 28, 2026

Sensitive Data as Venn Diagram

Healthcare data is split into "Normal" (non‑sensitive) and "Restricted" (sensitive) categories. Sensitive records receive specific sensitivity codes in the FHIR Resource.meta.security tag, creating a Venn diagram of overlapping topics such as Sexual Health, Mental Health, and Substance Use. The tags...

By Healthcare Exchange Standards
Eli Lilly and NVIDIA Deploy 1,016‑GPU LillyPod Supercomputer to Speed AI Drug Discovery
NewsApr 28, 2026

Eli Lilly and NVIDIA Deploy 1,016‑GPU LillyPod Supercomputer to Speed AI Drug Discovery

Eli Lilly has launched LillyPod, a dedicated AI supercomputer built with 1,016 NVIDIA Blackwell Ultra GPUs, to accelerate drug discovery. The Indianapolis facility works alongside Lilly’s wet lab in San Francisco, promising faster analysis of legacy data and new molecular designs...

By Pulse
NAB Show 2026: Hydrolix Named “Data Observability Solution Provider of the Year” In 2026 Data Breakthrough Awards
BlogApr 28, 2026

NAB Show 2026: Hydrolix Named “Data Observability Solution Provider of the Year” In 2026 Data Breakthrough Awards

Hydrolix was named Data Observability Solution Provider of the Year in the 2026 Data Breakthrough Awards, marking its third straight win. The award follows previous honors for cloud data warehousing and observability innovation, underscoring the platform’s real‑time visibility at petabyte...

By StorageNewsletter
AI Won’t Fix Your Data Problems. Data Engineering Will
NewsApr 28, 2026

AI Won’t Fix Your Data Problems. Data Engineering Will

Enterprise AI projects often prioritize models while overlooking the quality of internal data. Because most large‑language models are trained on public datasets, they lack the contextual grounding needed for a company’s unique customer, billing, and usage records. The resulting gaps...

By CIO.com
Google's LangExtract: Free, Open‑Source Tool Beats $100K Solutions
SocialApr 28, 2026

Google's LangExtract: Free, Open‑Source Tool Beats $100K Solutions

RIP document extractors. Google just released LangExtract: Open-source. Free. Better than $100K enterprise tools. Here’s what it does: 🧵

By Matt Dancho
UK Revises AI Data Center Emissions Forecast to 123 Million Tons CO₂
NewsApr 28, 2026

UK Revises AI Data Center Emissions Forecast to 123 Million Tons CO₂

British officials quietly updated the country's Compute Roadmap, raising projected AI data‑center carbon emissions to 123 million metric tons between 2025‑2035—about 100 times the earlier 0.142 million‑ton estimate and comparable to the output of 2.7 million people. The revision, first spotted by Politico...

By Pulse
Insurers Need Real-Time Data Capabilities
BlogApr 28, 2026

Insurers Need Real-Time Data Capabilities

Insurers are no longer struggling to collect data but to act on it before it becomes stale. Legacy batch‑processing systems and entrenched data silos create 24‑hour delays that expose insurers to fraud and inefficiencies. The article outlines a five‑step roadmap—prioritizing...

By Insurance Thought Leadership (ITL)
Bitemporal Modeling: Managing Data Across Valid and Transaction Times
SocialApr 28, 2026

Bitemporal Modeling: Managing Data Across Valid and Transaction Times

Valid time vs transaction time. When you need both. Bitemporal modeling handles historical data along two distinct timelines. https://www.ssp.sh/brain/bitemporal-modeling

By SSP Data
Your LLM Issues Are Really Data Issues
PodcastApr 28, 202631 min

Your LLM Issues Are Really Data Issues

In this episode, Ryan Donovan talks with Harsha Chintalapani, co‑founder and CTO of Collate, about why the biggest challenges facing LLMs in production are actually data problems. Harsha explains how issues like schema drift, ambiguous business definitions, data discovery, lineage,...

By Stack Overflow Podcast
PharosIQ Launches atlasIQ Intelligence at Forrester B2B Summit, Targeting Real-Time Buyer Insight
NewsApr 28, 2026

PharosIQ Launches atlasIQ Intelligence at Forrester B2B Summit, Targeting Real-Time Buyer Insight

pharosIQ debuted its atlasIQ Intelligence platform at the Forrester B2B Summit in Phoenix, positioning the solution as the next‑generation buyer‑intelligence layer for go‑to‑market teams. The launch coincides with the company's reported double‑digit organic revenue growth in 2025 and continued momentum...

By Pulse
Genie Becomes Databricks’ Core AI for Semantic Analysis
SocialApr 28, 2026

Genie Becomes Databricks’ Core AI for Semantic Analysis

𝐆𝐞𝐧𝐢𝐞 is now the most important way to do data analysis in Databricks. What's unique about it is its ability to extract semantics from your entire Lakehouse, enabling it to answer complex data questions that cripple agents without a deep...

By Ali Ghodsi
Amazon Quick Unifies Enterprise Data Across Cloud
SocialApr 28, 2026

Amazon Quick Unifies Enterprise Data Across Cloud

First announcement: Amazon Quick. A solution to bring enterprise data together. #CIO #AI #Cloud #WhatsNextWithAWS https://t.co/tgebOMT351

By Tim Crawford
Google Cloud and NetApp Launch Fully Available Flex Unified Storage, B2BDaily Reviews Impact
NewsApr 28, 2026

Google Cloud and NetApp Launch Fully Available Flex Unified Storage, B2BDaily Reviews Impact

Google Cloud and NetApp announced that NetApp Volumes with Flex Unified is now generally available in every Google Cloud region. B2BDaily’s review highlights how the unified file‑and‑block service could reshape enterprise data lakes, tighten governance and accelerate AI analytics.

By Pulse
Pharma R&D Embraces Data Lakehouses, Boosting Speed and Cutting Costs
NewsApr 27, 2026

Pharma R&D Embraces Data Lakehouses, Boosting Speed and Cutting Costs

AstraZeneca, Illumina and Pfizer are moving their research data onto lakehouse platforms such as Databricks, Snowflake and Apache Iceberg. Pfizer reports a four‑fold query speed increase and a 57% reduction in total cost of ownership, signaling a sector‑wide shift toward...

By Pulse
Enterprises Rebuild Data Stack to Power AI, MIT Technology Review Finds
NewsApr 27, 2026

Enterprises Rebuild Data Stack to Power AI, MIT Technology Review Finds

MIT Technology Review’s latest analysis warns that fragmented data infrastructures are the biggest barrier to enterprise AI adoption. Senior leaders from Databricks and Infosys stress that unified, governed data stacks are essential for trustworthy, scalable AI outcomes.

By Pulse
Backfilling: The Mark of a Great Data Engineer
SocialApr 27, 2026

Backfilling: The Mark of a Great Data Engineer

Backfilling is where you see the difference between a data engineer and a great data engineer. A backfill means taking a data asset normally updated incrementally and updating historical parts of it. https://www.ssp.sh/brain/backfill

By SSP Data
Alteryx Releases AI Insights Agent on Google Cloud Marketplace, Brings Trust to Datasets
NewsApr 27, 2026

Alteryx Releases AI Insights Agent on Google Cloud Marketplace, Brings Trust to Datasets

Alteryx has launched the AI Insights Agent on Google Cloud Marketplace, embedding governed analytics into Gemini Enterprise. The agent runs predefined Alteryx One workflows directly on BigQuery, delivering AI‑generated answers that respect business logic, auditability and compliance. By keeping data...

By Database Trends & Applications (DBTA)
Structured Data Drives Accurate, Low‑Risk AI Results
SocialApr 27, 2026

Structured Data Drives Accurate, Low‑Risk AI Results

#AI is only as good as the structure behind your data. Same model. Two setups. Completely different results. Structured + metadata → higher accuracy, lower risk. Unstructured → messy, inconsistent outputs. That’s the bottleneck most teams ignore. Check out the...

By Ronald van Loon Threads
Metadata Is the Database; Data only Works Through It
SocialApr 27, 2026

Metadata Is the Database; Data only Works Through It

Metadata is the database. Data is just the thing metadata lets you find, interpret, route, lock, replicate, and recover.

By Gwen (Chen) Shapira
Why California's Data Broker Registry Matters More than Its Delete Button
NewsApr 27, 2026

Why California's Data Broker Registry Matters More than Its Delete Button

California’s Delete Request and Opt‑Out Platform (DROP) shifts focus from consumer‑driven deletions to a public data‑broker registry that forces disclosure of sensitive data practices. Brokers must report whether they collect minors’ information, geolocation, or health‑related data, giving regulators a centralized...

By Route Fifty — Finance
Angry Chickz VP Tonya McCoy Fuels 63% Sales Surge with Data‑first Marketing
NewsApr 27, 2026

Angry Chickz VP Tonya McCoy Fuels 63% Sales Surge with Data‑first Marketing

Tonya McCoy, Vice President of Marketing for the Angry Chickz franchise, has leveraged data‑driven tactics to lift territory unit sales by 63% and oversee multi‑million‑dollar marketing budgets. Her 28‑year food‑service marketing career underscores Angry Chickz’s people‑first, performance‑focused culture.

By Pulse
Dartmouth Researchers Launch Smartphone Study to Predict Alzheimer’s Risk in Williamstown Seniors
NewsApr 27, 2026

Dartmouth Researchers Launch Smartphone Study to Predict Alzheimer’s Risk in Williamstown Seniors

Dartmouth Medical School researchers began a pilot study with 23 Williamstown seniors, part of a nationwide 200‑person trial, to test the RealVision smartphone app that analyzes walking, speech, eye‑tracking and smiling to flag early Alzheimer’s risk. The effort showcases big‑data...

By Pulse
Regulators Target Hotel AI Use Over Data Privacy
SocialApr 27, 2026

Regulators Target Hotel AI Use Over Data Privacy

Regulators turn attention to hotel AI governance “Hotels are increasingly deploying AI for functions such as … dynamic pricing, and personalised marketing. These systems rely on large volumes of behavioural and transactional data, raising questions about consent, storage, and secondary...

By Glen Gilmore
Visier Launches Next‑Gen Workforce AI Platform for Enterprise Talent Analytics
NewsApr 27, 2026

Visier Launches Next‑Gen Workforce AI Platform for Enterprise Talent Analytics

Visier announced the launch of its next‑generation Workforce AI platform at the Outsmart conference in Palm Springs, adding AI‑driven assistants and a Glean MCP integration. Built on data from more than 2 million users, the solution promises to embed workforce insights...

By Pulse
DESI Releases Petabyte-Scale 3‑D Cosmic Map, Sparking Big Data Race
NewsApr 27, 2026

DESI Releases Petabyte-Scale 3‑D Cosmic Map, Sparking Big Data Race

The Dark Energy Spectroscopy Instrument (DESI) has completed its first full survey, releasing a 3‑D map of 47 million galaxies and 20 million stars that amounts to petabytes of data. The unprecedented dataset promises breakthroughs in cosmology while testing the limits of...

By Pulse
AI Dismisses Data Mesh vs Fabric, Emphasizes Fundamentals
SocialApr 27, 2026

AI Dismisses Data Mesh vs Fabric, Emphasizes Fundamentals

Data mesh vs data fabric was "the battle of the storytellers." Zhamak against the fabriconians. Then AI walked in and wiped both off the map. Storytellers win the round. Fundamentals win the decade. https://t.co/NPDJeWrmaE

By Yves Mulkers
Large UK Companies in the Dark About How Their Data Is Used Overseas by AI
NewsApr 27, 2026

Large UK Companies in the Dark About How Their Data Is Used Overseas by AI

Large UK corporations are increasingly uncertain about how their proprietary data is being accessed and processed by artificial‑intelligence systems located abroad. A recent industry survey reveals that most firms lack clear visibility into cross‑border data flows, leaving them vulnerable to...

By Financial Times – Technology
MoD Working up Enhanced ‘Commercial Leakage’ Analytics Capability, Perm Sec Says
NewsApr 27, 2026

MoD Working up Enhanced ‘Commercial Leakage’ Analytics Capability, Perm Sec Says

The UK Ministry of Defence (MoD) is rolling out an enhanced "commercial leakage" analytics capability built on Oracle Fusion Cloud and AI‑driven cloud analytics to spot fraud and errors in its massive invoicing process. Over the past three years the...

By PublicTechnology.net (UK)
U.S. Air Force Issues AI‑Centric Data Strategy, Calls for Immediate Overhaul
NewsApr 27, 2026

U.S. Air Force Issues AI‑Centric Data Strategy, Calls for Immediate Overhaul

On April 17, 2026, Secretary of the Air Force Troy Meink signed two new strategy documents that embed artificial intelligence at the heart of Air Force operations. The papers expose a broken data architecture, mandate a data‑mesh approach, and create...

By Pulse
Rubrik Adds Cyber‑Resilience to Google Cloud SQL, Boosting Immutable Backups for PostgreSQL
NewsApr 27, 2026

Rubrik Adds Cyber‑Resilience to Google Cloud SQL, Boosting Immutable Backups for PostgreSQL

Rubrik announced today a cyber‑resilience add‑on for Google Cloud SQL that delivers immutable, automated backups for managed PostgreSQL workloads. The integration promises ransomware‑proof protection and rapid cross‑region recovery without altering existing disaster‑recovery architectures.

By Pulse
Hershey Deploys Real-Time AI Marketing Mix Model to Optimize Media Spend
NewsApr 27, 2026

Hershey Deploys Real-Time AI Marketing Mix Model to Optimize Media Spend

Hershey announced that its AI-enabled marketing mix model will go live in May, giving the company real‑time visibility into sales and media performance. The system replaces a three‑times‑yearly planning cycle with monthly updates, ingesting hundreds of thousands of data points...

By Pulse
NOODL. An Experiment in Equitable Data Licensing: Promise and Limits
BlogApr 26, 2026

NOODL. An Experiment in Equitable Data Licensing: Promise and Limits

The Nwulite Obodo Open Data License (NOODL) is a tiered licensing model designed for African language datasets, aiming to close the equity gap between researchers in the Global South and multinational firms. Built on Creative Commons foundations, it grants permissive...

By GovLab — Digest —
IRS Deploys AI to Offset Auditor Shortages and Sharpen Audit Risk
NewsApr 26, 2026

IRS Deploys AI to Offset Auditor Shortages and Sharpen Audit Risk

The Internal Revenue Service announced the rollout of artificial‑intelligence and advanced‑analytics tools to identify high‑risk returns, a move designed to counter a quarter‑size loss of tax examiners and shrinking enforcement budgets. Agency leaders say the technology will improve compliance while...

By Pulse
TechVision Unveils QuantumMind AI Model to Accelerate Large-Scale Data Analysis
NewsApr 26, 2026

TechVision Unveils QuantumMind AI Model to Accelerate Large-Scale Data Analysis

TechVision Inc. launched its QuantumMind AI model, claiming it can process massive data sets faster and more accurately than existing tools. The announcement positions the model as a potential game‑changer for enterprises grappling with ever‑growing data volumes.

By Pulse
Overstock.com Boosts Data Science Velocity 500% to Power Hyper‑Personalized Shopping
NewsApr 26, 2026

Overstock.com Boosts Data Science Velocity 500% to Power Hyper‑Personalized Shopping

In a recent interview, Craig Kelly, group product manager at Overstock.com, detailed how the retailer amplified data‑science velocity by more than 500%, halved the cost of moving models to production and now launches new models five times faster. The changes...

By Pulse
OpenAI Triples Web Crawl Scale, Study Reveals
SocialApr 26, 2026

OpenAI Triples Web Crawl Scale, Study Reveals

Interested in how OpenAI is crawling (across its user-agents)? Good stuff from Chris Long of Nectiv based on 7B+ log file entries. Some good nuggets of information in the study. https://www.botify.com/blog/openai-tripled-web-crawl

By Glenn Gabe
Matrix Booking Launches Sensor‑integrated "Sense" Platform at The Workplace Event 2026
NewsApr 26, 2026

Matrix Booking Launches Sensor‑integrated "Sense" Platform at The Workplace Event 2026

Matrix Booking announced its new Sense platform at The Workplace Event 2026, adding occupancy and environmental sensors to its booking software. The move targets hybrid‑work challenges, cost inefficiencies and sustainability pressures, offering real‑time, anonymised analytics for office managers.

By Pulse
ITC Holdings CEO Calls Grid Monitoring Key to Modernizing U.S. Transmission
NewsApr 26, 2026

ITC Holdings CEO Calls Grid Monitoring Key to Modernizing U.S. Transmission

ITC Holdings CEO Charles Marshall said the company’s expanded grid‑monitoring platform and a 450‑mile transmission build‑out in Michigan will boost reliability and accommodate renewable generation. The comments come as the firm operates 16,000 miles of high‑voltage lines across the Midwest.

By Pulse
Google Cloud Unveils Agentic Enterprise Stack and New TPUs at Next 2026
NewsApr 26, 2026

Google Cloud Unveils Agentic Enterprise Stack and New TPUs at Next 2026

At its Las Vegas Next 2026 conference, Google Cloud introduced a suite of agentic‑focused services—including split eighth‑generation TPUs, the Gemini Enterprise Agent Platform, an Apache Iceberg‑based Agentic Data Cloud, and an Agentic Defense stack—supported by a planned $175‑185 billion capital spend...

By Pulse
SEBI Chief Calls for Vision‑Led Tech Framework as Indian Markets Face Disruption
NewsApr 26, 2026

SEBI Chief Calls for Vision‑Led Tech Framework as Indian Markets Face Disruption

At SEBI’s Foundation Day, Chairman Tuhin Kanta Pandey announced a push for a vision‑led, technology‑centric regulatory framework to safeguard India’s $120 billion annual capital formation market. He highlighted AI‑enabled supervision, e‑office migration and data‑analytics capacity as core pillars.

By Pulse
Snowflake Unveils AI Platform Upgrade to Serve as Enterprise Control Plane for Agents
NewsApr 26, 2026

Snowflake Unveils AI Platform Upgrade to Serve as Enterprise Control Plane for Agents

Snowflake launched an AI platform upgrade that integrates Snowflake Intelligence and Cortex Code, positioning the data cloud as a control plane for AI agents. The upgrade adds direct hooks to everyday tools such as Gmail, Salesforce and Slack, and supports...

By Pulse
Loop Secures $95M Series C to Accelerate AI‑Driven Supply‑Chain Platform
NewsApr 26, 2026

Loop Secures $95M Series C to Accelerate AI‑Driven Supply‑Chain Platform

Loop announced a $95 million Series C round led by Valor Equity Partners and the Valor Atreides AI Fund, with participation from 8VC, Founders Fund, Index Ventures, J.P. Morgan Growth Equity Partners, and Tao Capital Partners. The capital will fund team...

By Pulse
Netherlands Signs €0 Deal with STACKIT to Shift Government Cloud to Europe
NewsApr 26, 2026

Netherlands Signs €0 Deal with STACKIT to Shift Government Cloud to Europe

The Netherlands has inked a contract with German cloud provider STACKIT to host government data within the EU, aiming to curb dependence on U.S. tech giants. Ministers say the move strengthens digital resilience and sparks European market growth.

By Pulse
Google Unveils Cross‑Cloud Lakehouse Service Linking BigQuery to Amazon S3
NewsApr 25, 2026

Google Unveils Cross‑Cloud Lakehouse Service Linking BigQuery to Amazon S3

Google announced a cross‑cloud lakehouse service that lets BigQuery run SQL directly against Amazon S3 data using the Iceberg table format. The zero‑copy integration is positioned as a counter‑move to competitors’ more closed data ecosystems.

By Pulse
Cox Automotive Acquires Fullpath in $100M‑Plus AI Deal to Bolster Dealer Platform
NewsApr 25, 2026

Cox Automotive Acquires Fullpath in $100M‑Plus AI Deal to Bolster Dealer Platform

Cox Automotive announced the acquisition of Jerusalem‑based Fullpath, an AI‑powered dealership data platform, in a deal estimated at over $100 million. The move deepens Cox’s AI capabilities for its connected retail suite, targeting the $2 trillion U.S. automotive market.

By Pulse