Today's Big Data Pulse

Data‑Engineering Bottlenecks Shift From Legacy Tech to Leadership Gaps
Three 2026 surveys of 1,629 data professionals show that weak leadership direction and poor requirements now account for 40% of top‑bottleneck votes, outpacing legacy systems at 25%. By April, 50% of respondents cite lack of clear ownership as the biggest pain point, while better tooling is mentioned by under 5%.
Also developing:
By the numbers: Ampere Analysis acquires PlumResearch

CyrusOne Partners with Constellation for 760MW Data Center Campus in Texas
CyrusOne announced a 760 MW data‑center campus in Freestone County, Texas, partnering with Constellation’s Calpine unit. The project secures two 380 MW power agreements—Phase I and Phase II—leveraging the adjacent natural‑gas Freestone Energy Center. Construction is underway with an expected operational date in Q4 2026. The deal follows a similar CyrusOne‑Calpine collaboration near Dallas‑Fort Worth, expanding CyrusOne’s footprint of over 50 global facilities.

Deutsche Telekom Launches Nvidia AI Factory Data Center in Munich's Tucherpark
Deutsche Telekom has opened an Nvidia‑powered AI factory at the Polarise data centre in Munich’s Tucherpark, installing roughly 10,000 Blackwell GPUs including DGX B200 and RTX PRO systems. The facility, branded the Industrial AI Cloud, underpins the ‘Germany Stack’ – a sovereign...

Meta Purchases Additional 1,400 Acres for Hyperion Mega-Data Center Expansion
Meta has acquired an extra 1,400 acres adjacent to its Hyperion data‑center campus in Louisiana, expanding the footprint of a project originally planned on 2,250 acres. The Hyperion complex is designed to deliver 2 GW of compute power, with the potential...

Intersect Power Files to Develop Data Center in Texas
Intersect Power, a clean‑energy firm being acquired by Alphabet, has filed a Texas Department of Licensing and Regulation application to build a 761,000‑square‑foot data center near Miami, Texas. The Project Pumpkin 2A development will cost roughly $400 million and is scheduled for...
What AI Builders Can Learn From Fraud Models that Run in 300 Milliseconds
Mastercard’s Decision Intelligence Pro (DI Pro) uses a sub‑300 ms recurrent neural network to assign risk scores to each payment transaction in real time. The platform treats fraud detection as an "inverse recommender" problem, comparing current merchant behavior to historical patterns. By...

Pulsant Launches £10m Expansion of Milton Keynes Data Center
Edge provider Pulsant has completed a £10 million expansion of its Milton Keynes data centre, adding a 1.2 MW data hall optimized for high‑density AI workloads. The new facility is part of the company’s platformEDGE network, which now spans 14 edge sites...

The Cost of Caution: Why Oversizing in Data Center Design Is Breaking the Bank
The article warns that excessive caution in data‑center design leads to chronic oversizing. Studies show many facilities operate at only 20‑60% of installed capacity, inflating capital, energy and maintenance costs by up to 30%. Oversized power, cooling and backup systems...
AI Tools Remove Data Barriers, Driving Explosive Growth
I now constantly get questions about the SAAS meltdown, role of AI, system of records etc. I don't have an answer to all these. But I do know that we saw an acceleration in our business in Q2, Q3, and...

Storage News Ticker – 9 February 2026
The storage‑focused news ticker highlighted a wave of AI‑centric and security‑driven product launches, from Aerospike’s default Dynamic Data Masking to Cloudera’s on‑prem AI inference and Trino‑powered warehouse. Databricks secured a $5 billion equity round, reporting $5.4 billion ARR with strong AI revenue,...
New Tech, Data, AI Roles Shaping the Last Five Years
Job titles that have emerged in Tech, Data and AI the last 5 years: 👇

Supercapacitor Developer Skeleton Opens First US Engineering Facility in Houston, Texas
Estonian supercapacitor maker Skeleton Technologies opened its first U.S. engineering facility in Houston, Texas, to support AI data‑center customers. The graphene‑based devices can smooth power spikes and claim up to 45% energy savings for high‑performance computing workloads. Skeleton already has...

Aerospike 8.1.1 Introduces New Native Dynamic Data Masking for PII Protection and Regulatory Compliance
Aerospike released version 8.1.1, introducing native Dynamic Data Masking (DDM) for its high‑performance NoSQL database. The feature lets administrators define masking rules that hide personally identifiable information at the database layer, automatically applying to all users and machines except those...
EV Batteries Age Twice as Fast with Ultra-Fast Charging
Geotab’s telematics analysis of 22,700 EVs shows ultra‑fast charging above 100 kW roughly doubles battery degradation, reaching about 2.5 % capacity loss per year versus 1.2‑1.5 % with slower Level 2 charging. The effect intensifies when more than 12 % of sessions use high‑power chargers,...

Johnson Controls Launches New Chillers, Carrier Launches CRAH
Johnson Controls unveiled the York YDAM air‑cooled magnetic‑bearing centrifugal chiller, delivering 3.5 MW of cooling and capable of operating with 45 °C warm‑water. The company also previewed the two‑stage YK‑HT economizer that can produce 44 °F chilled water and 140 °F hot water simultaneously....

AtNorth Files to Expand Planned Data Center Campus in Kouvola, Finland
Nordic data‑center operator atNorth has filed a permit request to add three new buildings to its FIN04 campus in Kouvola, Finland. The expansion would raise the site’s planned capacity to 430 MW across a 45‑hectare footprint, up from the original 60 MW...
VMware Exiteeers Targeted by Alibaba’s NexaVM
Swiss‑based NexaVM, backed by Alibaba, delivers an all‑in‑one VMware replacement aimed at mid‑market cloud service providers, sovereign‑cloud operators, and hardware‑agnostic OEMs. The platform bundles a production‑grade KVM hypervisor, Ceph‑based hyper‑converged storage, integrated Kubernetes‑as‑a‑Service, and multi‑tenant management under a subscription licence....

Global AI Set to Develop Data Center Outside Denver, Colorado
Global AI, in partnership with Saudi AI firm Humain, is converting a former Kodak‑Carestream site near Windsor, Colorado into a high‑density AI data center. After purchasing 438 acres for $15.6 million, the company plans to launch an 18‑24 MW facility by the...

Corning and Meta Form Multiyear Partnership for up to $6 Billion to Accelerate U.S. Data Center Buildout
Corning and Meta have signed a multiyear agreement worth up to $6 billion to accelerate the construction of advanced data centers in the United States. Under the deal, Corning will provide Meta with its latest optical fiber, cable and connectivity solutions...

Red Hat and NVIDIA Expand Partnership to Align on Next-Gen AI Infrastructure
Red Hat announced an expanded partnership with NVIDIA, introducing Red Hat Enterprise Linux for NVIDIA—a specialized RHEL edition tuned for the NVIDIA Rubin platform. The collaboration delivers Day 0 support for the Vera CPU, Rubin GPUs and BlueField‑4 DPU across Red...

IBM Announces Global RFP Process for AI-Driven Solutions Shaping the Future of Work and Education
IBM has issued a global request for proposals for the next cohort of its Impact Accelerator, targeting AI-driven solutions in education and workforce development. The program invites nonprofits, government entities, and academic institutions to develop tools that bridge the widening...
StarRocks Delivers DWH‑Level Joins on Lakehouse Natively
Today, I dig into the details of StarRocks and how it is gaining traction in the real-time database world. DWH-like joins and fast retrieval from a #Lakehouse-native data architecture, without additional data engineering work to persist and ingest data. https://www.ssp.sh/blog/starrocks-lakehouse-native-joins/

Vantage Denied Permission for Data Center Outside Frankfurt
US‑based Vantage Data Centers was denied permission by the Groß‑Gerau city council to build a 174 MW, €2.5 billion data‑center campus on a 14‑hectare site outside Frankfurt. The council voted 18‑14 against contract negotiations, citing concerns over limited job creation, visual impact,...

Singtel's Nxera Opens Singapore Data Center
Singapore telecom giant Singtel’s data‑centre arm Nxera has opened DC Tuas, an 120,000‑sq‑ft, eight‑storey facility delivering 58 MW of power, the highest‑capacity data centre in the city‑state. More than 90% of its multi‑tenant capacity was pre‑leased, and the site features advanced...

Top 7 Embedded Analytics Benefits for Business Growth
Embedded analytics moves data insights from isolated dashboards into the applications where decisions happen, cutting context‑switching delays. By delivering real‑time metrics within workflows, it accelerates decision cycles, boosts operational efficiency, and encourages higher user adoption. The approach also fuels customer...
Global Chip Sales Are on Track to Hit $1 Trillion Thanks to AI
Global semiconductor revenue is projected to breach the $1 trillion mark, propelled by explosive AI demand. The Semiconductor Industry Association reported $791.7 billion in sales for 2025, a 25.6% year‑over‑year increase. Tech giants are pouring capital into AI‑focused data‑center infrastructure, accelerating the...
Modern Tools Reshape Kimball’s Data Modeling Techniques
What's changed since Kimball wrote The Data Warehouse Toolkit: 1. Surrogate keys are less necessary with better databases 2. Denormalization for performance matters less with modern engines 3. Snapshotting dimensions beats complex SCD2 logic 4. Collaboration requirements mean looser conformance Kimball's principles still matter. But...
Fail Fast & Ship It with Jeremy Custenborder | Ep. 18
In this episode, Viktor Gamov interviews Jeremy Custenborder of Confluent about his journey from a paper boy to a leader in large‑scale systems, focusing on his experience keeping MySpace operational at massive pre‑cloud scale. Jeremy explains how he built custom...
AI Success Mirrors Sports: Master Fundamentals Daily
Most AI programmes fail for the same reason sports teams lose: weak fundamentals. 🏅 In this video, I share 6 lessons from elite sport that make AI adoption work when it moves from pilot to daily workflow: 1️⃣ Clear roles...

#345 How to Drive Innovation with Brian Solis, Head of Global Innovation at ServiceNow
In episode #345, DataFramed hosts Adel Nehme and Richie Cotton sit down with Brian Solis, Head of Global Innovation at ServiceNow, to explore how organizations can foster a culture of continuous innovation. Solis emphasizes the importance of aligning innovation with...

From Penetration to Inclusion: How CRC Credit Bureau Is Re-Engineering Nigeria’s Credit Ecosystem
Nigeria’s credit penetration has topped 40%, signaling a rapid shift toward broader financial inclusion. CRC Credit Bureau, the country’s largest licensed bureau, has built the most comprehensive credit data ecosystem by pulling information from banks, fintechs, utilities, telcos, and digital...
![[Industry News] Yodo1 Unveils IPVerse to Eliminate High-Stakes Guesswork in Global Game Licensing](/cdn-cgi/image/width=1200,quality=75,format=auto,fit=cover/https://mcvuk.com/wp-content/uploads/IPVerse-Logo.jpeg)
[Industry News] Yodo1 Unveils IPVerse to Eliminate High-Stakes Guesswork in Global Game Licensing
Yodo1 Games has launched IPVerse, a free data‑intelligence platform that consolidates global IP and licensing information for in‑game collaborations. The service draws from over 80,000 events, 20,000 brands and 15,000 games, offering AI‑driven matching, popularity indices and regional insights, including...

UAE’s TII Challenges Big Tech Dominance with Open Source Falcon AI Models
The United Arab Emirates’ Technology Innovation Institute (TII) has released its Falcon family of large language models as open‑source, positioning the nation as a challenger to big‑tech AI dominance. Falcon models emphasize efficiency, with the 7‑billion‑parameter H1R delivering high‑performance reasoning...

Why Coinbase and Pinterest Chose StarRocks: Lakehouse-Native Design and Fast Joins at Terabyte Scale
StarRocks is attracting heavyweight users such as Coinbase, Pinterest and Fresha because it delivers sub‑second query latency on terabyte‑scale analytics while reading directly from lakehouse storage. The platform’s shared‑nothing architecture, colocated joins, caching layer and a cost‑based optimizer let it...
Regina Obe: PostGIS Patch Releases
The PostGIS development team announced a series of bug‑fix patch releases covering versions 3.2 through 3.6, while designating 3.0.12 and 3.1.13 as end‑of‑life. New patches include 3.6.2, 3.5.5, 3.4.5, 3.3.9 and 3.2.9, each addressing stability and security issues. Users still...
VMware Workstation Pro 25H2 Expands Hardware and OS Support
VMware released Workstation Pro 25H2 (version 17.6.4), expanding hardware compatibility and adding a host of new guest operating systems. The update introduces USB 3.2, hardware version 22, and a new dictTool CLI for editing VM configuration files. Licensing changed in 2024, making Workstation Pro free...

Visualize Churn Timing with Python Cohort Heatmaps
At my work every month we look at customer retention. A single churn rate doesn't tell the whole story. You need to show when customers are leaving. Here's how you can do it with a cohort analysis heatmap in python

Choosing Exit Over 2% $50M Gamble
I've been thinking about this math a lot and weighing if that 2% chance of $50m is worth it. I've built DataExpert.io to $2M ARR over the last three years. I could exit for $8-10m now. In 3 years not...

Jan Kristof Nidzwetzki: EBPF Tracing of PostgreSQL Spinlocks
Jan Kristof Nidzwetzki’s article breaks down PostgreSQL’s spinlock mechanism, detailing its low‑latency design and adaptive backoff strategy. It explains the four‑function API (Init, Acquire, Release, Free) and shows how contention is handled through incremental micro‑second sleeps. The piece also introduces the...

AI Flood Drowns Human Filters, Killing Traditional Docs
The signal-to-noise problem is about to become unsolvable for humans alone. When AI makes ALL content (texts, emails, PRs, job apps, grant apps, college apps, mail) high-quality and infinite in volume, every filtering process breaks. And when it can’t be filtered...
Healing Tables: When Day-by-Day Backfills Become a Slow-Motion Disaster
A data engineering team discovered that a three‑year SCD Type 2 backfill executed day‑by‑day produced 47,000 overlapping records, timeline gaps, and unrecoverable errors. The author introduced "Healing Tables," a framework that separates change detection from period construction and rebuilds the dimension in...

When Data Moves, Risk Moves with It: The Hidden Challenges of Warehousing Data
The episode explores how moving data into modern warehouses and lakes introduces hidden risks that go beyond technical challenges, emphasizing governance, data quality, and transformation controls. It highlights that inconsistencies in source systems, ambiguous definitions, and poorly documented transformation logic...

Radim Marek: Reading Buffer Statistics in EXPLAIN Output
The article explains how to read buffer statistics from PostgreSQL's EXPLAIN output, a skill essential for diagnosing I/O bottlenecks. Starting with PostgreSQL 18, buffer metrics are included automatically, eliminating the need for the BUFFERS option. It breaks down shared, local, and...

Kubernetes Could Use a Different Linux Scheduler
Cambridge researchers introduced Latency‑Aware Group Scheduling (LAGS), a Linux kernel patch that reshapes how the scheduler handles Kubernetes workloads. By favoring short‑running tasks and cutting context‑switch overhead, LAGS lifts cluster throughput by roughly 10‑20 %. Benchmarks show the same 600‑container workload...

Is Your Machine Learning Pipeline as Efficient as It Could Be?
Machine learning teams are increasingly overlooking pipeline efficiency, a hidden driver of productivity. Slow data I/O, redundant preprocessing, and mismatched compute inflate the iteration gap, limiting the number of hypotheses tested per week. The article outlines five audit areas—data ingestion,...

Voltage Ride-Through: A Key Ingredient in Data Center Resilience
Voltage ride‑through (VRT) enables data‑center hardware to stay connected to the grid during brief outages or brownouts, preventing immediate shutdowns. By using UPS units that buffer power and can switch instantly to backup sources, VRT eliminates the latency of manual...
ETL: The Backbone of Modern Data Workflows
📊 The ETL Data Pipeline From raw sources (databases, APIs, files) → clean & transform (cleaning, joining, aggregating) → load into warehouse/analytics for BI, reports & ML. E → T → L: the backbone of modern data workflows.🚀

Shinya Kato: Reducing Row Count Estimation Errors in PostgreSQL
The article outlines four progressive techniques for cutting row‑count estimation errors in PostgreSQL. It begins with per‑table autovacuum tuning to keep statistics fresh, then moves to raising column‑level statistics targets for finer sampling. Next, it introduces extended statistics to model...
New MIT Framework Uses Search to Handle LLM Errors in AI Agents
MIT CSAIL and Asari AI introduced EnCompass, a Python framework that adds systematic search and backtracking to AI agents using large language models. Developers annotate "branchpoints" where LLM outputs may vary, then the runtime explores execution paths with strategies such...
How Database Professionals Sent a Million Emails a Day From SQL Server
SQL Server Central’s engineers built a custom email‑queue using plain SQL Server tables and a lightweight .NET sender to replace an expensive third‑party service. By inserting newsletters and user addresses into an EmailNewsletter table, they processed batches of 500 rows,...
VAST Data Plans Funding Round so Early Stock Holders Can Get Cash
VAST Data is closing a $1 billion financing round that values the AI‑focused storage vendor at roughly $30 billion. The bulk of the cash will come from a secondary sale where existing shareholders dump 5‑6 shares for each new issue, generating $700‑$850 million...