Know What's Happening in Big Data

Today's Big Data Pulse

Data‑Engineering Bottlenecks Shift From Legacy Tech to Leadership Gaps

Three 2026 surveys of 1,629 data professionals show that weak leadership direction and poor requirements now account for 40% of top‑bottleneck votes, outpacing legacy systems at 25%. By April, 50% of respondents cite lack of clear ownership as the biggest pain point, while better tooling is mentioned by under 5%.

Cohesity Deepens Google Cloud Integration
NewsFeb 5, 2026

Cohesity Deepens Google Cloud Integration

Cohesity has integrated Google Cloud Threat Intelligence directly into the Cohesity Data Cloud UI and added Google Private Scanning for secure, privacy‑preserving malware detonation. The enhancement gives customers real‑time visibility into indicators of compromise and streamlines threat analysis without leaving...

By Blocks & Files
MinIO Plugs Apache Iceberg Tables Directly Into AIStor
NewsFeb 5, 2026

MinIO Plugs Apache Iceberg Tables Directly Into AIStor

MinIO has made its AIStor Tables feature generally available, embedding the full Apache Iceberg V3 Catalog REST API directly into its object storage platform. The integration lets enterprises treat Iceberg tables as first‑class citizens, unifying structured and unstructured data for...

By Blocks & Files
DDN Appoints Vice Chairman Amid Enterprise AI Expansion
NewsFeb 5, 2026

DDN Appoints Vice Chairman Amid Enterprise AI Expansion

Data Direct Networks (DDN) has named former Cisco and Groq executive Mohsen Moazami as vice chairman, a move seen as positioning the high‑performance computing firm for a public offering. The appointment follows a $300 million Blackstone investment and a strategic shift...

By Blocks & Files
Airbnb’s Open-Source GraphQL Framework with Adam Miskiewicz
PodcastFeb 5, 202655 min

Airbnb’s Open-Source GraphQL Framework with Adam Miskiewicz

In this episode, Adam Miskiewicz, Principal Software Engineer at Airbnb, explains how the company built Viaduct, an open‑source data‑oriented service mesh and GraphQL platform that unifies a central schema across millions of microservices. He details the architectural principles—centralized schema, consistent...

By Software Engineering Daily – Data
Cyberhaven Introduces Unified AI and Data Security Platform
NewsFeb 5, 2026

Cyberhaven Introduces Unified AI and Data Security Platform

Cyberhaven launched a unified AI‑driven Data Security Posture Management platform that integrates DSPM, DLP, insider risk management and AI security across endpoints, SaaS, cloud and on‑prem environments. The solution leverages comprehensive data lineage and agentic AI to provide continuous visibility,...

By Database Trends & Applications (DBTA)
DWP Rejigs Operating Model for Data Transformation by 2030
NewsFeb 5, 2026

DWP Rejigs Operating Model for Data Transformation by 2030

The UK Department for Work and Pensions (DWP) unveiled a seven‑year data strategy (2023‑2030) that pivots to a federated hub‑and‑spoke operating model for data management and governance. The plan targets a 20% cost reduction over five years, modernises legacy systems,...

By Computer Weekly – Latest IT news
Deep Data Science and Startup Can't Coexist Full‑Time
SocialFeb 5, 2026

Deep Data Science and Startup Can't Coexist Full‑Time

Let me explain to y’all saying can’t i do both the $370k full time data science role at Anthropic & my startup. As someone who has analyzed 2.5billion daily active youtube user data, data science is intense You cannot do deep...

By Omozara
Cerebras Systems Raises $1 Billion Series H
NewsFeb 4, 2026

Cerebras Systems Raises $1 Billion Series H

Cerebras Systems closed a $1 billion Series H financing round, valuing the company at roughly $23 billion post‑money. The round was led by Tiger Global and included investors such as Benchmark, Fidelity, AMD and Coatue. Proceeds will accelerate production of the Wafer Scale...

By EnterpriseAI (AIwire)
TUM Unveils EU’s 1st 7nm AI Chip with Local Processing and RISC-V Architecture
NewsFeb 4, 2026

TUM Unveils EU’s 1st 7nm AI Chip with Local Processing and RISC-V Architecture

Technical University of Munich unveiled the EU’s first 7‑nanometer AI chip, a neuromorphic processor built on an open‑source RISC‑V architecture. Designed by Prof. Hussam Amrouch, the chip processes data locally, promising higher privacy and security than cloud‑centric solutions. Production will shift...

By EnterpriseAI (AIwire)
Cluster API Update Makes Managing Kubernetes Environments Simpler
NewsFeb 4, 2026

Cluster API Update Makes Managing Kubernetes Environments Simpler

The Cluster API project released version 1.12.0, adding in‑place machine updates and chained upgrade capabilities for Kubernetes clusters. The update introduces declarative update extensions that let teams modify existing nodes without recreating them, leveraging only create and delete primitives. Fabrizio Pandini...

By Container Journal
NV5 GeoAgent Offers Autonomous Geospatial Intelligence
NewsFeb 4, 2026

NV5 GeoAgent Offers Autonomous Geospatial Intelligence

NV5 unveiled GeoAgent, an agentic AI platform that automates geospatial intelligence through natural‑language interaction. The solution acts as an operational layer over existing tools, orchestrating data discovery, multimodal analytics, and mission‑ready outputs without requiring specialist expertise. By shifting from tool‑driven...

By Database Trends & Applications (DBTA)
Pentaho Enhances Flagship Data Integration Suite, Introducing Version 11
NewsFeb 4, 2026

Pentaho Enhances Flagship Data Integration Suite, Introducing Version 11

Pentaho announced the release of Data Integration and Business Analytics Version 11, a major platform upgrade aimed at simplifying data workflows and supporting AI initiatives. The update introduces a browser‑based Pipeline Designer, Project Profile for organizing ETL assets, and a new...

By Database Trends & Applications (DBTA)
Passing the Torch: Building a Workforce for the Next Generation of Data Centers
NewsFeb 4, 2026

Passing the Torch: Building a Workforce for the Next Generation of Data Centers

The data‑center industry faces a looming talent shortage as up to half of its engineers could retire within the next two years, creating a critical experience gap. Rapid growth in AI‑driven workloads and ultra‑high‑density designs intensifies the need for skilled...

By Data Center Knowledge
Oracle Life Sciences AI Data Platform Unites Data and Agentic Intelligence to Accelerate Medical Breakthroughs
NewsFeb 4, 2026

Oracle Life Sciences AI Data Platform Unites Data and Agentic Intelligence to Accelerate Medical Breakthroughs

Oracle introduced the Oracle Life Sciences AI Data Platform, a generative AI‑enabled solution that consolidates diverse life‑science datasets and applies agentic AI to accelerate research, clinical trials, and commercialization. The platform offers out‑of‑the‑box AI agents and tools for label expansion,...

By Database Trends & Applications (DBTA)
Oracle Releases New Agentic Platform for the Banking and Finance Space
NewsFeb 4, 2026

Oracle Releases New Agentic Platform for the Banking and Finance Space

Oracle Financial Services unveiled an enterprise‑class, AI‑infused platform designed for banks and finance firms. The suite bundles pre‑built AI agents, design tools, and decisioning frameworks that deliver conversational, hyper‑personalized experiences across digital and branch channels. Oracle emphasizes a "human‑in‑the‑loop" model...

By Database Trends & Applications (DBTA)
5 Open Source Image Editing AI Models
BlogFeb 4, 2026

5 Open Source Image Editing AI Models

A new KDnuggets article spotlights five open‑source AI models that enable text‑driven image editing, ranging from Black Forest Labs' FLUX.2 [klein] 9B to Alibaba Cloud's Qwen‑Image‑Edit‑2511 and newer adapters like FLUX.2 [dev] Turbo. The models deliver real‑time generation, multi‑reference editing, bilingual support,...

By KDnuggets
MariaDB Discusses Database Scale and Active:active and Active:passive Architectures
NewsFeb 4, 2026

MariaDB Discusses Database Scale and Active:active and Active:passive Architectures

MariaDB’s chief product officer Vikas Mathur explained the trade‑offs between active‑passive and active‑active database architectures. Active‑passive relies on a primary server with idle standby replicas, offering low complexity but higher cost and unused capacity. Active‑active runs multiple nodes handling reads...

By Blocks & Files
Robins Tharakan: The "Skip Scan" You Already Had Before V18
NewsFeb 4, 2026

Robins Tharakan: The "Skip Scan" You Already Had Before V18

PostgreSQL 18 introduces a native skip‑scan operator for multicolumn B‑tree indexes, allowing the planner to jump between distinct leading‑key values instead of scanning the entire index. Earlier releases could already perform a full‑index scan when the index was smaller than...

By Planet PostgreSQL
AI Anomaly Detection for Warehouse Security: Smarter Protection Beyond Cameras
NewsFeb 4, 2026

AI Anomaly Detection for Warehouse Security: Smarter Protection Beyond Cameras

AI anomaly detection is reshaping warehouse security by using machine‑learning models to learn normal movement, access and handling patterns and flagging deviations in real time. The technology fuses video, IoT sensors, RFID and WMS data, delivering precise alerts while cutting...

By Datafloq
Hitachi Vantara May Be up for Sale
NewsFeb 3, 2026

Hitachi Vantara May Be up for Sale

Hitachi Ltd. is exploring a sale of its Hitachi Vantara data‑storage unit for up to ¥200 billion ($1.3 billion). The move follows a strategic pivot toward higher‑margin businesses such as energy transmission and digital SaaS services. Hitachi Vantara generated roughly ¥300 billion in...

By Blocks & Files
The Lakehouse Architecture | Multimodal Data, Delta Lake, and Data Engineering with R. Tyler Croy
BlogFeb 3, 2026

The Lakehouse Architecture | Multimodal Data, Delta Lake, and Data Engineering with R. Tyler Croy

The article introduces the lakehouse architecture as a unified platform that combines the scalability of data lakes with the performance of data warehouses. It highlights how Delta Lake brings ACID transaction support and schema enforcement to open‑source storage, enabling reliable...

By Confessions of a Data Guy
Samsung Shipping Fast and Small PCIe Gen5 Bus 4TB Mini-Gumstick Drive
NewsFeb 3, 2026

Samsung Shipping Fast and Small PCIe Gen5 Bus 4TB Mini-Gumstick Drive

Samsung has begun shipping a 4 TB PM9E1 M.2 2242 SSD that targets space‑constrained AI workstations such as Nvidia DGX Spark. The drive uses Samsung’s 236‑layer Gen 8 V‑NAND, dual‑sided DRAM, and a 5 nm Presto controller, delivering up to 2 million random read IOPS, 2.64 million...

By Blocks & Files
Western Digital Blows Hard Disk Drive Future Wide Open
NewsFeb 3, 2026

Western Digital Blows Hard Disk Drive Future Wide Open

Western Digital announced qualification of a 40 TB UltraSMR hard‑disk drive and unveiled a roadmap that could reach 100 TB using heat‑assisted magnetic recording (HAMR) by 2027. The company introduced High‑Bandwidth Drive (HBD) technology that initially doubles I/O throughput, with plans for...

By Blocks & Files
Gartner: AI and Datacentre Spending Ramps Up
NewsFeb 3, 2026

Gartner: AI and Datacentre Spending Ramps Up

Gartner projects global IT spending to rise 10.8% to $6.2 trillion by 2026, with datacentre equipment spending surging 32% and software up nearly 15%. AI investment will total $2.52 trillion, a 44% year‑over‑year jump, driven largely by hyperscale cloud providers expanding AI‑optimized...

By Computer Weekly – Latest IT news
Umair Shahid: PostgreSQL Materialized Views: When Caching Your Query Results Makes Sense (And When It Doesn’t)
NewsFeb 3, 2026

Umair Shahid: PostgreSQL Materialized Views: When Caching Your Query Results Makes Sense (And When It Doesn’t)

PostgreSQL materialized views create a physical snapshot of expensive query results, allowing fast, indexed reads while shifting computation to scheduled refreshes. The article demonstrates turning a 28‑second revenue aggregation into a 180‑millisecond lookup by building, indexing, and refreshing a materialized...

By Planet PostgreSQL
How Data Analytics Is Transforming Performance Appraisal in Modern HRM
NewsFeb 3, 2026

How Data Analytics Is Transforming Performance Appraisal in Modern HRM

Data analytics is reshaping HR performance appraisals in India, moving them from memory‑based, annual snapshots to continuous, evidence‑driven processes. By aggregating goal achievement, peer feedback, and productivity metrics, companies now generate real‑time dashboards that capture an employee’s full performance journey....

By Datafloq
Designing Reliable Data Pipelines in Cloud-Native Environments
NewsFeb 3, 2026

Designing Reliable Data Pipelines in Cloud-Native Environments

Designing reliable data pipelines in cloud‑native environments begins with clear expectations and ownership before any code is written. Teams must assume upstream volatility, embrace failure as a design premise, and build idempotent, replay‑friendly workflows that limit blast radius. Robust observability—beyond...

By Container Journal
Why Should the Construction Industry Use ERP Software?
NewsFeb 3, 2026

Why Should the Construction Industry Use ERP Software?

Construction firms are facing larger projects, tighter schedules, and heightened client expectations, exposing the limits of spreadsheets and paper-based processes. Enterprise Resource Planning (ERP) software consolidates field data, finance, procurement, and project management into a single platform, enabling real‑time visibility...

By Datafloq
CIQ Creates Startup Program to Offer High-Performance AI Infrastructure for Early-Stage Innovators
NewsFeb 3, 2026

CIQ Creates Startup Program to Offer High-Performance AI Infrastructure for Early-Stage Innovators

CIQ announced the CIQ Startup Program, giving early‑stage, VC‑backed startups six months of free access to its high‑performance AI infrastructure and up to an 80% discount for the following two years. The offering includes Rocky Linux AI with pre‑integrated frameworks,...

By Database Trends & Applications (DBTA)
Databricks Lakebase Is Now Generally Available, Delivering Reliability, Performance, and Governance
NewsFeb 3, 2026

Databricks Lakebase Is Now Generally Available, Delivering Reliability, Performance, and Governance

Databricks announced the general availability of Lakebase on AWS, a serverless Postgres‑compatible operational database built on its lakehouse platform. The service separates compute from storage, offering automatic scaling, zero‑copy branching, and point‑in‑time recovery for production workloads. Lakebase integrates with Unity...

By Database Trends & Applications (DBTA)
#290: Always Be Learning
PodcastFeb 3, 20261h 6m

#290: Always Be Learning

In this episode, Tim Wilson, Val Kroll, and Spotify product manager/data scientist Mårten Schultzberg discuss the limits of focusing solely on win rates in experimentation and introduce a broader "learning rate" metric that captures wins, regressions (avoiding bad outcomes), and neutral...

By Digital Analytics Power Hour
Rocket Software Announces Intent to Acquire Vertica Analytics Database Solution From OpenText?
NewsFeb 2, 2026

Rocket Software Announces Intent to Acquire Vertica Analytics Database Solution From OpenText?

Rocket Software announced a definitive agreement to acquire the Vertica analytics database from OpenText. Vertica, known for high‑performance, cloud‑ready analytics and AI/ML capabilities, will join Rocket’s portfolio of modernization tools. The cash‑funded deal is slated to close in mid‑2026, pending...

By Database Trends & Applications (DBTA)
Commvault Pitches Geo Shield for Sovereign Data Protection
NewsFeb 2, 2026

Commvault Pitches Geo Shield for Sovereign Data Protection

Commvault has launched Geo Shield, a sovereign‑data protection suite that lets enterprises dictate where data resides, who controls access, and who holds encryption keys. The offering spans four deployment models—from local hyperscaler SaaS to private sovereign clouds—supporting both BYOK and HYOK...

By Blocks & Files
From “This May Never Work” To WarpStream with Richie Artoul | Ep. 17
PodcastFeb 2, 202630 min

From “This May Never Work” To WarpStream with Richie Artoul | Ep. 17

In this episode, Tim Berglund chats with data infrastructure veteran Richie Artoul about his unconventional path—from running a LAN gaming café to building log storage at Datadog and now leading WarpStream at Confluent. Richie shares the technical and cultural challenges...

By Streaming Audio (Kafka / Confluent)
Storage News Ticker – February 2
NewsFeb 2, 2026

Storage News Ticker – February 2

The February 2 storage news ticker packed a series of vendor recognitions, product launches and strategic moves across data quality, protection, AI and memory markets. Ataccama earned the top Forrester strategy score, while Coldago’s 2025 map highlighted Cohesity, Commvault, Rubrik and...

By Blocks & Files
Apply Sports AI Tactics: Real‑Time, Personalized, Scenario‑Driven Business
SocialFeb 2, 2026

Apply Sports AI Tactics: Real‑Time, Personalized, Scenario‑Driven Business

4 practical AI lessons from sport. Sport is one of the best stress tests for AI, because decisions are fast, public, and high stakes. Here are 4 AI lessons every executive can steal from elite sport 👇 4) Fan Engagement...

By Bernard Marr
Match AI Capabilities to Tasks, Not Just Benchmarks
SocialFeb 2, 2026

Match AI Capabilities to Tasks, Not Just Benchmarks

Choosing The Right AI In 2026 Is No Longer About Choosing The Right Model In 2026, choosing the right #AI comes down to matching #capability profiles to specific tasks, risk levels and business outcomes, rather than chasing benchmark winners. This...

By Bernard Marr
Converting Floats to Strings Quickly
BlogFeb 1, 2026

Converting Floats to Strings Quickly

Converting binary floating‑point numbers to decimal strings is a core step in JSON, CSV, and logging pipelines. Recent research benchmarks modern algorithms—Dragonbox, Schubfach, and Ryū—showing they are roughly ten times faster than the original Dragon4 from 1990. The study finds...

By Daniel Lemire’s blog
The ROI Paradox: Why Small-Scale AI Architecture Outperforms Large Corporate Programs
NewsJan 31, 2026

The ROI Paradox: Why Small-Scale AI Architecture Outperforms Large Corporate Programs

An empirical analysis of 200 B2B AI projects from 2022‑2025 reveals a “Budget Paradox”: deployments under $20,000 achieve a median ROI of 159.8%, while larger, monolithic programs frequently fail to break even within two years. The study, validated by Harvard...

By Datafloq
Data Engineering Career Path: From Circuits to Pipelines
BlogJan 30, 2026

Data Engineering Career Path: From Circuits to Pipelines

The article maps a data‑engineering career trajectory that begins with hardware‑oriented roles and ends in building scalable data pipelines. It highlights how circuit‑design thinking translates into logical data modeling, while emphasizing the need to acquire SQL, Python, and cloud‑native tools....

By Confessions of a Data Guy
Apache Airflow vs Databricks Lakeflow | The Orchestration Battle
BlogJan 30, 2026

Apache Airflow vs Databricks Lakeflow | The Orchestration Battle

The article pits Apache Airflow, the open‑source workflow orchestrator, against Databricks Lakeflow, a newer Lakehouse‑native pipeline engine. It outlines core differences in architecture, integration depth with cloud data platforms, and pricing models. Airflow remains favored for heterogeneous environments, while Lakeflow...

By Confessions of a Data Guy
This One Polars Pattern Makes Code 10x Cleaner
BlogJan 30, 2026

This One Polars Pattern Makes Code 10x Cleaner

The article highlights a single Polars pattern—using the pipe operator—to streamline data‑frame code, cutting boilerplate and boosting readability up to tenfold. By chaining transformations in a lazy execution graph, developers avoid intermediate variables and gain clearer, more maintainable pipelines. The...

By Confessions of a Data Guy
It's Friday, Juan and Tim Rant with Data Day Texas Takeaways
PodcastJan 30, 202634 min

It's Friday, Juan and Tim Rant with Data Day Texas Takeaways

In this 34‑minute episode, Juan and Tim unwind over a beer to discuss recent developments in the data landscape and share their key takeaways from Data Day Texas. They cover topics such as the hype around AI versus real monetary...

By Catalog & Cocktails
Etleap Introduces a New Managed Pipeline Layer Created for Apache Iceberg
NewsJan 30, 2026

Etleap Introduces a New Managed Pipeline Layer Created for Apache Iceberg

Etleap announced a managed pipeline platform purpose‑built for Apache Iceberg, addressing the missing orchestration layer in Iceberg deployments. The solution consolidates ingestion, transformation, orchestration, and table operations into a single service that runs inside the customer’s virtual private cloud. By...

By Database Trends & Applications (DBTA)
Forget Quantum? Why Photonic Data Centers Could Arrive First
NewsJan 30, 2026

Forget Quantum? Why Photonic Data Centers Could Arrive First

Photonic computing is emerging as a realistic path to higher‑throughput, more energy‑efficient data centers, potentially arriving before general‑purpose quantum machines. By using photonic integrated circuits to perform linear‑algebra operations in the optical domain, these systems promise faster speeds, greater bandwidth,...

By Data Center Knowledge
The Data Center Surge Has a Hidden Source of Carbon Emissions
NewsJan 29, 2026

The Data Center Surge Has a Hidden Source of Carbon Emissions

Data center construction will need 2 million metric tons of cement by 2030, potentially releasing 1.9 million tons of CO₂ if conventional concrete is used. Tech giants such as Microsoft, Amazon and Meta have signed low‑carbon concrete offtake agreements with startups like...

By Data Center Knowledge
AmberSemi Launches PowerTile to Cut Data Center Power Drain
NewsJan 29, 2026

AmberSemi Launches PowerTile to Cut Data Center Power Drain

California fabless chipmaker AmberSemi announced its new PowerTile, a quarter‑size, 1,000‑amp vertical power‑delivery module designed to sit behind AI processors in servers. The device claims to cut board‑level power distribution losses by 85%, potentially saving 225 MW of electricity per year...

By Data Center Knowledge
Why Your AI Chip Utilization Problem Is Really a Storage Problem
NewsJan 29, 2026

Why Your AI Chip Utilization Problem Is Really a Storage Problem

AI performance hinges not just on GPUs or LLMs but on the storage layer that feeds data to accelerators. A Meta‑Stanford white paper shows storage can consume up to one‑third of the power used for deep‑learning training. When storage cannot...

By Data Center Knowledge
I Stress-Tested Cube's New AI Analytics Agent
BlogJan 29, 2026

I Stress-Tested Cube's New AI Analytics Agent

In this episode, host Joe Reis shares his hobby of stress‑testing AI analytics agents and introduces his own testing framework. He evaluates Cube's new AI analytics agent, highlighting how its semantic‑layer approach prevents common failures like hallucinated tables and incorrect...

By Joe Reis (Substack)