
Online Retailer Zalando Trims AWS Bill by Getting Manual with Flink Stream Filtering
Zalando, which generates roughly €3 billion in quarterly fashion sales, ran into soaring AWS costs and unstable Flink clusters due to the way Flink 1.20’s Table API handled chained joins. The joins caused state to balloon to over 240 GB per application, leading to CPU‑bound snapshot jobs and lengthy restarts. Engineers replaced the declarative join chain with a custom KeyedProcessFunction that manually filters streams, slashing the AWS bill by about 13% and cutting restart times to under five minutes. The team now awaits Flink 2.1’s native multi‑way join to simplify the solution.

15 Top Data Catalog Software Tools to Consider Using in 2026
Enterprises are grappling with fragmented data landscapes, prompting a surge in data catalog adoption. Modern catalog tools not only inventory metadata but also embed AI, generative AI, and natural‑language interfaces to accelerate discovery and governance. The article lists 15 leading...
[ EVENT ] CDO Retail Exchange 2026
CDO Retail Exchange 2026 convenes 70 senior retail data, analytics and AI executives from brands such as Adidas, IKEA and Shiseido. The closed‑door forum is designed to move AI projects from pilot to profit, focusing on real‑time decisioning, margin‑boosting use...
Designing Delta Tables with Liquid Clustering: Real-World Patterns for Data Engineers
Liquid Clustering is a Delta‑Lake layout strategy that dynamically groups rows by query‑driven columns instead of static folder partitions. By continuously reorganizing files, it makes file‑level statistics more useful, enabling stronger data skipping and smaller scan footprints. Engineers enable it...
Shifting Focus From AI Models to Data Architecture as Real-Time Streaming Gains Market Momentum
Confluent executives warned that AI initiatives will falter without fresh, governed data in motion, shifting focus from model perfection to real‑time data architecture. They described a transition from batch‑based business intelligence to continuous, autonomous AI that requires millisecond‑latency streams and...

The Iceberg Ecosystem Today (Anders Swanson)
The data industry is rapidly converging on open standards, and dbt Labs is leading the charge by migrating its entire data stack to an Iceberg‑based lake that supports multiple compute engines. In a recent podcast, Anders Swanson outlined the current...
Snowflake Cortex Code CLI Adds Dbt and Apache Airflow Support for AI-Powered Data Pipelines
Snowflake has expanded its Cortex Code CLI, an AI‑driven coding agent, to support the open‑source data‑pipeline frameworks dbt and Apache Airflow. The extension leverages Anthropic’s Agent Skills to automate debugging, testing, and optimization of pipelines, and is offered through a new...
Bruce Momjian: New Presentation
Bruce Momjian delivered two recent talks to the PostgreSQL community: a deep‑dive on the write‑ahead log (WAL) at the Scale conference on March 7, 2026, and a candid assessment of PostgreSQL’s missing features at Prague PostgreSQL Developer Day on January 28, 2026....

Mexico's Grupo Financiero Banorte Partners with Hitachi Vantara of Data Center Migration
Grupo Financiero Banorte teamed with Hitachi Vantara to relocate its primary data center from Mexico City to Querétaro, moving 450 TB of information in under an hour. The migration introduced two mainframes, three Hitachi storage arrays, and the Virtual Storage Platform...

AWS Likely Behind Plans for $750m Data Center in Clinton, Mississippi
Amazon Web Services is poised to invest $750 million in a new data center on a 99‑acre site in Clinton, Mississippi, repurposing the former Milwaukee Tool facility. The city council approved a fee‑in‑lieu tax arrangement, though final approval from the Mississippi...

MinIO Integrates Delta Sharing Open Protocol for Seamless Access to Enterprise Data
MinIO has launched AIStor Table Sharing, embedding the Delta Sharing open protocol directly into its AIStor object store. The feature lets enterprises expose on‑premises data to Databricks in real time, eliminating the need for costly data replication. Built on Iceberg...
Upping the Profiling of Chemical Exposures in the Omics Sciences
Panome Bio, a multi‑omics contract research organization, unveiled an exposomics service platform that pairs untargeted Discovery Exposomics with targeted quantification of priority chemicals. The Discovery workflow leverages the MassID™ engine and a 32,000‑compound database to profile environmental exposures without prior...
How Gen AI Can Turn Reams of Text Into Actionable Insights
Generative AI now turns dense, unstructured corporate text—especially 10‑K Item 1 disclosures—into structured, decision‑ready metrics. Researchers fine‑tuned a GPT model on 3,500 labeled sentences and applied it to nearly 10 million sentences from 39,710 filings, creating a climate‑solution intensity score for 4,483...
Seqster Unveils 1-Click DataLake for Clinical Trials
Seqster has introduced 1‑Click DataLake, a real‑world data platform that aggregates anonymized electronic health‑record information from over 150 million patients and 200,000 clinicians across the United States. The solution delivers real‑time, longitudinal patient journeys to speed trial design, feasibility assessments, and...

Why Your Planning Team Needs a BI Layer
Rail planning teams often add new data feeds that become extra log‑ins and reconciliation chores, leaving planners to rebuild spreadsheets for every decision. The article argues that a dedicated business intelligence (BI) layer, placed atop existing asset stores, can turn...

Data Quality Automation Startup Validio Raises $30M
Validio, a Stockholm‑based data‑quality automation startup, secured $30 million in Series A funding, bringing its total capital to $47 million after an 800 % ARR surge last year. The round was led by Plural with participation from Lakestar, J12 and several angels. Validio’s AI‑driven...

Strength in Numbers: Nonprofit Launches Consortium to Improve Public Health Data and Outcomes
The Association of State and Territorial Health Officials (ASTHO) announced a new public‑health data consortium, partnering with Veritas Data Research and HealthVerity to create a secure data exchange for state and territorial health agencies. The effort seeks to integrate real‑world...

Validio Secures $30M To Enhance Enterprise AI Data Quality
Validio announced a $30 million Series A round led by Plural, bringing total funding to $47 million after an 800 % revenue surge. The Stockholm‑based startup offers an automated data‑quality platform that monitors billions of records, detects anomalies, and maps lineage in days rather...
Data Supply Chains: The New Framework for Managing AI, Analytics, and Real-Time Insights
Enterprises are shifting from static data warehouses to a data supply chain model that manages information as a continuous, end‑to‑end flow. The framework defines stages—ingestion, transformation, storage, distribution, and consumption—optimizing each to support AI, analytics, and real‑time insights. By integrating...

Orange Wholesale CEO: We're Not Looking to Sell Data Centers
Orange Wholesale CEO Michaël Trabbia told MWC that the French telco will not sell its roughly 75 data‑centre assets across Europe, Africa and the Middle East. Instead, Orange plans to monetize the facilities by expanding colocation services for enterprise customers,...

Architecting Data And AI In The Era Of Enterprise Intelligence: Meet Shylaja Nathan, Principal Analyst
Shylaja Nathan, former senior vice president of architecture at Fidelity, joins Forrester as a principal analyst focusing on enterprise data and AI strategy. Drawing on more than two decades of experience modernizing data platforms for major financial institutions, she stresses...

Constructing Successful Digital Twins with Informatica
Many digital‑twin projects stall after pilot phases because they lack a trusted data foundation. At a recent DBTA webinar, Informatica’s Christian Farra explained that integrating master data and contextual information is essential to turn raw sensor signals into actionable insights....

BlueBox Systems Launches New Data Analytics Platform ‘Tradelane Intelligence’
BlueBox Systems unveiled Tradelane Intelligence, a data‑analytics platform that merges AI‑validated airfreight data with premium ocean data from Vizion. The solution delivers advanced reporting tools for carrier comparison, demurrage alerts, document verification, and an Eco‑Routing module that projects CO₂ emissions....

Agentic Business Intelligence Startup WisdomAI Shifts From Insights to Action
WisdomAI, an AI‑native business intelligence startup, announced the launch of its Federated Agentic Intelligence platform, shifting its focus from passive insights to autonomous enterprise execution. The platform combines an Enterprise Context Layer, a Model Context Protocol client, and an Adaptive...

Orizon Aerostructures Deploys Flexxbotics to Power Data-Driven Autonomy at Scale in Aerospace Manufacturing
Orizon Aerostructures has deployed Flexxbotics’ autonomous manufacturing platform to create a data‑driven, closed‑loop control environment across its aerospace production lines. The integration links CNC machines, FANUC robots, and enterprise PLM systems, feeding multimodal sensor streams into industrial AI for real‑time...

Codelco and Microsoft Sign Mining AI & Analytics Collaboration Agreement
Codelco, the world’s largest copper producer, has signed an 18‑month collaboration framework with Microsoft to embed artificial intelligence, advanced analytics, automation and digital security into its mining operations. Building on a 27‑year partnership, the deal will evaluate joint initiatives, pilot...
How to Overcome the Biggest Data Challenges in Startups
Startups often sideline data initiatives because of tight budgets and scarce talent, leaving them vulnerable to security risks and missed insights. Financial constraints and the inability to hire full‑time data experts hinder the development of robust data governance. The article...
HD Hyundai Selects Siemens Xcelerator for Integrated Digital Shipbuilding Platform
HD Hyundai’s shipbuilding arm, HD KSOE, has selected Siemens’ Xcelerator platform to create an integrated digital shipbuilding environment. The platform will provide a unified data backbone linking CAD, PLM, digital manufacturing, automation and simulation, eliminating data discontinuities from design through...
From Localization to Leverage: How Data Control Will Define India’s Digital Sovereignty
India’s data‑localization push laid the groundwork for digital sovereignty, but the focus is shifting from where data resides to who controls it. In the AI‑driven hybrid cloud era, governance, transparency, and accountability become critical as data fuels models across multiple...

Nvidia Hiring for Orbital Data Center System Architect, as Space Compute Market Grows
Nvidia announced a senior hire for an orbital data‑center system architect, offering a base salary between $224,000 and $356,500. The role will design end‑to‑end AI compute solutions that operate from the GPU chip through satellite platforms and inter‑satellite links. The...

The Pitfalls of the 95% Confidence Paradigm for Banking Data Quality
Bank executives often cite a 95% confidence level as the benchmark for data quality, yet studies show most banks operate at only 80‑90% confidence, which can erode to 50% as data moves through multiple systems. The shortfall has tangible costs:...

Circana Launches Complete Why Analytics Platform for CPG Sales Performance
Circana unveiled Complete Why, an AI‑driven analytics platform for the consumer packaged goods sector, embedded in its Unify+ visualization suite. The tool models sales performance at store‑ and week‑level, evaluating up to 60 drivers such as price, promotions, distribution, competition,...

Unique Capabilities of Edge Computing in IoT
The article outlines how edge computing transforms IoT by enabling federated learning, real‑time analytics, and stronger data sovereignty. By processing data locally, edge nodes cut latency, lower bandwidth demands, and keep sensitive information compliant with regulations such as GDPR and...

Missouri Team Shows How to Rewrite Bits Stored in DNA
University of Missouri researchers have demonstrated a technique to rewrite data stored in DNA, overcoming the long‑standing limitation that DNA‑encoded information was immutable. The method pairs a compact electronic module with a nanopore sensor, translating electrical signals into binary bits....

Oracle AI Database 26ai: Practical Features
Oracle introduced the AI Database 26ai, a new release that adds automatic transaction rollback, real‑time SQL plan management, and built‑in AI vector search. The platform promises more stable performance under unpredictable workloads, faster data ingestion, and a self‑managed in‑memory cache...

Harnessing the Potential of CXL for Cloud-Native Databases
Cloud‑native databases are increasingly critical, yet RDMA‑based memory disaggregation suffers from page‑level inefficiencies, contention, and slow recovery. Compute Express Link (CXL) offers a high‑bandwidth, low‑latency, cache‑coherent interconnect that enables fine‑grained memory access and instant recovery. Controlled tests show CXL can...
Databricks Lakeflow Spark Declarative Pipelines Migration From Non‑Unity Catalog to Unity Catalog
Databricks is transitioning Delta Live Tables pipelines from legacy Hive Metastore workspaces to Unity Catalog‑enabled environments, revealing consistent code refactoring and governance adjustments. Teams must adopt three‑level catalog.schema.table references, replace input_file_name() calls with the built‑in _metadata struct, and migrate notebook...
Havas Bets on AI Veteran Sharona Sankar-King to Lead Proprietary Tech Push
Havas Media Network North America has hired Sharona Sankar‑King as chief data and product officer to steer its proprietary AI platform, Converged.AI, and the broader analytics practice. Sankar‑King arrives from Harte Hanks after more than 25 years in agencies, consultancies...

MARS Coalition Advocates for Data-Driven Road Safety in the US
The Modern Analytics for Roadway Safety (MARS) Coalition is urging Congress to modernise federal road safety programs by adopting AI, telematics and predictive analytics. These technologies allow agencies to spot crash risks before they materialise, moving from reactive to preventive...

Capgemini Joins OpenAI's Frontier Alliance to Scale Enterprise AI
Capgemini has joined OpenAI’s newly launched Frontier Alliance as a founding partner, creating a dedicated delivery function to scale AI agents for enterprises. The firm will deploy OpenAI‑certified professionals to tackle data readiness, integration, operating‑model design and governance challenges. Capgemini...

Find Duplicate Rows in SQL Server with a CTE
The article shows how to locate and list duplicate rows in a SQL Server table using a Common Table Expression (CTE) that groups all columns and counts occurrences. It presents two queries: one that returns only unique rows (order_count = 1) and...
Lætitia AVROT: Mostly Dead Is Slightly Alive: Killing Zombie Sessions
PostgreSQL administrators frequently encounter zombie sessions—backend processes that remain active or idle in transaction after a client vanishes. Linux’s default TCP keepalive interval of two hours lets these dead connections retain locks and block vacuum, inflating the process list. The...

South Korea, Australia, Portugal Top OECD Digital Government Index for 2025
The OECD’s 2025 Digital Government Index (DGI) places South Korea at the top with a 0.95 composite score, followed by Australia (0.88) and Portugal (0.86). Korea is the only nation to break the 0.9 threshold across all six assessment categories,...

Gartner Acknowledges Growth of Decision Intelligence Platforms with Inaugural Magic Quadrant
Gartner released its inaugural Magic Quadrant for Decision Intelligence Platforms, signaling a shift from data‑driven to decision‑centric strategies. The report highlights legacy players like FICO alongside newer pro‑code solutions such as Quantexa, and notes that generative AI integration remains early....
NDAP Overhaul in Works to Handle Surge in Big Data
India’s National Data and Analytics Platform (NDAP) will undergo a major revamp as NITI Aayog seeks a private‑sector partner to redesign, operate and hand over the system. The upgrade aims to handle vastly larger data volumes, add advanced analytics and...

The Rise of Location Intelligence: Turning Geographic Data Into Competitive Advantage
Location intelligence is moving from a background reporting tool to a strategic asset as businesses combine geographic data with operational metrics. By layering spatial context onto demand, infrastructure and behavior datasets, firms uncover patterns that traditional analytics miss. AI and...
How Data Analytics Is Transforming Modern Risk Assessment
Data analytics is reshaping risk assessment from a reactive practice into a predictive science across finance, insurance, healthcare, and transportation. Predictive modeling, machine‑learning, and real‑time dashboards now enable firms to forecast exposure, micro‑segment customers, and allocate capital with greater confidence....

Macquarie Partners with KINX & Gabia for South Korean Data Center Build-Out
Macquarie Asset Management’s Asia‑Pacific Infrastructure Fund 4 has teamed with South Korean IT firm Gabia and its network subsidiary KINX to launch a $420 million hyperscale data‑center venture. The joint‑venture will initially build a 40 MW facility in Ansan, Seoul, and aims to...

Google Files for Fifth Data Center at Midlothian Campus in Texas
Google, via shell company Sharka LLC, filed to build a fifth data center on its Midlothian, Texas campus. The $880 million project will span 288,000 sq ft and is slated for completion by February 24, 2027. This addition follows a $100 million fourth building announced in...

Tonic Structural vs Informatica: Which Is Better for Test Data Management?
The article compares Tonic Structural and Informatica for test data management, highlighting that both generate privacy‑safe data but differ in deployment models and feature focus. Informatica is shifting to a cloud‑first strategy after its Salesforce acquisition, limiting on‑premises options, while...