VentureBeat

Publication

1 followers

AI/data/automation with enterprise finance implications

News•Apr 14, 2026

Databricks Tested a Stronger Model Against Its Multi-Step Agent on Hybrid Queries. The Stronger Model Still Lost by 21%.

Databricks’ research shows its multi-step Supervisor Agent beats single‑turn retrieval‑augmented generation (RAG) models on hybrid queries, delivering 20%‑plus gains on the STaRK benchmark and a 21% advantage on academic tasks even when using a stronger foundation model. The study attributes the performance gap to architectural limitations rather than model quality, emphasizing the need for agents that can simultaneously query structured SQL warehouses and unstructured vector stores. By decomposing queries, self‑correcting, and using declarative source descriptions, the agent handles complex enterprise questions without custom code. The findings suggest a shift toward tool‑oriented AI agents for data‑rich businesses.

VentureBeat

Databricks Tested a Stronger Model Against Its Multi-Step Agent on Hybrid Queries. The Stronger Model Still Lost by 21%.

43% of AI-Generated Code Changes Need Debugging in Production, Survey Finds

Agentic Coding at Enterprise Scale Demands Spec-Driven Development

Is Anthropic 'Nerfing' Claude? Users Increasingly Report Performance Degradation as Leaders Push Back

Designing the Agentic AI Enterprise for Measurable Performance

Five Signs Data Drift Is Already Undermining Your Security Models

Your Developers Are Already Running AI Locally: Why On-Device Inference Is the CISO’s New Blind Spot

AI Agent Credentials Live in the Same Box as Untrusted Code. Two New Architectures Show Where the Blast Radius Actually...

OpenAI Introduces ChatGPT Pro $100 Tier with 5X Usage Limits for Codex Compared to Plus

Mythos Autonomously Exploited Vulnerabilities that Survived 27 Years of Human Review. Security Teams Need a New Detection Playbook

New Framework Lets AI Agents Rewrite Their Own Skills without Retraining the Underlying Model

LLM-Referred Traffic Converts at 30-40% — and Most Enterprises Aren't Optimizing for It

Block Introduces Managerbot, a Proactive Square AI Agent and the Clearest Proof Point yet for Jack Dorsey’s AI Bet

Amazon S3 Files Gives AI Agents a Native File System Workspace, Ending the Object-File Split that Breaks Multi-Agent Pipelines

Anthropic Says Its Most Powerful AI Cyber Model Is Too Dangerous to Release Publicly — so It Built Project Glasswing

How MassMutual and Mass General Brigham Turned AI Pilot Sprawl Into Production Results

OCSF Explained: The Shared Data Language Security Teams Have Been Missing

Karpathy Shares 'LLM Knowledge Base' Architecture that Bypasses RAG with an Evolving Markdown Library Maintained by AI

Nvidia Launches Enterprise AI Agent Platform with Adobe, Salesforce, SAP Among 17 Adopters at GTC 2026

Meta's New Structured Prompting Technique Makes LLMs Significantly Better at Code Review — Boosting Accuracy to 93% in some Cases

OpenClaw Has 500,000 Instances and No Enterprise Kill Switch

When Product Managers Ship Code: AI Just Broke the Software Org Chart

IndexCache, a New Sparse Attention Optimizer, Delivers 1.82x Faster Inference on Long-Context AI Models

Intercom's New Post-Trained Fin Apex 1.0 Beats GPT-5.4 and Claude Sonnet 4.6 at Customer Service Resolutions

Three Ways AI Is Learning to Understand the Physical World

Scale AI Launches Voice Showdown, the First Real-World Benchmark for Voice AI — and the Results Are Humbling for some...

Why Enterprises Are Replacing Generic AI with Tools that Know Their Users

New MiniMax M2.7 Proprietary AI Model Is 'Self-Evolving' And Can Perform 30-50% of Reinforcement Learning Research Workflow

Enterprise AI Agents Keep Operating From Different Versions of Reality — Microsoft Says Fabric IQ Is the Fix

Mistral AI Launches Forge to Help Companies Build Proprietary AI Models, Challenging Cloud Giants

Nvidia's Agentic AI Stack Is the First Major Platform to Ship with Security at Launch, but Governance Gaps Remain

The Accessibility Gap: Why Good Intentions Aren’t Enough for Digital Compliance

Rethinking AEO when Software Agents Navigate the Web on Behalf of Users

NanoClaw and Docker Partner to Make Sandboxes the Safest Way for Enterprises to Deploy AI Agents

Y Combinator-Backed Random Labs Launches Slate V1, Claiming the First 'Swarm-Native' Coding Agent

Agents Need Vector Search More than RAG Ever Did

Manufact Raises $6.3M as MCP Becomes the ‘USB-C for AI’ Powering ChatGPT and Claude Apps

RSAC's Innovation Sandbox Is Where Cybersecurity's Next Giants Are Born

Google Finds that AI Agents Learn to Cooperate when Trained Against Unpredictable Opponents

How to Make Your E-Commerce Product Visible to AI Agents? Use This New System Trusted by L’Oréal, Unilever, Mars &...

Enterprise Agentic AI Requires a Process Layer Most Companies Haven’t Built

LangChain's CEO Argues that Better Models Alone Won't Get Your AI Agent to Production

Karpathy’s March of Nines Shows Why 90% AI Reliability Isn’t Even Close to Enough

Anthropic Launches Claude Marketplace, Giving Enterprises Access to Claude-Powered Tools From Replit, GitLab, Harvey and More

New KV Cache Compaction Technique Cuts LLM Memory 50x without Accuracy Loss

Google PM Open-Sources Always On Memory Agent, Ditching Vector Databases for LLM-Driven Persistent Memory

Databricks Built a RAG Agent It Says Can Handle Every Kind of Enterprise Search

Pentagon Vendor Cutoff Exposes the AI Dependency Map Most Enterprises Never Built

Did Alibaba Just Kneecap Its Powerful Qwen AI Team? Key Figures Depart in Wake of Latest Open Source Release

EY Hit 4x Coding Productivity by Connecting AI Agents to Engineering Standards

Technology Pulse