VentureBeat

VentureBeat

Publication
1 followers

AI/data/automation with enterprise finance implications

Enterprise AI Coding Grows Teeth: GPT‑5.2‑Codex Weaves Security Into Large-Scale Software Refactors
NewsDec 18, 2025

Enterprise AI Coding Grows Teeth: GPT‑5.2‑Codex Weaves Security Into Large-Scale Software Refactors

OpenAI released GPT‑5.2‑Codex, an agentic coding model built on GPT‑5.2 with enhanced cybersecurity capabilities. The model achieved top scores on Capture‑the‑Flag, CVE‑Bench (87%) and a 72.7% pass rate on Cyber Range tests, demonstrating improved long‑horizon code understanding. Enterprise users can...

By VentureBeat
JP Morgan’s AI Adoption Hit 50% of Employees. The Secret? A Connectivity-First Architecture
NewsDec 17, 2025

JP Morgan’s AI Adoption Hit 50% of Employees. The Secret? A Connectivity-First Architecture

JPMorgan Chase rolled out an internal LLM‑powered assistant suite two‑and‑a‑half years ago, and adoption surged to over 60% of its 250,000‑plus workforce without mandates. The rapid, organic uptake stemmed from a connectivity‑first architecture that embeds AI into existing data, CRM,...

By VentureBeat
AI Agents Fail 63% of the Time on Complex Tasks. Patronus AI Says Its New 'Living' Training Worlds Can Fix...
NewsDec 17, 2025

AI Agents Fail 63% of the Time on Complex Tasks. Patronus AI Says Its New 'Living' Training Worlds Can Fix...

Patronus AI, backed by $20 million, unveiled Generative Simulators—a dynamic training architecture that creates adaptive, continuously evolving environments for AI agents. The platform aims to replace static benchmarks, which have struggled to predict real‑world performance, by generating on‑the‑fly challenges and feedback....

By VentureBeat
AI Is Moving to the Edge – and Network Security Needs to Catch Up
NewsDec 17, 2025

AI Is Moving to the Edge – and Network Security Needs to Catch Up

Small and mid‑size businesses are rapidly deploying AI at the edge, moving workloads from centralized data centers to retail stores, clinics, and remote sites. This shift delivers real‑time insights, resilience, and faster deployment but strains network bandwidth and security controls....

By VentureBeat
Zoom Says It Aced AI’s Hardest Exam. Critics Say It Copied Off Its Neighbors.
NewsDec 16, 2025

Zoom Says It Aced AI’s Hardest Exam. Critics Say It Copied Off Its Neighbors.

Zoom announced that its federated AI system achieved a 48.1% score on the Humanity's Last Exam, surpassing Google’s Gemini 3 Pro benchmark. The approach routes queries to multiple external models and selects the best output via a proprietary Z‑scorer. Critics...

By VentureBeat
With 91% Accuracy, Open Source Hindsight Agentic Memory Provides 20/20 Vision for AI Agents Stuck on Failing RAG
NewsDec 16, 2025

With 91% Accuracy, Open Source Hindsight Agentic Memory Provides 20/20 Vision for AI Agents Stuck on Failing RAG

Vectorize.io’s open‑source Hindsight memory architecture outperforms traditional retrieval‑augmented generation (RAG) by organizing agent knowledge into four specialized networks. The system achieved a record 91.4% accuracy on the LongMemEval benchmark, dramatically boosting multi‑session recall, temporal reasoning, and knowledge‑update scores. Hindsight’s TEMPR...

By VentureBeat
Echo Raises $35M to Secure the Enterprise Cloud's Base Layer — Container Images — with Autonomous AI Agents
NewsDec 16, 2025

Echo Raises $35M to Secure the Enterprise Cloud's Base Layer — Container Images — with Autonomous AI Agents

Israeli startup Echo raised $35 million Series A to overhaul container base images, the hidden OS layer of cloud workloads, with a secure‑by‑design approach. The company rebuilds images from source, hardens them to SLSA Level 3, and uses autonomous AI agents to monitor and...

By VentureBeat
Zencoder Drops Zenflow, a Free AI Orchestration Tool that Pits Claude Against OpenAI’s Models to Catch Coding Errors
NewsDec 16, 2025

Zencoder Drops Zenflow, a Free AI Orchestration Tool that Pits Claude Against OpenAI’s Models to Catch Coding Errors

Zencoder unveiled Zenflow, a free desktop AI orchestration tool that coordinates multiple AI agents—such as Claude and OpenAI models—to plan, implement, test, and review code in structured workflows. The platform replaces ad‑hoc prompting with repeatable sequences, spec‑driven development, multi‑agent verification,...

By VentureBeat
Korean AI Startup Motif Reveals 4 Big Lessons for Training Enterprise LLMs
NewsDec 15, 2025

Korean AI Startup Motif Reveals 4 Big Lessons for Training Enterprise LLMs

Korean startup Motif Technologies released Motif-2-12.7B-Reasoning, an open‑weight model that outperforms many larger U.S. and European counterparts on benchmark tests. The company also published a reproducible training recipe that isolates the real drivers of reasoning performance in enterprise LLMs. Four...

By VentureBeat
Bolmo’s Architecture Unlocks Efficient Byte‑level LM Training without Sacrificing Quality
NewsDec 15, 2025

Bolmo’s Architecture Unlocks Efficient Byte‑level LM Training without Sacrificing Quality

The Allen Institute for AI unveiled Bolmo, a family of open‑source byte‑level language models (7B and 1B) built by "bytefying" its Olmo 3 architecture. By operating directly on raw UTF‑8 bytes, Bolmo eliminates the need for tokenizers, improving robustness to misspellings,...

By VentureBeat
Why Agentic AI Needs a New Category of Customer Data
NewsDec 15, 2025

Why Agentic AI Needs a New Category of Customer Data

Twilio argues that the data infrastructure behind most enterprises was built for batch‑oriented marketing, not the millisecond‑level, context‑rich interactions demanded by agentic AI. Conversational AI needs a new category of customer data—real‑time conversational memory that captures tone, intent, and sentiment...

By VentureBeat
Ai2's New Olmo 3.1 Extends Reinforcement Learning Training for Stronger Reasoning Benchmarks
NewsDec 12, 2025

Ai2's New Olmo 3.1 Extends Reinforcement Learning Training for Stronger Reasoning Benchmarks

The Allen Institute for AI unveiled Olmo 3.1, an upgraded 32‑billion‑parameter family that extends the original Olmo 3 models through an additional 21‑day reinforcement‑learning run on 224 GPUs. The Think 32B variant shows 5‑plus point gains on the AIME math benchmark and strong...

By VentureBeat
Marble Enters the Race to Bring AI to Tax Work, Armed with $9 Million and a Free Research Tool
NewsDec 11, 2025

Marble Enters the Race to Bring AI to Tax Work, Armed with $9 Million and a Free Research Tool

Marble, a startup developing AI agents for tax professionals, announced a $9 million seed round led by Susa Ventures. The funding will support its free AI‑powered tax research tool and future agents that can analyze compliance scenarios and automate parts of...

By VentureBeat
Nous Research Just Released Nomos 1, an Open-Source AI that Ranks Second on the Notoriously Brutal Putnam Math Exam
NewsDec 11, 2025

Nous Research Just Released Nomos 1, an Open-Source AI that Ranks Second on the Notoriously Brutal Putnam Math Exam

Nous Research released Nomos 1, an open‑source AI mathematician that scored 87 out of 120 on the 2024 Putnam Competition, which would place it second among 3,988 participants. The system achieves this performance with a 30‑billion‑parameter mixture‑of‑experts model, activating only...

By VentureBeat
Cohere’s Rerank 4 Quadruples the Context Window over 3.5 to Cut Agent Errors and Boost Enterprise Search Accuracy
NewsDec 11, 2025

Cohere’s Rerank 4 Quadruples the Context Window over 3.5 to Cut Agent Errors and Boost Enterprise Search Accuracy

Cohere has released Rerank 4, expanding its context window to 32 K tokens—four times larger than Rerank 3.5—and promising higher ranking accuracy for enterprise search. The model arrives in Fast and Pro variants, targeting speed‑critical and deep‑reasoning workloads respectively. Rerank 4 also introduces self‑learning...

By VentureBeat
The 70% Factuality Ceiling: Why Google’s New ‘FACTS’ Benchmark Is a Wake-Up Call for Enterprise AI
NewsDec 10, 2025

The 70% Factuality Ceiling: Why Google’s New ‘FACTS’ Benchmark Is a Wake-Up Call for Enterprise AI

Google’s FACTS Benchmark Suite, released with Kaggle, evaluates large language models on factuality across four real‑world scenarios—parametric knowledge, search‑augmented retrieval, multimodal interpretation, and text grounding. The initial leaderboard shows Gemini 3 Pro topping the chart with a 68.8% overall score, while GPT‑5...

By VentureBeat
The AI that Scored 95% — Until Consultants Learned It Was AI
NewsDec 10, 2025

The AI that Scored 95% — Until Consultants Learned It Was AI

SAP secretly tested its AI co‑pilot Joule with five consultant teams, asking four teams to believe the answers came from junior interns. Those teams rated the output about 95% accurate, while the fifth team, told the answers were AI‑generated, rejected...

By VentureBeat
Quilter's AI Just Designed an 843‑part Linux Computer that Booted on the First Try. Hardware Will Never Be the Same.
NewsDec 10, 2025

Quilter's AI Just Designed an 843‑part Linux Computer that Booted on the First Try. Hardware Will Never Be the Same.

Quilter, a San Francisco AI startup, used a physics‑driven system to design a two‑board Linux computer with 843 components in just one week, cutting human effort from an estimated 428 hours to 38.5. The AI generated a layout with 98%...

By VentureBeat
Mistral Launches Powerful Devstral 2 Coding Model Including Open Source, Laptop-Friendly Version
NewsDec 9, 2025

Mistral Launches Powerful Devstral 2 Coding Model Including Open Source, Laptop-Friendly Version

Mistral AI unveiled Devstral 2, a 123‑billion‑parameter coding model with a 256K‑token context window, alongside a 24‑billion‑parameter Devstral Small 2 that runs on a single laptop. Both models are open‑weight and available free for a limited time via API and...

By VentureBeat
Databricks' OfficeQA Uncovers Disconnect: AI Agents Ace Abstract Tests but Stall at 45% on Enterprise Docs
NewsDec 9, 2025

Databricks' OfficeQA Uncovers Disconnect: AI Agents Ace Abstract Tests but Stall at 45% on Enterprise Docs

Databricks introduced OfficeQA, a benchmark that tests AI agents on document‑heavy enterprise tasks using 89,000 pages of U.S. Treasury Bulletins. Tests show top agents such as Claude Opus 4.5 and GPT‑5.1 achieve only 37‑44% accuracy on raw PDFs, rising to 68%...

By VentureBeat
Brand-Context AI: The Missing Requirement for Marketing AI
NewsDec 9, 2025

Brand-Context AI: The Missing Requirement for Marketing AI

Marketing teams are adopting generative AI, but outputs often miss brand, audience, and strategic alignment because models lack contextual intelligence. BlueOcean argues that the missing ingredient is structured brand‑context, which unifies vertical data streams into a horizontal view for decision‑quality...

By VentureBeat
Z.ai Debuts Open Source GLM-4.6V, a Native Tool-Calling Vision Model for Multimodal Reasoning
NewsDec 9, 2025

Z.ai Debuts Open Source GLM-4.6V, a Native Tool-Calling Vision Model for Multimodal Reasoning

Zhipu AI’s Z.ai has launched the GLM-4.6V series, an open‑source vision‑language model family featuring a 106‑billion‑parameter flagship and a 9‑billion‑parameter Flash variant. Both models introduce native multimodal function calling, allowing visual inputs to be passed directly to tools such as...

By VentureBeat
Booking.com’s Agent Strategy: Disciplined, Modular and Already Delivering 2× Accuracy
NewsDec 8, 2025

Booking.com’s Agent Strategy: Disciplined, Modular and Already Delivering 2× Accuracy

Booking.com has turned its homegrown conversational recommendation system into a disciplined, modular AI agent stack that blends small travel‑specific models with larger LLMs and in‑house evaluations. This hybrid approach has doubled accuracy on key retrieval, ranking and customer‑interaction tasks while...

By VentureBeat
Design in the Age of AI: How Small Businesses Are Building Big Brands Faster
NewsDec 8, 2025

Design in the Age of AI: How Small Businesses Are Building Big Brands Faster

Generative AI has turned design from a late‑stage expense into a front‑line capability for small businesses. Since 2022, searches for AI‑powered naming, logo and website generators have surged 700‑1,600%, indicating rapid adoption. Unified platforms like Design.com now deliver naming, logo...

By VentureBeat
Why AI Coding Agents Aren’t Production-Ready: Brittle Context Windows, Broken Refactors, Missing Operational Awareness
NewsDec 7, 2025

Why AI Coding Agents Aren’t Production-Ready: Brittle Context Windows, Broken Refactors, Missing Operational Awareness

AI coding agents can generate snippets quickly, but they falter in enterprise settings due to limited context windows, service limits, and lack of hardware awareness. Indexing caps at 2,500 files and 500 KB per file leave large monorepos partially invisible, forcing...

By VentureBeat
Inside NetSuite’s Next Act: Evan Goldberg on the Future of AI-Powered Business Systems
NewsDec 4, 2025

Inside NetSuite’s Next Act: Evan Goldberg on the Future of AI-Powered Business Systems

Oracle NetSuite unveiled NetSuite Next at SuiteWorld 2025, branding it as the platform’s biggest product evolution. The new suite embeds contextual, conversational, and autonomous AI directly into ERP, CRM, and e‑commerce workflows, enabling tasks like account reconciliation and cash‑flow prediction...

By VentureBeat
Nvidia's New AI Framework Trains an 8B Model to Manage Tools Like a Pro
NewsDec 3, 2025

Nvidia's New AI Framework Trains an 8B Model to Manage Tools Like a Pro

Nvidia and the University of Hong Kong unveiled Orchestrator, an 8‑billion‑parameter model that coordinates multiple tools and specialist LLMs to solve complex tasks. Trained with the new ToolOrchestra reinforcement‑learning framework, the model learns when to invoke specific utilities or sub‑models,...

By VentureBeat
Gemini 3 Pro Scores 69% Trust in Blinded Testing up From 16% for Gemini 2.5: The Case for Evaluating AI...
NewsDec 3, 2025

Gemini 3 Pro Scores 69% Trust in Blinded Testing up From 16% for Gemini 2.5: The Case for Evaluating AI...

Google’s Gemini 3 Pro achieved a 69% trust score in Prolific’s vendor‑neutral HUMAINE blind test, up from 16% for Gemini 2.5. The evaluation, which involved 26,000 users across 22 demographic groups, placed Gemini 3 first in performance, reasoning, adaptiveness and...

By VentureBeat
Tariff Turbulence Exposes Costly Blind Spots in Supply Chains and AI
NewsDec 3, 2025

Tariff Turbulence Exposes Costly Blind Spots in Supply Chains and AI

Tariff volatility forces companies to react within 48 hours, prompting a shift toward process intelligence (PI) and AI‑driven supply‑chain orchestration. At Celosphere 2025, Vinmar, Florida Crystals and ASOS demonstrated how Celonis’ PI platform creates real‑time digital twins that cut expedites,...

By VentureBeat
Workspace Studio Aims to Solve the Real Agent Problem: Getting Employees to Use Them
NewsDec 3, 2025

Workspace Studio Aims to Solve the Real Agent Problem: Getting Employees to Use Them

Google has made Workspace Studio generally available, letting employees design, manage, and share AI agents directly within Google Workspace. The platform, powered by Gemini 3, targets business teams rather than developers and offers templates that automate routine tasks across Docs, Sheets,...

By VentureBeat
AWS Claims 90% Vector Cost Savings with S3 Vectors GA, Calls It 'Complementary' - Analysts Split on What It Means...
NewsDec 3, 2025

AWS Claims 90% Vector Cost Savings with S3 Vectors GA, Calls It 'Complementary' - Analysts Split on What It Means...

Amazon Web Services announced the general availability of Amazon S3 Vectors, a native vector storage and similarity‑search capability built directly into its S3 object storage service. The GA release expands capacity to 2 billion vectors per index and up to 20 trillion...

By VentureBeat
Ascentra Labs Raises $2 Million to Help Consultants Use AI Instead of All-Night Excel Marathons
NewsDec 2, 2025

Ascentra Labs Raises $2 Million to Help Consultants Use AI Instead of All-Night Excel Marathons

London‑based Ascentra Labs closed a $2 million seed round led by Berlin VC NAP to automate survey analysis in private‑equity due diligence. The platform ingests raw survey data and generates traceable Excel workbooks, promising 60‑80% time savings for consulting teams. Early...

By VentureBeat
New Training Method Boosts AI Multimodal Reasoning with Smaller, Smarter Datasets
NewsDec 2, 2025

New Training Method Boosts AI Multimodal Reasoning with Smaller, Smarter Datasets

Researchers at MiroMind AI and partner universities introduced OpenMMReasoner, a two‑stage training framework that first fine‑tunes a base vision‑language model on a curated, high‑quality dataset and then applies reinforcement learning to sharpen multimodal reasoning. The approach achieves state‑of‑the‑art performance on...

By VentureBeat
AWS Goes Beyond Prompt-Level Safety with Automated Reasoning in AgentCore
NewsDec 2, 2025

AWS Goes Beyond Prompt-Level Safety with Automated Reasoning in AgentCore

At re:Invent, AWS announced major upgrades to its Bedrock AgentCore platform, adding policy enforcement, episodic memory, and evaluation tools powered by automated reasoning. The new policy layer sits between agents and external tools, allowing enterprises to enforce guardrails after an...

By VentureBeat
With Nova Forge, AWS Gives Companies a Path to Build Foundation-Class Models without GPUs
NewsDec 2, 2025

With Nova Forge, AWS Gives Companies a Path to Build Foundation-Class Models without GPUs

AWS unveiled Nova Forge, a new service that lets enterprises fine‑tune its Nova 2 foundation models with proprietary data without needing costly GPU clusters. The offering creates custom “Novellas” that retain core reasoning abilities while gaining domain‑specific knowledge, and these...

By VentureBeat
Arcee Aims to Reboot U.S. Open Source AI with New Trinity Models Released Under Apache 2.0
NewsDec 2, 2025

Arcee Aims to Reboot U.S. Open Source AI with New Trinity Models Released Under Apache 2.0

Arcee AI unveiled Trinity Mini (26B parameters) and Trinity Nano (6B parameters) as the first U.S.-trained open‑weight Mixture‑of‑Experts models released under an Apache 2.0 license. The models are available for free download on Hugging Face and can be accessed via a...

By VentureBeat
DeepSeek Just Dropped Two Insanely Powerful AI Models that Rival GPT-5 and They're Totally Free
NewsDec 1, 2025

DeepSeek Just Dropped Two Insanely Powerful AI Models that Rival GPT-5 and They're Totally Free

DeepSeek, a Chinese AI startup, unveiled two 685‑billion‑parameter models—DeepSeek‑V3.2 and the high‑performance DeepSeek‑V3.2‑Speciale—under an MIT open‑source license. The models employ a novel Sparse Attention architecture that halves inference costs for long‑context tasks, supporting 128,000‑token windows at roughly $0.70 per million...

By VentureBeat
AI Models Block 87% of Single Attacks, but Just 8% when Attackers Persist
NewsDec 1, 2025

AI Models Block 87% of Single Attacks, but Just 8% when Attackers Persist

Cisco’s AI Threat Research team discovered that open‑weight large language models block 87% of single‑turn malicious prompts but see attack success soar to 92% when adversaries persist across multiple turns. The study evaluated eight popular models and found multi‑turn success...

By VentureBeat
OpenAGI Emerges From Stealth with an AI Agent that It Claims Crushes OpenAI and Anthropic
NewsDec 1, 2025

OpenAGI Emerges From Stealth with an AI Agent that It Claims Crushes OpenAI and Anthropic

OpenAGI, a stealth startup founded by MIT researcher Zengyi Qin, unveiled Lux, an AI foundation model that autonomously controls computers. Lux achieved an 83.6% success rate on the Online‑Mind2Web benchmark, outpacing OpenAI’s Operator (61.3%) and Anthropic’s Claude Computer Use (56.3%)....

By VentureBeat
Capture the Full Value of Your Technology with Financial Intelligence
NewsDec 1, 2025

Capture the Full Value of Your Technology with Financial Intelligence

Apptio’s Technology Business Management (TBM) platform adds a Financial Intelligence Layer that unifies data from ERP, cloud, ITSM, HR and other systems. By normalizing and enriching these inputs, the solution enables FinOps, IT financial management and strategic portfolio management teams...

By VentureBeat
Agent Coordination Is the Missing Piece in AI Commerce — New AWS and Visa Blueprints Target the Gap
NewsDec 1, 2025

Agent Coordination Is the Missing Piece in AI Commerce — New AWS and Visa Blueprints Target the Gap

AWS has added Visa’s Intelligence Commerce platform to its Marketplace, pairing Visa’s Trusted Agent Protocol tools with Amazon Bedrock and AgentCore. The joint effort includes publicly available blueprints that streamline multi‑agent workflows such as travel booking and B2B payment reconciliation....

By VentureBeat
Ontology Is the Real Guardrail: How to Stop AI Agents From Misunderstanding Your Business
NewsNov 30, 2025

Ontology Is the Real Guardrail: How to Stop AI Agents From Misunderstanding Your Business

Enterprises pour billions into AI agents, yet real‑world deployments falter because agents lack true understanding of siloed business data. The article proposes an ontology‑based single source of truth—defining concepts, hierarchies, and relationships—to bridge this gap, enabling agents to interpret context,...

By VentureBeat
What to Be Thankful for in AI in 2025
NewsNov 28, 2025

What to Be Thankful for in AI in 2025

2025 marks a turning point for generative AI as the ecosystem diversifies beyond a handful of cloud‑only giants. OpenAI launched GPT‑5, GPT‑5.1, Atlas, Sora 2 and open‑weight models, while enterprises report over‑50% ticket‑resolution gains using the new models. China’s open‑source wave,...

By VentureBeat
Prompt Security's Itamar Golan on Why Generative AI Security Requires Building a Category, Not a Feature
NewsNov 27, 2025

Prompt Security's Itamar Golan on Why Generative AI Security Requires Building a Category, Not a Feature

Prompt Security, founded by Itamar Golan in August 2023, built a full‑stack GenAI security platform and was acquired by SentinelOne for an estimated $250 million in August 2025. The company pioneered runtime protection, shadow‑AI discovery, and real‑time data sanitization, moving beyond...

By VentureBeat
Black Forest Labs Launches Flux.2 AI Image Models to Challenge Nano Banana Pro and Midjourney
NewsNov 26, 2025

Black Forest Labs Launches Flux.2 AI Image Models to Challenge Nano Banana Pro and Midjourney

Black Forest Labs unveiled FLUX.2, a new family of image‑generation and editing models that includes four variants—Pro, Flex, Dev and the upcoming Klein—plus an open‑source VAE released under Apache 2.0. The models add multi‑reference conditioning, higher‑fidelity 4‑megapixel outputs, and markedly better...

By VentureBeat
What Enterprises Should Know About The White House's New AI 'Manhattan Project' The Genesis Mission
NewsNov 25, 2025

What Enterprises Should Know About The White House's New AI 'Manhattan Project' The Genesis Mission

President Trump announced the Genesis Mission, an AI‑focused "Manhattan Project" that directs the Department of Energy to create a closed‑loop AI experimentation platform linking the nation’s 17 national labs, federal supercomputers and decades of government scientific data. The initiative aims...

By VentureBeat
OpenAI Now Lets Enterprises Choose Where to Host Their Data
NewsNov 25, 2025

OpenAI Now Lets Enterprises Choose Where to Host Their Data

OpenAI has broadened its data residency options for ChatGPT Enterprise, Edu, and approved API customers, now allowing data at rest to be stored and processed in ten regions—including the EU, UK, US, Canada, Japan, South Korea, Singapore, India, Australia, and...

By VentureBeat
DeepSeek Injects 50% More Security Bugs when Prompted with Chinese Political Triggers
NewsNov 24, 2025

DeepSeek Injects 50% More Security Bugs when Prompted with Chinese Political Triggers

CrowdStrike researchers found that the Chinese AI model DeepSeek‑R1 injects up to 50% more insecure code when prompts contain politically sensitive terms such as "Falun Gong," "Uyghurs" or "Tibet." The vulnerability stems from an embedded censorship mechanism in the model’s...

By VentureBeat
Microsoft’s Fara-7B Is a Computer-Use AI Agent that Rivals GPT-4o and Works Directly on Your PC
NewsNov 24, 2025

Microsoft’s Fara-7B Is a Computer-Use AI Agent that Rivals GPT-4o and Works Directly on Your PC

Microsoft unveiled Fara-7B, a 7‑billion‑parameter computer‑use AI agent that runs locally on a PC and interacts with web interfaces via pixel‑level visual input. In benchmark tests on WebVoyager, it achieved a 73.5% task‑success rate, surpassing larger models such as GPT‑4o...

By VentureBeat