Agentic AI

Agentic AI

Creator
0 followers

I share my thoughts on anything related to Agentic AI and Agentic AI Security topics.

Microsoft's Approach to LLM: MAI-Thinking-1
BlogJun 3, 2026

Microsoft's Approach to LLM: MAI-Thinking-1

Microsoft’s AI team released a technical report on MAI‑Thinking‑1, a reasoning model that achieved 52.8% on SWE‑Bench Pro and 97% on AIME 2025, matching frontier‑size models. The model was trained on 30 trillion human‑written tokens without any external model distillation, emphasizing learning...

By Agentic AI
CLAUDE CODE ORCHESTRATION
BlogMay 29, 2026

CLAUDE CODE ORCHESTRATION

Claude Code’s new orchestration model introduces three primitives—Dynamic Workflows, Subagents, and Agent Teams—to let developers match execution style to problem complexity. Dynamic Workflows auto‑generate JavaScript scripts that can spawn up to 1,000 parallel agents, run adversarial verification, and return only...

By Agentic AI
Self-Evolving Agent Skills: SkillOpt
BlogMay 28, 2026

Self-Evolving Agent Skills: SkillOpt

SkillOpt, a new framework from Microsoft and leading Chinese universities, treats agent instructions as trainable external state, applying deep‑learning‑style optimization to text. By imposing a bounded edit budget—called a textual learning rate—the system performs controlled “textual gradient descent,” preventing erratic...

By Agentic AI
An Implementation Checklist to Claude Code in Large Codebases
BlogMay 20, 2026

An Implementation Checklist to Claude Code in Large Codebases

The post translates Anthropic’s Claude Code best‑practice guide into a step‑by‑step checklist for large codebases. It emphasizes building a harness—root and subdirectory CLAUDE.md files, targeted hooks, and on‑demand skills—to give Claude the right context without maintaining an index. Each phase...

By Agentic AI
GCIS 2026: My Agentic-AI Lens on a Prestigious, Invitation-Only Cyber Summit Near Washington
BlogMay 12, 2026

GCIS 2026: My Agentic-AI Lens on a Prestigious, Invitation-Only Cyber Summit Near Washington

The Global Cyber Innovation Summit (GCIS) in February 2026 gathered senior cyber and AI leaders to discuss the emerging agentic‑AI era. Attendees warned that trillions of autonomous agents will soon overwhelm existing infrastructure, while weaponized models can turn a disclosed...

By Agentic AI
Claude Agents Can Now Dream: How AI Engineers Should Use Anthropic’s New Agent Features Without Creating New Attack Paths
BlogMay 8, 2026

Claude Agents Can Now Dream: How AI Engineers Should Use Anthropic’s New Agent Features Without Creating New Attack Paths

Anthropic unveiled three new capabilities for Claude Managed Agents—Dreaming, Outcomes, and Multi‑agent Orchestration—shifting the platform from single‑threaded assistants to a managed, inspectable agent operating system. Dreaming lets agents synthesize lessons from up to 100 past sessions into a separate memory...

By Agentic AI
What a Secure Harness for Agentic AI Actually Is
BlogMay 6, 2026

What a Secure Harness for Agentic AI Actually Is

Enterprise teams are conflating terms like guardrails, gateways, and governance, leaving a critical gap in securing autonomous AI agents. A "secure harness" is defined as an engineered control layer that provides visibility, policy enforcement, and real‑time intervention across an agent’s...

By Agentic AI
Why Your Agentic AI Pentester Is Probably Just a Fancy Scanner
BlogMay 4, 2026

Why Your Agentic AI Pentester Is Probably Just a Fancy Scanner

Ridge Security benchmarked three agentic AI pentesting platforms—RidgeGen, Shannon and Strix—against the OWASP Juice Shop. Using the same Gemini 3 Flash model, RidgeGen produced 55 fully verified findings with zero hallucinations, while Shannon generated 27 findings of which 63% were...

By Agentic AI
World Models, Architectures, and the Next Phase of AI
BlogMay 3, 2026

World Models, Architectures, and the Next Phase of AI

At the Spring School AI For Impact, Yann LeCun and Eric Xing sparred over the core design of world models: LeCun’s Joint Embedding Predictive Architecture (JEPA) predicts future latent representations without reconstruction, while Xing’s Generative Latent Prediction (GLP) re‑attaches a decoder to...

By Agentic AI
The Computational Wall: Why the Defense Trilemma and the NP-Hardness of Reward Hacking Detection Demand a New Security Posture for...
BlogMay 2, 2026

The Computational Wall: Why the Defense Trilemma and the NP-Hardness of Reward Hacking Detection Demand a New Security Posture for...

At a National Academies panel, researchers presented two converging impossibility results: the Defense Trilemma shows that wrapper defenses around LLMs cannot simultaneously guarantee continuity, utility preservation, and complete safety, and recent proofs demonstrate that detecting reward‑hacking is NP‑hard. Both findings...

By Agentic AI
Chapter 15: Structured Output and Schema-Constrained Generation (Claude Code Vs. Hermes Agent)
BlogMay 2, 2026

Chapter 15: Structured Output and Schema-Constrained Generation (Claude Code Vs. Hermes Agent)

The post explains structured output—a technique that forces LLMs to return validated JSON instead of free text—and compares two implementations: Claude Code’s `jsonSchema` parameter and Hermes Agent’s tool‑forcing approach. Claude Code wraps the schema as a synthetic tool, tracks retries...

By Agentic AI
Chapter 14: Model Routing and Provider Abstraction (Claude Code Vs. Hermes Agent)
BlogMay 1, 2026

Chapter 14: Model Routing and Provider Abstraction (Claude Code Vs. Hermes Agent)

Model routing bridges the gap between an agent’s need for a language model and the actual API call, handling provider choice, format translation, context limits, and fallback. Claude Code implements this as a compile‑time TypeScript layer with static rules and...

By Agentic AI
Chapter 13: MCP Integration — Connecting Agents to the World (Claude Code Vs. Hermes Agent)
BlogApr 30, 2026

Chapter 13: MCP Integration — Connecting Agents to the World (Claude Code Vs. Hermes Agent)

The Model Context Protocol (MCP) is emerging as a universal standard that lets AI agents invoke external tools without custom code. Claude Code embeds an MCP client directly into its TypeScript QueryEngine, using double‑underscore namespacing and dynamic discovery. Hermes solves...

By Agentic AI
Chapter 12: The Skill System Pattern (Claude Code Vs. Hermes Agent)
BlogApr 29, 2026

Chapter 12: The Skill System Pattern (Claude Code Vs. Hermes Agent)

The chapter introduces the Skill System Pattern, contrasting Claude Code’s minimal CLAUDE.md approach with Hermes Agent’s full‑featured skill subsystem. Claude Code simply discovers static markdown files at session start, lacking versioning, creation, security scanning, or a marketplace. Hermes, by contrast,...

By Agentic AI
Agentic AI | Pulse