
MetaClaw presents a continual‑learning framework for large language model agents that combines instant, text‑based skill injection with scheduled weight updates, eliminating service downtime. The fast loop creates concise behavioral rules from user failures and injects them directly into the prompt. The slower loop uses reinforcement learning on post‑skill query data during idle periods to adjust the model’s core weights. Benchmarks report up to 40.6% accuracy, an 8.25‑fold rise in task completion, and an 18% boost in robustness for autonomous research pipelines.

The paper introduces ReMix, a reinforcement‑learning based routing strategy for Mixture‑of‑LoRAs that eliminates the common “routing weight collapse” where a single adapter dominates. By assigning constant, equal weights to all activated adapters and training the router as a policy, ReMix...

The paper introduces CONSTORY‑CHECKER, an automated pipeline, and ConStory‑Bench, a 2,000‑prompt benchmark, to evaluate narrative consistency in long‑form story generation by LLMs. The four‑stage system extracts suspect spans, pairs conflicting statements, generates evidence chains, and produces anchored reports. Evaluation across...