Sung Kim

Creator

0 followers

Semiconductor technologist/investor commenting on ASML and EUV deployment; brings practitioner insight on Europe’s chip equipment leader and related equity narratives.

Social•Jun 16, 2026

MLPerf v6.0 Adds MoE Benchmarks, Showcasing 671B DeepSeek

MLCommons Releases MLPerf Training v6.0 Results Two Mixture-of-Experts (MoE) benchmarks were added, reflecting where the AI training frontier actually is. 📍 DeepSeek V3 — 671B params (largest in MLPerf history) 📍 GPT-OSS 20B — 21B params https://mlcommons.org/2026/06/mlperf-training-v6-0-results/

By Sung Kim

Social•Jun 13, 2026

Zhipu AI Launches GLM‑5.2 for All Coding Plan Users

Z AI or Zhipu AI released GLM-5.2 (Not open-weight yet?) It is available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans. http://docs.z.ai/devpack/latest-model

By Sung Kim

Social•Jun 13, 2026

Meta's AI Team Likened to Soul‑crushing Gulag

Yeah, I was right... "Their assigned work? Generating puzzles and coding problems to train AI models. “It’s literally the gulag,” one employee told Wired. “Most people find the work soul-crushing,” said another." https://techcrunch.com/2026/06/12/metas-months-old-ai-unit-is-a-soul-crushing-gulag-say-the-engineers-stuck-inside-it

By Sung Kim

Social•Jun 12, 2026

Kimi‑K2.7‑Code Boosts Performance, Cuts Reasoning Tokens

Moonshot AI's Kimi-K2.7-Code - Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. - Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. - Long-horizon coding:...

By Sung Kim

Social•Jun 12, 2026

PaddleOCR's PP-OCRv6 Boosts Accuracy and Speed Across Devices

Baidu's PP-OCRv6 looks impressive. PaddleOCR’s new OCR model series scales from 1.5M to 34.5M parameters, bringing stronger accuracy, faster inference, and broader deployment options — from browsers and edge devices to servers. Web: https://paddleocr.com Repo: https://github.com/PaddlePaddle/PaddleOCR Models: https://huggingface.co/collections/PaddlePaddle/pp-ocrv6

By Sung Kim

Social•Jun 9, 2026

Anthropic's Fable Flags Content, Hinders Security Testing

Anthropic: Use Fable/Mythos to harden your app against security threats. Me: Tries to use Fable/Mythos to harden my app against security threats. Also, Anthropic: Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content...

By Sung Kim

Social•May 30, 2026

Decentralized AI Agents Conduct End‑to‑End Scientific Research

AutoScientists: a decentralized team of AI agents that can generate hypotheses, design experiments, write code, test ideas, analyze failures, and revise strategy as evidence accumulates.

By Sung Kim

Social•May 30, 2026

StepFun AI's Flash Delivers 400

StepFun AI's Step 3.7 Flash (open-weight) 400 TPS. 198B sparse MoE, ~11B active. 256K context, 3 reasoning levels, built for agentic, coding, search, and multimodal workflows — balancing speed, cost, and reliable execution.

By Sung Kim

Social•May 28, 2026

Neural Weight Norm Mirrors Kolmogorov Complexity

He finds that neural networks may be biased toward the simplest fit that explains the data OR the neural network with the smallest possible weight norm (that fits the data) must encode the shortest program (that fits the data). "Neural Weight...

By Sung Kim

Social•May 27, 2026

FlashLib Delivers up to 208× GPU Speedup for Classic ML

FlashLib - a GPU library for fast, predictable, agent-ready classical ML operators. "Up to 26× on KMeans, 19× on KNN, 40× on HDBSCAN, 208× on TruncatedSVD, 47× on PCA, 147× on exact t-SNE, and 49× on MultinomialNB over state-of-the-art (cuML)." Blog: https://flashml-org.github.io Code:...

By Sung Kim

Social•May 26, 2026

Webwright Lets LLM Browse via Reusable Python Scripts

Microsoft's Webwright: Playwright for AI Agent CLI? Webwright gives LLM a terminal where it can launch multiple browser sessions to inspect the page and complete a web task. It captures and inspects page screenshots/states only when needed. It enforces each web...

By Sung Kim

Social•May 22, 2026

GPU Optimizations Are Interlinked, Necessitating Auto‑Tuning

A survey paper on Optimization Techniques for GPU Programming This survey discusses various optimization techniques found in 450 articles published in the last 14 years. They analyze the optimizations from different perspectives which shows that the various optimizations are highly interrelated,...

By Sung Kim

Social•May 22, 2026

Parallel Latent Trajectories Boost Test-Time Reasoning Breadth

They propose test time compute should scale not just by thinking deeper, but by thinking wider. They do this by sampling many latent reasoning trajectories, letting the model explore multiple hypotheses in parallel, so they don't follow one deterministic path...

By Sung Kim

Social•May 21, 2026

Stable Audio 3.0 Enables GPU-Free Six‑Minute Song Creation

Stability AI release Stable Audio 3.0, the open-weight model family built for artistic experimentation. "New and improved capabilities include variable-length generation up to six minutes, and full song composition on portable devices, no GPU required." Paper: https://arxiv.org/abs/2605.17991 Models: https://huggingface.co/collections/stabilityai/stable-audio-3 Web: https://stability.ai/stable-audio

By Sung Kim

Social•May 21, 2026

Delta Attention Residuals Learn Optimal Layer Routing

𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬 A drop-in upgrade to residual connections that learns which past layers to route from, without the routing collapse that breaks prior cross-layer attention at scale.

By Sung Kim

Sung Kim

MLPerf v6.0 Adds MoE Benchmarks, Showcasing 671B DeepSeek

Zhipu AI Launches GLM‑5.2 for All Coding Plan Users

Meta's AI Team Likened to Soul‑crushing Gulag

Kimi‑K2.7‑Code Boosts Performance, Cuts Reasoning Tokens

PaddleOCR's PP-OCRv6 Boosts Accuracy and Speed Across Devices

Anthropic's Fable Flags Content, Hinders Security Testing

Decentralized AI Agents Conduct End‑to‑End Scientific Research

StepFun AI's Flash Delivers 400

Neural Weight Norm Mirrors Kolmogorov Complexity

FlashLib Delivers up to 208× GPU Speedup for Classic ML

Webwright Lets LLM Browse via Reusable Python Scripts

GPU Optimizations Are Interlinked, Necessitating Auto‑Tuning

Parallel Latent Trajectories Boost Test-Time Reasoning Breadth

Stable Audio 3.0 Enables GPU-Free Six‑Minute Song Creation

Delta Attention Residuals Learn Optimal Layer Routing

Technology Pulse