Sung Kim

Creator

0 followers

Semiconductor technologist/investor commenting on ASML and EUV deployment; brings practitioner insight on Europe’s chip equipment leader and related equity narratives.

Social•May 19, 2026

Never Switch AI Models Mid‑Workflow—Lose Your Context

Advice for using Claude Code or Codex: don’t switch models mid-workflow. A model switch is like handing the task to a new engineer with the same laptop, files, terminal history, and notes; but none of the previous engineer’s mental flow. Anything not written down clearly is effectively gone.

By Sung Kim

Social•May 13, 2026

Token Superposition Cuts LLM Pre‑training Time 2‑3×

2x to 3x times faster LLM pre-training without changing the model architecture, optimizer, tokenizer, or training data. Nous Research's Token Superposition Training (TST) During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings...

By Sung Kim

Social•May 13, 2026

4B RL‑fine‑tuned Model

Can a 4B model learn to recursively call itself to answer hard long-context questions? So, they RL fine-tuned a small model to behave as a native RLM. They claim their 4B RLM matches Sonnet 4.6 in quality while running significantly...

By Sung Kim

Social•May 13, 2026

Jina AI Releases Open-Weight Multimodal Omni Embeddings

Jina AI releases open-weight of jina-embeddings-v5-omni Their universal embedding model for text, images, audio, and video. Available in two sizes: small (1.57B, 1024-dim, 32K context) and nano (0.95B, 768-dim, 8K context). Both support Matryoshka truncation down to 32 dimensions.

By Sung Kim

Social•May 13, 2026

Programmable Renderers Streamline LLM Chat Template Processing

renderers Programmable chat templates for LLM training and inference. A renderer turns a model's chat template into a Python object that can render messages → token ids, parse completion ids → structured assistant messages, and extend a multi-turn rollout without re-rendering...

By Sung Kim

Social•May 12, 2026

Cognition Doubles Devin Usage Biweekly, Hits $445M Run Rate

Just who is paying/using for Devin? "Cognition is also one of the fastest-growing businesses in the history of business, doubling usage of Devin every eight weeks and hitting $445 million of revenue run rate in its first 18 months of service....

By Sung Kim

Social•May 10, 2026

Redis Creator Launches DS4 Inference Engine for DeepSeek V4

DS4, a specialized inference engine for DeepSeek v4 Flash, by the creator of redis. github.com/antirez/ds4

By Sung Kim

Social•May 10, 2026

Apple's Intel Partnership Validates Foundry, Spurs Industry Adoption

Apple using Intel is a big deal, not just because Apple would be using Intel’s 18A-P or 14A process, but because it would give Intel Foundry a major stamp of approval and make it easier for other customers to follow.

By Sung Kim

Social•May 9, 2026

YouTube DJ Sets Blur Live and Curated Performance

I’ve been watching a lot of YouTube videos where DJs play quasi-live music sets. They feel like a traditional live performance, playing a curated playlists with visuals.

By Sung Kim

Social•May 8, 2026

Zyphra Unveils ZAYA1-VL-8B Vision-Language Model

Zyphra (Open Superintelligence AI Lab in SF) released: - ZAYA1-VL-8B, vision-language model Blog: https://zyphra.com/post/zaya1-vl-8b Technical report: http://zyphra.com/zaya1-vl-8b-technical-report Weights: http://huggingface.co/Zyphra/ZAYA1-VL-8B

By Sung Kim

Social•May 7, 2026

Claude’s Hidden Activations Decoded Into Readable Text

Anthropic's Natural Language Autoencoders: Turning Claude’s thoughts into text You interact with Claude using language, but Claude processes those words as long lists of numbers, before again producing words as its output. These numbers in the middle are called activations—and like...

By Sung Kim

Social•May 7, 2026

TokenSpeed Delivers Light‑Speed LLM Inference, Open‑Source

LightSeek's TokenSpeed, a speed-of-light LLM inference engine - TensorRT LLM level performance - vLLM level usability - Built by a lean and mission-driven team in two months - MIT license, open-source Blog: https://lightseek.org/blog/lightseek-tokenspeed.html Repo: https://github.com/lightseekorg/tokenspeed

By Sung Kim

Social•May 7, 2026

Opus 4.7 Improves While GPT‑5.5 Declines

Is it just me Opus 4.7 (Claude Code) got better, while gpt-5.5 (Codex) got worse.

By Sung Kim

Social•May 6, 2026

Deep Nets Learn Real Patterns Fast, Hide Noise Slowly

A Theory of Generalization in Deep Learning by Elon Litman, Gabe Guo Deep learning generalizes because the model learns “real patterns” quickly, while memorizing noise happens slowly and mostly gets hidden in parts of the model that do not matter for...

By Sung Kim

Social•May 6, 2026

Building and Scaling RL Environments for the LLM Era

The ultimate guide to RL environments: building and scaling them in the LLM era by Hugging Face https://huggingface.co/spaces/AdithyaSK/rl-environments-guide

By Sung Kim

Social•May 5, 2026

MATLAB Agentic Toolkit Integrates AI Coding Assistants

MATLAB Agentic Toolkit MathWorks have released the MATLAB Agentic Toolkit which will significantly improve the life of anyone who is using MATLAB and Simulink with agentic AI systems such as Claude Code or OpenAI Codex. Blog: https://blogs.mathworks.com/matlab/2026/04/13/introducing-the-matlab-agentic-toolkit/ Repo: https://github.com/matlab/matlab-agentic-toolkit

By Sung Kim

Social•May 4, 2026

Leave Portable Batteries Behind for Korea‑U.S. Flights

Flying from Korea to the U.S.? You may want to discard your portable batteries before you travel, because airport security may discard them for you anyway.

By Sung Kim

Social•May 4, 2026

Brands Profit Selling $30 Glasses Abroad, Not US

Flying from Korea to LA. Interestingly, a lot of brands have their own lines of eyeglasses and sunglasses, selling for about $30 each. It seems like this must be very profitable for them. Something to think about the next time you...

By Sung Kim

Social•May 3, 2026

Short Social Media Breaks Boost Weight‑Training Gains

An easy exercise for everyone: weight training 1. Do a set of weight 2. Browse social media for 2 minutes 3. Do a second set of weight 4. Browse social media for 2 minutes 5. Repeat two more times 6. Go to another station 7. Repeat 1-6,...

By Sung Kim

Social•May 3, 2026

NIST CAISI Says DeepSeek V4 Pro Wid

NIST Center for AI Standards and Innovation (CAISI) Evaluation of DeepSeek V4 Pro Is gap widening or narrowing? NIST CASI thinks they are widening. https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro

By Sung Kim

Social•May 2, 2026

Agentic Data Creation Turns Inference Power Into Better Training

Meta's Autodata: an agentic data scientist to create high quality data They introduce a method for building agents that create high-quality training & evaluation data where they find that agentic data creation provides a way to convert increased inference compute into...

By Sung Kim

Social•May 1, 2026

Separate Review Agent Ensures Safe Autonomous AI Actions

OpenAI's blog Auto-review of agent actions without synchronous human oversight When Codex wants to run a risky action outside its sandbox, a separate Codex agent approves or denies it. The key is separation of concerns: the main agent is optimized to get...

By Sung Kim

Social•May 1, 2026

TSMC N2 May Be Delayed, No Adoption Yet

Has anyone asked TSMC whether N2 is delayed? Please note that TSMC excels at incremental improvements. But TSMC N2 is not incremental. It is a major leap from FinFET to GAA/nanosheet. We know that customers such as Bitmain and AMD were expected...

By Sung Kim

Social•Apr 30, 2026

SK Hynix Vest Beats Luxury Brands as Korea’s Dating Dress

In Korea, the ultimate blind-date outfit is not Louis Vuitton, Prada, or any luxury brand. Please — that is so passé. It is an SK hynix employee vest. https://www.koreatimes.co.kr/southkorea/20260430/sk-hynix-vest-becomes-koreas-hottest-status-symbol

By Sung Kim

Social•Apr 30, 2026

Stubborn Faith in Intel Pays Off Amid Doom

It seems being stubborn as a mule about your core conviction, that the U.S. needs its own semiconductor champion, Intel, for example, can be very profitable, even when 99% of people on social media have spent years insisting the company...

By Sung Kim

Social•Apr 28, 2026

Curated Papers on Memory for LLM and Multimodal Agents

Awesome Agent Memory Papers A curated list of papers on memory for LLM / multimodal agents — methods, benchmarks, and surveys — covering episodic, semantic, procedural, and multimodal memory, with both parametric (internal) and retrieval-based (external) storage, learned via prompting, supervised...

By Sung Kim

Social•Apr 28, 2026

Mistral Launches AI Workflows, Echoing Temporal’s Model

Mistral releases Workflows for AI (I don't know... it looks like Temporal without AI to me) https://mistral.ai/news/workflows

By Sung Kim

Social•Apr 28, 2026

Abstract Token Reasoning Matches Verbal CoT, Cuts Cost

Does a LLM really need to think in English or Chinese? How about it thinks using a short sequence of reserved "abstract" tokens through reinforcement learning? They find out that it is as performant as verbalized CoT at a fraction of the...

By Sung Kim

Social•Apr 26, 2026

Xiaomi Stops Copying Cars, Hires European Designers Directly

Good news: Xiaomi won’t blatantly copy European car design anymore. They’ve found a faster method: hiring the designers directly.

By Sung Kim

Social•Apr 25, 2026

Enforce Strict Canonical Terms to Curb AI‑induced Debt

With AI coding agents, do you know you can accumulate years of technical debt in a matter of days. How do you prevent your codebase from becoming unmaintainable? Remember, these coding agents often rely heavily on text-search patterns to understand and...

By Sung Kim

Social•Apr 25, 2026

Intel's 18A Beats TSMC N2 in Real‑World Availability

So, who is ahead in fab technology: Intel or TSMC? Have you seen any products you can buy today that were fabbed on TSMC N2? I haven’t. Have you? They say TSMC's N2 has a higher yield, but again where are...

By Sung Kim

Social•Apr 24, 2026

RAG Lives On: Latency Irrelevant in Agentic AI

One more thing. Is RAG dead? No. In traditional AI chat, RAG was detrimental because of the added latency. But in agentic AI, latency does not matter nearly as much.

By Sung Kim

Social•Apr 22, 2026

New Prosumer Test Affects Only 2% of Signups

It seems they're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected (YET, BUT WILL BE).

By Sung Kim

Social•Apr 21, 2026

Claude Pro Loses Code Access as Token Costs Rise

Just wow. Tokens are getting more expensive. Claude Code is no longer included in Claude Pro plan. https://claude.com/pricing

By Sung Kim

Social•Apr 21, 2026

Mozilla Fixes 271 Firefox Bugs Using Anthropic Mythos

Mozilla scanned the Firefox codebase with Anthropic Mythos has fixed 271 vulnerabilities identified during this initial evaluation. https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/

By Sung Kim

Social•Apr 20, 2026

AI Coding Hype Fades, Now Just Faster App Building

Back in December, when AI coding agents started getting good, it felt like magic. Now, the magic is gone, replaced by the same old process of building an app one step at a time, except you’re coding much less, or not...

By Sung Kim

Social•Apr 18, 2026

Hybrid Model Cuts KV Cache, Slashes Token Costs

Moonshot AI prefill/decode disaggregation beyond a single cluster: cross-datacenter + heterogeneous hardware, unlocking the potential for significantly lower cost per token. This was previously blocked by KV cache transfer overhead. The key enabler is our hybrid model (Kimi Linear), which reduces...

By Sung Kim

Social•Apr 18, 2026

Rust Tailscale Library Expands with C, Elixir, Python Bindings

tailscale-rs It is a work-in-progress Tailscale library written in Rust, with language bindings to C, Elixir, and Python. https://github.com/tailscale/tailscale-rs

By Sung Kim

Social•Apr 18, 2026

Rust-Powered LLM Framework Delivers 3× Speed, Cross‑platform

He trained a 12M parameter LLM on my own ML framework using a Rust backend and CUDA kernels for flash attention, AdamW, and more. The framework features: - Custom CUDA kernels (Flash Attention, fused LayerNorm, fused GELU) for 3x increased throughput - Automatic...

By Sung Kim

Social•Apr 17, 2026

TPO Converts RL to Supervised Learning for Sparse Rewards

Target Policy Optimization (TPO) TPO turns GRPO into supervised learning: build a target distribution over sampled completions, then fit with cross-entropy. TPO brings the old “reweight, then fit” idea from the RL-as-EM line of work (e.g. REPS, MPO) into group RL....

By Sung Kim

Social•Apr 16, 2026

Pair Codex Backend with Claude Code Frontend for Polished UI

My general rule for AI coding agents is simple: use Codex for the backend and Claude Code for the frontend. Why? Codex is written in Rust, and Claude Code is written in TypeScript. In practice, that means you let Codex build the...

By Sung Kim

Social•Apr 16, 2026

Longing for Interpretable Models Over Opaque Modern AI

I really miss the good old days of linear and logistic regression, decision trees, random forests (kind of), SVMs, XGBoost (kind of), ARIMA, survival models, and similar methods; back when so-called AI was still something you could actually explain.

By Sung Kim

Social•Apr 16, 2026

Korea Alone Prevents Luxury Sales Collapse Worldwide

Koreans are single-handedly keeping luxury brands’ sales from cratering. 🤦‍♂️🤦🤦‍♂️ Literally, no word. None. Zilch. Nada. https://www.asiae.co.kr/en/article/2026040809431791412

By Sung Kim

Social•Apr 16, 2026

Looped Transformers Enable Implicit Reasoning and Broad Generalization

You have to admire the speed - generating a paper and a repo on Looped Transformer (LT) in about a week is impressive, even with AI assistance. LT is powerful because it can perform implicit reasoning over their parametric knowledge, unlocking...

By Sung Kim

Social•Apr 15, 2026

Distilling Experience Into Memory Primitives Amid Data Explosion

Harness, Memory, Context Fragments, & the Bitter Lesson by Viv ...or how do we efficiently distill experiences into higher level memory primitives that capture the important parts? when there are hyper-exponential in the amount of data produced by those agents.

By Sung Kim

Social•Apr 14, 2026

Host Your App at Home with Free Cloudflare Tunnel

Oh, I forgot to mention that you can even host your app at home if you have a spare PC lying around. PostGrip uses Cloudflare Tunnel, which is free, to securely expose your app to the internet.😀

By Sung Kim

Social•Apr 14, 2026

Deploy PostGrip: VPS, Docker Worker, Simple Setup

So, how does PostGrip work? It’s very simple. You go to a VPS provider like Hetzner and provision a VPS for your app(S). You download the worker Docker image from Docker Hub, run it on your VPS, and then follow the...

By Sung Kim

Social•Apr 13, 2026

Expensive $20K Setup Delivers MiniMax M2.7

$20,000 setup, but it is nice. MiniMax M2.7 200K running on 4x DGX Sparks and OpenCode frontend by @TheAhmadOsman

By Sung Kim

Social•Apr 13, 2026

Codex Currently Outperforms Claude Code

Just my personal observation. Right now, Codex is better than Claude Code. For reference, I have both subscriptions: Claude Code Max and ChatGPT Pro.

By Sung Kim

Social•Apr 10, 2026

Burning Tokens, Still Manual Testing: Are You Shipping?

Question for builders burning trillions of tokens across multiple Max and Pro accounts: are you actually shipping anything? At the end of the day, it still takes your own time to manually test new features.

By Sung Kim

Sung Kim

Never Switch AI Models Mid‑Workflow—Lose Your Context

Token Superposition Cuts LLM Pre‑training Time 2‑3×

4B RL‑fine‑tuned Model

Jina AI Releases Open-Weight Multimodal Omni Embeddings

Programmable Renderers Streamline LLM Chat Template Processing

Cognition Doubles Devin Usage Biweekly, Hits $445M Run Rate

Redis Creator Launches DS4 Inference Engine for DeepSeek V4

Apple's Intel Partnership Validates Foundry, Spurs Industry Adoption

YouTube DJ Sets Blur Live and Curated Performance

Zyphra Unveils ZAYA1-VL-8B Vision-Language Model

Claude’s Hidden Activations Decoded Into Readable Text

TokenSpeed Delivers Light‑Speed LLM Inference, Open‑Source

Opus 4.7 Improves While GPT‑5.5 Declines

Deep Nets Learn Real Patterns Fast, Hide Noise Slowly

Building and Scaling RL Environments for the LLM Era

MATLAB Agentic Toolkit Integrates AI Coding Assistants

Leave Portable Batteries Behind for Korea‑U.S. Flights

Brands Profit Selling $30 Glasses Abroad, Not US

Short Social Media Breaks Boost Weight‑Training Gains

NIST CAISI Says DeepSeek V4 Pro Wid

Agentic Data Creation Turns Inference Power Into Better Training

Separate Review Agent Ensures Safe Autonomous AI Actions

TSMC N2 May Be Delayed, No Adoption Yet

SK Hynix Vest Beats Luxury Brands as Korea’s Dating Dress

Stubborn Faith in Intel Pays Off Amid Doom

Curated Papers on Memory for LLM and Multimodal Agents

Mistral Launches AI Workflows, Echoing Temporal’s Model

Abstract Token Reasoning Matches Verbal CoT, Cuts Cost

Xiaomi Stops Copying Cars, Hires European Designers Directly

Enforce Strict Canonical Terms to Curb AI‑induced Debt

Intel's 18A Beats TSMC N2 in Real‑World Availability

RAG Lives On: Latency Irrelevant in Agentic AI

New Prosumer Test Affects Only 2% of Signups

Claude Pro Loses Code Access as Token Costs Rise

Mozilla Fixes 271 Firefox Bugs Using Anthropic Mythos

AI Coding Hype Fades, Now Just Faster App Building

Hybrid Model Cuts KV Cache, Slashes Token Costs

Rust Tailscale Library Expands with C, Elixir, Python Bindings

Rust-Powered LLM Framework Delivers 3× Speed, Cross‑platform

TPO Converts RL to Supervised Learning for Sparse Rewards

Pair Codex Backend with Claude Code Frontend for Polished UI

Longing for Interpretable Models Over Opaque Modern AI

Korea Alone Prevents Luxury Sales Collapse Worldwide

Looped Transformers Enable Implicit Reasoning and Broad Generalization

Distilling Experience Into Memory Primitives Amid Data Explosion

Host Your App at Home with Free Cloudflare Tunnel

Deploy PostGrip: VPS, Docker Worker, Simple Setup

Expensive $20K Setup Delivers MiniMax M2.7

Codex Currently Outperforms Claude Code

Burning Tokens, Still Manual Testing: Are You Shipping?

Technology Pulse

Opus 4.7 Improves While GPT‑5.5 Declines