Machine learning at scale

Creator

0 followers

Machine learning systems in the real world.

Blog•Apr 1, 2026

ML@Scale Is Leveling up (and Your Window to Lock in at 7 CHF / Month Closes in 48h)

Machine Learning at Scale (ML@Scale) announced a 2026 content schedule featuring four weekly formats, including a new Zürich Feed that curates Swiss machine‑learning job listings with compensation estimates. The newsletter offers a limited‑time early‑bird subscription at $15 per month (≈ 13 CHF) or $109 per year, after which the price rises to $20 per month (≈ 22 CHF) or $149 per year. Subscribers lock in the discounted rate permanently, while the free tier remains available. The promotional window closes in 48 hours.

By Machine learning at scale

Blog•Mar 29, 2026

The Modern LLM Optimization Stack: A Field Guide

Gauri Gupta’s LLM optimization notes map the current distributed training and inference landscape, emphasizing that naive implementations quickly hit memory limits. The guide details advanced parallelism techniques—ZeRO data parallelism, tensor and pipeline parallelism—and memory‑saving methods like Flash Attention. It also...

By Machine learning at scale

Blog•Mar 28, 2026

800ms Latency Spikes From A $45K Redis Cluster That Looked Healthy [Edition #2]

Fintech firm Veritas Pay, processing 800 million transactions annually, saw its real‑time fraud detection engine exceed the 150 ms SLA, with P99 latency spiking to 800 ms during peak loads. The root causes include Redis write saturation during six‑hour batch syncs, a Python...

By Machine learning at scale

Blog•Mar 22, 2026

Evolutionary Code Optimization: How Datadog Automates Low-Level Performance Tuning

Datadog engineers moved from hand‑tuning Go assembly to an automated system called BitsEvolve that leverages large language models and evolutionary algorithms to optimize low‑level code. Manual removal of redundant bounds checks alone delivered a 25% CPU reduction on targeted functions....

By Machine learning at scale

Blog•Mar 21, 2026

VectoScale Is Paying $237k/Month to Hide a Bad Architectural Decision [Edition #1]

VectoScale, a Series B AI‑infrastructure startup handling 500 million daily queries, spends $237,000 a month on GPU inference and vector storage. Their hybrid retrieval pipeline suffers from an O(N) cross‑encoder reranker, unquantized 768‑dimensional vectors, and a one‑size‑fits‑all HNSW index, leading to p99...

By Machine learning at scale

Blog•Mar 18, 2026

Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation

Meta introduced GEM (Generative Ads Model), a foundation‑model approach that treats ad recommendation like a large language model. The architecture separates sequence and non‑sequence features, uses an InterFormer to handle long user histories, and adds a Student Adapter to keep...

By Machine learning at scale

Blog•Mar 15, 2026

The Industrialization of Algorithm Design: AI-Driven Research for Systems

UC Berkeley researchers introduced AI‑Driven Research for Systems (ADRS), a closed‑loop framework where large language models iteratively generate and refine system algorithms using simulators as hard verifiers. The approach treats code generation as an evolutionary search, allowing the LLM to...

By Machine learning at scale

Blog•Mar 8, 2026

Engineering Airbnb’s Embedding-Based Retrieval System

Airbnb introduced an Embedding‑Based Retrieval (EBR) system to sharpen the candidate pool for its search experience. The model uses a two‑tower architecture, with offline‑precomputed listing embeddings and real‑time query embeddings, trained on session‑based hard negatives rather than random samples. For...

By Machine learning at scale

Blog•Mar 4, 2026

Continual Learning via Sparse Memory Finetuning

Continual learning for large language models (LLMs) is hampered by catastrophic forgetting when traditional fine‑tuning updates all parameters. A new approach replaces transformer feed‑forward layers with sparse memory layers, updating only a handful of key‑value slots identified via TF‑IDF. Experiments...

By Machine learning at scale

Blog•Mar 2, 2026

A Real Day in the Life of a ML Engineer.

The post demystifies a machine‑learning engineer’s routine, showing it’s less about glamorous model training and more about disciplined workflow. The author starts early, clears email inbox, applies a five‑minute rule for quick actions, and parks larger tasks in a physical...

By Machine learning at scale

Machine learning at scale

ML@Scale Is Leveling up (and Your Window to Lock in at 7 CHF / Month Closes in 48h)

The Modern LLM Optimization Stack: A Field Guide

800ms Latency Spikes From A $45K Redis Cluster That Looked Healthy [Edition #2]

Evolutionary Code Optimization: How Datadog Automates Low-Level Performance Tuning

VectoScale Is Paying $237k/Month to Hide a Bad Architectural Decision [Edition #1]

Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation

The Industrialization of Algorithm Design: AI-Driven Research for Systems

Engineering Airbnb’s Embedding-Based Retrieval System

Continual Learning via Sparse Memory Finetuning

A Real Day in the Life of a ML Engineer.

Technology Pulse

Top Publishers

Top Creators

Top Companies

Top Investors

Machine learning at scale

ML@Scale Is Leveling up (and Your Window to Lock in at 7 CHF / Month Closes in 48h)

The Modern LLM Optimization Stack: A Field Guide

800ms Latency Spikes From A $45K Redis Cluster That Looked Healthy [Edition #2]

Evolutionary Code Optimization: How Datadog Automates Low-Level Performance Tuning

VectoScale Is Paying $237k/Month to Hide a Bad Architectural Decision [Edition #1]

Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation

The Industrialization of Algorithm Design: AI-Driven Research for Systems

Engineering Airbnb’s Embedding-Based Retrieval System

Continual Learning via Sparse Memory Finetuning

A Real Day in the Life of a ML Engineer.