
What an API Actually Does
The video explains that an application programming interface (API) is the conduit through which software interacts with large language models, whether the model is proprietary, open‑weight, or open‑source. When a developer sends a prompt, the API forwards it to the provider’s servers, the model generates a completion, and the API returns that text to the calling app. This request‑response cycle isolates the heavy computation from the client. The presenter likens the process to ordering food via a delivery app and cites ChatGPT in a browser as a concrete example: the user’s query travels to OpenAI’s servers, the model crafts a reply, and the answer appears on screen. Even locally hosted models are typically exposed through an API to simplify integration. Understanding this pattern is crucial for developers because it dictates latency, cost, security, and scaling considerations; choosing or building an API layer becomes a primary architectural decision for any AI‑powered product.

The Future and Risk of Agents
The video examines how developers must decide which class of large‑language model to adopt when moving from experimentation to production. It outlines three categories—proprietary models such as OpenAI’s GPT‑5 or Google’s Gemini, open‑weight models like Meta’s Llama 3.1, Mistral, and Google’s Gemma,...

Make Prompts Reliable Through Systematic Testing, Not Tricks
If you’re a student or a professional using AI daily, you’ve seen this happen. A prompt works great today. Tomorrow it gives a weird answer. Next week it breaks after a model update 😅 A prompt that works once for one model isn’t reliable. A...

When NOT to Use Agents
The video examines the emerging class of agentic AI systems and warns against indiscriminate deployment. Unlike traditional reactive chatbots that wait for a prompt and return a single answer, agentic models can formulate plans, execute multiple actions, and deliver complex...

LLMs Learn by Predicting Tokens, Then Get Instructed
LLMs don’t wake up smart. They’re trained into it. Before a model can answer questions, follow instructions, or sound helpful, it goes through a long phase called pre-training. This is where: • random parameters • massive amounts of text and code • and one simple task come...
Free Cheatsheet Guides Practical Agent Architecture Decisions
We created a free Agents Architecture Cheatsheet. Here’s why 👇 A lot of people are building agent systems without a clear reason to do so. They mix tools with agents, over-complicate architectures, and struggle to move from demos to production. This cheatsheet is designed to be...

Why AI Makes Things Up
The video explains grounding – the practice of constraining large language model (LLM) responses to information drawn from verifiable external sources – as a core strategy to curb hallucinations. By forcing the model to rely on trusted data rather than...

This Setting Controls Randomness
The video explains how the temperature parameter governs the randomness of token selection in large language models, shaping whether outputs are deterministic or stochastic.\n\nA temperature of zero forces the model to pick the single most probable token, producing identical responses...

Learn AI in Months, Job-Ready in 1‑2 Years
I’ve done quite a few AI workshops recently, and I keep getting the same questions 👇 “Where do I start?” “How long does it really take to learn AI?” “Can I actually become job-ready?” So to clear the confusion, I put all the resources...

From Workflows to Multi-Agent Systems: How to Choose
In this talk Luis Franis, CTO of TORZI, explains how AI engineers decide between workflows, single agents, and multi‑agent systems when building client solutions. He frames AI engineering as a bridge between model development and product integration, emphasizing constraints such...
LLMs Predict Tokens, Humans Predict Meaning
People say “LLMs learn like humans, we both copy patterns.” Sounds right. It’s also misleading. LLMs don’t learn language to understand meaning. They learn to predict the next token. Not the next word. Tokens. IDs. Math. Over and over, trillions of times,...

How LLMs Think Step by Step & Why AI Reasoning Fails
The video explains how large language models (LLMs) often stumble on multi‑step questions because they attempt to jump straight to a final answer, leading to logical slips and hallucinations. To mitigate this, practitioners employ a prompt‑engineering technique called chain‑of‑thought (CoT),...

2025: Massive Growth, New Courses, and Personal Milestones
My 2025 wrapped: - released our first ever course & product 🚀 - followed up with 3 more courses (and a 4th coming soon with a great friend, @pauliusztin_) - invited to NVIDIA GTC and briefly met Jensen + many amazing people - landed...

The Easiest Way to Improve Prompts
The video explains two foundational prompting strategies—zero-shot and few-shot learning—used to shape large language model outputs. Zero-shot prompting presents a plain instruction without any exemplars, trusting the model’s pre‑trained knowledge to generate an answer, such as asking a general‑purpose assistant...
Accuracy Isn’t Enough: Prioritize Relevance, Grounding, Faithfulness
Accuracy is a terrible metric for LLMs. And it’s the reason many AI demos look great but fall apart in real usage. LLMs don’t usually fail by being wrong. They fail by being: irrelevant ungrounded confidently misleading An answer can be “accurate” in isolation and still be useless...

This Is How Much AI Can Remember
The video explains that a language model’s ability to remember is bounded by its context window – the maximum number of tokens it can see at once. The window comprises the system prompt, the full dialogue history, and any tokens the...

Context Rot Causes AI Failures; Engineer Memory
Most AI failures don’t come from bad prompts or weak models. They come from bad context. As tasks get longer and agents take more steps, important information gets buried, forgotten, or drowned in noise, something we call “context rot.” The result looks...

Your Prompts Aren’t the Problem—Your Context Is
The video argues that the real bottleneck in AI assistants isn’t how you phrase a question but what information the model actually sees when it generates a reply. While traditional prompt engineering tweaks wording to coax better answers, "context engineering"...

Why Prompts Actually Work
The video breaks down why prompts work, defining a prompt as the full set of instructions and context sent to an LLM. It distinguishes two parts: a system prompt that establishes the model’s role and constraints, and a user prompt...

Constraints, Not Model Choice, Drive Real-World AI Success
Do you still care about picking the right model? GPT. Gemini. Claude. Bigger models. Bigger context windows. But when you actually work on real projects, you quickly realize something else. Most decisions aren’t driven by models. They’re governed by constraints. Cost Latency Quality Data privacy Every model call has a...

RLHF Explained Simply
RLHF, or reinforcement learning from human feedback, is the technique powering modern large‑language‑model alignment. Rather than relying solely on static text corpora, developers augment training with human‑generated preference data, teaching models what constitutes a helpful, safe response. The workflow begins with...
Apply an Autonomy Slider:
Recently, a close friend of mine, @pauliusztin_, launched a free 9-lesson course on AI agent foundations, and I went through it. It’s short (around 1.5 hours total) and focuses purely on end-to-end fundamentals - no tools, no frameworks, just the core...

LLMs Turn Words Into Numbers Before Understanding Meaning
Your model doesn’t understand words. It understands numbers. That single fact explains a lot of confusing LLM behavior. Before an LLM can answer anything, your text goes through two quiet steps most people never see: Tokens: your sentence is broken into small pieces and...

How AI Gets Specialized (Fine-Tuning Explained)
The video demystifies fine‑tuning, the technique of taking a pre‑trained large language model and further training it on a narrow, high‑quality dataset to make it proficient at a specific task. Unlike the massive, generic corpus used for pre‑training, fine‑tuning relies on...

Base vs Instruct Models Explained
The video explains the fundamental distinction between base models and instruct models in modern AI development. A base model is the product of large‑scale pre‑training; it stores vast factual information but is not optimized for following user instructions or sustaining...

This Is How GPT Gets Built
The video walks through the foundational phase that turns a random‑parameter network into a functional language model, known as pre‑training. It describes how the model is fed an enormous corpus of text and code from the internet and tasked with...

Learn Python for AI by Building Real Projects
One of the best feelings in teaching AI? When a student describes your course exactly the way you hoped it would work. We just received a new review for our Beginner Python for AI Engineering course, and the part that hit me...
2026 Will Transform Creators After 2025 Image Boom
If you’re a creator, marketer, or video editor… 2026 is going to be very different.👇 2025 was dominated by image generation. Google’s Nano Banana Pro changed how we control style and lighting. ChatGPT made image consistency crazy. Images finally started doing what we asked fo...

Day 4/42: How AI Understands Meaning
The video explains how modern language models move beyond simple token IDs toward semantic representations called embeddings. While tokenization converts user input into arbitrary numeric identifiers, those IDs carry no information about word meaning or relationships, preventing the model from...

Prediction Isn’t Understanding and That Difference Matters
The video tackles a common misconception that large language models (LLMs) learn in the same way humans do, arguing that the similarity ends at a superficial level of pattern imitation. It breaks the discussion into three parts – pre‑training, fine‑tuning/reinforcement...

ChatGPT Doesn’t “Know” Anything. This Is Why
The video demystifies large language models (LLMs) by framing them as sophisticated autocomplete engines. It explains that an LLM’s core task is to predict the most probable next token—whether a whole word, a sub‑word fragment, or punctuation—based on the preceding...
Six Years, 70k Subs: Persistence Beats Early Momentum
70k YouTube subscribers after 6 years. Sounds simple on paper. It wasn’t. For the first few years, everything felt easy. I was covering AI research papers, I loved it, and people loved it too. Consistency wasn’t a struggle because the content was...

Day 1/42: What Is Generative AI?
The video introduces a new daily short‑form series aimed at demystifying generative AI for a broad audience. It opens by acknowledging the common frustration of receiving slow, vague, or inaccurate answers from tools like ChatGPT, Gemini, or Google Cloud, and...
42 Days of No‑Hype AI Concept Videos
I’m publishing one AI video every day for the next 42 days. No math. No code. No hype. Just the concepts you actually need to understand LLMs. YouTube is where 𝗲𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴 𝘀𝘁𝗮𝗿𝘁𝗲𝗱 for me. And honestly, I miss it. So on Monday, I’m coming back...
LLMs Trained on Reddit, Wikipedia, Now Eroding Them
It’s funny how LLMs depends on Reddit and Wikipedia content to be trained, but at the same time it’s killing both… https://t.co/MhE4knbw0t

Open This Before Jan 2
The episode highlights the rare, distraction‑free period between Christmas and New Year’s as an ideal time to decide whether to ship a real AI product in the coming year. It outlines a two‑part learning path—a free 10‑hour LLM Primer and...
Showcase AI Work Publicly; Jobs Find You
I didn’t get my first AI job by applying anywhere. It started with YouTube. 🎥👇 I was posting simple research explainers on YouTube when one day, the founder of a startup left a comment: “Can we talk?” He noticed I was also from Québec...

First University Talk: Turning AI Work Into Teach‑Ready Insight
I gave my first university talk this week at the University of San Diego. I was genuinely stressed. Not because the content was hard, but because it was the first time I had to turn what we do at Towards AI into...

Speech to Text Is Harder Than You Think
The video tackles a misconception that speech‑to‑text (STT) is merely a matter of converting audio into words. It argues that for production voice agents, transcription is only the first step; the real battle lies in extracting precise entities, handling latency,...
Voice Agents Require Far More than Simple Speech-to-Text
Most people think speech to text is just turn audio into words. Anyone who has built a real voice agent knows... that's the easy part. https://t.co/vNSn0Tum1y
Write to Learn: Content Accelerates Mastery, Not Virality
Sharing content online is one of the highest-ROI habits you can build, and it has nothing to do with going viral. When I started, my goal wasn’t audience or money. It was learning. Content forces clarity. Saying “I want to learn...

The Hidden Skill Boost Behind Posting Online
The video explores the often‑overlooked benefit of publishing content online: it serves as a powerful learning accelerator. The creator explains that his initial foray into content creation wasn’t driven by audience size, revenue, or virality, but by a desire to...
AI Amplifies What You Already Have, Good or Bad
I’ve been watching a pattern lately: a lot of businesses are trying to “add AI” hoping it will magically fix everything. But here’s the honest truth: AI won’t save a bad business. It will simply reveal what’s already broken. If your operations are messy,...
Avoid Prompt Debt: Understand AI‑Generated Code, Don’t Outsource Thinking
Prompt debt might be the most 2025 kind of technical debt. It’s what happens when AI writes your code… but you never build the mental model behind it. Shaw Talebi calls it out clearly: LLMs can generate code, architecture, even some...
Ship Fast: Use Your Known Stack, Not New Tools
Stuck choosing a tech stack for your next AI project? You might be overthinking it. My friend Shaw Talebi is an AI engineer who ships a lot of small AI SaaS projects fast. His rule is simple: build with what you...
Decode Papers Quickly: Abstract, Figure, AI Summaries
I used to be terrified of reading research papers... until I learned this 👇 The first time I opened one, I thought: “There’s no way I can read this.” Too formal, too technical, too many acronyms - and English wasn’t even my first language. But...
ChatGPT’s Three Modes Adjust Reasoning Depth, Not Models
Do you know how Instant vs Thinking vs Auto mode works in ChatGPT? And what the actual difference is?👇 These aren’t different models. They are different reasoning modes for the same model. Here’s what actually changes when you switch modes: 1. Instant Mode (Fast): Fast mode...
AI-Generated Replies Make Twitter Conversations Feel Generic
Yeah, I’ve been seeing more replies that read like polished, template-style LLM outputs—overly neutral tone, generic phrasing, and lots of “great question” energy. It definitely changes how discussions feel. Next, if you want, I can generate you a short checklist...
AI Generates New Images via Compressed Latent Space
How does AI create images or ideas that never existed? Not by remembering everything. AI reduces the data into a small, meaningful code a compressed space where similar things stay close and different things spread apart. That space is called the latent space. The carousel below...
Chunk Retrieve Augment: RAG Cuts Hall
Here’s a simple breakdown of how a basic RAG pipeline actually works.👇 You start by breaking long documents into smaller, focused chunks and converting each chunk into an embedding vector. These embeddings capture semantic meaning which lets the system understand what each piece...