Depth Anything 3 Renders FPV Video in Seconds on A100
Depth Anything 3 can reconstruct this FPV video in just a few seconds on a A100 🤯 It was not long ago that I used to let agisoft metashape chug all night on a 3d scan, and here we are https://t.co/P6VMSZJcF9
AI Labels Will Soon Lose Meaning Amid Ubiquitous Integration
Yup. "Made with AI" labels will end up being pointless. From coding assistance to asset creation and optimization - the lines will get murky to the point of being unhelpful. Agree/disagree? https://t.co/SeL3yFGMbQ
VPS Gives AR Glasses and Robots Centimeter‑level Spatial Awareness
VPS or visual positioning system... this is how AR glasses and robots will understand where they are in the real world and know where to go. It’s also how you can spatially annotate reality with cm level accuracy - all...
AI Video Tools Unlock Affordable Faithful Classic Remasters
Editing video models (think nano banana for video) will cause a boom in faithful remasters of old classics. Suddenly you can afford decisions that used to be cost prohibitive. Maybe even tackle cult favorites that never had the fan base...
Soldiers Can Now See Through Walls: The Rise of Human–Robot Co-Perception
In this episode Bilawal Sidhu explores Anduril’s EagleEye AR headset, which fuses data from drones, satellites, ground sensors, and soldier-worn devices into a shared, real‑time 3D battlefield map that lets humans see through walls and coordinate with autonomous systems. He...
AI-Powered Keyframe Interpolation Delivers Local, Believable Animations
Wow.. AI assisted keyframe interpolation in the latest release of cascadeur. Define two keyframes manually and get believable animation in between. You can also mix regular, AI and physics-based interpolation - oh and it’s all generated locally: https://t.co/f3SqmcShbV
AI Video Shifts From Novelty to Everyday Utility
Nano banana pro is hitting the threshold for images that Veo 4 will unlock for video. We’ll suddenly go from static infographics to pro-grade animated motion graphics — like having a custom youtube video essay on any topic imaginable. And just like...
Video-to-Video Diffusion Expands VTuber Self-Image Flexibility
reskinning myself playing electric guitar. canon rock for the vibes. the future of vtubing is bright -- video-to-video diffusion is about to make everyone's self-image a lot more flexible. https://t.co/OWBIA5HVFu
Explore Cutting-Edge Visual Positioning and 3D Scanning Tech
check out @multiset_ai (visual positioning system & object tracking platform) and @XGRIDS2023 (3d scanning devices)
Real‑time 3D Mapping Delivers X‑Ray Precision Localization
3d visual positioning experiment -- look at the alignment between the 3d mesh and the live camera view. Truly feels magical -- like x-ray vision. Workflow: 1. Scanned a street in 15 mins w/ xgrids 2. Localized against that scan at night, in...
Grok Imagine Impresses with Quality Hindi Generation
@DevDminGod this is pretty good! i didn't know grok imagine could do such decent hindi
Vibe Coding Lets Anyone Build AR HUDs Instantly
Vibe coded this JARVIS inspired HUD with laser eyes and repulsor hands in 20 mins w/ Gemini 3 Pro using MediaPipe and ThreeJS. Augmented reality filters were always cool - but vibe coding makes them so much more accessible than complex...
Transform 360 Photos Into Immersive 3D Worlds Instantly
Turned my real world 360 images into 3d scenes with World Labs. You can use the built-in editing tools to stitch them together into large-scale 3d worlds - then use them as virtual set for your AI videos, games and VR...

Gaussian Splatting the World with Satellite 3D & Google's One-Two Punch
The episode spotlights Skyfall‑GS, a new method that combines 3D Gaussian splatting with diffusion models to generate detailed, city‑scale 3D maps from satellite imagery alone, opening up applications from conflict‑zone mapping to military terrain planning. It then shifts to Google’s...
Meta Cracks 3D Data Bottleneck with RLHF‑style Ranking
Meta just dropped SAM 3D, but more interestingly, they basically cracked the 3D data bottleneck that's been holding the field back for years. Manually creating or scanning 3D ground truth for the messy real world is basically impossible at scale. But what...
AGI's Multimodal Nature Treats Reality as Its Dataset
"AGI is multimodal and reality is the dataset of AGI"

AI Can Now Build 3D Worlds… And Live Inside Them
The episode explores the rapid convergence of AI-driven 3D world generators—such as World Labs' Marble and video‑diffusion system Genie‑3—and embodied agents like Google’s SIMA 2 that can perceive and act within those environments. Bilawal Sidhu explains how this fusion creates persistent,...
Real-Time 3D Home Mapping Showcases Computer Vision Power
Computer vision is fucking cool. Matic robo building a real time 3d map of your house. https://t.co/GSZq4fN11m
36‑Camera System Lets Fans Choose Angles—Do They Want It?
Pretty cool hack to blend between different video feeds to give you the feeling of free viewpoint video AKA. god's eye view. TL;DR 36 cameras deployed at basketball & badminton venues for China's National Games, letting viewers drag around on their...
DeepMind's SIMA 2 Shows Incremental Yet Bittersweet Progress
Google DeepMind's SIMA 1 vs SIMA 2 The bitter lesson continues to be bitter sweet https://t.co/puwR92vCla
SIMA 2 Shows Generalist AI Reasoning Across Games
Damn. DeepMind's generalist AI agent SIMA 2 evolved from basic instruction-following to actual reasoning companion. Uses vision and keyboard/mouse like a human player, works across dozens of games without touching game code. The robotics angle is obvious - if you...
From Complex 3D Scan Pipelines to Click‑Ready Simplicity
What was a complex hacky pipeline in 2023 to take indoor 3d scans and reskin them to different types of decor is now just a few clicks in 2025. World labs marble has collapsed a lot of the complexity involved...
Multimodal AI Companion Brings Real‑Life Jarvis to Users
Pretty big step towards a real life Jarvis - a multimodal ai assistant w/ a personality to boot. They're intentionally blurring the lines between a tool and a companion. This is what Siri should've been by now. Cool to see...
Seamless 3D Digital Twins: Merging Data, Unlocking Value
One of the coolest 3d real estate demos I've seen in a while. Treedis brings together gaussian splats for area understanding → BIM data to check which units are available → Matterport for indoor views → and image editing AI...
Douyin AI Videos: Mom Battles Escalating Alien Creature
Omg, AI videos on Douyin are a different breed. Chinese mom absolutely goes to town on a xenomorph 💀 And just when you think it’s over, it keeps escalating further: https://t.co/Ddgnb7gFLv
AI‑Powered Bike Survey Maps Ireland’s Empty Buildings
You can just bike around your city with a camera and auto identify vacant & derelict buildings. Ireland has a housing crisis. Thousands of empty buildings. Nobody knows where they all are. UCD's Spatial Dynamics Lab is using the latest in AI...
New Creation Engine: MotionStream Puppeteers Reality
I genuinely think we’re on the cusp of a new type of creation engine. Feels less like prompting and more like puppeteering reality itself. MotionStream is a taste of what’s to come: https://t.co/qWbOPnQD9R
Real-Time AI Video Generation via Interactive Motion Controls
We are just scratching the surface of precise control over AI video generation. MotionStream unlocks real-time video with interactive motion controls. You can interactively generate video based on motion inputs (like drawn trajectories, camera movements, or motion transfer). 29fps generation w/ 0.4...
Google's Gemini Adds Visual Landmarks to Navigation
Google is now using Gemini to cross-reference ~250M places with Street View imagery to identify visible landmarks for turn-by-turn nav. Think iconic buildings, gas stations and restaurants. So instead of "turn right in 500 feet" you get "turn right after the...
VR‑controlled Human Robots Erase Distance and Time
We will transcend space & time by teleoperation human robots anywhere in the world using VR headsets. It’s wild how far we’ve come from iPads on wheels. https://t.co/lSx7n8Su9n
Big Tech's AI Twins: Your Identity Becomes Their Profit
Meta, Google, Apple - they're all building AI replicas that capture your face, expressions, movements, personality. This goes way beyond Face ID. They're basically creating a version of you that knows you better than you know yourself. The fidelity is remarkable...
Ilya’s Memo Exposes Brockman Feud, Failed Merger, Backs Altman’s Firing
Lots of RTs on my Helen Toner interview today. Over a year later, we hear Ilya’s side of this story. The Brockman beef was unexpected (Ilya wrote a memo on him too), as was the failed Anthropic merger. Interesting that most screenshots...

Matrix Oscar Winner: "It Was a Prototype Disguised as a Blockbuster"
John Gaeta, the Oscar‑winning VFX architect behind the Matrix’s bullet‑time, explains how the film’s pioneering capture and simulation techniques became the R&D foundation for today’s game engines, virtual‑production tools, and AI‑driven world models. He traces a 25‑year evolution from the...
Matrix Was Prototype, Now Powers Hollywood’s Digital Future
Matrix Oscar Winner: "It Was a Prototype Disguised as a Blockbuster" I spoke to John Gaeta, the man who built bullet time. While Hollywood thought they were making a sci-fi trilogy, his team was running a research project in plain sight. We...
Apple Adopts Gaussian Splats for Lifelike Vision Pro Avatars
Apple Sr. Director Jeff Norris on how Vision Pro persona went from looking like cadavers to life-like avatars: "First thing I'd want to point out is that we changed the the whole rendering approach of Personas to a completely new technique...
Adobe's Spatial Lighting Mode Relights Images via AI
Okay coolest Adobe Sneaks of the year award goes to Project Light Touch. Adobe calls this "spatial lighting mode" -- interactively move your light source around within a 3D volume and voila -- your image is accurately relit. They're probably using ML...

Apple Leverages Gaussian Splatting for Realistic 3D Avatars
The secret's out -- Apple's persona avatars and 3D photos owe their quality boost to none other than Gaussian splatting. "In our meeting, Norris explains that Persona technology uses Gaussian splatting to create those surprisingly convincing 3D facial scans." "Norris says Apple...
AI-Powered Vision Reimagines Old Delhi Through Countless Eyes
Watching old delhi with a billion eyes (experimenting w/ generative ai + computer vision) https://t.co/v9eWuxmZS1
TED2025 Promise Fulfilled: Humanoid Robots Arrive at Home
When I brought 1X to TED2025 they promised something insane - people in the audience would have a humanoid robot in their homes within a year. Now these mad lads are delivering. Huge congrats Bernt, Dar, Eric and team! Order...
Exploring New Control Features in Generative Video Tools
Trying the new “control” features in generative video tools https://t.co/vRDrz0B7zf
1:1 3
You can capture reality with remarkable fidelity - here’s a 1:1 3d gaussian splat perfectly aligned with reality. Once Meta layers in a scalable version of their codec avatar tech - this stuff is about to get trippy. Literally transcending time...
Just-in-Time AI Ads Will Predict Your Scroll
Just-in-time AI advertising is near. Here are the early signs. Even if these massive data center build outs don’t amount to AGI - you best believe personalized ads will be generated for you just a few videos before you scroll...
Hollywood's JARVIS Blueprint Shaped Real-World AI
Marvel gave Ian Dawson and his team six months to design Tony Stark's holographic AI assistant. "We weren't trying to predict the future. We were just solving script problems." Then Microsoft was building it. Tesla was building it. Apple was building it. Here's...
Google Turns Geospatial Data Into Everyday AI
Google just took another big step towards becoming ChatGPT for planet earth. I can’t overstate how important this is — geospatial AI commodified. Here’s what it can do: https://t.co/QjfwbEMXoy
PlayCanvas Introduces LOD Streaming for World‑Scale Splatting
I like big splats and I cannot lie. PlayCanvas's new LOD streaming system for 3D Gaussian Splatting is a big deal. It's a key step towards splatting the world and making it explorable. Check out a live demo below in your...
Robots Require Digital Twins via Reality Capture to Scale
Robots need a digital twin of the world to train inside. Reality capture techniques like radiance fields & surface reconstruction are crucial for robotics to scale in the real world — and this thread beautifully illustrates why.
Diffusion Models Turn Satellite Roofs Into Full 3D Cities
So these researchers figured out you can basically hallucinate 3D cities into existence using just satellite photos & a diffusion model. The problem's pretty straightforward: satellites only see rooftops. Building facades? Invisible. Street-level detail? Doesn't exist. But people want flyable 3D...
Meta Releases 500‑hour 3D Motion Dataset for AI
Datasets you need to build an AI JARVIS — Meta dropped 500 hours of 3D motion data spanning everything from individual gestures to multi-person conversations and co-living scenarios, complete with motion tracking, annotations, and audio tracks. https://t.co/ZDEiMmeq0Q
Mystery “World Models” Spark VC Excitement
World models: No one knows what it means, but it’s provocative. It gets the VCs going.
Simulon Delivers Superior Real‑time AI Lighting Estimation
I like light probes and I cannot lie. Especially when they’re being hallucinated by AI. Simulon is building a much higher quality version of image-based lighting estimation the Vision Pro or ARCore does in real time.