Data Science Dojo - Latest News and Information
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

Technology Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
Data Science Dojo

Data Science Dojo

Publication
0 followers

Educational AI data science and machine learning tutorials and talks

Recent Posts

Exploring the Origins with Word2Vec | Vector Databases for Beginners | Part 3
Video•Dec 13, 2025

Exploring the Origins with Word2Vec | Vector Databases for Beginners | Part 3

The video "Exploring the Origins with Word2Vec | Vector Databases for Beginners | Part 3" walks viewers through the historical breakthrough that introduced word embeddings, focusing on the Word2Vec model and its role in turning raw text into numeric vectors. The presenter frames the discussion around a fundamental question—how does a neural network learn to encode language—before diving into the mechanics of the original Word2Vec architecture. Key technical insights are laid out step‑by‑step. Word2Vec was trained on a corpus exceeding 100 billion words using a shallow neural network that predicts surrounding words (the skip‑gram approach). By repeatedly feeding an input word and adjusting the network to minimize the error between its predicted context and the actual neighboring words, the model gradually learns vector representations that capture semantic relationships. The speaker illustrates the process with a concrete example: feeding the word “not” and expecting the model to predict “thou,” showing how an incorrect prediction (e.g., “taco”) triggers back‑propagation to refine the embeddings. The presenter also highlights practical limitations that have shaped subsequent research. Word2Vec operates at the word level, making sentence‑level embeddings cumbersome and requiring post‑hoc vector combinations. Moreover, it assigns a single vector to polysemous words—such as “bank”—ignoring distinct senses. These shortcomings are underscored with the “bank” example, emphasizing that the model cannot differentiate between a financial institution, a riverbank, or a verb. Finally, the video positions Word2Vec as the conceptual foundation for modern embedding techniques and vector databases used in search, recommendation, and AI‑driven analytics. Understanding its architecture and constraints helps businesses evaluate the suitability of legacy embeddings versus newer contextual models, informing decisions about data pipelines, storage strategies, and the scalability of AI solutions.

By Data Science Dojo
What Are Vectors? | Vector Databases for Beginners | Part 2
Video•Dec 13, 2025

What Are Vectors? | Vector Databases for Beginners | Part 2

The video provides a beginner‑friendly overview of vector embeddings, tracing their academic roots back to early 2000s research and highlighting the watershed 2013 Word2Vec paper that brought vectors into mainstream industry use. It then connects that breakthrough to the later...

By Data Science Dojo
Why Your Brain Learns Better With A Good Story Story? Joshua Starmer X Data Science Dojo
Video•Dec 11, 2025

Why Your Brain Learns Better With A Good Story Story? Joshua Starmer X Data Science Dojo

The video featuring Joshua Starmer and Data Science Dojo argues that storytelling is not a peripheral flourish but a core pedagogical tool, even when the subject matter is as technical as mathematics or machine learning. The speakers contend that a...

By Data Science Dojo
Deep Agents with LangGraph: From Planning to Persistent Reasoning | Community Webinar
Video•Dec 11, 2025

Deep Agents with LangGraph: From Planning to Persistent Reasoning | Community Webinar

The webinar introduced Deep Agents built on LangGraph, positioning them as the next evolution in multi‑agent AI systems. Presenter Sajir Heather Zaddi, a senior software engineer specializing in LLM fine‑tuning and agentic workflows, framed the discussion around a recent tweet...

By Data Science Dojo
Introduction to Vector Embeddings | Vector Databases for Beginners | Part 1
Video•Dec 9, 2025

Introduction to Vector Embeddings | Vector Databases for Beginners | Part 1

The video serves as an introductory tutorial on vector embeddings, presented by machine‑learning engineer Victoria Slocum in partnership with Data Science Dojo. Slocum frames embeddings as the bridge between raw media—text, images, audio, video—and the numerical representations that power modern AI...

By Data Science Dojo
How To Stay Ahead In A World Where AI Can Possibly Replace You? | Jay Alammar X Data Science Dojo
Video•Dec 9, 2025

How To Stay Ahead In A World Where AI Can Possibly Replace You? | Jay Alammar X Data Science Dojo

The video features a conversation between AI educator Jay Alammar and Data Science Dojo on how knowledge workers can stay ahead in an economy where generative AI threatens to automate many tasks. The hosts frame the discussion around the age‑old...

By Data Science Dojo
Workshop: Transformer Models with @SerranoAcademy | Future of Data and AI | Agentic AI Conference
Video•Dec 8, 2025

Workshop: Transformer Models with @SerranoAcademy | Future of Data and AI | Agentic AI Conference

The workshop hosted by Luis Tirano at the Agentic AI Conference provided a deep‑dive into transformer models, focusing on their architecture, practical strengths and weaknesses, and emerging techniques such as Retrieval‑Augmented Generation (RAG) and autonomous agents. After a brief introduction...

By Data Science Dojo
Running the Workflow & Final Output | Multi Agent Workflows for Beginners | Part 10
Video•Dec 4, 2025

Running the Workflow & Final Output | Multi Agent Workflows for Beginners | Part 10

The video demonstrates running a multi-agent workflow where a supervisor routes tasks to specialized agents: a coder agent that generates complete HTML/CSS/JavaScript portfolio code and a researcher agent that produces a structured, iterative research report on radiology. The presenter runs...

By Data Science Dojo
Should You Trust ChatGPT With Your Data? | Jerry Liu X Data Science Dojo
Video•Dec 1, 2025

Should You Trust ChatGPT With Your Data? | Jerry Liu X Data Science Dojo

Speakers argue that for most individual users, uploading personal or mundane documents to ChatGPT (or similar tools) poses minimal risk because OpenAI does not broadly use such data traces for model training. However, companies and users handling highly sensitive, classified,...

By Data Science Dojo
Workshop: Building Smarter Agents, Faster with Arize | Future of Data and AI | Agentic AI Conference
Video•Dec 1, 2025

Workshop: Building Smarter Agents, Faster with Arize | Future of Data and AI | Agentic AI Conference

Arize hosted a three-hour interactive workshop at the Agentic AI Conference to teach practitioners how to build and deploy smarter agents quickly. Product and community leads walked attendees through core concepts—RAG, tool-calling, model composition and evaluation—and provided hands-on Python labs...

By Data Science Dojo
Building the Agent Graph | Multi Agent Workflows for Beginners | Part 9
Video•Nov 30, 2025

Building the Agent Graph | Multi Agent Workflows for Beginners | Part 9

The presenter walks through constructing an agent graph for a multi-agent workflow, demonstrating how to define nodes (researcher, coder, supervisor), import required libraries, and instantiate a class to set up the workflow. They explain adding conditional edges that route decisions...

By Data Science Dojo
Setting Up the Supervisor Agent | Multi Agent Workflows for Beginners | Part 8
Video•Nov 29, 2025

Setting Up the Supervisor Agent | Multi Agent Workflows for Beginners | Part 8

The video demonstrates setting up a Supervisor Agent as part of a multi-agent workflow. It walks through helper utilities, the agent’s message block and system prompt, and a prompt template that decides which agent should act next. The presenter names...

By Data Science Dojo
Setting Up Tools for Your Agents | Multi Agent Workflows for Beginners | Part 7
Video•Nov 27, 2025

Setting Up Tools for Your Agents | Multi Agent Workflows for Beginners | Part 7

The video walks through setting up tools and a supervisor agent for multi-agent workflows, using slides and screenshots to explain architecture rather than live coding. The instructor shows creating two tools—a web search tool and a Python REPL tool—importing and...

By Data Science Dojo
Workshop: Agentic AI for Semantic Search | Future of Data and AI | Agentic AI Conference
Video•Nov 26, 2025

Workshop: Agentic AI for Semantic Search | Future of Data and AI | Agentic AI Conference

Pinecone hosted a three-hour workshop titled “Agentic AI for Semantic Search” that walked developers through the theory and hands-on construction of agent-driven semantic search applications. Hosts from Pinecone introduced agentic AI concepts, detailed Pinecone’s vector database architecture and differentiators, and...

By Data Science Dojo
Coding & Environment Setup | Multi Agent Workflows for Beginners | Part 6
Video•Nov 24, 2025

Coding & Environment Setup | Multi Agent Workflows for Beginners | Part 6

The video walks through a hands-on notebook that builds a multi-agent supervisor: after installing required Python packages (langchain, langsmi(th?), pandas, etc.) and setting environment variables, the instructor creates a supervisor agent that can route queries to two specialist agents. The...

By Data Science Dojo

Page 2 of 2

← Prev12