KDnuggets

KDnuggets

Publication
0 followers

Longstanding data science/analytics publication covering Big Data, ML, and data engineering.

A Guide to Kedro: Your Production-Ready Data Science Toolbox
NewsMar 4, 2026

A Guide to Kedro: Your Production-Ready Data Science Toolbox

QuantumBlack’s open‑source Kedro framework helps data scientists move from exploratory notebooks to production‑ready pipelines. The guide walks users through installing Kedro, setting up a project, defining a data catalog, building pipelines with nodes, and configuring parameters. It also covers optional...

By KDnuggets
5 Useful Python Scripts for Automated Data Quality Checks
NewsFeb 26, 2026

5 Useful Python Scripts for Automated Data Quality Checks

The article presents five open‑source Python scripts that automate common data‑quality checks, ranging from missing‑value analysis to cross‑field consistency validation. Each tool reads CSV, Excel or JSON inputs, applies schema‑driven rules or statistical methods, and generates detailed reports with actionable...

By KDnuggets
5 Python Data Validation Libraries You Should Be Using
NewsFeb 24, 2026

5 Python Data Validation Libraries You Should Be Using

Data validation is gaining prominence as pipelines become more complex, and Python now offers a diverse set of libraries to address this need. The article reviews five tools—Pydantic, Cerberus, Marshmallow, Pandera, and Great Expectations—each targeting a different validation paradigm, from...

By KDnuggets
7 XGBoost Tricks for More Accurate Predictive Models
NewsFeb 20, 2026

7 XGBoost Tricks for More Accurate Predictive Models

The article outlines seven practical XGBoost tricks that boost predictive accuracy on tabular data. It demonstrates how adjusting learning rate, tree depth, subsampling, regularization, early stopping, hyper‑parameter search, and class weighting can transform a baseline model. Code snippets using the...

By KDnuggets
FastMCP: The Pythonic Way to Build MCP Servers and Clients
NewsFeb 19, 2026

FastMCP: The Pythonic Way to Build MCP Servers and Clients

FastMCP is a Python framework that streamlines building Model Context Protocol (MCP) servers and clients using decorator‑based abstractions. It handles JSON‑RPC 2.0 messaging, async execution, and multiple transports such as stdio, HTTP, WebSocket, and SSE, while providing built‑in error handling and...

By KDnuggets
From Messy to Clean: 8 Python Tricks for Effortless Data Preprocessing
NewsFeb 18, 2026

From Messy to Clean: 8 Python Tricks for Effortless Data Preprocessing

The article outlines eight concise Python tricks that streamline data preprocessing, from normalizing column names to clipping outliers. Each technique uses pandas functions to handle whitespace, type conversion, date parsing, missing values, categorical standardization, duplicate removal, and quantile‑based capping. The...

By KDnuggets
All About Feature Stores
NewsFeb 16, 2026

All About Feature Stores

Feature stores have moved from niche tools to core infrastructure for operational machine‑learning, providing a single source of truth for features used in both training and online inference. The concept was coined by Uber in 2017 and commercialized by Tecton...

By KDnuggets
Learn Python, SQL and PowerBI to Become a Certified Data Analyst for FREE This Week
NewsFeb 16, 2026

Learn Python, SQL and PowerBI to Become a Certified Data Analyst for FREE This Week

From February 16–22, DataCamp’s entire curriculum is 100% free.

By KDnuggets
Self-Hosted AI: A Complete Roadmap for Beginners
NewsFeb 16, 2026

Self-Hosted AI: A Complete Roadmap for Beginners

The article outlines a step‑by‑step roadmap for building a private AI hub using Docker, Ollama, and n8n. It targets beginners seeking to run large language models locally without relying on cloud providers. The guide emphasizes containerization, open‑source model serving, and...

By KDnuggets
Versioning and Testing Data Solutions: Applying CI and Unit Tests on Interview-Style Queries
NewsFeb 11, 2026

Versioning and Testing Data Solutions: Applying CI and Unit Tests on Interview-Style Queries

The article walks through solving a Tesla interview question in Python, calculating each car maker’s net product launch change between 2019 and 2020 using pandas. It then refactors the script into a reusable function and adds a unit‑test suite to...

By KDnuggets
Building Your Modern Data Analytics Stack with Python, Parquet, and DuckDB
NewsFeb 10, 2026

Building Your Modern Data Analytics Stack with Python, Parquet, and DuckDB

Modern data analytics can be streamlined using a trio of open‑source tools: Python for scripting, Parquet for columnar storage, and DuckDB as an in‑process SQL engine. The article demonstrates how DuckDB reads and writes Parquet files directly, eliminating data movement...

By KDnuggets
7 Python EDA Tricks to Find and Fix Data Issues
NewsFeb 9, 2026

7 Python EDA Tricks to Find and Fix Data Issues

The article outlines seven practical Python techniques for early-stage exploratory data analysis aimed at uncovering and correcting data quality problems. It highlights core pandas functions, visualization tools, and string‑matching methods that streamline the detection of missing values, outliers, and duplicate...

By KDnuggets
Is Your Machine Learning Pipeline as Efficient as It Could Be?
NewsFeb 6, 2026

Is Your Machine Learning Pipeline as Efficient as It Could Be?

Machine learning teams are increasingly overlooking pipeline efficiency, a hidden driver of productivity. Slow data I/O, redundant preprocessing, and mismatched compute inflate the iteration gap, limiting the number of hypotheses tested per week. The article outlines five audit areas—data ingestion,...

By KDnuggets
5 Open Source Image Editing AI Models
NewsFeb 4, 2026

5 Open Source Image Editing AI Models

A new KDnuggets article spotlights five open‑source AI models that enable text‑driven image editing, ranging from Black Forest Labs' FLUX.2 [klein] 9B to Alibaba Cloud's Qwen‑Image‑Edit‑2511 and newer adapters like FLUX.2 [dev] Turbo. The models deliver real‑time generation, multi‑reference editing, bilingual support,...

By KDnuggets