Decode-AiML

We’re on a mission to build the world’s most structured and industry-relevant AI course β€” crafted to help you crack interviews at MAANG and top startups. And the best part? It’s 100% FREE!

Section 6 - EDA & Feature Engineering - Basic to Advanced

Welcome to the series - EDA & Feature Engineering - Basic to Advanced. Notes Link : EDA & Feature Engineering - Basic to Advanced

Github Code Repository Link: Decode-AiML Repository


Lecture 6.1) Introduction to Exploratory Data Analysis (EDA) & Feature Engineering - ML Lifecycle - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. Machine Learning Lifecycle Explained
  2. Introduction to Exploratory Data Analysis (EDA)
  3. Common Techniques of EDA with Examples
  4. Introduction to Feature Engineering
  5. Key Steps in Feature Engineering
  6. Why are EDA & Feature Engineering Important in Machine Learning ?

#EDA #FeatureEngineering #MachineLearning #DataScience #ExploratoryDataAnalysis #MLLifecycle #DecodeAiML


Lecture 6.2) Exploratory Data Analysis in Seaborn - Housing Prices Dataset EDA - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. Introduction to House Price Prediction Dataset from Kaggle
  2. Loading dataset into a Pandas DataFrame
  3. Exploring dataset using built-in methods – info(), describe(), head(), columns
  4. Seaborn Introduction – Seaborn vs Matplotlib vs Pandas built-ins
  5. Univariate EDA – Bar Plot, KDE Plot & Box Plot in Seaborn
  6. Outliers in Data – Detection & Handling using clip() in Pandas
  7. Bivariate EDA – Scatter Plot & Box Plot in Seaborn
  8. Multivariate EDA – Heatmap visualization with Seaborn

#Seaborn #ExploratoryDataAnalysis #HousingPrices #HousingPricesPrediction #HousingPricesDataset #PythonForDataScience #DataAnalysis #EDA #MachineLearning #Pandas #DataVisualization #DecodeAiML


Lecture 6.3) Handling Outliers in Feature Engineering - Z-Score, IQR, Transformations Explained - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. What are outliers and why they matter in data.
  2. Sources of outliers in datasets.
  3. Detecting outliers with examples and Python code.
  4. Different methods for handling outliers in Python.
  5. Understanding Z-score – definition, calculation, and Python implementation.
  6. Handling outliers using Capping/Winsorization with code.
  7. Cap vs Transformation – when to use which approach.
  8. Applying Log, Square Root, and Yeo-Johnson Transformations in Python.
  9. Choosing the right method to handle outliers.
  10. Final summary and key takeaways.

#Outliers #DataPreprocessing #ZScore #DataScienceHindi #PythonForDataScience #MachineLearning #EDA #FeatureEngineering #Statistics #PythonCode #DecodeAiML


Lecture 6.4) Handling Missing Values in EDA - Simple, Iterative & KNN Imputer - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. What are Missing values and why handling missing values are important in Exploratory data analysis ?
  2. What are missing values and why they matter in EDA?
  3. Dropping rows/columns based on threshold score.
  4. Replacing null values with constants or computed values.
  5. Using SimpleImputer in sklearn with examples.
  6. Time series imputation techniques – forward fill & backward fill.
  7. Multivariate approaches for handling missing values.
  8. Iterative Imputer explained with sklearn implementation.
  9. KNN Imputer explained with sklearn implementation.

#MissingValues #DataPreprocessing #EDA #PythonForDataScience #MachineLearning #FeatureEngineering #SimpleImputer #KNNImputer #IterativeImputer #DataCleaning #DecodeAiML


Lecture 6.5) Categorical Data Encoding - Label, One-Hot, Ordinal & Target Encoding - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. What are categorical features and why they matter in ML.
  2. Types of categorical data – Nominal vs Ordinal.
  3. Creating a synthetic dataset in pandas to simulate categorical features.
  4. Identifying categorical columns in a dataset.
  5. EDA & visualization of categorical data.
  6. Encoding techniques – Label, One-Hot, Ordinal & Target Encoding.
  7. LabelEncoding vs OrdinalEncoding explained.
  8. Implementing LabelEncoder & OrdinalEncoder in sklearn.
  9. OneHotEncoding with low cardinality categorical data using sklearn.
  10. OneHotEncoding explained with examples, limitations & why we drop one column.

#MachineLearning #DataScience #FeatureEngineering #DataPreprocessing #DataScienceTutorial #MachineLearningHindi #CategoricalData #CategoricalEncoding #LabelEncoding #OneHotEncoding #OrdinalEncoding #TargetEncoding #EncodingTechniques #Python #PythonProgramming #ScikitLearn #Pandas #Numpy #DSATutorial #MLForBeginners #LearnDataScience #DataScienceProjects #MLHindi #DecodeAiML


Lecture 6.6) Feature Creation & Feature Engineering Explained - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. Creating a synthetic e-commerce dataset for ML.
  2. Generating new features from existing data using domain knowledge.
  3. Handling datetime columns and extracting meaningful features.
  4. Creating customer-level and category-level aggregates.
  5. Handling categorical data with encoding techniques.
  6. Preprocessing text features from reviews.
  7. Creating interaction features and engineered numerical features.
  8. Feature selection to retain useful variables.
  9. Building a pipeline for feature engineering.
  10. Training a model with engineered features for better performance.

#MachineLearning #FeatureEngineering #FeatureCreation #DataScience #MachineLearningHindi #DataScienceProjects #Python #ScikitLearn #Pandas #Numpy #MLPipeline #FeatureSelection #TextFeatures #CategoricalData #DateTimeFeatures #DecodeAiML #DSATutorial #MLForBeginners #LearnDataScience #DecodeAiML


Lecture 6.7) Feature Scaling in Machine Learning - MinMaxScaler, StandardScaler, RobustScaler - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. What is Feature Scaling & Why it’s important in ML?
  2. Types of Feature Scaling techniques.
  3. Min-Max Scaling – concept, use-cases & implementation in Python (scikit-learn).
  4. Standardization (Z-Score Scaling) – concept, use-cases & implementation in Python.
  5. RobustScaler – concept, use-cases & implementation in Python.
  6. Effect of scaling on data distribution – visual explanation.
  7. Checklist for choosing the right scaling method.
  8. Feature Scaling vs Feature Transformation explained with examples.

#FeatureScaling #MachineLearning #MinMaxScaler #StandardScaler #RobustScaler #DataPreprocessing #Python #ScikitLearn #DataScience #MachineLearningHindi #MLForBeginners #DecodeAiML #DataPreprocessingPython #FeatureEngineering #DSATutorial #DecodeAiML


Lecture 6.8) Data Leakage in Machine Learning - Types of Data Leakage - Hindi

Video Link: Watch on YouTube

Video Description:

πŸ“˜ Topics Covered:

  1. Introduction to Data Leakage in ML.
  2. Ecommerce Order dataset explained with practical examples.
  3. Temporal vs Non-temporal datasets.
  4. Types of Data Leakage explained with examples.
  5. Future Info Leakage, Train-Test Contamination, Temporal Leakage & Target Leakage explained with Python code.
  6. Checklist to avoid Data Leakage (with code).
  7. Drop leakage-prone features.
  8. Use time-based train-test split.
  9. Scale numeric features using only training statistics.
  10. Encode categorical features using only training categories.
  11. Replace categorical columns with encoded versions.

#DataLeakage #MachineLearning #FeatureEngineering #DataScience #MLTips #MLHindi #EcommerceDataset #Python #ScikitLearn #MLForBeginners #DecodeAiML #DataPreprocessing #AvoidDataLeakage #DecodeAiML #FeatureEngineeringInML #AIML #MachineLearning