EDUREKHA Data Science and ML Internship Program V2 - Program Brochure

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

EDUREKA

DATA SCIENCE AND MACHINE LEARNING


INTERNSHIP PROGRAM
About Edureka
Edureka is one of the world’s largest and most effective online education
platforms for students. Our easy and affordable learning solution is accessible to
millions who aspire to be technology professionals.

11 176 100,000+
Years of Excellence Countries Reached Learner Community

Ever since our inception, we have dedicated ourselves towards helping students
and working professionals learn Programming, Data Science, Big Data, Cloud
Computing, DevOps, Business Analytics, Mobile Technologies, Software Testing,
Web Development, System Engineering, Project Management, Digital Marketing,
Business Intelligence, Cybersecurity, RPA and more.

About the Program


Edureka’s Data Science and Machine Learning Internship Program is carefully
designed by industry experts to help learners master foundational skills
indispensable for Data Science Professionals. Throughout the program, you will
learn Python from Level 0, Master MicrosoftSQL Server, learn important Machine
Learning Concepts, build Machine Learning and Deep learning Models, Data
Analysis, Data Management and so much more. All this will help you get
acquainted with the technical responsibilities that you will be handling at your
future company.

Highest Course Lifetime Course University & 24/7


Completion Rate Access Corporate Alliances Live Support

Our Learners Enjoy lifetime Courses curated Get real-time


course completion access to in partnership with doubt resolution
rate is above complete course IIT’s, NIT’s & for technical &
80% content Top MNC’s general queries
INTERNSHIP PROGRAM CURRICULUM I 01

CONTENT
MODULE 1
Python
Python Basics
Python Essentials
Control Flow in Python
Functions in Python
Packages, Modules, and File Handling
NumPy Arrays
NumPY Functions
Array Manipulation
Pandas
Functionality of Pandas Dataframe
Functionalities
Data Visualization
Web Scraping
07

MODULE 2
Probability and Statistics
Statistical Analysis in Data Science
Measures of Dispersion and Position
Exploratory Data Analysis - I
Exploratory Data Analysis - II
Probability - I
Probability - II
Probability Distribution, Skewness, and Kurtosis
Probability Distribution Essentials
Inferential Statistics
Hypothesis Tests 13
INTERNSHIP PROGRAM CURRICULUM I 02

MINI PROJECT 1
Travel Aggregator Analysis 20

MODULE 3
Database Management using Microsoft SQL Server
Database Essentials
Querying Data using Built-in Functions and T-SQL - I
Querying Data using Built-in Functions and T-SQL - II
Advanced SQL - I
Advanced SQL - II
Advanced SQL - III
21

MINI PROJECT 2
Human Resource Management System 25

MODULE 4
Supervised Machine Learning
Introduction to Machine Learning
Regression
Regularization Techniques
Classification
Logistic Regression
Decision Tree
Naïve Bayes Classification
k Nearest Neighbor
Support Vector Machine 26
INTERNSHIP PROGRAM CURRICULUM I 03

MODULE 5
Unsupervised Machine Learning
Dimensionality Reduction
LDA
Clustering - I
Clustering - II
Clustering - III
Association Rule Mining and Market Basket Analysis
Recommendation Engine
Time Series Analysis
Time Series Models Using Python 31

MODULE 6
Model Evaluation and Optimization
Model Selection
Model Evaluation
Performance Metrics
Hyperparameter Tuning
Bagging, Boosting, and Model Optimization
Linear Programming 35

MINI PROJECT 3
Heart Disease Prediction 38

MODULE 7
Deep Learning and NLP
Introduction to Deep Learning
Backpropagation
Tensorflow
INTERNSHIP PROGRAM CURRICULUM I 04

CNN
RNN
RNN
LSTM
Reinforcement Learning
Introduction to NLP
Text Pre-processing
Feature Extraction
Sentiment Analysis 39

MODULE 8
Data Visualization using Tableau
Introduction to Tableau
Data Granularity and Sorting
Data Blending, Joins, and Unions
Calculations and Built-In Functions
Table Calculations, Parameters, and Level of Detail Calculations
Advanced Visualization Techniques
Maps, Effective Chart Usage, and Dashboard Basics
Interactive Dashboards and Story Points 44

MINI PROJECT 4
Core Infrastructure Company Data Visualization 49

CAPSTONE PROJECT
Movie Recommendation System like Netflix 50

TOOLS & TECHNOLOGIES 52


INTERNSHIP PROGRAM CURRICULUM I 05

EVALUATIONS 53

CERTIFICATES 54

PLACEMENT ASSISTANCE 57
INTERNSHIP PROGRAM CURRICULUM I 06

MODULE 1

Python
Python Basics
Python Essentials
Control Flow in Python
Functions in Python
Packages, Modules, and File Handling
NumPy Arrays
NumPY Functions
Array Manipulation
Pandas
Functionality of Pandas Dataframe
Functionalities
Data Visualization
Web Scraping
INTERNSHIP PROGRAM CURRICULUM I 07

Day 0 - Data Science and Machine


Learning Internship Program Orientation

Introduction to Data Science and Machine Learning Internship


Program

What is Unique about this Internship?

Program Details and Features

Internship Class Schedule

Understanding Data Science and Machine Learning

Why Learn Data Science and Machine Learning

Market Demand for Data Science Professionals

Job Roles and Responsibilities

Day 1 - Python Basics

Basics of Python

Tokens

Keywords

Literals

Identifiers

Operators
INTERNSHIP PROGRAM CURRICULUM I 08

Day 2 - Python Essentials

Variables

Python Data Types

Numbers

Strings

Lists

Tuples

Sets

Dictionaries

Day 3 - Control Flow in Python

Control Flow in Python

Conditional Statements

Loops

Nested Loops

Loop Control Statements


INTERNSHIP PROGRAM CURRICULUM I 09

Day 4 - Functions in Python

Functions in Python

Types of Arguments

Range Function

Lambda Function

Day 5 - Packages, Modules,


and File Handling

Packages and Modules in Python

File Handling

Day 6 - NumPy Arrays

What is NumPy?

Installing NumPy

N-dimensional array

Array Creation Routines

Arrays of ones and zeroes

From existing data

Using numerical ranges


INTERNSHIP PROGRAM CURRICULUM I 10

Day 7 - NumPY Functions

Arithmetic Operators

Single Dimensional Arrays

Multi Dimensional Arrays

Matrix Product

Day 8 - Array Manipulation

Array Manipulation

Changing array shape

Transpose-like operations

Array manipulation

Joining arrays

Splitting arrays

Day 9 - Pandas

Introduction to Pandas Library in Python

Pandas Data Structures

Importing and Exporting Data Using Pandas


INTERNSHIP PROGRAM CURRICULUM I 11

Day 10 - Functionality of Pandas


Dataframe

Functionality of Pandas Series

Understanding Functionality of pandas DataFrame

Functionality of Pandas DataFrame

Updating Cells

Day 11 - Functionalities

Filtering a DataFrame

Concatenate and Merge Data

Join data

Combining Data

Data Cleaning Using Pandas – Use Case

Check for missing values

Handling missing values

Data Cleaning

Cleaning Data
INTERNSHIP PROGRAM CURRICULUM I 12

Day 12 - Data Visualization

Data visualization

Libraries and tools for data visualization

Matplotlib

Types of plots and charts

Data visualization using seaborn

Matplotlib Plots and Charts

Plotting Different Types of Charts (Demonstration)

Customizing Visualizations

Customizing Visualizations Using Matplotlib (Demonstration)

Saving Plots

Subplots

Day 13 - Web Scraping

Basics of Web Scraping

Illustration of Web Scraping Process Flow

Beautiful Soup

Inspecting a Web Page

Fetching Book Categories Using Beautiful Soup


INTERNSHIP PROGRAM CURRICULUM I 13

MODULE 2

Probability and Statistics

Day 14 - Statistical Analysis in


Data Science

Introduction to Data Types

Categorical Data

Numerical Data

Data I/O and Data Types

Introduction to Statistics

Statistical Analysis Divisions

Introduction to Measures of Central Tendency

Calculate Measures of Central Tendency Using Python


INTERNSHIP PROGRAM CURRICULUM I 14

Day 15 - Measures of Dispersion and


Position

Introduction to Measures of Dispersion

Calculate Measures of Dispersion Using Python

Introduction to Measures of Position

Calculate Measures of Position Using Python

Day 16 - Exploratory Data Analysis - I

EDA

Basics of Univariate Non-Graphical EDA

Existence of Outliers

Detecting and Removing Outliers

Measures of Shape

Basics of Univariate Graphical EDA

Visualize Data Using Pie Chart

Visualize Data Using Histogram

Visualize Data Using Line Graph (Self-paced)

Visualize Data Using Box Plot (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 15

Day 17 - Exploratory Data Analysis - II

Multivariate EDA

Analyzing Multivariate Non-Graphical EDA

Perform Cross Tabulation on Data

Covariance and Correlation

Analyzing Multivariate Graphical EDA

Visualize Data Using Scatter Plot

Heat Maps

Visualize Data Using Heat Map (Self-paced)

Summarizing Graphical EDA Techniques (Self-paced)

Day 18 - Probability - I

Need for Probability Theory

Basics of Probability

Basics of Probability

Working with Events

Working with Events


INTERNSHIP PROGRAM CURRICULUM I 16

Day 19 - Probability - II

Calculating Probabilities

Conditional Probability

Introduction to Bayes’ Theorem

Bayes’ Theorem

Application of Bayes’ Theorem in Data Science

Expected Values

Applications of Conditional Probability

Day 20 - Probability Distribution,


Skewness, and Kurtosis

Types of Probability Distributions

Introduction to Skewness

Introduction to Kurtosis

Normal Distribution and Z-Distribution


INTERNSHIP PROGRAM CURRICULUM I 17

Day 21- Probability Distribution Essentials

IIntroduction to Probability Distribution

Population and Sample

Student’s t-Distribution

Introduction to Sampling Distribution

Sampling Distribution and Standard Error

Central Limit Theorem

Day 22- Inferential Statistics

Introduction to Inferential Statistics

Estimation

Bias of an Estimator

Point Estimation and Interval Estimation

MLE

Basics of Confidence Interval

Margin of Error and Confidence Interval

Hypothesis Testing (Self-paced)

Decision Errors and Decision Rules (Self-paced)

Statistical Test Interpretation (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 18

Day 23- Hypothesis Tests

Two-Tailed and One-Tailed Tests

Hypothesis Test For Single Population Mean: Z-Test

Calculating Z-Statistic

Hypothesis Test for Single Population Mean: T-Test

Independent Two Sample T-Test

Performing T-Tests

Introduction to Chi-Square Tests

Performing Chi-Square Tests


INTERNSHIP PROGRAM CURRICULUM I 19

Day 24 - Interview Preparation


Python, Probability and Statistics
INTERNSHIP PROGRAM CURRICULUM I 20

Day 25 - MINI PROJECT 1


Travel Aggregator Analysis

Python For Data Science


INTERNSHIP PROGRAM CURRICULUM I 21

MODULE 3

Database Management using MicrosoftSQL Server

Day 26 - Database Essentials

Transaction (Self-paced)

ER-Model (Self-paced)

Cardinality (Self-paced)

Degree of Relationship (Self-paced)

Client Server Architecture (Self-paced)

Microsoft SQL Server (Self-paced)

Microsoft SQL Management Studio (SSMS)

Degree of Relationship
INTERNSHIP PROGRAM CURRICULUM I 22

Client Server Architecture

Normalization

1NF, 2NF, 3NF, 4NF and BCNF

Operation on Keys (Self-paced)

SQL Commands (Self-paced)

Day 27 - Querying Data using Built-in


Functions and T-SQL - I

Transact Structured Query Language

Aggregate Functions

Day 28 - Querying Data using Built-in


Functions and T-SQL - II

Scalar Functions

SQL Server Operators

Microsoft SQL Management Studio (SSMS)

Day 29 - Advanced SQL - I

SQL Queries

SELECT Statement

Clauses of SELECT Statement


INTERNSHIP PROGRAM CURRICULUM I 23

Day 30 - Advanced SQL - II

SQL Commands

Subqueries

Subqueries with various operators

SQL Server View

Alerting and Removing View

Case Expressions

Day 31 - Advanced SQL - III

Stored Procedures

Life Cycle of Stored Procedures


Concurrency Control and its types

Ranking Functions

Date and Time Functions

UDFs (User Defined Functions) (Self-paced)

Backup and Restore Databases (Self-paced)

Triggers (Self-paced)

Index (Self-paced)

Introduction to Optimization (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 24

Understanding Performance (Self-paced)

Optimizing Queries (Self-paced)

Indexing for Performance (Self-paced)

Performance Tuning (Self-paced)Transact Structured


Query Language

Aggregate Functions

Day 32 - Interview Preparation


Database Management using Microsoft SQL Server
INTERNSHIP PROGRAM CURRICULUM I 25

Day 33 - MINI PROJECT 2


Human Resource Management System

DataBase Management with


Microsoft SQL Server
INTERNSHIP PROGRAM CURRICULUM I 26

MODULE 4

Supervised Machine Learning

Day 34 - Introduction to Machine Learning

What is Machine Learning?

AI vs. Machine Learning vs. Deep Learning

Significance of Machine Learning

Applications of Machine Learning

The myth about Machine Learning

Types of Machine Learning

Data Pre-processing Techniques

Train/Test split method


INTERNSHIP PROGRAM CURRICULUM I 27

Day 35 - Regression

What is Supervised Learning?

Introduction to Regression

What is Linear Regression?

Assumptions of Linear Regression

Types of Linear Regression (Self-paced)

OLS Regression Results Summary (Self-paced)

Calculation of R2 (Self-paced)

Gradient Descent (Self-paced)

Day 36 - Regularization Techniques

Regularization Techniques

Ridge Regression

Lasso Regression

Elastic net Regression


INTERNSHIP PROGRAM CURRICULUM I 28

Day 37 - Classification

What is Classification?

Classification vs. Regression

Types of Classification Algorithms

Use Case: Classification

Day 38 - Logistic Regression

Logistic Regression

What is Logistic Regression?

Log Odds

Logistic Regression Cost Function

Maximum Likelihood

Evaluation Parameters
INTERNSHIP PROGRAM CURRICULUM I 29

Day 39 - Decision Tree

Decision Tree

Common Terminologies

Decision Tree using CART Algorithm

Decision Tree using ID3 Algorithm

Attribute Selection

Random Forest

Day 40 - Naïve Bayes Classification

Naïve Bayes Algorithm

Understanding Bayes Theorem

Working of Naïve Bayes

Types of Naïve Bayes in Scikit

Naïve Bayes Classification (Demonstration)

Day 41 - k Nearest Neighbor

Understanding K Nearest Neighbor (KNN)

How does K Nearest Neighbor Work?

Significance of K in KNN Algorithm

K Nearest Neighbor Classifier (Demonstration)


INTERNSHIP PROGRAM CURRICULUM I 30

Day 42 - Support Vector Machine

Introduction to Support Vector Machines

Introduction to Nonlinear SVMs

SVM (Demonstration)
INTERNSHIP PROGRAM CURRICULUM I 31

MODULE 5

UnSupervised Machine Learning

Day 43 - Dimensionality Reduction

Curse of Dimensionality

Dimensionality Reduction

Understanding PCA

Working with Data

Performing PCA
INTERNSHIP PROGRAM CURRICULUM I 32

Day 44 - LDA

Discuss how Linear Discriminant Analysis (LDA) works?

Describe dimensionality reduction using LDA

Compare LDA and Principal Component Analysis (PCA)

Various Dimensionality Reduction Techniques

Implementing Common Techniques of Dimensionality Reduction

Day 45 - Clustering - I

Basics of Unsupervised Learning

Clustering Key Concepts

Types of Clustering

Implementing Hierarchical Clustering

Calculating the Value of K (Self-paced)

Day 46 - Clustering - II

Introduction to K-Means Clustering (Self-paced)

Calculating the Value of K (Self-paced)

K-Means Algorithm on 2D Plot (Self-paced)

Implementing K-Means Clustering (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 33

Day 47 - Clustering - III

Discuss C-means clustering algorithm

List the pros and cons of C-means clustering

DBSCAN Clustering

Day 48 - Association Rule Mining and


Market Basket Analysis

Introduction to Association Rule Mining

Association Rule Mining: Parable

How to Find/Generate Association Rules?

Association Rule Mining: Parable

How to Find/Generate Association Rules?

Step-By-Step Market Basket Analysis With Apriori Algorithm


(Self-paced)

Day 49 - Recommendation Engine

Important Measures to be Remembered

Recommendation Engine/Recommender System

Step-by-Step UBCF

CBF

Step-By-Step CBF
INTERNSHIP PROGRAM CURRICULUM I 34

Day 50 - Time Series Analysis

Introduction to TSA

Components of Time Series

Forms of Data

Methods to Check for Stationarity of Data

AR, MA, and ARMA Models

Understanding ARIMA Model

Day 51 - Time Series Models Using Python

TSA

TSA Using Python (Demonstration)

Case Study: Association Rule Mining and TSA on Retail


Store Data

Day 52 - Model Selection

What is Model Selection?

How Does K-Fold Cross-Validation Works ?

Perform K-Fold Cross-Validation Using Python


INTERNSHIP PROGRAM CURRICULUM I 35

MODULE 6

Model Evaluation and Optimization

Day 53 - Model Evaluation

Introduction to Model Evaluation

Metrics for Model Evaluation: Regression Models

Model Evaluation Metrics for Regression

Introduction to Model Evaluation Metrics in Classification


Models
INTERNSHIP PROGRAM CURRICULUM I 36

Day 54 - Performance Metrics

Confusion Matrix

How to Calculate a Confusion Matrix?

Introduction to ROC and AUC

Understanding Precision, Recall, and F1 Score

Day 55 - Hyperparameter Tuning

Understanding Hyperparameter Tuning

Types of Optimization

Grid Search

Perform Grid Search Using Python (Demonstration)


What is Ensemble Learning?

Day 56 - Bagging and Boosting,


Model Optimization

Bagging

Boosting

Adaptive Boosting Algorithm (AdaBoost)

Introduction to Gradient Boosting

Implementing Gradient Boosting Using Python


INTERNSHIP PROGRAM CURRICULUM I 37

What is Linear Programming?

What is XGBoost? (Self-paced)

Implementing XGBoost Using Python (Self-paced)

Model Optimization (Self-paced)

Day 57 - Linear Programming

Linear Programming using PuLP in Python (Demonstration)


(Self-paced)

Introduction to Formulating Optimization Problem (Self-paced)

Gradient Methods (Self-paced)

Second-Order Methods (Self-paced)

Day 58 - Interview Preparation


Machine Learning
INTERNSHIP PROGRAM CURRICULUM I 38

Day 59 - MINI PROJECT 3


Heart Disease Prediction

Machine Learning
INTERNSHIP PROGRAM CURRICULUM I 39

MODULE 7

Deep Learning and NLP

Day 60- Introduction to Deep Learning

Deep Learning

Understanding Neural Networks

Working on a Neural Network

Perceptron

Activation Functions

Introduction to MLP
INTERNSHIP PROGRAM CURRICULUM I 40

Day 61 - Backpropagation

Introduction to Backpropagation

Understanding Backpropagation Using Gradient Descent

Day 62 - TensorFlow

Understanding TensorFlow

TensorFlow 2.x and Keras

Understanding MNIST Digit Classification Using TensorFlow 2. X

Day 63 - CNN

Limitations of MLP

Basics of CNN

Understanding Image Recognition

Image Classification Using CNN


INTERNSHIP PROGRAM CURRICULUM I 41

Day 64 - RNN

Understanding RNN

Introduction to Architecture of RNN

Use Case: Workout Schedule

Training an RNN

Disadvantages of Backpropagation

Day 65 - LSTM

Introduction to LSTM Networks

LSTM Structure

Day 66 - Reinforcement Learning

Introduction to RL

Introduction to RL Process

Elements of RL

Basics of RL Agent Taxonomy (Self-paced)

Introduction to OpenAI Gym (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 42

Day 67 - Introduction to NLP

Natural Language Processing

NLTK for Natural Language Processing

Day 68 - Text Pre-processing

Text Preprocessing using NLTK

Tokenization

POS Tagging

Stop Words Removal

Stemming

Lemmatization

NER

WSD

Day 69 - Feature Extraction

Feature Extraction: Bag-of-Words

Feature Extraction: TF-IDF


INTERNSHIP PROGRAM CURRICULUM I 43

Day 69 - Sentiment Analysis

What is Sentiment Analysis?

Understanding Sentiment Analysis Through Case Study


INTERNSHIP PROGRAM CURRICULUM I 44

MODULE 8

Data Visualization using Tableau

Day 71- Introduction to Tableau

Data Visualization

BI Tools

Introduction to Tableau

Data Connections

Bar Charts, Line Charts, and Pie Charts

Creating a Bar, Line, and Pie Chart


INTERNSHIP PROGRAM CURRICULUM I 45

Day 72 - Data Granularity and Sorting

Hierarchies

Data Granularity

Sorting

Introduction to Data Grouping

Introduction to Filtering

Implement Filters

Sets

Implement Sets

Day 73 - Data Blending, Joins, and Unions

Introduction to Data Blending

Data Blending

Introduction to Joins and Unions

Day 74 - Calculations and Built-In


Functions

Introduction to Calculations

Built-In Functions in Tableau


INTERNSHIP PROGRAM CURRICULUM I 46

Day 75 - Table Calculations, Parameters,


and Level of Detail Calculations

Introduction to Table Calculations

Parameters

Introduction to Level of Detail Calculations

LOD Use Cases

Day 76 - Advanced Visualization


Techniques

Understanding Trend Lines

Trend Lines

Understanding Reference Lines, Bands, and Distributions

Forecasting

Implementing Forecasting

Clustering

Implementing Clustering
INTERNSHIP PROGRAM CURRICULUM I 47

Day 77 - Maps, Effective Chart Usage,


and Dashboard Basics

Types of Maps

Spatial Files

WMS

How to Use Charts Effectively?

Day 78 - Interactive Dashboards and


Story Points

Basics of Dashboards

Dashboard Interface

Dashboard Objects

Adding Objects to the Dashboard

Building a Dashboard

Introduction to Dashboard Layouts and Formatting

Interactive Dashboards Using Actions

Designing Dashboards for Devices

Basics of Story Points (Self-paced)

Visual Best Practices (Self-paced)


INTERNSHIP PROGRAM CURRICULUM I 48

Day 79 - Interview Preparation


Deep Learning and Data Visualization
INTERNSHIP PROGRAM CURRICULUM I 49

Day 80 - MINI PROJECT 4


Core Infrastructure Company Data Visualization

Deep Learning and Data Visualization


Using Tableau
INTERNSHIP PROGRAM CURRICULUM I 50

Day 81-82 - CAPSTONE PROJECT


Build a Movie Recommender System like Netflix

Day 83 - Mock Exam and


Real-world Use Cases

Day 84 - FINAL ASSESSMENT

Day 85-86 - Profile Building

Day 87-88 - Logical Reasoning


INTERNSHIP PROGRAM CURRICULUM I 51

Day 89-90 - Aptitude

Day 91-92 - Communication Skills

Day 93-96 - Mock Interviews


INTERNSHIP PROGRAM CURRICULUM I 52

Tools and Technologies


Python NumPy

Jupyter Machine Learning

NLTK

Seaborn Deep Learning


INTERNSHIP PROGRAM CURRICULUM I 53

Evaluations:

After completion of the course, final evaluations


are conducted:

1 Final project submission (Industry-grade Project)

2 Final project evaluation

Mock Interview round conducted by a Data Science


3 Professionals who are Subject Matter Experts
INTERNSHIP PROGRAM CURRICULUM I 54

Certifications:

You will get one of the three certificates based


on your performance in the evaluation:

Course Completion Certificate


1 (Final evaluation is NOT mandatory)
INTERNSHIP PROGRAM CURRICULUM I 55

Certifications:

Edureka Internship Certificate


2 (Final evaluation is mandatory)
INTERNSHIP PROGRAM CURRICULUM I 56

Certifications:

Edureka Super Intern Certificate


3 (Final evaluation is mandatory)
INTERNSHIP PROGRAM CURRICULUM I 57

Placement Assistance:

1 Resume Building

2 Professional skill development session

Increasing online visibility on platforms like


3 Linkedin, Naukri etc.

4 Additional Interview Preparation study material

5 Placement assistance with Elevayt


Thank you!

Data Science and Machine Learning


Internship Program Curriculum

You might also like