## Curriculum

## Overview

**What is Data science? **

Data science combines math and statistics, specialized programming, advanced analytics, artificial intelligence (AI), and machine learning with specific subject matter expertise to uncover actionable insights hidden in an organization’s data.

These insights can be used to guide decision making and strategic planning.

**Why learn data science?**

Data science has the potential to improve the way we live and work, and it can empower others to make better decisions, solve problems, discover new advancements, and address some of the world’s most pressing issues. With a data science career, you can be a part of this transformation.

- Demand: Data Scientists are in High Demand.
- Growth: Data Science Careers Have High Earning Potential.
- Job Security: Data Science is a Fast-Growing Field.
- Opportunity: Data Science Has a Range of Potential Job Opportunities.
- Flexibility: Data Scientists are Needed in Various Sectors.

**Who is called a data scientist? **

A data scientist is a professional who creates programming code and combines it with statistical knowledge to create insights from data.

** **

**How to learn data Science?**

The best way to learn data science is to **work on projects** so you can gain data science skills that can be applied immediately and are useful from a real-world implementation perspective. The sooner you start working on diverse data science projects, the faster you will learn the related concepts.

** **

TalhaTraining teaches you the essential concepts of the Various Data Science Disciplines like Data, Big Data, BI, Traditional Data Science and ML.

**Training Objectives**

We will teach you how to build technical knowledge and skills to be a Data Scientist.

**Prerequisites**

The course can be customized to any level of programming and relational database familiarity.

**Hands-on/Lecture Ratio**

This training class is 80% hands-on, and 20% lecture. Students learn by doing, with immediate opportunities to apply their learning material to real-world problems.

**Training Materials**

All related software and lecture sheets will provide in class.

**Training Objectives **

- Data Science – The Benefits
- Popular Data Science Techniques
- Popular Data Science Tools
- Careers in Data Science
- Debunking Common Misconceptions
- Probability
- Statistics
- Python Programming for Data Scientist
- Big Data and Spark with Python
- Advanced Statistical Methods
- Mathematics
- JavaScript for Data Scientist
- Deep Learning

**Training Outline**

**The Field of Data Science – The Various Data Science Disciplines**

- Data Science and Business Buzzwords: Why are there so Many?
- What is the difference between Analysis and Analytics?
- Business Analytics, Data Analytics, and Data Science: An Introduction
- Continuing with BI, ML, and AI
- A Breakdown of our Data Science Infographic

**The Field of Data Science – Connecting the Data Science Disciplines**

- Applying Traditional Data, Big Data, BI, Traditional Data Science and ML

**The Field of Data Science – The Benefits of Each Discipline**

- The Reason Behind These Disciplines

**The Field of Data Science – Popular Data Science Techniques**

- Techniques for Working with Traditional Data
- Real Life Examples of Traditional Data
- Techniques for Working with Big Data
- Real Life Examples of Big Data
- Business Intelligence (BI) Techniques
- Real Life Examples of Business Intelligence (BI)
- Techniques for Working with Traditional Methods
- Real Life Examples of Traditional Methods
- Machine Learning (ML) Techniques
- Types of Machine Learning
- Real Life Examples of Machine Learning (ML)

**The Field of Data Science – Popular Data Science Tools**

- Necessary Programming Languages and Software Used in Data Science

**The Field of Data Science – Careers in Data Science**

- Finding the Job – What to Expect and What to Look for

**The Field of Data Science – Debunking Common Misconceptions**

- Debunking Common Misconceptions

**Probability**

- The Basic Probability Formula
- Computing Expected Values
- Frequency
- Events and Their Complements

**Probability – Combinatorics**

- Fundamentals of Combinatorics
- Permutations and How to Use Them
- Simple Operations with Factorials
- Solving Variations with Repetition
- Solving Variations without Repetition
- Solving Combinations
- Symmetry of Combinations
- Solving Combinations with Separate Sample Spaces
- Combinatorics in Real-Life: The Lottery
- A Recap of Combinatorics
- A Practical Example of Combinatorics

**Probability – Bayesian Inference**

- Sets and Events
- Ways Sets Can Interact
- Intersection of Sets
- Union of Sets
- Mutually Exclusive Sets
- Dependence and Independence of Sets
- The Conditional Probability Formula
- The Law of Total Probability
- The Additive Rule
- The Multiplication Law
- Bayes’ Law
- A Practical Example of Bayesian Inference

**Probability – Distributions**

- Fundamentals of Probability Distributions
- Types of Probability Distributions
- Characteristics of Discrete Distributions
- Discrete Distributions: The Uniform Distribution
- Discrete Distributions: The Bernoulli Distribution
- Discrete Distributions: The Binomial Distribution
- Discrete Distributions: The Poisson Distribution
- Characteristics of Continuous Distributions
- Continuous Distributions: The Normal Distribution
- Continuous Distributions: The Standard Normal Distribution
- Continuous Distributions: The Students’ T Distribution
- Continuous Distributions: The Chi-Squared Distribution
- Continuous Distributions: The Exponential Distribution
- Continuous Distributions: The Logistic Distribution
- A Practical Example of Probability Distributions

**Probability – Probability in Other Fields**

- Probability in Finance
- Probability in Statistics
- Probability in Data Science

**Statistics**

- Population and Sample

**Statistics – Descriptive Statistics**

- Types of Data
- Levels of Measurement
- Categorical Variables – Visualization Techniques
- Categorical Variables Exercise
- Numerical Variables – Frequency Distribution Table
- Numerical Variables Exercise
- The Histogram
- Histogram Exercise
- Cross Tables and Scatter Plots
- Cross Tables and Scatter Plots Exercise
- Mean, median and mode
- Mean, Median and Mode Exercise
- Skewness
- Skewness Exercise
- Variance
- Variance Exercise
- Standard Deviation and Coefficient of Variation
- Standard Deviation
- Standard Deviation and Coefficient of Variation Exercise
- Covariance
- Covariance Exercise
- Correlation Coefficient
- Correlation
- Correlation Coefficient Exercise

**Statistics – Practical Example: Descriptive Statistics**

- Practical Example: Descriptive Statistics
- Practical Example: Descriptive Statistics Exercise

**Statistics – Inferential Statistics Fundamentals**

- Introduction
- What is a Distribution
- What is a Distribution
- The Normal Distribution
- The Standard Normal Distribution
- The Standard Normal Distribution Exercise
- Central Limit Theorem
- Standard error
- Estimators and Estimates

**Statistics – Inferential Statistics: Confidence Intervals**

- What are Confidence Intervals?
- Confidence Intervals; Population Variance Known; Z-score
- Confidence Intervals; Population Variance Known; Z-score; Exercise

**Statistics – Confidence Interval Clarifications**

- Student’s T Distribution
- Confidence Intervals; Population Variance Unknown; T-score
- Confidence Intervals; Population Variance Unknown; T-score; Exercise
- Margin of Error
- Confidence intervals. Two means. Dependent samples
- Confidence intervals. Two means. Dependent samples Exercise
- Confidence intervals. Two means. Independent Samples (Part 1)
- Confidence intervals. Two means. Independent Samples (Part 1). Exercise
- Confidence intervals. Two means. Independent Samples (Part 2)
- Confidence intervals. Two means. Independent Samples (Part 2). Exercise
- Confidence intervals. Two means. Independent Samples (Part 3)

**Statistics – Practical Example: Inferential Statistics**

- Practical Example: Inferential Statistics
- Practical Example: Inferential Statistics Exercise

**Statistics – Hypothesis Testing**

- Null vs Alternative Hypothesis
- Further Reading on Null and Alternative Hypothesis
- Rejection Region and Significance Level
- Type I Error and Type II Error
- Test for the Mean. Population Variance Known
- Test for the Mean. Population Variance Known Exercise
- p-value
- Test for the Mean. Population Variance Unknown
- Test for the Mean. Population Variance Unknown Exercise
- Test for the Mean. Dependent Samples
- Test for the Mean. Dependent Samples Exercise
- Test for the mean. Independent Samples (Part 1)
- Test for the mean. Independent Samples (Part 1). Exercise
- Test for the mean. Independent Samples (Part 2)
- Test for the mean. Independent Samples (Part 2). Exercise

**Statistics – Practical Example: Hypothesis Testing**

- Practical Example: Hypothesis Testing
- Practical Example: Hypothesis Testing Exercise

**Introduction to Python**

- Introduction to Programming
- Why Python?
- Anaconda Details
- Why Jupyter?
- Installing Python and Jupyter
- Understanding Jupyter’s Interface – the Notebook Dashboard
- Prerequisites for Coding in the Jupyter Notebooks
- Jupyter Notebooks
- Jupyter’s Interface

**Python – Variables and Data Types**

- Variables
- Numbers and Boolean Values in Python
- Python Strings

**Python – Basic Python Syntax**

- Using Arithmetic Operators in Python
- The Double Equality Sign
- How to Reassign Values
- Add Comments
- Understanding Line Continuation
- Indexing Elements
- Structuring with Indentation

**Python – Other Python Operators**

- Comparison Operators
- Logical and Identity Operators

**Python – Conditional Statements**

- The IF Statement
- The ELSE Statement
- A Note on Boolean Values

**Python – Python Functions**

- Defining a Function in Python
- How to Create a Function with a Parameter
- Defining a Function in Python – Part II
- How to Use a Function within a Function
- Conditional Statements and Functions
- Functions Containing a Few Arguments
- Built-in Functions in Python
- Python Functions

**Python – Sequences**

- Lists
- Using Methods
- List Slicing
- Tuples
- Dictionaries

**Python – Iterations**

- For Loops
- While Loops and Incrementing
- Lists with the range() Function
- Conditional Statements and Loops
- Conditional Statements, Functions, and Loops
- How to Iterate over Dictionaries

**Python for Data Analysis – NumPy**

- Introduction to Numpy
- Numpy Arrays
- Array Indexing
- Numpy Array Indexing
- Numpy Operations
- Numpy Exercises Overview
- Numpy Exercises Solutions

**Python for Data Analysis – Pandas**

- Introduction to Pandas
- Series
- DataFrames – Part 1
- DataFrames – Part 2
- DataFrames – Part 3
- Missing Data
- Groupby
- Merging Joining and Concatenating
- Operations
- Data Input and Output

**Python for Data Analysis – Pandas Exercises**

- SF Salaries Exercise Overview
- SF Salaries Solutions
- Ecommerce Purchases Exercise Overview
- Ecommerce Purchases Exercise Solutions

**Python for Data Visualization – Matplotlib**

- Introduction to Matplotlib
- Matplotlib Part 1
- Matplotlib Part 2
- Matplotlib Part 3
- Matplotlib Exercises Overview
- Matplotlib Exercises – Solutions

**Python for Data Visualization – Seaborn**

- Introduction to Seaborn
- Distribution Plots
- Categorical Plots
- Matrix Plots
- Grids
- Regression Plots
- Style and Color
- Seaborn Exercise Overview
- Seaborn Exercise Solutions

**Python for Data Visualization – Pandas Built-in Data Visualization**

- Pandas Built-in Data Visualization
- Pandas Data Visualization Exercise
- Pandas Data Visualization Exercise- Solutions

**Python for Data Visualization – Plotly and Cufflinks**

- Introduction to Plotly and Cufflinks
- Plotly and Cufflinks

**Python for Data Visualization – Geographical Plotting**

- Introduction to Geographical Plotting
- Choropleth Maps – Part 1
- Choropleth Maps – Part 2
- Choropleth Exercises
- Choropleth Exercises – Solutions

**Big Data and Spark with Python**

- Welcome to the Big Data Section!
- Big Data Overview
- Spark Overview
- Local Spark Set-Up
- AWS Account Set-Up
- Quick Note on AWS Security
- EC2 Instance Set-Up
- SSH with Mac or Linux
- PySpark Setup
- Lambda Expressions Review
- Introduction to Spark and Python
- RDD Transformations and Actions

**Python – Advanced Python Tools**

- Object Oriented Programming
- Modules and Packages
- What is the Standard Library?
- Importing Modules in Python

**Advanced Statistical Methods in Python**

- Introduction to Regression Analysis

**Advanced Statistical Methods – Linear Regression with StatsModels**

- The Linear Regression Model
- Correlation vs Regression
- Geometrical Representation of the Linear Regression Model
- Python Packages Installation
- First Regression in Python
- First Regression in Python Exercise
- Using Seaborn for Graphs
- How to Interpret the Regression Table

**Decomposition of Variability**

- What is the OLS?
- R-Squared

**Advanced Statistical Methods – Multiple Linear Regression with StatsModels**

- Multiple Linear Regression
- Adjusted R-Squared
- Multiple Linear Regression Exercise
- Test for Significance of the Model (F-Test)
- OLS Assumptions
- A1: Linearity
- A2: No Endogeneity
- A3: Normality and Homoscedasticity
- A4: No Autocorrelation
- A5: No Multicollinearity
- Dealing with Categorical Data – Dummy Variables
- Making Predictions with the Linear Regression

**Advanced Statistical Methods – Linear Regression with sklearn**

- What is sklearn and how is it Different from Other Packages
- How are we Going to Approach this Section?
- Simple Linear Regression with sklearn
- Simple Linear Regression with sklearn – A StatsModels-like Summary Table
- A Note on Normalization
- Simple Linear Regression with sklearn – Exercise
- Multiple Linear Regression with sklearn
- Calculating the Adjusted R-Squared in sklearn
- Calculating the Adjusted R-Squared in sklearn – Exercise
- Feature Selection (F-regression)
- A Note on Calculation of P-values with sklearn
- Creating a Summary Table with P-values
- Multiple Linear Regression – Exercise
- Feature Scaling (Standardization)
- Feature Selection through Standardization of Weights
- Predicting with the Standardized Coefficients
- Feature Scaling (Standardization) – Exercise
- Underfitting and Overfitting
- Train – Test Split Explained

**Advanced Statistical Methods – Practical Example: Linear Regression**

- Practical Example: Linear Regression (Part 1)
- Practical Example: Linear Regression (Part 2)
- A Note on Multicollinearity
- Practical Example: Linear Regression (Part 3)
- Dummies and Variance Inflation Factor – Exercise
- Practical Example: Linear Regression (Part 4)
- Dummy Variables – Exercise
- Practical Example: Linear Regression (Part 5)

**Linear Regression – Exercise**

- Advanced Statistical Methods – Logistic Regression
- Introduction to Logistic Regression
- A Simple Example in Python
- Logistic vs Logit Function
- Building a Logistic Regression
- Building a Logistic Regression – Exercise
- An Invaluable Coding Tip
- Understanding Logistic Regression Tables
- Understanding Logistic Regression Tables – Exercise
- What do the Odds Actually Mean
- Binary Predictors in a Logistic Regression
- Binary Predictors in a Logistic Regression – Exercise
- Calculating the Accuracy of the Model
- Underfitting and Overfitting
- Testing the Model
- Testing the Model – Exercise

**Advanced Statistical Methods – Cluster Analysis**

- Introduction to Cluster Analysis
- Some Examples of Clusters
- Difference between Classification and Clustering
- Math Prerequisites

**Advanced Statistical Methods – K-Means Clustering**

- K-Means Clustering
- A Simple Example of Clustering
- A Simple Example of Clustering – Exercise
- Clustering Categorical Data
- Clustering Categorical Data – Exercise
- How to Choose the Number of Clusters
- How to Choose the Number of Clusters – Exercise
- Pros and Cons of K-Means Clustering
- To Standardize or not to Standardize
- Relationship between Clustering and Regression
- Market Segmentation with Cluster Analysis (Part 1)
- Market Segmentation with Cluster Analysis (Part 2)
- How is Clustering Useful?
- EXERCISE: Species Segmentation with Cluster Analysis (Part 1)
- EXERCISE: Species Segmentation with Cluster Analysis (Part 2)

**Advanced Statistical Methods – Other Types of Clustering**

- Types of Clustering
- Dendrogram
- Heatmaps

**Mathematics**

- What is a Matrix?
- Scalars and Vectors
- Linear Algebra and Geometry
- Arrays in Python – A Convenient Way to Represent Matrices
- What is a Tensor?
- Addition and Subtraction of Matrices
- Errors when Adding Matrices
- Transpose of a Matrix
- Dot Product
- Dot Product of Matrices
- Why is Linear Algebra Useful?

**Deep Learning**

- What to Expect from this Part?

**Deep Learning – Introduction to Neural Networks**

- Introduction to Neural Networks
- Training the Model
- Types of Machine Learning
- The Linear Model (Linear Algebraic Version)
- The Linear Model
- The Linear Model with Multiple Inputs
- The Linear model with Multiple Inputs and Multiple Outputs
- Graphical Representation of Simple Neural Networks
- What is the Objective Function?
- Common Objective Functions: L2-norm Loss
- Common Objective Functions: Cross-Entropy Loss
- Optimization Algorithm: 1-Parameter Gradient Descent
- Optimization Algorithm: n-Parameter Gradient Descent

**Deep Learning – How to Build a Neural Network from Scratch with NumPy**

- Basic NN Example (Part 1)
- Basic NN Example (Part 2)
- Basic NN Example (Part 3)
- Basic NN Example (Part 4)
- Basic NN Example Exercises

** **

**JavaScript**

**Deep Learning – TensorFlow**

- Deep Learning – TensorFlow 2.0: Introduction
- How to Install TensorFlow 2.0

**TensorFlow Outline and Comparison with Other Libraries**

- TensorFlow 1 vs TensorFlow 2
- A Note on TensorFlow 2 Syntax
- Types of File Formats Supporting TensorFlow
- Outlining the Model with TensorFlow 2

**Interpreting the Result and Extracting the Weights and Bias**

- Customizing a TensorFlow 2 Model
- Basic NN with TensorFlow: Exercises

**Deep Learning – Digging Deeper into NNs: Introducing Deep Neural Networks**

- What is a Layer?
- What is a Deep Net?
- Digging into a Deep Net
- Non-Linearities and their Purpose
- Activation Functions
- Activation Functions: Softmax Activation
- Backpropagation
- Backpropagation Picture
- Backpropagation – A Peek into the Mathematics of Optimization

**Deep Learning – Overfitting**

- What is Overfitting?
- Underfitting and Overfitting for Classification
- What is Validation?
- Training, Validation, and Test Datasets
- N-Fold Cross Validation
- Early Stopping or When to Stop Training

**Deep Learning – Initialization**

- What is Initialization?
- Types of Simple Initializations
- State-of-the-Art Method – (Xavier) Glorot Initialization

**Deep Learning – Digging into Gradient Descent and Learning Rate Schedules**

- Stochastic Gradient Descent
- Problems with Gradient Descent
- Momentum
- Learning Rate Schedules, or How to Choose the Optimal Learning Rate
- Learning Rate Schedules Visualized
- Adaptive Learning Rate Schedules (AdaGrad and RMSprop )
- Adam (Adaptive Moment Estimation)

**Deep Learning – Preprocessing**

- Preprocessing Introduction
- Types of Basic Preprocessing
- Standardization
- Preprocessing Categorical Data
- Binary and One-Hot Encoding

**Deep Learning – Classifying on the MNIST Dataset**

- MNIST: The Dataset
- MNIST: How to Tackle the MNIST
- MNIST: Importing the Relevant Packages and Loading the Data
- MNIST: Preprocess the Data – Create a Validation Set and Scale It
- MNIST: Preprocess the Data – Scale the Test Data – Exercise
- MNIST: Preprocess the Data – Shuffle and Batch
- MNIST: Preprocess the Data – Shuffle and Batch – Exercise
- MNIST: Outline the Model
- MNIST: Select the Loss and the Optimizer
- MNIST: Learning
- MNIST – Exercises
- MNIST: Testing the Model

**Deep Learning – Business Case Example**

- Business Case: Exploring the Dataset and Identifying Predictors
- Business Case: Outlining the Solution
- Business Case: Balancing the Dataset
- Business Case: Preprocessing the Data
- Business Case: Preprocessing the Data – Exercise
- Business Case: Load the Preprocessed Data
- Business Case: Load the Preprocessed Data – Exercise
- Business Case: Learning and Interpreting the Result
- Business Case: Setting an Early Stopping Mechanism
- Setting an Early Stopping Mechanism – Exercise
- Business Case: Testing the Model
- Business Case: Final Exercise

**Deep Learning – Conclusion**

- Summary on What You’ve Learned
- What’s Further out there in terms of Machine Learning
- DeepMind and Deep Learning
- An overview of CNNs
- An Overview of RNNs
- An Overview of non-NN Approaches

**Certificates will be awarded to participants at the end of training.**

**Seats are limited. To confirm your enrollment, pay the course fee @**

A/C Name: |
TalhaTraining |

A/C No.: |
2141116000973 |

Bank Name: |
Prime Bank Limited |

**And ****mail us**** after paying the course fee.**

#### For registration or information please call or contact any of the following addresses

**TalhaTraining**

**Mobile & WhatsApp** 01712742217

**Email:** training@talhatraining.com or talhatraining@gmail.com

**Website: **talhatraining.com

### Course Features

- Lectures 425
- Quiz 0
- Duration 96 hours
- Skill level All levels
- Language English
- Students 21
- Certificate Yes
- Assessments Yes