DATA SCIENCES
Data science, also known as datadriven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.
What you learn?
The only Data Sciences training program where you get indepth knowledge of all the 6 modules of Data Sciences with practical handson exposure.
Introduction to data science
The Introduction to Data Science class will survey the foundational topics in data science.
 What is data Science? – Introduction.
 Importance of Data Science.
 Demand for Data Science Professional.
 Brief Introduction to Big data and Data Analytics.
 Lifecycle of data science.
 Tools and Technologies used in data Science.
 Business Intelligence vs Data Science.
 Role of a data scientist.
R programming basics
Use the R Programming Language to execute data science projects and become a data scientist.
 Introduction to R
 R Basics, background.
 Comprehensive R Archive Network
 Demo of Installing R On windows from CRAN Website
 Installing R Studios on Windows OS
 Setting Up R Workspace.
 Getting Help for RHow to use help system
 Installing Packages – Loading And Unloading Packages
 Starting with R : Getting familiar with basics
 Operators in R – Arithmetic,Relational,Logical and Assignment Operators
 Variables,Types Of Variables,Using variables
 Conditional statements,ifelse(),switch
 Loops: For Loops,While Loops,Using Break statement,Switch
 The R Programming Language Data Types And Functions
 Use R for simple maths, creating data objects from the keyword.
 How to make different type of data objects.
 Understand the various data types that the language supports.
 Introduction to Functions in R
 Types of data structures in R
 Arrays And Lists Create Access the elements
 Vectors – Create Vectors,Vectorized Operations,Power of Vectorized Operations
 Matrices Building the first matrices,Matrix Operations,Subsetting,visualising subset,Visualising with matplot()
 Factors – Creating a Factor
 Data Frames create and filter data frames,Building And Merging data frames.
 Functions And Importing data into R
 Function Overview – Naming Guidelines
 Arguments Matching,Function with Multiple Arguments
 Additional Arguments using Ellipsis,Lazy Evaluation
 Multiple Return Values
 Function as Objects,Anonymous Functions
 Importing and exporting Data into R importing from files like excel,csv and minitab.
 Import from URL and excel Files
 Import from database.
 Data Descriptive Statistics,Tabulation,Distribution
 Summary Statistics for Matrix Objects.
 apply() Command.Converting an Object into a Table
 Histograms, Stem and Leaf Plot, Density Function. Normal Distribution
 Graphics in R – Types of graphics
 Bar Chart,Pie Chart,Histograms Create and edit.
 Box Plots Basics of Boxplots Create and Edit
 Visualisation in R using ggplot2.
 More About Graphs: Adding Legends to Graphs
 Adding Text to Graphs, Orienting the Axis Label.
Introduction to sql
SQL is a standard language for accessing and manipulating databases.
 Introduction to SQL Server and RDBMS
 Covers an overview of using relational databases. You’ll learn basic terminology used in future modules,
 SQL Server Management Studio is the primary tool used to create queries and manage objects in SQL Server databases
 SQL Operations
 Single Table Queries – SELECT,WHERE,ORDER BY,Distinct,And ,OR
 Multiple Table Queries: INNER, SELF, CROSS, and OUTER,oin, Left Join, Right Join, Full Join, Union and MANY MORE…..
 SQL Advance Operations
 Data Aggregations and summarizing the data
 Ranking Functions: TopN Analysis
 Advanced SQL Queries for Analytics
Python for data science
A comprehensive learning path to become a data scientist using Python. Topics include machine learning, deep learning & pandas on Python.
 Python Programming Basics
 Installing Jupyter Notebooks
 Python Overview
 Python 2.7 vs Python 3
 Python Identifiers
 Various Operators and Operators Precedence
 Getting input from User,Comments,Multi line Comments.
 Making Decisions And Loop Control
 Simple if Statement,ifelse Statement
 ifelif Statement.
 Introduction To while Loops.
 Introduction To for Loops,Using continue
 and break,
 Python Data Types: List,Tuples,Dictionaries
 Python Lists,Tuples,Dictionaries
 Accessing Values
 Basic Operations
 Indexing, Slicing, and Matrixes
 Builtin Functions & Methods
 Exercises on List,Tuples And Dictionary
 Functions And Modules
 Introduction To Functions – Why
 Defining Functions
 Calling Functions
 Functions With Multiple Arguments.
 Anonymous Functions – Lambda
 Using BuiltIn Modules,UserDefined Modules,Module Namespaces,
 Iterators And Generators
 File I/O And Exceptional Handling
 Opening and Closing Files
 open Function,file Object Attributes
 close() Method ,Read,write,seek.Exception Handling,the tryfinally Clause
 Raising an Exceptions,UserDefined Exceptions
 Regular Expression Search and Replace
 Regular Expression Modifiers
 Regular Expression Patterns,re module
 Numpy
 Introduction to Numpy. Array Creation,Printing Arrays
 Basic Operations Indexing, Slicing and Iterating
 Shape Manipulation – Changing shape,stacking and spliting of array
 Vector stacking
 Pandas And Matplotlib
 Introduction to Pandas
 Importing data into Python
 Pandas Data Frames,Indexing Data Frames ,Basic Operations With Data frame,Renaming Columns,Subletting and filtering a data frame.
 Matplotlib – Introduction,plot(),Controlling Line Properties,Working with Multiple Figures,Histograms
Introduction to statistics
Numerical descriptive measures computed from data are called statistics.
 Fundamentals of Math and Probability

 Basic understanding of linear algebra, Matrics, vectors
 Addition and Multimplication of matrics
 Fundamentals of Probability
 Probability distributed function and cumulative distributed function.
Class Handon
 Problem solving using R for vector manupulation
 Problem solving for probability assignments

 Descriptive Statistics

 Describe or sumarise a set of data Measure of central tendency and measure of dispersion.
 The mean,median,mode, curtosis and skewness Computing Standard deviation and Variance.
 Types of distribution.
Class Handon
 5 Point summary BoxPlot
 Histogram and Bar Chart
 Exploratory analytics R Methods

 Inferential Statistics

 What is inferential statistics
 Different types of Sampling techniques
 Central Limit Theorem
 Point estimate and Interval estimate
 Creating confidence interval for population parameter
 Characteristics of Zdistribution and TDistribution
 Basics of Hypothesis Testing
 Type of test and rejection region
 Type of errors in Hypothesis resting, Typel error and Typell errors
 PValue and ZScore Method
 TTest, Analysis of variance(ANOVA) and Analysis of Co variance(ANCOVA)
 Regression analysis in ANOVA
Class Handon
 Problem solving for C.L.T
 Problem solving Hypothesis Testing
 Problem solving for Ttest, Zscore test
 Case study and model run for ANOVA, ANCOVA

 Hypothesis Testing

 Hypothesis Testing
 Basics of Hypothesis Testing
 Type of test and Rejection Region
 Type o errorsType 1 Errors,Type 2 Errors
 P value method,Z score Method

Understanding and implementing machine learning
Implementing machine learning algorithms from scratch seems like a great way for a programmer to understand machine learning.
 Introduction To Machine Learning
 What is Machine Learning?
 What is the Challenge?
 Introduction to Supervised Learning,Unsupervised Learning
 What is Reinforcement Learning?
 Linear Regression
 Introduction to Linear Regression
 Linear Regression with Multiple Variables
 Disadvantage of Linear Models
 Interpretation of Model Outputs
 Understanding Covariance and Colinearity
 Understanding Heteroscedasticity
Case Study – Application of Linear Regression for Housing Price Prediction
 Logistic Regression
 Introduction to Logistic Regression.– Why Logistic Regression .
 Introduce the notion of classification
 Cost function for logistic regression
 Application of logistic regression to multiclass classification.
 Confusion Matrix, Odd’s Ratio And ROC Curve
 Advantages And Disadvantages of Logistic Regression.
Case Study:To classify an email as spam or not spam using logistic Regression.
 Decision Trees And Supervised Learning
 Decision Tree – data set
 How to build decision tree?
 Understanding Kart Model
 Classification Rules Overfitting Problem
 Stopping Criteria And Pruning
 How to Find final size of Trees?
 Model A decision Tree.
 Naive Bayes
 Random Forests and Support Vector Machines
 Interpretation of Model Outputs
Case Study:
1 Business Case Study for Kart Model
2 Business Case Study for Random Forest
3 Business Case Study for SVM
 Unsupervised Learning
 Hierarchical Clustering
 kMeans algorithm for clustering – groupings of unlabeled data points.
 Principal Component Analysis(PCA) Data
 Independent components analysis(ICA)
 Anomaly Detection
 Recommender Systemcollaborative filtering algorithm
Case Study– Recommendation Engine for ecommerce/retail chain
 Introduction to Deep Learning
 Neural Network
 Understanding Neural Network Model
 Understanding Tuning of Neural Network
Case Study:
Case study using Neural Network
 Natural language Processing
 Introduction to natural Language Processing(NLP).
 Word Frequency Algorithms for NLP
 Sentiment Analysis
Case Study :
Twitter data analysis using NLP
 Apache Spark Analytics
 What is Spark
 Introduction to Spark RDD
 Introduction to Spark SQL and Dataframes
 Using RSpark for machine learning
Handson:
1 installation and configuration of Spark
2 Hands on Spark RDD programming
3 Hands on of Spark SQL and Dataframe programming
4 Using RSpark for machine learning programming
 Introduction to Tableau/Spotfire
 Connecting to data source
 Creating dashboard pages
 How to create calculated columns
 Different charts
1 Handson:
2 Hands on on connecting data source and data clensing
3 Hands on verious charts
4 Hands on deployment of Predictive model in visualisation