1. Introduction to Data Science/Analytics 

Why does companies care about Data Scientist/Analyst?
Common myths & confusions: Data Analyst, Business Analyst, BI, DataMiner, ML Engineer, DataScientist etc.,
What is DataScience? Why DataScience?
Data driven product engineering
Skillset of Data Scientist and How to become a Data Scientist?
Who is hiring? Career Opportunities


3. Technology overview for Data Science/Analytics 

Detailed Lifecycle to solve Datascience problems
Technologies for Data Science/Analytics
Languages: R/python/julia/scala
Frameworks & packages for structured data
Frameworks & packages for structured bigdata
Frameworks & packages for unstructured (big)data
Datasets for doing data science/analytics 

5. Applied Linear Algebra for data scientist 

Applied perspective of Linear Algebra
Vector Algebra
ideas that map to vectors
understanding vector operations
understanding lienar independance
Matrix Algebra
ideas that map to matrices
fundamental ideas in matrix algebra:
matrix operations
determinant,
eigenvalues and eigenvectors,
inverse,
rank
positive definite & semidefiniteness
basis, orthogonal and orthonormal basis
understanding factorization
SVD factorization
(Optional)LU factorization
(Optional)QR factorization 

7. Applied Probability for data scientist 

Applied perspective of Probability theory
Basic Probability, Conditional Probability
Bayes Rule/Reasoning, MAP vs MLE Reasoning
Mapping Random process to Random variable
Properties of Random variables, expectation, variance, entropy and crossentropy, covariance and correlation
Understanding standard random processes
Probability Distributions: Normal, Gamma, Poisson , Dirichlet, Bernoulli, Binomial, Powerlaw, Log normal, Multinomial
Parameter Estimation in Distributions: MAP and MLE approaches 

9. Applied Optimization theory for data scientist 

Applied perspective of optimization theory
NonML vs ML optimization problems
Modelling ML problems with optimization requirements
Solving unconstrained optimization problems
Solving optimization problems with linear constraints
Gradient descent variations
Batch vs stochastic gradients 

11. Introduction to Machine Learning 

Pattern discovery: Manual vs Automated
Supervised ML
Unsupervised ML
Reinforcement ML
Overfitted vs underfitted models
Techniques to mitigate overfitting 

13. Exploration of structured data(EDA) 

pandas support for data sources & formats
working with csv & tsv files
working with RDBMS data
working json files
pandas dataframes
creating dataframes from differnt sources & formats
query with dataframe API
visual EDA
univariate plots: barchart, histogram, boxplot, densitycurve
multivariate plots: facetgrids, factorplots, scatterplots
tSNE plots 

15. Feature Engineering 

Feature filtering techniques
Variance based filtering
Correlation based filtering
Feature Creation
Techniques to create new features
Feature Selection
Statistical feature selection
Model based featured selection
Feature Extraction & Transformation
PCA 

17. Usecase driven approach to solve Classification problem 

Applied perspective of classification problem
Machine learning approaches to solve classification problem
Tree learning approaches
Algorithms: CART, C4.5, C5.0
Overfitting control techniques: Prepruning, Cost complexity pruning, Pessimistic pruning
Probabilistic learning approaches
Algorithms: NaiveBayes
Overfitting control techniques
Objective based learning approaches
Algorithms: SVM,Logistic Regression, Neural Network
Overfitting control techniques: Lasso, Ridge & Elastic net penalties
Instance based learning: KNN
Ensemble based learning
Algorithms: Voting, Stacking, RandomForest, Adaboost, Gradient boosting, Extreme gradient boosting(XGB)
Overfitting control techniques
Evaluation Metrics for Classification Algorithms
Confustion matrix
Accuracy, Error Rate
Precision, Recall and FScore
ROC curve, AUC 

19. Usecase driven approach to Recommendation problem 

Applied perspective of recommendation problem
TopN Recommender
Rating Prediction
Machine learning approaches to solve recommendation problem
Content based learning approaches
Collabative filtering(KNN based) approaches
Algorithms: UBCF, IBCF
Overfitting control techniques
Latent factor learning approaches
Algorithms: Funk algorithm
Overfitting control techniques: Lasso, Ridge & Elastic net penalties
Hybrid learning approches
Evaluation Metrics for Recommendation Algorithms
TopN Recommnder: Accuracy, Error Rate
Rating Prediction: RMSE 

21. Usecase driven approach to Clustering problem 

Applied perspective of clustering problem
Machine learning approaches to solve clustering problem
Iterative algorithms: Kmeans, Kmedoids
Hierarchical algorithms: Wards algorithm
Desnity based algorithms: DBSCAN
BIRCH algorithm
Evaluation Metrics for Clustering Algorithms
GT based metrics: Adj RandIndex, MutualInfo
NoGT based metrics: Silhouette Coefficient, CalinskiHarabaz Index 

23. Usecase driven approach to Dimensionality Reduction problem 

Applied perspective of feature reduction problem
Machine learning approaches to reduce dimensionality
Variance based approach
Algorithms: Linear PCA, Nonlinear PCA
Evaluation: Captured Variance
Neighborhood based approach
Algorithms: tSNE
Evaluation: KLdivergence 
