Choose your language:

France
Germany
Hong Kong
India
Ireland
Japan
Malaysia
Netherlands
New Zealand
Singapore
Sweden
United Kingdom
United States

Anaconda Machine Learning

Course Code

BD75

Duration

3 Days

General knowledge of data stores, advanced math, analytics and a working knowledge of the Python language.
The students will learn and use the primary machine learning and data analysis tools in Anaconda including: NumPy which provides a fast numerical array structure and helper functions, Pandas which provides a DataFrame structure to store data in memory and work with it easily and efficiently, scikit-learn which is the essential Machine Learning package in Python, Matplotlib for basic plotting in Python, and Seaborn for advanced statistical plotting.

Students will learn basic theoretical principles, algorithms, and applications of Machine Learning using the Anaconda packages. We cover all learning techniques including: supervised learning Classification and Regression, unsupervised learning Clustering and Dimensionality Reduction, and Recommendation algorithms using Collaborative Filtering.

The students gain hands-on experience applying these principles using Scikit-learn pipelines and model evaluation techniques.
This course is designed for data engineers, analysts, architects, software engineers, IT operations and technical managers interested in a thorough, hands-on course covering the Anaconda Machine Learning.

In this course, participants will:

  • Understand machine learning concepts
  • Install Anaconda
  • Prepare data for analysis
  • Learn the difference is supervised and unsupervised learning
  • Apply ML algorithms to LIBSVM data
  • Articulate and implement typical use cases for machine learning
  • Understand and improve model Performance
  • Build data pipelines with Scikit-learn and Pandas DataFrames
  • Understand and use Classification
  • Prepare and visualize ML data
  • Understand and use Regression
  • Use Dimensionality Reduction and Principal Component
1. Overview
Anaconda
Big Data
History
Anaconda Ecosystem
Machine Learning
Scikit-Learn ML Libraries
Types of Learning
Data Preparation
Algorithms
Iterative Processes
Scalability
Ensemble Modeling
AnacondaCON 2018
Lab

2. Installation
Platforms
Prerequisites
Download
Windows
Mac OS
Linux
Anaconda Packages
R Essentials Bundle
conda
Lab

3. Anaconda Basics
Concepts
Conda Directory Structure
Conda Environments
Conda Packages
Anaconda Python
Python Basics
Anaconda R
R Basics
Numpy
Matplotlib
Seaborn
Scikit-learn
Pandas
Lab

4. Machine Learning
Machine Learning
Sklearn ML Libraries
Types of Learning
Supervised Learning
Unsupervised Learning
Recommendation Learning
Algorithms
Data Types
Sparse Vectors
LIBSVM
LibSVM Training Set
Machine Learning Flow
Naive Bayes Classifier
Anaconda
Lab

5. Predicting Titanic Survival
The Titanic Data Set
Building Good Training Sets
Pandas DataFrame
Feature Engineering
Age and Gender
Family Size
Class and Fare
Category Table
Survival Analysis
Naive Bayes Prediction
Naive Baynes in R
Classification
Lab

6. Classification Models
Data Sets
Classification
Naive Bayes
Logistic Regression
Y Intercept
Decision Tree Classifier
Over Fitting
Gini Impurity
Ensemble
Random Forest Classifier
Gradient-Boosted Tree
Additional Classifiers
Lab

7. Sklearn Pipelines
DataFrame API
ML Pipelines
DataFrames
Data Types
Vectors
Transformers
Estimators
Parameters
Pipelines
fit()
fit_predict
fit_transform
Pipeline Examples
Saving and Loading
Lab

8. Extracting Features
Nomenclature
Feature Selection
Dimensionality
Feature Reduction
Tokenizer
CountVectorizer
Stop Words
Term Frequency (TF)
Inverse Document Frequency (IDF)
TfidfVectorizer
String Transformers
Anaconda Word2Vec
Natural Language Toolkit
Lab

9. Regression Models
Regression
Linear Regression
Regularization
Lasso Model
Ridge Regression
Generalized Linear Regression
Decision Tree Regression
Random Forest Regression
Gradient-Boosted Tree Regression
Survival Regression
Lab

10. Evaluating Predictions
Model Evaluation
Classification Evaluation
Confusion Matrix
Binary Classification
classification_report
Threshold Tuning
Multiclass Classification
Label Based Metrics
Multilabel Classification
Regression Evaluation
Lab

11. Clustering Models
Unsupervised Learning
Clustering
K-Means
K-means Clustering
K-means Cluster Processing
Within Set Sum of Squared Error
Latent Dirichlet Allocation
LDA Pipeline
LDA Example
LDA LibSVM
Bisecting K-means
Gaussian Mixture Model
K-Means in SparkR
Lab

12. Recommendation Models
Apriori
Collaborative Filtering
Singular Value Decomposition
MovieLens Dataset
MovieLens Dataset Analysis
Evaluating Recommendation Engines
Eclat Algorithm
FP Algorithm
Sequence Mining
R PrefixSpan
Lab

13. Model Selection & Tuning
Hyperparameter Tuning
Model Selection
Cross-Validation
Cross-Validation Example
Stochastic Gradient Descent
SGDRegressor
Broyden–Fletcher–Goldfarb–Shanno
lbfgs-newton-cg,
LogisticRegression With LBFGS
Lab

14. Deep Learning
Machine versus Deep Learning
Multilayer Perceptron Classifier
Feedforward Artificial Neural Network
Hidden Layers
Python Example
R Example
Iris Dataset
Popular DNN Frameworks
TensorFlow
Lab

15. Business Applications of ML
Checklist
Marketing Use Cases
Healthcare Use Cases
Expedia
Expedia Scratchpad
Datacenter Network Traffic
Cisco ML Applications
Tetration Analytics
Stealthwatch Learning Network
Machine Learning Model Factory
Propensity to Buy
Companies Using ML Spark
Lab
Send Us a Message
Choose one