Data Visualization Training in Belfast City

Belfast City Centre

Forsythe House - Belfast
Cromac Square
Belfast BT2 8LA
United Kingdom
GB
Belfast City Centre
he Cromac Street Centre is located in a office development with a three-storey glass-fronted lobby on a prominent corner site in Belfast's city centre. It is...Read more

Client Testimonials

A practical introduction to Data Analysis and Big Data

presentation of technologies

Continental AG / Abteilung: CF IT Finance

Data Visualization

Trainer was enthusiastic.

Diane Lucas - Virginia Department of Education

Data Visualization

Good real world examples, reviews of existing reports

Ronald Parrish - Virginia Department of Education

A practical introduction to Data Analysis and Big Data

Willingness to share more

Balaram Chandra Paul - MOL Information Technology Asia Limited

Beyond the relational database: neo4j

Flexibility to blend in with Autodata related details to get more of a real world scenario as we went on.

Autodata Ltd

Data Visualization

I am a hands-on learner and this was something that he did a lot of.

Lisa Comfort - Virginia Department of Education

Beyond the relational database: neo4j

Flexibility to blend in with Autodata related details to get more of a real world scenario as we went on.

Autodata Ltd

Data Visualization

I thought that the information was interesting.

Allison May - Virginia Department of Education

A practical introduction to Data Analysis and Big Data

It covered a broad range of information.

Continental AG / Abteilung: CF IT Finance

Data Visualization

Learning about all the chart types and what they are used for. Learning the value of decluttering. Learning about the methods to show time data.

Susan Williams - Virginia Department of Education

Data Visualization

I really appreciated that Jeff utilized data and examples that were applicable to education data. He made it interesting and interactive.

Carol Wells Bazzichi - Virginia Department of Education

Beyond the relational database: neo4j

The trainer did bring some good insight and ways to approach developing a graph database. He used examples from the slides presented but also drew on his own experience which was good.

Autodata Ltd

Data Visualization

The examples.

peter coleman - Virginia Department of Education

Beyond the relational database: neo4j

The trainer did bring some good insight and ways to approach developing a graph database. He used examples from the slides presented but also drew on his own experience which was good.

Autodata Ltd

Data Visualization

The examples.

peter coleman - Virginia Department of Education

Data Visualization

Content / Instructor

Craig Roberson - Virginia Department of Education

A practical introduction to Data Analysis and Big Data

Overall the Content was good.

Sameer Rohadia - Continental AG / Abteilung: CF IT Finance

Data Visualization Course Events - Belfast City

Code Name Venue Duration Course Date PHP Course Price [Remote / Classroom]
datavisR1 Introduction to Data Visualization with R Belfast City Centre 28 hours Tue, 2018-01-30 09:30 £5200 / £6200
pythonmultipurpose Advanced Python Belfast City Centre 28 hours Tue, 2018-01-30 09:30 £4400 / £5400
nlpwithr NLP: Natural Language Processing with R Belfast City Centre 21 hours Wed, 2018-01-31 09:30 £3300 / £4050
highcharts Highcharts for Data Visualization Belfast City Centre 7 hours Thu, 2018-02-01 09:30 £1100 / £1350
BigData_ A practical introduction to Data Analysis and Big Data Belfast City Centre 35 hours Mon, 2018-02-05 09:30 £5500 / £6500
kdd Knowledge Discover in Databases (KDD) Belfast City Centre 21 hours Tue, 2018-02-06 09:30 £3300 / £4050
embeddingprojector Embedding Projector: Visualizing your Training Data Belfast City Centre 14 hours Wed, 2018-02-21 09:30 £2200 / £2700
zeppelin Zeppelin for interactive data analytics Belfast City Centre 14 hours Thu, 2018-02-22 09:30 £2200 / £2700
powerbiforbiandanalytics Power BI for Business Analysts Belfast City Centre 21 hours Tue, 2018-02-27 09:30 £3300 / £4050
datavisualizationreports Data Visualization: Creating Captivating Reports Belfast City Centre 21 hours Wed, 2018-02-28 09:30 £3300 / £4050
druid Druid: Build a fast, real-time data analysis system Belfast City Centre 21 hours Mon, 2018-03-05 09:30 £3300 / £4050
octnp Octave not only for programmers Belfast City Centre 21 hours Mon, 2018-03-05 09:30 £3300 / £4050
datameer Datameer for Data Analysts Belfast City Centre 14 hours Tue, 2018-03-06 09:30 £2200 / £2700
tidyverse Introduction to Data Visualization with Tidyverse and R Belfast City Centre 7 hours Tue, 2018-03-06 09:30 £1100 / £1350
d3js D3.js for Data Visualization Belfast City Centre 7 hours Thu, 2018-03-08 09:30 £1100 / £1350
neo4j Beyond the relational database: neo4j Belfast City Centre 21 hours Tue, 2018-03-13 09:30 £3300 / £4050
fsharpfordatascience F# for Data Science Belfast City Centre 21 hours Wed, 2018-03-21 09:30 £3300 / £4050
OpenNN OpenNN: Implementing neural networks Belfast City Centre 14 hours Wed, 2018-03-21 09:30 £2600 / £3100
pythonmultipurpose Advanced Python Belfast City Centre 28 hours Tue, 2018-03-27 09:30 £4400 / £5400
datavisR1 Introduction to Data Visualization with R Belfast City Centre 28 hours Tue, 2018-03-27 09:30 £5200 / £6200
nlpwithr NLP: Natural Language Processing with R Belfast City Centre 21 hours Wed, 2018-03-28 09:30 £3300 / £4050
BigData_ A practical introduction to Data Analysis and Big Data Belfast City Centre 35 hours Tue, 2018-04-03 09:30 £5500 / £6500
deckgl deck.gl: Visualizing Large-scale Geospatial Data Belfast City Centre 14 hours Tue, 2018-04-03 09:30 £2200 / £2700
kdd Knowledge Discover in Databases (KDD) Belfast City Centre 21 hours Wed, 2018-04-04 09:30 £3300 / £4050
highcharts Highcharts for Data Visualization Belfast City Centre 7 hours Thu, 2018-04-05 09:30 £1100 / £1350
zeppelin Zeppelin for interactive data analytics Belfast City Centre 14 hours Mon, 2018-04-16 09:30 £2200 / £2700
powerbiforbiandanalytics Power BI for Business Analysts Belfast City Centre 21 hours Mon, 2018-04-23 09:30 £3300 / £4050
datavisualizationreports Data Visualization: Creating Captivating Reports Belfast City Centre 21 hours Wed, 2018-04-25 09:30 £3300 / £4050
druid Druid: Build a fast, real-time data analysis system Belfast City Centre 21 hours Wed, 2018-04-25 09:30 £3300 / £4050
tidyverse Introduction to Data Visualization with Tidyverse and R Belfast City Centre 7 hours Wed, 2018-04-25 09:30 £1100 / £1350
datameer Datameer for Data Analysts Belfast City Centre 14 hours Thu, 2018-04-26 09:30 £2200 / £2700
d3js D3.js for Data Visualization Belfast City Centre 7 hours Tue, 2018-05-01 09:30 £1100 / £1350
octnp Octave not only for programmers Belfast City Centre 21 hours Wed, 2018-05-02 09:30 £3300 / £4050
embeddingprojector Embedding Projector: Visualizing your Training Data Belfast City Centre 14 hours Thu, 2018-05-03 09:30 £2200 / £2700
neo4j Beyond the relational database: neo4j Belfast City Centre 21 hours Wed, 2018-05-09 09:30 £3300 / £4050
OpenNN OpenNN: Implementing neural networks Belfast City Centre 14 hours Thu, 2018-05-10 09:30 £2600 / £3100
fsharpfordatascience F# for Data Science Belfast City Centre 21 hours Mon, 2018-05-14 09:30 £3300 / £4050
datavisR1 Introduction to Data Visualization with R Belfast City Centre 28 hours Mon, 2018-05-21 09:30 £5200 / £6200
nlpwithr NLP: Natural Language Processing with R Belfast City Centre 21 hours Mon, 2018-05-21 09:30 £3300 / £4050
pythonmultipurpose Advanced Python Belfast City Centre 28 hours Tue, 2018-05-22 09:30 £4400 / £5400
deckgl deck.gl: Visualizing Large-scale Geospatial Data Belfast City Centre 14 hours Wed, 2018-05-23 09:30 £2200 / £2700
kdd Knowledge Discover in Databases (KDD) Belfast City Centre 21 hours Tue, 2018-05-29 09:30 £3300 / £4050
BigData_ A practical introduction to Data Analysis and Big Data Belfast City Centre 35 hours Mon, 2018-06-04 09:30 £5500 / £6750
zeppelin Zeppelin for interactive data analytics Belfast City Centre 14 hours Wed, 2018-06-06 09:30 £2200 / £2700
tidyverse Introduction to Data Visualization with Tidyverse and R Belfast City Centre 7 hours Wed, 2018-06-13 09:30 £1100 / £1350
datavisualizationreports Data Visualization: Creating Captivating Reports Belfast City Centre 21 hours Mon, 2018-06-18 09:30 £3300 / £4050
druid Druid: Build a fast, real-time data analysis system Belfast City Centre 21 hours Mon, 2018-06-18 09:30 £3300 / £4050
datameer Datameer for Data Analysts Belfast City Centre 14 hours Mon, 2018-06-18 09:30 £2200 / £2700
powerbiforbiandanalytics Power BI for Business Analysts Belfast City Centre 21 hours Mon, 2018-06-18 09:30 £3300 / £4050
d3js D3.js for Data Visualization Belfast City Centre 7 hours Thu, 2018-06-21 09:30 £1100 / £1350
octnp Octave not only for programmers Belfast City Centre 21 hours Mon, 2018-06-25 09:30 £3300 / £4050
embeddingprojector Embedding Projector: Visualizing your Training Data Belfast City Centre 14 hours Tue, 2018-06-26 09:30 £2200 / £2700
OpenNN OpenNN: Implementing neural networks Belfast City Centre 14 hours Mon, 2018-07-02 09:30 £2600 / £3100
neo4j Beyond the relational database: neo4j Belfast City Centre 21 hours Tue, 2018-07-03 09:30 £3300 / £4050
fsharpfordatascience F# for Data Science Belfast City Centre 21 hours Wed, 2018-07-04 09:30 £3300 / £4050
highcharts Highcharts for Data Visualization Belfast City Centre 7 hours Wed, 2018-07-11 09:30 £1100 / £1350
datavisR1 Introduction to Data Visualization with R Belfast City Centre 28 hours Mon, 2018-07-16 09:30 £5200 / £6200
pythonmultipurpose Advanced Python Belfast City Centre 28 hours Tue, 2018-07-17 09:30 £4400 / £5400
nlpwithr NLP: Natural Language Processing with R Belfast City Centre 21 hours Wed, 2018-07-18 09:30 £3300 / £4050
deckgl deck.gl: Visualizing Large-scale Geospatial Data Belfast City Centre 14 hours Wed, 2018-07-18 09:30 £2200 / £2700
BigData_ A practical introduction to Data Analysis and Big Data Belfast City Centre 35 hours Mon, 2018-07-30 09:30 £5500 / £6750
kdd Knowledge Discover in Databases (KDD) Belfast City Centre 21 hours Wed, 2018-08-01 09:30 £3300 / £4050
zeppelin Zeppelin for interactive data analytics Belfast City Centre 14 hours Mon, 2018-08-06 09:30 £2200 / £2700

Course Outlines

Code Name Duration Outline
datameer Datameer for Data Analysts 14 hours

Datameer is a business intelligence and analytics platform built on Hadoop. It allows end-users to access, explore and correlate large-scale, structured, semi-structured and unstructured data in an easy-to-use fashion.

In this instructor-led, live training, participants will learn how to use Datameer to overcome Hadoop's steep learning curve as they step through the setup and analysis of a series of big data sources.

By the end of this training, participants will be able to:

  • Create, curate, and interactively explore an enterprise data lake
  • Access business intelligence data warehouses, transactional databases and other analytic stores
  • Use a spreadsheet user-interface to design end-to-end data processing pipelines
  • Access pre-built functions to explore complex data relationships
  • Use drag-and-drop wizards to visualize data and create dashboards
  • Use tables, charts, graphs, and maps to analyze query results

Audience

  • Data analysts

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

To request a customized course outline for this training, please contact us.

deckgl deck.gl: Visualizing Large-scale Geospatial Data 14 hours

deck.gl is an open-source, WebGL-powered library for exploring and visualizing data assets at scale. Created by Uber, it is especially useful for gaining insights from geospatial data sources, such as data on maps.

This instructor-led, live training introduces the concepts and functionality behind deck.gl and walks participants through the set up of a demonstration project.

By the end of this training, participants will be able to:

  • Take data from very large collections and turn it into compelling visual representations
  • Visualize data collected from transportation and journey-related use cases, such as pick-up and drop-off experiences, network traffic, etc.
  • Apply layering techniques to geospatial data to depict changes in data over time
  • Integrate deck.gl with React (for Reactive programming) and Mapbox GL (for visualizations on Mapbox based maps).
  • Understand and explore other use cases for deck.gl, including visualizing points collected from a 3D indoor scan, visualizing machine learning models in order to optimize their algorithms, etc.

Audience

  • Developers
  • Data scientists

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

To request a customized course outline for this training, please contact us.

embeddingprojector Embedding Projector: Visualizing your Training Data 14 hours

Embedding Projector is an open-source web application for visualizing the data used to train machine learning systems. Created by Google, it is part of TensorFlow.

This instructor-led, live training introduces the concepts behind Embedding Projector and walks participants through the setup of a demo project.

By the end of this training, participants will be able to:

  • Explore how data is being interpreted by machine learning models
  • Navigate through 3D and 2D views of data to understand how a machine learning algorithm interprets it
  • Understand the concepts behind Embeddings and their role in representing mathematical vectors for images, words and numerals.
  • Explore the properties of a specific embedding to understand the behavior of a model
  • Apply Embedding Project to real-world use cases such building a song recommendation system for music lovers

Audience

  • Developers
  • Data scientists

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

To request a customized course outline for this training, please contact us.

datavis1 Data Visualization 28 hours

This course is intended for engineers and decision makers working in data mining and knoweldge discovery.

You will learn how to create effective plots and ways to present and represent your data in a way that will appeal to the decision makers and help them to understand hidden information.

Day 1:

  • what is data visualization
  • why it is important
  • data visualization vs data mining
  • human cognition
  • HMI
  • common pitfalls

Day 2:

  • different type of curves
  • drill down curves
  • categorical data plotting
  • multi variable plots
  • data glyph and icon representation

Day 3:

  • plotting KPIs with data
  • R and X charts examples
  • what if dashboards
  • parallel axes mixing
  • categorical data with numeric data

Day 4:

  • different hats of data visualization
  • how can data visualization lie
  • disguised and hidden trends
  • a case study of student data
  • visual queries and region selection
datavisualizationreports Data Visualization: Creating Captivating Reports 21 hours

In this instructor-led, live training, participants will learn the skills, strategies, tools and approaches for visualizing and reporting data for different audiences. Case studies are also analyzed and discussed to exemplify how data visualization solutions are being applied in the real world to derive meaning out of data and answer crucial questions.

By the end of this training, participants will be able to:

  • Write reports with captivating titles, subtitles, and annotations using the most suitable highlighting, alignment, and color schemes for readability and user friendliness.
  • Design charts that fit the audience's information needs and interests
  • Choose the best chart types for a given dataset (beyond pie charts and bar charts)
  • Identify and analyze the most valuable and relevant data quickly and efficiently
  • Select the best file formats to include in reports (graphs, infographics, references, GIFs, etc.)
  • Create effective layouts for displaying time series data, part-to-whole relationships, geographic patterns, and nested data
  • Use effective color-coding to display qualitative and text-based data such as sentiment analysis, timelines, calendars, and diagrams
  • Apply the most suitable tools for the job (Excel, R, Tableau, mapping programs, etc.)
  • Prepare datasets for visualization

Audience

  • Data analysts
  • Business managers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Introduction to data visualization

Selecting and creating effective reports

Data visualization tools and resources

Generating and revising your visualizations

Closing remarks

datavisR1 Introduction to Data Visualization with R 28 hours

This course is intended for data engineers, decision makers and data analysts and will lead you to create very effective plots using R studio that appeal to decision makers and help them find out hidden information and take the right decisions

 

Day 1:

  • overview of R programming
  • introduction to data visualization
  • scatter plots and clusters
  • the use of noise and jitters

Day 2:

  • other type of 2D and 3D plots
  • histograms
  • heat charts
  • categorical data plotting

Day 3:

  • plotting KPIs with data
  • R and X charts examples
  • dashboards
  • parallel axes
  • mixing categorical data with numeric data

Day 4:

  • different hats of data visualization
  • disguised and hidden trends
  • case studies
  • saving plots and loading Excel files
fsharpfordatascience F# for Data Science 21 hours

Data science is the application of statistical analysis, machine learning, data visualization and programming for the purpose of understanding and interpreting real-world data. F# is a well suited programming language for data science as it combines efficient execution, REPL-scripting, powerful libraries and scalable data integration.

In this instructor-led, live training, participants will learn how to use F# to solve a series of real-world data science problems.

By the end of this training, participants will be able to:

  • Use F#'s integrated data science packages
  • Use F# to interoperate with other languages and platforms, including Excel, R, Matlab, and Python
  • Use the Deedle package to solve time series problems
  • Carry out advanced analysis with minimal lines of production-quality code
  • Understand how functional programming is a natural fit for scientific and big data computations
  • Access and visualize data with F#
  • Apply F# for machine learning

Explore solutions for problems in domains such as business intelligence and social gaming

Audience

  • Developers
  • Data scientists

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

To request a customized course outline for this training, please contact us.

deepmclrg Machine Learning & Deep Learning with Python and R 14 hours

MACHINE LEARNING

1: Introducing Machine Learning

  • The origins of machine learning
  • Uses and abuses of machine learning
  • Ethical considerations
  • How do machines learn?
  • Abstraction and knowledge representation
  • Generalization
  • Assessing the success of learning
  • Steps to apply machine learning to your data
  • Choosing a machine learning algorithm
  • Thinking about the input data
  • Thinking about types of machine learning algorithms
  • Matching your data to an appropriate algorithm
  • Using R for machine learning
  • Installing and loading R packages
  • Installing an R package
  • Installing a package using the point-and-click interface
  • Loading an R package
  • Summary

2: Managing and Understanding Data

  • R data structures
  • Vectors
  • Factors
  • Lists
  • Data frames
  • Matrixes and arrays
  • Managing data with R
  • Saving and loading R data structures
  • Importing and saving data from CSV files
  • Importing data from SQL databases
  • Exploring and understanding data
  • Exploring the structure of data
  • Exploring numeric variables
  • Measuring the central tendency – mean and median
  • Measuring spread – quartiles and the five-number summary
  • Visualizing numeric variables – boxplots
  • Visualizing numeric variables – histograms
  • Understanding numeric data – uniform and normal distributions
  • Measuring spread – variance and standard deviation
  • Exploring categorical variables
  • Measuring the central tendency – the mode
  • Exploring relationships between variables
  • Visualizing relationships – scatterplots
  • Examining relationships – two-way cross-tabulations
  • Summary

3: Lazy Learning – Classification Using Nearest Neighbors

  • Understanding classification using nearest neighbors
  • The kNN algorithm
  • Calculating distance
  • Choosing an appropriate k
  • Preparing data for use with kNN
  • Why is the kNN algorithm lazy?
  • Diagnosing breast cancer with the kNN algorithm
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Transformation – normalizing numeric data
  • Data preparation – creating training and test datasets
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Transformation – z-score standardization
  • Testing alternative values of k
  • Summary

4: Probabilistic Learning – Classification Using

  • Naive Bayes
  • Understanding naive Bayes
  • Basic concepts of Bayesian methods
  • Probability
  • Joint probability
  • Conditional probability with Bayes' theorem
  • The naive Bayes algorithm
  • The naive Bayes classification
  • The Laplace estimator
  • Using numeric features with naive Bayes
  • Example – filtering mobile phone spam with the naive Bayes algorithm
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Data preparation – processing text data for analysis
  • Data preparation – creating training and test datasets
  • Visualizing text data – word clouds
  • Data preparation – creating indicator features for frequent words
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Summary

5: Divide and Conquer – Classification Using

  • Decision Trees and Rules
  • Understanding decision trees
  • Divide and conquer
  • The C5.0 decision tree algorithm
  • Choosing the best split
  • Pruning the decision tree
  • Example – identifying risky bank loans using C5.0 decision trees
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Data preparation – creating random training and test datasets
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Boosting the accuracy of decision trees
  • Making some mistakes more costly than others
  • Understanding classification rules
  • Separate and conquer
  • The One Rule algorithm
  • The RIPPER algorithm
  • Rules from decision trees
  • Example – identifying poisonous mushrooms with rule learners
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Summary

6: Forecasting Numeric Data – Regression Methods

  • Understanding regression
  • Simple linear regression
  • Ordinary least squares estimation
  • Correlations
  • Multiple linear regression
  • Example – predicting medical expenses using linear regression
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Exploring relationships among features – the correlation matrix
  • Visualizing relationships among features – the scatterplot matrix
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Model specification – adding non-linear relationships
  • Transformation – converting a numeric variable to a binary indicator
  • Model specification – adding interaction effects
  • Putting it all together – an improved regression model
  • Understanding regression trees and model trees
  • Adding regression to trees
  • Example – estimating the quality of wines with regression trees
  • and model trees
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
    • Step 3 – training a model on the data
  • Visualizing decision trees
    • Step 4 – evaluating model performance
  • Measuring performance with mean absolute error
    • Step 5 – improving model performance
  • Summary

7: Black Box Methods – Neural Networks and

  • Support Vector Machines
  • Understanding neural networks
  • From biological to artificial neurons
  • Activation functions
  • Network topology
  • The number of layers
  • The direction of information travel
  • The number of nodes in each layer
  • Training neural networks with backpropagation
  • Modeling the strength of concrete with ANNs
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Understanding Support Vector Machines
  • Classification with hyperplanes
  • Finding the maximum margin
  • The case of linearly separable data
  • The case of non-linearly separable data
  • Using kernels for non-linear spaces
  • Performing OCR with SVMs
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Summary

8: Finding Patterns – Market Basket Analysis Using

  • Association Rules
  • Understanding association rules
  • The Apriori algorithm for association rule learning
  • Measuring rule interest – support and confidence
  • Building a set of rules with the Apriori principle
  • Example – identifying frequently purchased groceries with
  • association rules
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Data preparation – creating a sparse matrix for transaction data
  • Visualizing item support – item frequency plots
  • Visualizing transaction data – plotting the sparse matrix
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Sorting the set of association rules
  • Taking subsets of association rules
  • Saving association rules to a file or data frame
  • Summary

9: Finding Groups of Data – Clustering with k-means

  • Understanding clustering
  • Clustering as a machine learning task
  • The k-means algorithm for clustering
  • Using distance to assign and update clusters
  • Choosing the appropriate number of clusters
  • Finding teen market segments using k-means clustering
    • Step 1 – collecting data
    • Step 2 – exploring and preparing the data
  • Data preparation – dummy coding missing values
  • Data preparation – imputing missing values
    • Step 3 – training a model on the data
    • Step 4 – evaluating model performance
    • Step 5 – improving model performance
  • Summary

10: Evaluating Model Performance

  • Measuring performance for classification
  • Working with classification prediction data in R
  • A closer look at confusion matrices
  • Using confusion matrices to measure performance
  • Beyond accuracy – other measures of performance
  • The kappa statistic
  • Sensitivity and specificity
  • Precision and recall
  • The F-measure
  • Visualizing performance tradeoffs
  • ROC curves
  • Estimating future performance
  • The holdout method
  • Cross-validation
  • Bootstrap sampling
  • Summary

11: Improving Model Performance

  • Tuning stock models for better performance
  • Using caret for automated parameter tuning
  • Creating a simple tuned model
  • Customizing the tuning process
  • Improving model performance with meta-learning
  • Understanding ensembles
  • Bagging
  • Boosting
  • Random forests
  • Training random forests
  • Evaluating random forest performance
  • Summary

DEEP LEARNING with R

1: Getting Started with Deep Learning

  • What is deep learning?
  • Conceptual overview of neural networks
  • Deep neural networks
  • R packages for deep learning
  • Setting up reproducible results
  • Neural networks
  • The deepnet package
  • The darch package
  • The H2O package
  • Connecting R and H2O
  • Initializing H2O
  • Linking datasets to an H2O cluster
  • Summary

2: Training a Prediction Model

  • Neural networks in R
  • Building a neural network
  • Generating predictions from a neural network
  • The problem of overfitting data – the consequences explained
  • Use case – build and apply a neural network
  • Summary

3: Preventing Overfitting

  • L1 penalty
  • L1 penalty in action
  • L2 penalty
  • L2 penalty in action
  • Weight decay (L2 penalty in neural networks)
  • Ensembles and model averaging
  • Use case – improving out-of-sample model performance
  • using dropout
  • Summary

4: Identifying Anomalous Data

  • Getting started with unsupervised learning
  • How do auto-encoders work?
  • Regularized auto-encoders
  • Penalized auto-encoders
  • Denoising auto-encoders
  • Training an auto-encoder in R
  • Use case – building and applying an auto-encoder model
  • Fine-tuning auto-encoder models
  • Summary

5: Training Deep Prediction Models

  • Getting started with deep feedforward neural networks
  • Common activation functions – rectifiers, hyperbolic tangent,
  • and maxout
  • Picking hyperparameters
  • Training and predicting new data from a deep neural network
  • Use case – training a deep neural network for automatic
  • classification
  • Working with model results
  • Summary

6: Tuning and Optimizing Models

  • Dealing with missing data
  • Solutions for models with low accuracy
  • Grid search
  • Random search
  • Summary

DEEP LEARNING WITH PYTHON

I Introduction

1 Welcome

  • Deep Learning The Wrong Way
  • Deep Learning With Python
  • Summary

II Background

2 Introduction to Theano

  • What is Theano?
  • How to Install Theano
  • Simple Theano Example
  • Extensions and Wrappers for Theano
  • More Theano Resources
  • Summary

3 Introduction to TensorFlow

  • What is TensorFlow?
  • How to Install TensorFlow
  • Your First Examples in TensorFlow
  • Simple TensorFlow Example
  • More Deep Learning Models
  • Summary

4 Introduction to Keras

  • What is Keras?
  • How to Install Keras
  • Theano and TensorFlow Backends for Keras
  • Build Deep Learning Models with Keras
  • Summary

5 Project: Develop Large Models on GPUs Cheaply In the Cloud

  • Project Overview
  • Setup Your AWS Account
  • Launch Your Server Instance
  • Login, Configure and Run
  • Build and Run Models on AWS
  • Close Your EC2 Instance
  • Tips and Tricks for Using Keras on AWS
  • More Resources For Deep Learning on AWS
  • Summary

III Multilayer Perceptrons

6 Crash Course In Multilayer Perceptrons

  • Crash Course Overview
  • Multilayer Perceptrons
  • Neurons
  • Networks of Neurons
  • Training Networks
  • Summary

7 Develop Your First Neural Network With Keras

  • Tutorial Overview
  • Pima Indians Onset of Diabetes Dataset
  • Load Data
  • Define Model
  • Compile Model
  • Fit Model
  • Evaluate Model
  • Tie It All Together
  • Summary

8 Evaluate The Performance of Deep Learning Models

  • Empirically Evaluate Network Configurations
  • Data Splitting
  • Manual k-Fold Cross Validation
  • Summary

9 Use Keras Models With Scikit-Learn For General Machine Learning

  • Overview
  • Evaluate Models with Cross Validation
  • Grid Search Deep Learning Model Parameters
  • Summary

10 Project: Multiclass Classification Of Flower Species

  • Iris Flowers Classification Dataset
  • Import Classes and Functions
  • Initialize Random Number Generator
  • Load The Dataset
  • Encode The Output Variable
  • Define The Neural Network Model
  • Evaluate The Model with k-Fold Cross Validation
  • Summary

11 Project: Binary Classification Of Sonar Returns

  • Sonar Object Classification Dataset
  • Baseline Neural Network Model Performance
  • Improve Performance With Data Preparation
  • Tuning Layers and Neurons in The Model
  • Summary

12 Project: Regression Of Boston House Prices

  • Boston House Price Dataset
  • Develop a Baseline Neural Network Model
  • Lift Performance By Standardizing The Dataset
  • Tune The Neural Network Topology
  • Summary

IV Advanced Multilayer Perceptrons and Keras

13 Save Your Models For Later With Serialization

  • Tutorial Overview .
  • Save Your Neural Network Model to JSON
  • Save Your Neural Network Model to YAML
  • Summary

14 Keep The Best Models During Training With Checkpointing

  • Checkpointing Neural Network Models
  • Checkpoint Neural Network Model Improvements
  • Checkpoint Best Neural Network Model Only
  • Loading a Saved Neural Network Model
  • Summary

15 Understand Model Behavior During Training By Plotting History

  • Access Model Training History in Keras
  • Visualize Model Training History in Keras
  • Summary

16 Reduce Overfitting With Dropout Regularization

  • Dropout Regularization For Neural Networks
  • Dropout Regularization in Keras
  • Using Dropout on the Visible Layer
  • Using Dropout on Hidden Layers
  • Tips For Using Dropout
  • Summary

17 Lift Performance With Learning Rate Schedules

  • Learning Rate Schedule For Training Models
  • Ionosphere Classification Dataset
  • Time-Based Learning Rate Schedule
  • Drop-Based Learning Rate Schedule
  • Tips for Using Learning Rate Schedules
  • Summary

V Convolutional Neural Networks

18 Crash Course In Convolutional Neural Networks

  • The Case for Convolutional Neural Networks
  • Building Blocks of Convolutional Neural Networks
  • Convolutional Layers
  • Pooling Layers
  • Fully Connected Layers
  • Worked Example
  • Convolutional Neural Networks Best Practices
  • Summary

19 Project: Handwritten Digit Recognition

  • Handwritten Digit Recognition Dataset
  • Loading the MNIST dataset in Keras
  • Baseline Model with Multilayer Perceptrons
  • Simple Convolutional Neural Network for MNIST
  • Larger Convolutional Neural Network for MNIST
  • Summary

20 Improve Model Performance With Image Augmentation

  • Keras Image Augmentation API
  • Point of Comparison for Image Augmentation
  • Feature Standardization
  • ZCA Whitening
  • Random Rotations
  • Random Shifts
  • Random Flips
  • Saving Augmented Images to File
  • Tips For Augmenting Image Data with Keras
  • Summary

21 Project Object Recognition in Photographs

  • Photograph Object Recognition Dataset
  • Loading The CIFAR-10 Dataset in Keras
  • Simple CNN for CIFAR-10
  • Larger CNN for CIFAR-10
  • Extensions To Improve Model Performance
  • Summary

22 Project: Predict Sentiment From Movie Reviews

  • Movie Review Sentiment Classification Dataset
  • Load the IMDB Dataset With Keras
  • Word Embeddings
  • Simple Multilayer Perceptron Model
  • One-Dimensional Convolutional Neural Network
  • Summary

VI Recurrent Neural Networks

23 Crash Course In Recurrent Neural Networks

  • Support For Sequences in Neural Networks
  • Recurrent Neural Networks
  • Long Short-Term Memory Networks
  • Summary

24 Time Series Prediction with Multilayer Perceptrons

  • Problem Description: Time Series Prediction
  • Multilayer Perceptron Regression
  • Multilayer Perceptron Using the Window Method
  • Summary

25 Time Series Prediction with LSTM Recurrent Neural Networks

  • LSTM Network For Regression
  • LSTM For Regression Using the Window Method
  • LSTM For Regression with Time Steps
  • LSTM With Memory Between Batches
  • Stacked LSTMs With Memory Between Batches
  • Summary

26 Project: Sequence Classification of Movie Reviews

  • Simple LSTM for Sequence Classification
  • LSTM For Sequence Classification With Dropout
  • LSTM and CNN For Sequence Classification
  • Summary

27 Understanding Stateful LSTM Recurrent Neural Networks

  • Problem Description: Learn the Alphabet
  • LSTM for Learning One-Char to One-Char Mapping
  • LSTM for a Feature Window to One-Char Mapping
  • LSTM for a Time Step Window to One-Char Mapping
  • LSTM State Maintained Between Samples Within A Batch
  • Stateful LSTM for a One-Char to One-Char Mapping
  • LSTM with Variable Length Input to One-Char Output
  • Summary

28 Project: Text Generation With Alice in Wonderland

  • Problem Description: Text Generation
  • Develop a Small LSTM Recurrent Neural Network
  • Generating Text with an LSTM Network
  • Larger LSTM Recurrent Neural Network
  • Extension Ideas to Improve the Model
  • Summary
tidyverse Introduction to Data Visualization with Tidyverse and R 7 hours

The Tidyverse is a collection of versatile R packages for cleaning, processing, modeling, and visualizing data. Some of the packages included are: ggplot2, dplyr, tidyr, readr, purrr, and tibble.

In this instructor-led, live training, participants will learn how to manipulate and visualize data using the tools included in the Tidyverse.

By the end of this training, participants will be able to:

  • Perform data analysis and create appealing visualizations
  • Draw useful conclusions from various datasets of sample data
  • Filter, sort and summarize data to answer exploratory questions
  • Turn processed data into informative line plots, bar plots, histograms
  • Import and filter data from diverse data sources, including Excel, CSV, and SPSS files

Audience

  • Beginners to the R language
  • Beginners to data analysis and data visualization

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Introduction
    Tydyverse vs traditional R plotting

Setting up your working environment

Preparing the dataset

Importing and filtering data

Wrangling the data

Visualizing the data (graphs, scatter plots)

Grouping and summarizing the data

Visualizing the data (line plots, bar plots, histograms, boxplots)

Working with non-standard data

Closing remarks

neo4j Beyond the relational database: neo4j 21 hours

Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer.

In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure.

Audience

  • Database administrators (DBAs)
  • Data analysts
  • Developers
  • System Administrators
  • DevOps engineers
  • Business Analysts
  • CTOs
  • CIOs

Format of the course

  • Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development.

Getting started with neo4j

  • neo4j vs relational databases
  • neo4j vs other NoSQL databases
  • Using neo4j to solve real world problems
  • Installing neo4j

Data modeling with neo4j

  • Mapping white-board diagrams and mind maps to neo4j

Working with nodes

  • Creating, changing and deleting nodes
  • Defining node properties

Node relationships

  • Creating and deleting relationships
  • Bi-directional relationships

Querying your data with Cypher

  • Querying your data based on relationships
  • MATCH, RETURN, WHERE, REMOVE, MERGE, etc.
  • Setting indexes and constraints

Working with the REST API

  • REST operations on nodes
  • REST operations on relationships
  • REST operations on indexes and constraints

Accessing the core API for application development

  • Working with NET, Java, Javascript, and Python APIs

Closing remarks

 

powerbiforbiandanalytics Power BI for Business Analysts 21 hours

Microsoft Power BI is a free Software as a Service (SaaS) suite for analyzing data and sharing insights. Power BI dashboards provide a 360-degree view of the most important metrics in one place, updated in real time, and available on all of their devices.

In this instructor-led, live training, participants will learn how to use Microsoft Power Bi to analyze and visualize data using a series of sample data sets.

By the end of this training, participants will be able to:

  • Create visually compelling dashboards that provide valuable insights into data
  • Obtain and integrate data from multiple data sources
  • Build and share visualizations with team members
  • Adjust data with Power BI Desktop

Audience

  • Business managers
  • Business analystss
  • Data analysts
  • Business Intelligence (BI) and Data Warehouse (DW) teams
  • Report developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

 

Introduction

Data Visualization

  • Authoring in Power BI Desktop
  • Creating reports
  • Interacting with reports
  • Uploading reports it to the Power BI Service
  • Revising report layouts
  • Publishing to PowerBI.com
  • Sharing and collaborating with team members

Data Modeling

  • Aquiring data
  • Modeling data
  • Security
  • Working with DAX
  • Refreshing the source data
  • Securing data

Advanced querying and data modeling

  • Data modeling principals
  • Complex DAX patterns
  • Power BI tips and tricks

Closing remarks

kdd Knowledge Discover in Databases (KDD) 21 hours

Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing.

In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes.

Audience
    Data analysts or anyone interested in learning how to interpret data to solve problems

Format of the course
    After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations.

Introduction
    KDD vs data mining

Establishing the application domain

Establishing relevant prior knowledge

Understanding the goal of the investigation

Creating a target data set

Data cleaning and preprocessing

Data reduction and projection

Choosing the data mining task

Choosing the data mining algorithms

Interpreting the mined patterns

d3js D3.js for Data Visualization 7 hours

D3.js (or D3 for Data-Driven Documents) is a JavaScript library that uses SVG, HTML5, and CSS for producing dynamic, interactive data visualizations in web browsers.

In this instructor-led, live training, participants will learn how to create web-based data-driven visualizations that run on multiple devices responsively.

By the end of this training, participants will be able to:

  • Use D3 to create interactive graphics, information dashboards, infographics and maps
  • Control HTML with jQuery-like selections
  • Transform the DOM by selecting elements and joining to data
  • Export SVG for use in print publications

Audience

  • Developers
  • Data scientists

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Introduction

Overview of the data visualization process

Data visualization components: HTML, CSS, Javascript, DOM, D3, SVG

D3 methods: scaling, events, transitions, and animations

Attaching your data to DOM (Document Object Model) elements

Using CSS3, HTML, and/or SVG to showcase data

Making data interactive with D3.js data-driven transformations and transitions

Working with layouts

Exporting SVG

Closing remarks

OpenNN OpenNN: Implementing neural networks 14 hours

OpenNN is an open-source class library written in C++  which implements neural networks, for use in machine learning.

In this course we go over the principles of neural networks and use OpenNN to implement a sample application.

Audience
    Software developers and programmers wishing to create Deep Learning applications.

Format of the course
    Lecture and discussion coupled with hands-on exercises.

Introduction to OpenNN, Machine Learning and Deep Learning

Downloading OpenNN

Working with Neural Designer
    Using Neural Designer for descriptive, diagnostic, predictive and prescriptive analytics

OpenNN architecture
    CPU parallelization

OpenNN classes
    Data set, neural network, loss index, training strategy, model selection, testing analysis
    Vector and matrix templates

Building a neural network application
    Choosing a suitable neural network
    Formulating the variational problem (loss index)
    Solving the reduced function optimization problem (training strategy)

Working with datasets
     The data matrix (columns as variables and rows as instances)

Learning tasks
    Function regression
    Pattern recognition

Compiling with QT Creator

Integrating, testing and debugging your application

The future of neural networks and OpenNN

highcharts Highcharts for Data Visualization 7 hours

Highcharts is an open-source JavaScript library for creating interactive graphical charts on the Web. It is commonly used to represent data in a more user-readable and interactive fashion.

In this instructor-led, live training, participants will learn how to create high-quality data visualizations for web applications using Highcharts.

By the end of this training, participants will be able to:

  • Set up interactive charts on the Web using only HTML and JavaScript
  • Represent large datasets in visually interesting and interactive ways
  • Export charts to JPEG, PNG, SVG, or PDF
  • Integrate Highcharts with jQuery Mobile for cross-platform compatibility

Audience

  • Developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Introduction

Configuring Highcharts

Chart Types, Series Types, Layouts, Options, and Styling

Working with Data: CSV, XML, and JSON using AJAX

Enabling User Interaction

Sharing Charts

Highcharts on Mobile Platforms

Closing Remarks

druid Druid: Build a fast, real-time data analysis system 21 hours

Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo.

In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment.

Audience
    Application developers
    Software engineers
    Technical consultants
    DevOps professionals
    Architecture engineers

Format of the course
    Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding

Introduction

Installing and starting Druid

Druid architecture and design

Real-time ingestion of event data

Sharding and indexing

Loading data

Querying data

Visualizing data

Running a distributed cluster

Druid + Apache Hive

Druid + Apache Kafka

Druid + others

Troubleshooting

Administrative tasks

nlpwithr NLP: Natural Language Processing with R 21 hours

It is estimated that unstructured data accounts for more than 90 percent of all data, much of it in the form of text. Blog posts, tweets, social media, and other digital publications continuously add to this growing body of data.

This course centers around extracting insights and meaning from this data. Utilizing the R Language and Natural Language Processing (NLP) libraries, we combine concepts and techniques from computer science, artificial intelligence, and computational linguistics to algorithmically understand the meaning behind text data. Data samples are available in various languages per customer requirements.

By the end of this training participants will be able to prepare data sets (large and small) from disparate sources, then apply the right algorithms to analyze and report on its significance.

Audience
    Linguists and programmers

Format of the course
    Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding

Introduction
    NLP and R vs Python

Installing and configuring R Studio

Installing R packages related to Natural Language Processing (NLP).

An overview of R’s text manipulation capabilities

Getting started with an NLP project in R

Reading and importing data files into R

Text manipulation with R

Document clustering in R

Parts of speech tagging in R

Sentence parsing in R

Working with regular expressions in R

Named-entity recognition in R

Topic modeling in R

Text classification in R

Working with very large data sets

Visualizing your results

Optimization

Integrating R with other languages (Java, Python, etc.)

Closing remarks

BigData_ A practical introduction to Data Analysis and Big Data 35 hours

Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools.

Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class.

The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability.

Audience

  • Developers / programmers
  • IT consultants

Format of the course

  • Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.

Introduction to Data Analysis and Big Data

  • What makes Big Data "big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Languages used for Data Analysis

  • R language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / Topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability
  • Choosing the right solution for the problem
  • The future of Big Data
  • Closing remarks
octnp Octave not only for programmers 21 hours

Course is dedicated for those who would like to know an alternative program to the commercial MATLAB package. The three-day training provides comprehensive information on moving around the environment and performing the OCTAVE package for data analysis and engineering calculations. The training recipients are beginners but also those who know the program and would like to systematize their knowledge and improve their skills. Knowledge of other programming languages is not required, but it will greatly facilitate the learners' acquisition of knowledge. The course will show you how to use the program in many practical examples.

Introduction

Simple calculations

  • Starting Octave, Octave as a calculator, built-in functions

The Octave environment

  • Named variables, numbers and formatting, number representation and accuracy, loading and saving data 

Arrays and vectors

  • Extracting elements from a vector, vector maths

Plotting graphs

  • Improving the presentation, multiple graphs and figures, saving and printing figures

Octave programming I: Script files

  • Creating and editing a script, running and debugging scripts,

Control statements

  • If else, switch, for, while

Octave programming II: Functions

Matrices and vectors

  • Matrix, the transpose operator, matrix creation functions, building composite matrices, matrices as tables, extracting bits of matrices, basic matrix functions

Linear and Nonlinear Equations

More graphs

  • Putting several graphs in one window, 3D plots, changing the viewpoint, plotting surfaces, images and movies,

 Eigenvectors and the Singular Value Decomposition

 Complex numbers

  • Plotting complex numbers,

 Statistics and data processing

 GUI Development

matlabdsandreporting MATLAB Fundamentals, Data Science & Report Generation 126 hours

In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform.  Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles.

In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic.

In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation.

Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB's capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation.

Assessments will be conducted throughout the course to gauge progress.

Format of the course

  • Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation.

Note

  • Practice sessions will be based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange.

Introduction
MATLAB for data science and reporting

 

Part 01: MATLAB fundamentals

Overview
    MATLAB for data analysis, visualization, modeling, and programming.

Working with the MATLAB user interface

Overview of MATLAB syntax

Entering commands
    Using the command line interface

Creating variables
    Numeric vs character data

Analyzing vectors and matrices
    Creating and manipulating
    Performing calculations

Visualizing vector and matrix data

Working with data files
    Importing data from Excel spreadsheets

Working with data types
    Working with table data

Automating commands with scripts
    Creating and running scripts
    Organizing and publishing your scripts

Writing programs with branching and loops
    User interaction and flow control

Writing functions
    Creating and calling functions
    Debugging with MATLAB Editor

Applying object-oriented programming principles to your programs

 

Part 02: MATLAB for data science

Overview
    MATLAB for data mining, machine learning and predictive analytics

Accessing data
    Obtaining data from files, spreadsheets, and databases
    Obtaining data from test equipment and hardware
    Obtaining data from software and the Web

Exploring data
    Identifying trends, testing hypotheses, and estimating uncertainty

Creating customized algorithms

Creating visualizations

Creating models

Publishing customized reports

Sharing analysis tools
    As MATLAB code
    As standalone desktop or Web applications

Using the Statistics and Machine Learning Toolbox

Using the Neural Network Toolbox

 

Part 03: Report generation

Overview
    Presenting results from MATLAB programs, applications, and sample data
    Generating Microsoft Word, PowerPoint®, PDF, and HTML reports.
    Templated reports
    Tailor-made reports
        Using organization’s templates and standards

Creating reports interactively vs programmatically
    Using the Report Explorer
    Using the DOM (Document Object Model) API

Creating reports interactively using Report Explorer
    Report Explorer Examples
        Magic Squares Report Explorer Example

    Creating reports
        Using Report Explorer to create report setup file, define report structure and content

    Formatting reports
        Specifying default report style and format for Report Explorer reports

    Generating reports
        Configuring Report Explorer for processing and running report

    Managing report conversion templates
        Copying and managing Microsoft Word , PDF, and HTML conversion templates for Report Explorer reports

    Customizing Report Conversion templates
        Customizing the style and format of Microsoft Word and HTML conversion templates for Report Explorer reports

    Customizing components and style sheets
        Customizing report components, define layout style sheets

Creating reports programmatically in MATLAB
    Template-Based Report Object (DOM) API Examples
        Functional report
        Object-oriented report
        Programmatic report formatting

    Creating report content
        Using the Document Object Model (DOM) API

    Report format basics
        Specifying format for report content

    Creating form-based reports
        Using the DOM API to fill in the blanks in a report form

    Creating object-oriented reports
        Deriving classes to simplify report creation and maintenance

    Creating and formatting report objects
        Lists, tables, and images

    Creating DOM Reports from HTML
        Appending HTML string or file to a Microsoft® Word, PDF, or HTML report generated by Document Object Model (DOM) API

    Creating report templates
        Creating templates to use with programmatic reports

    Formatting page layouts
        Formatting pages in Microsoft Word and PDF reports


Summary and closing remarks

pythonmultipurpose Advanced Python 28 hours

In this instructor-led training, participants will learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, finance, data analysis and visualization, UI programming and maintenance scripting.

Audience

  • Developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Notes

  • If you wish to add, remove or customize any section or topic within this course, please contact us to arrange.

Introduction

  • Python versatility: from data analysis to web crawling

Python data structures and operations

  • Integers and floats
  • Strings and bytes
  • Tuples and lists
  • Dictionaries and ordered dictionaries
  • Sets and frozen sets
  • Data frame (pandas)
  • Conversions

Object-oriented programming with Python

  • Inheritance
  • Polymorphism
  • Static classes
  • Static functions
  • Decorators
  • Other

Data Analysis with pandas

  • Data cleaning
  • Using vectorized data in pandas
  • Data wrangling
  • Sorting and filtering data
  • Aggregate operations
  • Analyzing time series

Data visualization

  • Plotting diagrams with matplotlib
  • Using matplotlib from within pandas
  • Creating quality diagrams
  • Visualizing data in Jupyter notebooks
  • Other visualization libraries in Python

Vectorizing Data in Numpy

  • Creating Numpy arrays
  • Common operations on matrices
  • Using ufuncs
  • Views and broadcasting on Numpy arrays
  • Optimizing performance by avoiding loops
  • Optimizing performance with cProfile

Processing Big Data with Python

  • Building and supporting distributed applications with Python
  • Data storage: Working with SQL and NoSQL databases
  • Distributed processing with Hadoop and Spark
  • Scaling your applications

Python for finance

  • Packages, libraries and APIs for financial processing
    • Zipline
    • PyAlgoTrade
    • Pybacktest
    • quantlib
    • Python APIs

Extending Python (and vice versa) with other languages

  • C#
  • Java
  • C++
  • Perl
  • Others

Python multi-threaded programming

  • Modules
  • Synchronizing
  • Prioritizing

UI programming with Python

  • Framework options for building GUIs in Python
    • Tkinter
    • Pyqt

Python for maintenance scripting

  • Raising and catching exceptions correctly
  • Organizing code into modules and packages
  • Understanding symbol tables and accessing them in code
  • Picking a testing framework and applying TDD in Python

Python for the web

  • Packages for web processing
  • Web crawling
  • Parsing HTML and XML
  • Filling web forms automatically

Closing remarks

zeppelin Zeppelin for interactive data analytics 14 hours

Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data.

This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment.

By the end of this training, participants will be able to:

  • Install and configure Zeppelin
  • Develop, organize, execute and share data in a browser-based interface
  • Visualize results without referring to the command line or cluster details
  • Execute and collaborate on long workflows
  • Work with any of a number of plug-in language/data-processing-backends, such as Scala ( with Apache Spark ), Python ( with Apache Spark ), Spark SQL, JDBC, Markdown and Shell.
  • Integrate Zeppelin with Spark, Flink and Map Reduce
  • Secure multi-user instances of Zeppelin with Apache Shiro

Audience

  • Data engineers
  • Data analysts
  • Data scientists
  • Software developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

To request a customized course outline for this training, please contact us.

 

Other regions

Data Visualization training courses in Belfast City, Weekend Data Visualization courses in Belfast City, Evening Data Visualization training in Belfast City, Data Visualization instructor-led in Belfast City , Data Visualization instructor-led in Belfast City, Data Visualization coaching in Belfast City, Data Visualization private courses in Belfast City,Weekend Data Visualization training in Belfast City, Data Visualization on-site in Belfast City, Data Visualization trainer in Belfast City,Data Visualization classes in Belfast City, Data Visualization boot camp in Belfast City, Data Visualization one on one training in Belfast City, Evening Data Visualization courses in Belfast City

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients