Data Visualization Training Courses

Testi...Client Testimonials

Data Visualization

I thought that the information was interesting.

Allison May - Virginia Department of Education

Data Visualization

I really appreciated that Jeff utilized data and examples that were applicable to education data. He made it interesting and interactive.

Carol Wells Bazzichi - Virginia Department of Education

Data Visualization

Learning about all the chart types and what they are used for. Learning the value of decluttering. Learning about the methods to show time data.

Susan Williams - Virginia Department of Education

Data Visualization

Trainer was enthusiastic.

Diane Lucas - Virginia Department of Education

Data Visualization

Content / Instructor

Craig Roberson - Virginia Department of Education

Data Visualization

I am a hands-on learner and this was something that he did a lot of.

Lisa Comfort - Virginia Department of Education

Data Visualization

The examples.

peter coleman - Virginia Department of Education

Data Visualization

The examples.

peter coleman - Virginia Department of Education

Data Visualization

Good real world examples, reviews of existing reports

Ronald Parrish - Virginia Department of Education

A practical introduction to Data Analysis and Big Data

Willingness to share more

Balaram Chandra Paul - MOL Information Technology Asia Limited

Data Visualization Course Outlines

Code Name Duration Overview
embeddingprojector Embedding Projector: Visualizing your Training Data 14 hours Embedding Projector is an open-source web application for visualizing the data used to train machine learning systems. Created by Google, it is part of TensorFlow. This instructor-led, live training introduces the concepts behind Embedding Projector and walks participants through the setup of a demo project. By the end of this training, participants will be able to: Explore how data is being interpreted by machine learning models Navigate through 3D and 2D views of data to understand how a machine learning algorithm interprets it Understand the concepts behind Embeddings and their role in representing mathematical vectors for images, words and numerals. Explore the properties of a specific embedding to understand the behavior of a model Apply Embedding Project to real-world use cases such building a song recommendation system for music lovers Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice To request a customized course outline for this training, please contact us.
datavisualizationreports Data Visualization: Creating Captivating Reports 21 hours In this instructor-led, live training, participants will learn the skills, strategies, tools and approaches for visualizing and reporting data for different audiences. Case studies are also analyzed and discussed to exemplify how data visualization solutions are being applied in the real world to derive meaning out of data and answer crucial questions. By the end of this training, participants will be able to: Write reports with captivating titles, subtitles, and annotations using the most suitable highlighting, alignment, and color schemes for readability and user friendliness. Design charts that fit the audience's information needs and interests Choose the best chart types for a given dataset (beyond pie charts and bar charts) Identify and analyze the most valuable and relevant data quickly and efficiently Select the best file formats to include in reports (graphs, infographics, references, GIFs, etc.) Create effective layouts for displaying time series data, part-to-whole relationships, geographic patterns, and nested data Use effective color-coding to display qualitative and text-based data such as sentiment analysis, timelines, calendars, and diagrams Apply the most suitable tools for the job (Excel, R, Tableau, mapping programs, etc.) Prepare datasets for visualization Audience Data analysts Business managers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Introduction to data visualization Selecting and creating effective reports Data visualization tools and resources Generating and revising your visualizations Closing remarks
fsharpfordatascience F# for Data Science 21 hours Data science is the application of statistical analysis, machine learning, data visualization and programming for the purpose of understanding and interpreting real-world data. F# is a well suited programming language for data science as it combines efficient execution, REPL-scripting, powerful libraries and scalable data integration. In this instructor-led, live training, participants will learn how to use F# to solve a series of real-world data science problems. By the end of this training, participants will be able to: Use F#'s integrated data science packages Use F# to interoperate with other languages and platforms, including Excel, R, Matlab, and Python Use the Deedle package to solve time series problems Carry out advanced analysis with minimal lines of production-quality code Understand how functional programming is a natural fit for scientific and big data computations Access and visualize data with F# Apply F# for machine learning Explore solutions for problems in domains such as business intelligence and social gaming Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice To request a customized course outline for this training, please contact us.
deepmclrg Machine Learning & Deep Learning with Python and R 14 hours MACHINE LEARNING 1: Introducing Machine Learning The origins of machine learning Uses and abuses of machine learning Ethical considerations How do machines learn? Abstraction and knowledge representation Generalization Assessing the success of learning Steps to apply machine learning to your data Choosing a machine learning algorithm Thinking about the input data Thinking about types of machine learning algorithms Matching your data to an appropriate algorithm Using R for machine learning Installing and loading R packages Installing an R package Installing a package using the point-and-click interface Loading an R package Summary 2: Managing and Understanding Data R data structures Vectors Factors Lists Data frames Matrixes and arrays Managing data with R Saving and loading R data structures Importing and saving data from CSV files Importing data from SQL databases Exploring and understanding data Exploring the structure of data Exploring numeric variables Measuring the central tendency – mean and median Measuring spread – quartiles and the five-number summary Visualizing numeric variables – boxplots Visualizing numeric variables – histograms Understanding numeric data – uniform and normal distributions Measuring spread – variance and standard deviation Exploring categorical variables Measuring the central tendency – the mode Exploring relationships between variables Visualizing relationships – scatterplots Examining relationships – two-way cross-tabulations Summary 3: Lazy Learning – Classification Using Nearest Neighbors Understanding classification using nearest neighbors The kNN algorithm Calculating distance Choosing an appropriate k Preparing data for use with kNN Why is the kNN algorithm lazy? Diagnosing breast cancer with the kNN algorithm Step 1 – collecting data Step 2 – exploring and preparing the data Transformation – normalizing numeric data Data preparation – creating training and test datasets Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Transformation – z-score standardization Testing alternative values of k Summary 4: Probabilistic Learning – Classification Using Naive Bayes Understanding naive Bayes Basic concepts of Bayesian methods Probability Joint probability Conditional probability with Bayes' theorem The naive Bayes algorithm The naive Bayes classification The Laplace estimator Using numeric features with naive Bayes Example – filtering mobile phone spam with the naive Bayes algorithm Step 1 – collecting data Step 2 – exploring and preparing the data Data preparation – processing text data for analysis Data preparation – creating training and test datasets Visualizing text data – word clouds Data preparation – creating indicator features for frequent words Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Summary 5: Divide and Conquer – Classification Using Decision Trees and Rules Understanding decision trees Divide and conquer The C5.0 decision tree algorithm Choosing the best split Pruning the decision tree Example – identifying risky bank loans using C5.0 decision trees Step 1 – collecting data Step 2 – exploring and preparing the data Data preparation – creating random training and test datasets Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Boosting the accuracy of decision trees Making some mistakes more costly than others Understanding classification rules Separate and conquer The One Rule algorithm The RIPPER algorithm Rules from decision trees Example – identifying poisonous mushrooms with rule learners Step 1 – collecting data Step 2 – exploring and preparing the data Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Summary 6: Forecasting Numeric Data – Regression Methods Understanding regression Simple linear regression Ordinary least squares estimation Correlations Multiple linear regression Example – predicting medical expenses using linear regression Step 1 – collecting data Step 2 – exploring and preparing the data Exploring relationships among features – the correlation matrix Visualizing relationships among features – the scatterplot matrix Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Model specification – adding non-linear relationships Transformation – converting a numeric variable to a binary indicator Model specification – adding interaction effects Putting it all together – an improved regression model Understanding regression trees and model trees Adding regression to trees Example – estimating the quality of wines with regression trees and model trees Step 1 – collecting data Step 2 – exploring and preparing the data Step 3 – training a model on the data Visualizing decision trees Step 4 – evaluating model performance Measuring performance with mean absolute error Step 5 – improving model performance Summary 7: Black Box Methods – Neural Networks and Support Vector Machines Understanding neural networks From biological to artificial neurons Activation functions Network topology The number of layers The direction of information travel The number of nodes in each layer Training neural networks with backpropagation Modeling the strength of concrete with ANNs Step 1 – collecting data Step 2 – exploring and preparing the data Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Understanding Support Vector Machines Classification with hyperplanes Finding the maximum margin The case of linearly separable data The case of non-linearly separable data Using kernels for non-linear spaces Performing OCR with SVMs Step 1 – collecting data Step 2 – exploring and preparing the data Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Summary 8: Finding Patterns – Market Basket Analysis Using Association Rules Understanding association rules The Apriori algorithm for association rule learning Measuring rule interest – support and confidence Building a set of rules with the Apriori principle Example – identifying frequently purchased groceries with association rules Step 1 – collecting data Step 2 – exploring and preparing the data Data preparation – creating a sparse matrix for transaction data Visualizing item support – item frequency plots Visualizing transaction data – plotting the sparse matrix Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Sorting the set of association rules Taking subsets of association rules Saving association rules to a file or data frame Summary 9: Finding Groups of Data – Clustering with k-means Understanding clustering Clustering as a machine learning task The k-means algorithm for clustering Using distance to assign and update clusters Choosing the appropriate number of clusters Finding teen market segments using k-means clustering Step 1 – collecting data Step 2 – exploring and preparing the data Data preparation – dummy coding missing values Data preparation – imputing missing values Step 3 – training a model on the data Step 4 – evaluating model performance Step 5 – improving model performance Summary 10: Evaluating Model Performance Measuring performance for classification Working with classification prediction data in R A closer look at confusion matrices Using confusion matrices to measure performance Beyond accuracy – other measures of performance The kappa statistic Sensitivity and specificity Precision and recall The F-measure Visualizing performance tradeoffs ROC curves Estimating future performance The holdout method Cross-validation Bootstrap sampling Summary 11: Improving Model Performance Tuning stock models for better performance Using caret for automated parameter tuning Creating a simple tuned model Customizing the tuning process Improving model performance with meta-learning Understanding ensembles Bagging Boosting Random forests Training random forests Evaluating random forest performance Summary DEEP LEARNING with R 1: Getting Started with Deep Learning What is deep learning? Conceptual overview of neural networks Deep neural networks R packages for deep learning Setting up reproducible results Neural networks The deepnet package The darch package The H2O package Connecting R and H2O Initializing H2O Linking datasets to an H2O cluster Summary 2: Training a Prediction Model Neural networks in R Building a neural network Generating predictions from a neural network The problem of overfitting data – the consequences explained Use case – build and apply a neural network Summary 3: Preventing Overfitting L1 penalty L1 penalty in action L2 penalty L2 penalty in action Weight decay (L2 penalty in neural networks) Ensembles and model averaging Use case – improving out-of-sample model performance using dropout Summary 4: Identifying Anomalous Data Getting started with unsupervised learning How do auto-encoders work? Regularized auto-encoders Penalized auto-encoders Denoising auto-encoders Training an auto-encoder in R Use case – building and applying an auto-encoder model Fine-tuning auto-encoder models Summary 5: Training Deep Prediction Models Getting started with deep feedforward neural networks Common activation functions – rectifiers, hyperbolic tangent, and maxout Picking hyperparameters Training and predicting new data from a deep neural network Use case – training a deep neural network for automatic classification Working with model results Summary 6: Tuning and Optimizing Models Dealing with missing data Solutions for models with low accuracy Grid search Random search Summary DEEP LEARNING WITH PYTHON I Introduction 1 Welcome Deep Learning The Wrong Way Deep Learning With Python Summary II Background 2 Introduction to Theano What is Theano? How to Install Theano Simple Theano Example Extensions and Wrappers for Theano More Theano Resources Summary 3 Introduction to TensorFlow What is TensorFlow? How to Install TensorFlow Your First Examples in TensorFlow Simple TensorFlow Example More Deep Learning Models Summary 4 Introduction to Keras What is Keras? How to Install Keras Theano and TensorFlow Backends for Keras Build Deep Learning Models with Keras Summary 5 Project: Develop Large Models on GPUs Cheaply In the Cloud Project Overview Setup Your AWS Account Launch Your Server Instance Login, Configure and Run Build and Run Models on AWS Close Your EC2 Instance Tips and Tricks for Using Keras on AWS More Resources For Deep Learning on AWS Summary III Multilayer Perceptrons 6 Crash Course In Multilayer Perceptrons Crash Course Overview Multilayer Perceptrons Neurons Networks of Neurons Training Networks Summary 7 Develop Your First Neural Network With Keras Tutorial Overview Pima Indians Onset of Diabetes Dataset Load Data Define Model Compile Model Fit Model Evaluate Model Tie It All Together Summary 8 Evaluate The Performance of Deep Learning Models Empirically Evaluate Network Configurations Data Splitting Manual k-Fold Cross Validation Summary 9 Use Keras Models With Scikit-Learn For General Machine Learning Overview Evaluate Models with Cross Validation Grid Search Deep Learning Model Parameters Summary 10 Project: Multiclass Classification Of Flower Species Iris Flowers Classification Dataset Import Classes and Functions Initialize Random Number Generator Load The Dataset Encode The Output Variable Define The Neural Network Model Evaluate The Model with k-Fold Cross Validation Summary 11 Project: Binary Classification Of Sonar Returns Sonar Object Classification Dataset Baseline Neural Network Model Performance Improve Performance With Data Preparation Tuning Layers and Neurons in The Model Summary 12 Project: Regression Of Boston House Prices Boston House Price Dataset Develop a Baseline Neural Network Model Lift Performance By Standardizing The Dataset Tune The Neural Network Topology Summary IV Advanced Multilayer Perceptrons and Keras 13 Save Your Models For Later With Serialization Tutorial Overview . Save Your Neural Network Model to JSON Save Your Neural Network Model to YAML Summary 14 Keep The Best Models During Training With Checkpointing Checkpointing Neural Network Models Checkpoint Neural Network Model Improvements Checkpoint Best Neural Network Model Only Loading a Saved Neural Network Model Summary 15 Understand Model Behavior During Training By Plotting History Access Model Training History in Keras Visualize Model Training History in Keras Summary 16 Reduce Overfitting With Dropout Regularization Dropout Regularization For Neural Networks Dropout Regularization in Keras Using Dropout on the Visible Layer Using Dropout on Hidden Layers Tips For Using Dropout Summary 17 Lift Performance With Learning Rate Schedules Learning Rate Schedule For Training Models Ionosphere Classification Dataset Time-Based Learning Rate Schedule Drop-Based Learning Rate Schedule Tips for Using Learning Rate Schedules Summary V Convolutional Neural Networks 18 Crash Course In Convolutional Neural Networks The Case for Convolutional Neural Networks Building Blocks of Convolutional Neural Networks Convolutional Layers Pooling Layers Fully Connected Layers Worked Example Convolutional Neural Networks Best Practices Summary 19 Project: Handwritten Digit Recognition Handwritten Digit Recognition Dataset Loading the MNIST dataset in Keras Baseline Model with Multilayer Perceptrons Simple Convolutional Neural Network for MNIST Larger Convolutional Neural Network for MNIST Summary 20 Improve Model Performance With Image Augmentation Keras Image Augmentation API Point of Comparison for Image Augmentation Feature Standardization ZCA Whitening Random Rotations Random Shifts Random Flips Saving Augmented Images to File Tips For Augmenting Image Data with Keras Summary 21 Project Object Recognition in Photographs Photograph Object Recognition Dataset Loading The CIFAR-10 Dataset in Keras Simple CNN for CIFAR-10 Larger CNN for CIFAR-10 Extensions To Improve Model Performance Summary 22 Project: Predict Sentiment From Movie Reviews Movie Review Sentiment Classification Dataset Load the IMDB Dataset With Keras Word Embeddings Simple Multilayer Perceptron Model One-Dimensional Convolutional Neural Network Summary VI Recurrent Neural Networks 23 Crash Course In Recurrent Neural Networks Support For Sequences in Neural Networks Recurrent Neural Networks Long Short-Term Memory Networks Summary 24 Time Series Prediction with Multilayer Perceptrons Problem Description: Time Series Prediction Multilayer Perceptron Regression Multilayer Perceptron Using the Window Method Summary 25 Time Series Prediction with LSTM Recurrent Neural Networks LSTM Network For Regression LSTM For Regression Using the Window Method LSTM For Regression with Time Steps LSTM With Memory Between Batches Stacked LSTMs With Memory Between Batches Summary 26 Project: Sequence Classification of Movie Reviews Simple LSTM for Sequence Classification LSTM For Sequence Classification With Dropout LSTM and CNN For Sequence Classification Summary 27 Understanding Stateful LSTM Recurrent Neural Networks Problem Description: Learn the Alphabet LSTM for Learning One-Char to One-Char Mapping LSTM for a Feature Window to One-Char Mapping LSTM for a Time Step Window to One-Char Mapping LSTM State Maintained Between Samples Within A Batch Stateful LSTM for a One-Char to One-Char Mapping LSTM with Variable Length Input to One-Char Output Summary 28 Project: Text Generation With Alice in Wonderland Problem Description: Text Generation Develop a Small LSTM Recurrent Neural Network Generating Text with an LSTM Network Larger LSTM Recurrent Neural Network Extension Ideas to Improve the Model Summary
tidyverse Introduction to Data Visualization with Tidyverse and R 7 hours The Tidyverse is a collection of versatile R packages for cleaning, processing, modeling, and visualizing data. Some of the packages included are: ggplot2, dplyr, tidyr, readr, purrr, and tibble. In this instructor-led, live training, participants will learn how to manipulate and visualize data using the tools included in the Tidyverse. By the end of this training, participants will be able to: Perform data analysis and create appealing visualizations Draw useful conclusions from various datasets of sample data Filter, sort and summarize data to answer exploratory questions Turn processed data into informative line plots, bar plots, histograms Import and filter data from diverse data sources, including Excel, CSV, and SPSS files Audience Beginners to the R language Beginners to data analysis and data visualization Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Introduction     Tydyverse vs traditional R plotting Setting up your working environment Preparing the dataset Importing and filtering data Wrangling the data Visualizing the data (graphs, scatter plots) Grouping and summarizing the data Visualizing the data (line plots, bar plots, histograms, boxplots) Working with non-standard data Closing remarks
kdd Knowledge Discover in Databases (KDD) 21 hours Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience     Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course     After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations. Introduction     KDD vs data mining Establishing the application domain Establishing relevant prior knowledge Understanding the goal of the investigation Creating a target data set Data cleaning and preprocessing Data reduction and projection Choosing the data mining task Choosing the data mining algorithms Interpreting the mined patterns
BigData_ A practical introduction to Data Analysis and Big Data 35 hours Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress. Introduction to Data Analysis and Big Data What makes Big Data "big"? Velocity, Volume, Variety, Veracity (VVVV) Limits to traditional Data Processing Distributed Processing Statistical Analysis Types of Machine Learning Analysis Data Visualization Languages used for Data Analysis R language Why R for Data Analysis? Data manipulation, calculation and graphical display Python Why Python for Data Analysis? Manipulating, processing, cleaning, and crunching data Approaches to Data Analysis Statistical Analysis Time Series analysis Forecasting with Correlation and Regression models Inferential Statistics (estimating) Descriptive Statistics in Big Data sets (e.g. calculating mean) Machine Learning Supervised vs unsupervised learning Classification and clustering Estimating cost of specific methods Filtering Natural Language Processing Processing text Understaing meaning of the text Automatic text generation Sentiment analysis / Topic analysis Computer Vision Acquiring, processing, analyzing, and understanding images Reconstructing, interpreting and understanding 3D scenes Using image data to make decisions Big Data infrastructure Data Storage Relational databases (SQL) MySQL Postgres Oracle Non-relational databases (NoSQL) Cassandra MongoDB Neo4js Understanding the nuances Hierarchical databases Object-oriented databases Document-oriented databases Graph-oriented databases Other Distributed Processing Hadoop HDFS as a distributed filesystem MapReduce for distributed processing Spark All-in-one in-memory cluster computing framework for large-scale data processing Structured streaming Spark SQL Machine Learning libraries: MLlib Graph processing with GraphX Scalability Public cloud AWS, Google, Aliyun, etc. Private cloud OpenStack, Cloud Foundry, etc. Auto-scalability Choosing the right solution for the problem The future of Big Data Closing remarks
OpenNN OpenNN: Implementing neural networks 14 hours OpenNN is an open-source class library written in C++  which implements neural networks, for use in machine learning. In this course we go over the principles of neural networks and use OpenNN to implement a sample application. Audience     Software developers and programmers wishing to create Deep Learning applications. Format of the course     Lecture and discussion coupled with hands-on exercises. Introduction to OpenNN, Machine Learning and Deep Learning Downloading OpenNN Working with Neural Designer     Using Neural Designer for descriptive, diagnostic, predictive and prescriptive analytics OpenNN architecture     CPU parallelization OpenNN classes     Data set, neural network, loss index, training strategy, model selection, testing analysis     Vector and matrix templates Building a neural network application     Choosing a suitable neural network     Formulating the variational problem (loss index)     Solving the reduced function optimization problem (training strategy) Working with datasets      The data matrix (columns as variables and rows as instances) Learning tasks     Function regression     Pattern recognition Compiling with QT Creator Integrating, testing and debugging your application The future of neural networks and OpenNN
octnp Octave not only for programmers 21 hours Course is dedicated for those who would like to know an alternative program to the commercial MATLAB package. The three-day training provides comprehensive information on moving around the environment and performing the OCTAVE package for data analysis and engineering calculations. The training recipients are beginners but also those who know the program and would like to systematize their knowledge and improve their skills. Knowledge of other programming languages is not required, but it will greatly facilitate the learners' acquisition of knowledge. The course will show you how to use the program in many practical examples. Introduction Simple calculations Starting Octave, Octave as a calculator, built-in functions The Octave environment Named variables, numbers and formatting, number representation and accuracy, loading and saving data  Arrays and vectors Extracting elements from a vector, vector maths Plotting graphs Improving the presentation, multiple graphs and figures, saving and printing figures Octave programming I: Script files Creating and editing a script, running and debugging scripts, Control statements If else, switch, for, while Octave programming II: Functions Matrices and vectors Matrix, the transpose operator, matrix creation functions, building composite matrices, matrices as tables, extracting bits of matrices, basic matrix functions Linear and Nonlinear Equations More graphs Putting several graphs in one window, 3D plots, changing the viewpoint, plotting surfaces, images and movies,  Eigenvectors and the Singular Value Decomposition  Complex numbers Plotting complex numbers,  Statistics and data processing  GUI Development
druid Druid: Build a fast, real-time data analysis system 21 hours Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo. In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment. Audience     Application developers     Software engineers     Technical consultants     DevOps professionals     Architecture engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding Introduction Installing and starting Druid Druid architecture and design Real-time ingestion of event data Sharding and indexing Loading data Querying data Visualizing data Running a distributed cluster Druid + Apache Hive Druid + Apache Kafka Druid + others Troubleshooting Administrative tasks
nlpwithr NLP: Natural Language Processing with R 21 hours It is estimated that unstructured data accounts for more than 90 percent of all data, much of it in the form of text. Blog posts, tweets, social media, and other digital publications continuously add to this growing body of data. This course centers around extracting insights and meaning from this data. Utilizing the R Language and Natural Language Processing (NLP) libraries, we combine concepts and techniques from computer science, artificial intelligence, and computational linguistics to algorithmically understand the meaning behind text data. Data samples are available in various languages per customer requirements. By the end of this training participants will be able to prepare data sets (large and small) from disparate sources, then apply the right algorithms to analyze and report on its significance. Audience     Linguists and programmers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding Introduction     NLP and R vs Python Installing and configuring R Studio Installing R packages related to Natural Language Processing (NLP). An overview of R’s text manipulation capabilities Getting started with an NLP project in R Reading and importing data files into R Text manipulation with R Document clustering in R Parts of speech tagging in R Sentence parsing in R Working with regular expressions in R Named-entity recognition in R Topic modeling in R Text classification in R Working with very large data sets Visualizing your results Optimization Integrating R with other languages (Java, Python, etc.) Closing remarks
neo4j Beyond the relational database: neo4j 21 hours Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer. In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure. Audience Database administrators (DBAs) Data analysts Developers System Administrators DevOps engineers Business Analysts CTOs CIOs Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development.   Getting started with neo4j neo4j vs relational databases neo4j vs other NoSQL databases Using neo4j to solve real world problems Installing neo4j Data modeling with neo4j Mapping white-board diagrams and mind maps to neo4j Working with nodes Creating, changing and deleting nodes Defining node properties Node relationships Creating and deleting relationships Bi-directional relationships Querying your data with Cypher Querying your data based on relationships MATCH, RETURN, WHERE, REMOVE, MERGE, etc. Setting indexes and constraints Working with the REST API REST operations on nodes REST operations on relationships REST operations on indexes and constraints Accessing the core API for application development Working with NET, Java, Javascript, and Python APIs Closing remarks  
matlabdsandreporting MATLAB Fundamentals, Data Science & Report Generation 126 hours In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform.  Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles. In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic. In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation. Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB' capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation. Assessments will be conducted throughout the course to guage progress. Format of the course Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation. Note Practice sessions will based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange Introduction MATLAB for data science and reporting   Part 01: MATLAB fundamentals Overview     MATLAB for data analysis, visualization, modeling, and programming. Working with the MATLAB user interface Overview of MATLAB syntax Entering commands     Using the command line interface Creating variables     Numeric vs character data Analyzing vectors and matrices     Creating and manipulating     Performing calculations Visualizing vector and matrix data Working with data files     Importing data from Excel spreadsheets Working with data types     Working with table data Automating commands with scripts     Creating and running scripts     Organizing and publishing your scripts Writing programs with branching and loops     User interaction and flow control Writing functions     Creating and calling functions     Debugging with MATLAB Editor Applying object-oriented programming principles to your programs   Part 02: MATLAB for data science Overview     MATLAB for data mining, machine learning and predictive analytics Accessing data     Obtaining data from files, spreadsheets, and databases     Obtaining data from test equipment and hardware     Obtaining data from software and the Web Exploring data     Identifying trends, testing hypotheses, and estimating uncertainty Creating customized algorithms Creating visualizations Creating models Publishing customized reports Sharing analysis tools     As MATLAB code     As standalone desktop or Web applications Using the Statistics and Machine Learning Toolbox Using the Neural Network Toolbox   Part 03: Report generation Overview     Presenting results from MATLAB programs, applications, and sample data     Generating Microsoft Word, PowerPoint®, PDF, and HTML reports.     Templated reports     Tailor-made reports         Using organization’s templates and standards Creating reports interactively vs programmatically     Using the Report Explorer     Using the DOM (Document Object Model) API Creating reports interactively using Report Explorer     Report Explorer Examples         Magic Squares Report Explorer Example     Creating reports         Using Report Explorer to create report setup file, define report structure and content     Formatting reports         Specifying default report style and format for Report Explorer reports     Generating reports         Configuring Report Explorer for processing and running report     Managing report conversion templates         Copying and managing Microsoft Word , PDF, and HTML conversion templates for Report Explorer reports     Customizing Report Conversion templates         Customizing the style and format of Microsoft Word and HTML conversion templates for Report Explorer reports     Customizing components and style sheets         Customizing report components, define layout style sheets Creating reports programmatically in MATLAB     Template-Based Report Object (DOM) API Examples         Functional report         Object-oriented report         Programmatic report formatting     Creating report content         Using the Document Object Model (DOM) API     Report format basics         Specifying format for report content     Creating form-based reports         Using the DOM API to fill in the blanks in a report form     Creating object-oriented reports         Deriving classes to simplify report creation and maintenance     Creating and formatting report objects         Lists, tables, and images     Creating DOM Reports from HTML         Appending HTML string or file to a Microsoft® Word, PDF, or HTML report generated by Document Object Model (DOM) API     Creating report templates         Creating templates to use with programmatic reports     Formatting page layouts         Formatting pages in Microsoft Word and PDF reports Summary and closing remarks
pythonmultipurpose Advanced Python 28 hours In this instructor-led training, participants will learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, finance, data analysis and visualization, UI programming and maintenance scripting. Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Notes If you wish to add, remove or customize any section or topic within this course, please contact us to arrange.   Introduction     Python versatility: from data analysis to web crawling Python data structures and operations     Integers and floats     Strings and bytes     Tuples and lists     Dictionaries and ordered dictionaries     Sets and frozen sets     Data frame (pandas)     Conversions Object-oriented programming with Python     Inheritance     Polymorphism     Static classes     Static functions     Decorators     Other Data Analysis with pandas     Data cleaning     Using vectorized data in pandas     Data wrangling     Sorting and filtering data     Aggregate operations     Analyzing time series Data visualization     Plotting diagrams with matplotlib     Using matplotlib from within pandas     Creating quality diagrams     Visualizing data in Jupyter notebooks     Other visualization libraries in Python Vectorizing Data in Numpy     Creating Numpy arrays     Common operations on matrices     Using ufuncs     Views and broadcasting on Numpy arrays     Optimizing performance by avoiding loops     Optimizing performance with cProfile Processing Big Data with Python     Building and supporting distributed applications with Python     Data storage: Working with SQL and NoSQL databases     Distributed processing with Hadoop and Spark     Scaling your applications Python for finance     Packages, libraries and APIs for financial processing         Zipline         PyAlgoTrade         Pybacktest         quantlib         Python APIs Extending Python (and vice versa) with other languages     C#     Java     C++     Perl     Others Python multi-threaded programming     Modules     Synchronizing     Prioritizing UI programming with Python     Framework options for building GUIs in Python         Tkinter         Pyqt Python for maintenance scripting     Raising and catching exceptions correctly     Organizing code into modules and packages     Understanding symbol tables and accessing them in code     Picking a testing framework and applying TDD in Python Python for the web     Packages for web processing     Web crawling     Parsing HTML and XML     Filling web forms automatically Closing remarks
zeppelin Zeppelin for interactive data analytics 14 hours Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data. This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment. By the end of this training, participants will be able to: Install and configure Zeppelin Develop, organize, execute and share data in a browser-based interface Visualize results without referring to the command line or cluster details Execute and collaborate on long workflows Work with any of a number of plug-in language/data-processing-backends, such as Scala ( with Apache Spark ), Python ( with Apache Spark ), Spark SQL, JDBC, Markdown and Shell. Integrate Zeppelin with Spark, Flink and Map Reduce Secure multi-user instances of Zeppelin with Apache Shiro Audience Data engineers Data analysts Data scientists Software developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice To request a customized course outline for this training, please contact us.  
datameer Datameer for Data Analysts 14 hours Datameer is a business intelligence and analytics platform built on Hadoop. It allows end-users to access, explore and correlate large-scale, structured, semi-structured and unstructured data in an easy-to-use fashion. In this instructor-led, live training, participants will learn how to use Datameer to overcome Hadoop's steep learning curve as they step through the setup and analysis of a series of big data sources. By the end of this training, participants will be able to: Create, curate, and interactively explore an enterprise data lake Access business intelligence data warehouses, transactional databases and other analytic stores Use a spreadsheet user-interface to design end-to-end data processing pipelines Access pre-built functions to explore complex data relationships Use drag-and-drop wizards to visualize data and create dashboards Use tables, charts, graphs, and maps to analyze query results Audience Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice To request a customized course outline for this training, please contact us.
datavis1 Data Visualization 28 hours This course is intended for engineers and decision makers working in data mining and knoweldge discovery. You will learn how to create effective plots and ways to present and represent your data in a way that will appeal to the decision makers and help them to understand hidden information. Day 1: what is data visualization why it is important data visualization vs data mining human cognition HMI common pitfalls Day 2: different type of curves drill down curves categorical data plotting multi variable plots data glyph and icon representation Day 3: plotting KPIs with data R and X charts examples what if dashboards parallel axes mixing categorical data with numeric data Day 4: different hats of data visualization how can data visualization lie disguised and hidden trends a case study of student data visual queries and region selection
datavisR1 Introduction to Data Visualization with R 28 hours This course is intended for data engineers, decision makers and data analysts and will lead you to create very effective plots using R studio that appeal to decision makers and help them find out hidden information and take the right decisions   Day 1: overview of R programming introduction to data visualization scatter plots and clusters the use of noise and jitters Day 2: other type of 2D and 3D plots histograms heat charts categorical data plotting Day 3: plotting KPIs with data R and X charts examples dashboards parallel axes mixing categorical data with numeric data Day 4: different hats of data visualization disguised and hidden trends case studies saving plots and loading Excel files
deckgl deck.gl: Visualizing Large-scale Geospatial Data 14 hours deck.gl is an open-source, WebGL-powered library for exploring and visualizing data assets at scale. Created by Uber, it is especially useful for gaining insights from geospatial data sources, such as data on maps. This instructor-led, live training introduces the concepts and functionality behind deck.gl and walks participants through the set up of a demonstration project. By the end of this training, participants will be able to: Take data from very large collections and turn it into compelling visual representations Visualize data collected from transportation and journey-related use cases, such as pick-up and drop-off experiences, network traffic, etc. Apply layering techniques to geospatial data to depict changes in data over time Integrate deck.gl with React (for Reactive programming) and Mapbox GL (for visualizations on Mapbox based maps). Understand and explore other use cases for deck.gl, including visualizing points collected from a 3D indoor scan, visualizing machine learning models in order to optimize their algorithms, etc. Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice To request a customized course outline for this training, please contact us.

Upco...Upcoming Courses

Other regions

Weekend Data Visualization courses, Evening Data Visualization training, Data Visualization boot camp, Data Visualization instructor-led , Evening Data Visualization courses, Data Visualization one on one training , Data Visualization instructor, Data Visualization private courses, Data Visualization training courses, Data Visualization classes,Weekend Data Visualization training, Data Visualization trainer , Data Visualization on-site

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients