Data science Training Courses

Data science Training

Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured,[1][2] similar to Knowledge Discovery in Databases (KDD).

Client Testimonials

Data and Analytics - from the ground up

The way the trainer made complex subjects easy to understand.

Adam Drewry - Digital Jersey

A practical introduction to Data Analysis and Big Data

Overall the Content was good.

Sameer Rohadia - Continental AG / Abteilung: CF IT Finance

A practical introduction to Data Analysis and Big Data

Willingness to share more

Balaram Chandra Paul - MOL Information Technology Asia Limited

Data and Analytics - from the ground up

learning how to use excel properly

Torin Mitchell - Digital Jersey

Data and Analytics - from the ground up

real life practical examples

Wioleta (Vicky) Celinska-Drozd - Digital Jersey

Data and Analytics - from the ground up

First session. Very intensive and quick.

Digital Jersey

A practical introduction to Data Analysis and Big Data

It covered a broad range of information.

Continental AG / Abteilung: CF IT Finance

Data and Analytics - from the ground up

I enjoyed the Excel sheets provided having the exercises with examples. This meant that if Kamil was held up helping other people, I could crack on with the next parts.

Luke Pontin - Digital Jersey

Data and Analytics - from the ground up

Kamil is very knowledgeable and nice person, I have learned from him a lot.

Aleksandra Szubert - Digital Jersey

A practical introduction to Data Analysis and Big Data

presentation of technologies

Continental AG / Abteilung: CF IT Finance

Data and Analytics - from the ground up

The patience of Kamil.

Laszlo Maros - Digital Jersey

Data and Analytics - from the ground up

Detailed and comprehensive instruction given by experienced and clearly knowledgeable expert on the subject.

Justin Roche - Digital Jersey

Data science Course Outlines

Code Name Duration Overview
matlabdsandreporting MATLAB Fundamentals, Data Science & Report Generation 126 hours In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform.  Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles. In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic. In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation. Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB's capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation. Assessments will be conducted throughout the course to gauge progress. Format of the course Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation. Note Practice sessions will be based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange.
hdp Hortonworks Data Platform (HDP) for administrators 21 hours Hortonworks Data Platform is an open-source Apache Hadoop support platform that provides a stable foundation for developing big data solutions on the Apache Hadoop ecosystem. This instructor-led live training introduces Hortonworks and walks participants through the deployment of Spark + Hadoop solution. By the end of this training, participants will be able to: Use Hortonworks to reliably run Hadoop at a large scale Unify Hadoop's security, governance, and operations capabilities with Spark's agile analytic workflows. Use Hortonworks to investigate, validate, certify and support each of the components in a Spark project Process different types of data, including structured, unstructured, in-motion, and at-rest. Audience Hadoop administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
zeppelin Zeppelin for interactive data analytics 14 hours Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data. This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment. By the end of this training, participants will be able to: Install and configure Zeppelin Develop, organize, execute and share data in a browser-based interface Visualize results without referring to the command line or cluster details Execute and collaborate on long workflows Work with any of a number of plug-in language/data-processing-backends, such as Scala ( with Apache Spark ), Python ( with Apache Spark ), Spark SQL, JDBC, Markdown and Shell. Integrate Zeppelin with Spark, Flink and Map Reduce Secure multi-user instances of Zeppelin with Apache Shiro Audience Data engineers Data analysts Data scientists Software developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
d2dbdpa From Data to Decision with Big Data and Predictive Analytics 21 hours Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning.
snorkel Snorkel: Rapidly process training data 7 hours Snorkel is a system for rapidly creating, modeling, and managing training data. It focuses on accelerating the development of structured or "dark" data extraction applications for domains in which large labeled training sets are not available or easy to obtain. In this instructor-led, live training, participants will learn techniques for extracting value from unstructured data such as text, tables, figures, and images through modeling of training data with Snorkel. By the end of this training, participants will be able to: Programmatically create training sets to enable the labeling of massive training sets Train high-quality end models by first modeling noisy training sets Use Snorkel to implement weak supervision techniques and apply data programming to weakly-supervised machine learning systems Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
dsbda Data Science for Big Data Analytics 35 hours Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
jupyter Jupyter for Data Science Teams 7 hours Jupyter is an open-source, web-based interactive IDE and computing environment. This instructor-led, live training introduces the idea of collaborative development in data science and demonstrates how to use Jupyter to track and participate as a team in the "life cycle of a computational idea".  It walks participants through the creation of a sample data science project based on top of the Jupyter ecosystem. By the end of this training, participants will be able to: Install and configure Jupyter, including the creation and integration of a team repository on Git Use Jupyter features such as extensions, interactive widgets, multiuser mode and more to enable project collaboraton Create, share and organize Jupyter Notebooks with team members Choose from Scala, Python, R, to write and execute code against big data systems such as Apache Spark, all through the Jupyter interface Audience Data science teams Format of the course Part lecture, part discussion, exercises and heavy hands-on practice   Note The Jupypter Notebook supports over 40 languages including R, Python, Scala, Julia, etc. To customize this course to your language(s) of choice, please contact us to arrange.
marvin Marvin Image Processing Framework - creating image and video processing applications with Marvin 14 hours Marvin is an extensible, cross-platform, open-source image and video processing framework developed in Java.  Developers can use Marvin to manipulate images, extract features from images for classification tasks, generate figures algorithmically, process video file datasets, and set up unit test automation. Some of Marvin's video applications include filtering, augmented reality, object tracking and motion detection. In this course participants will learn the principles of image and video analysis and utilize the Marvin Framework and its image processing algorithms to construct their own application. Audience     Software developers wishing to utilize a rich, plug-in based open-source framework to create image and video processing applications Format of the course     The basic principles of image analysis, video analysis and the Marvin Framework are first introduced. Students are given project-based tasks which allow them to practice the concepts learned. By the end of the class, participants will have developed their own application using the Marvin Framework and libraries.
deckgl deck.gl: Visualizing Large-scale Geospatial Data 14 hours deck.gl is an open-source, WebGL-powered library for exploring and visualizing data assets at scale. Created by Uber, it is especially useful for gaining insights from geospatial data sources, such as data on maps. This instructor-led, live training introduces the concepts and functionality behind deck.gl and walks participants through the set up of a demonstration project. By the end of this training, participants will be able to: Take data from very large collections and turn it into compelling visual representations Visualize data collected from transportation and journey-related use cases, such as pick-up and drop-off experiences, network traffic, etc. Apply layering techniques to geospatial data to depict changes in data over time Integrate deck.gl with React (for Reactive programming) and Mapbox GL (for visualizations on Mapbox based maps). Understand and explore other use cases for deck.gl, including visualizing points collected from a 3D indoor scan, visualizing machine learning models in order to optimize their algorithms, etc. Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
Torch Torch: Getting started with Machine and Deep Learning 21 hours Torch is an open source machine learning library and a scientific computing framework based on the Lua programming language. It provides a development environment for numerics, machine learning, and computer vision, with a particular emphasis on deep learning and convolutional nets. It is one of the fastest and most flexible frameworks for Machine and Deep Learning and is used by companies such as Facebook, Google, Twitter, NVIDIA, AMD, Intel, and many others. In this course we cover the principles of Torch, its unique features, and how it can be applied in real-world applications. We step through numerous hands-on exercises all throughout, demonstrating and practicing the concepts learned. By the end of the course, participants will have a thorough understanding of Torch's underlying features and capabilities as well as its role and contribution within the AI space compared to other frameworks and libraries. Participants will have also received the necessary practice to implement Torch in their own projects. Audience     Software developers and programmers wishing to enable Machine and Deep Learning within their applications Format of the course     Overview of Machine and Deep Learning     In-class coding and integration exercises     Test questions sprinkled along the way to check understanding
kdbplusandq kdb+ and q: Analyze time series data 21 hours kdb+ is an in-memory, column-oriented database and q is its built-in, interpreted vector-based language. In kdb+, tables are columns of vectors and q is used to perform operations on the table data as if it was a list. kdb+ and q are commonly used in high frequency trading and are popular with the major financial institutions, including Goldman Sachs, Morgan Stanley, Merrill Lynch, JP Morgan, etc. In this instructor-led, live training, participants will learn how to create a time series data application using kdb+ and q. By the end of this training, participants will be able to: Understand the difference between a row-oriented database and a column-oriented database Select data, write scripts and create functions to carry out advanced analytics Analyze time series data such as stock and commodity exchange data Use kdb+'s in-memory capabilities to store, analyze, process and retrieve large data sets at high speed Think of functions and data at a higher level than the standard function(arguments) approach common in non-vector languages Explore other time-sensitive applications for kdb+, including energy trading, telecommunications, sensor data, log data, and machine and network usage monitoring Audience Developers Database engineers Data scientists Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
OpenNN OpenNN: Implementing neural networks 14 hours OpenNN is an open-source class library written in C++  which implements neural networks, for use in machine learning. In this course we go over the principles of neural networks and use OpenNN to implement a sample application. Audience     Software developers and programmers wishing to create Deep Learning applications. Format of the course     Lecture and discussion coupled with hands-on exercises.
fsharpfordatascience F# for Data Science 21 hours Data science is the application of statistical analysis, machine learning, data visualization and programming for the purpose of understanding and interpreting real-world data. F# is a well suited programming language for data science as it combines efficient execution, REPL-scripting, powerful libraries and scalable data integration. In this instructor-led, live training, participants will learn how to use F# to solve a series of real-world data science problems. By the end of this training, participants will be able to: Use F#'s integrated data science packages Use F# to interoperate with other languages and platforms, including Excel, R, Matlab, and Python Use the Deedle package to solve time series problems Carry out advanced analysis with minimal lines of production-quality code Understand how functional programming is a natural fit for scientific and big data computations Access and visualize data with F# Apply F# for machine learning Explore solutions for problems in domains such as business intelligence and social gaming Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
BigData_ A practical introduction to Data Analysis and Big Data 35 hours Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.
tidyverse Introduction to Data Visualization with Tidyverse and R 7 hours The Tidyverse is a collection of versatile R packages for cleaning, processing, modeling, and visualizing data. Some of the packages included are: ggplot2, dplyr, tidyr, readr, purrr, and tibble. In this instructor-led, live training, participants will learn how to manipulate and visualize data using the tools included in the Tidyverse. By the end of this training, participants will be able to: Perform data analysis and create appealing visualizations Draw useful conclusions from various datasets of sample data Filter, sort and summarize data to answer exploratory questions Turn processed data into informative line plots, bar plots, histograms Import and filter data from diverse data sources, including Excel, CSV, and SPSS files Audience Beginners to the R language Beginners to data analysis and data visualization Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
DatSci7 Data Science Programme 245 hours The explosion of information and data in today’s world is un-paralleled, our ability to innovate and push the boundaries of the possible is growing faster than it ever has. The role of Data Scientist is one of the highest in-demand skills across industry today. We offer much more than learning through theory; we deliver practical, marketable skills that bridge the gap between the world of academia and the demands of industry. This 7 week curriculum  can be tailored to your specific Industry requirements, please contact us for further information or visit the Nobleprog Institute website www.inobleprog.co.uk Audience: This programme is aimed post level graduates as well as anyone with the required pre-requisite skills which will be determined by an assessment and interview.  Delivery: Delivery of the course will be a mixture of Instructor Led Classroom and Instructor Led Online; typically the 1st week will be 'classroom led', weeks 2 - 6 'virtual classroom' and week 7  back to 'classroom led'.     
matlabpredanalytics Matlab for Predictive Analytics 21 hours Predictive analytics is the process of using data analytics to make predictions about the future. This process uses data along with data mining, statistics, and machine learning techniques to create a predictive model for forecasting future events. In this instructor-led, live training, participants will learn how to use Matlab to build predictive models and apply them to large sample data sets to predict future events based on the data. By the end of this training, participants will be able to: Create predictive models to analyze patterns in historical and transactional data Use predictive modeling to identify risks and opportunities Build mathematical models that capture important trends Use data to from devices and business systems to reduce waste, save time, or cut costs Audience Developers Engineers Domain experts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
opennmt OpenNMT: Setting up a Neural Machine Translation System 7 hours OpenNMT is a full-featured, open-source (MIT) neural machine translation system that utilizes the Torch mathematical toolkit. In this training participants will learn how to set up and use OpenNMT to carry out translation of various sample data sets. The course starts with an overview of neural networks as they apply to machine translation. Participants will carry out live exercises throughout the course to demonstrate their understanding of the concepts learned and get feedback from the instructor. By the end of this training, participants will have the knowledge and practice needed to implement a live OpenNMT solution. Source and target language samples will be pre-arranged per the audience's requirements. Audience Localization specialists with a technical background Global content managers Localization engineers Software developers in charge of implementing global content solutions Format of the course Part lecture, part discussion, heavy hands-on practice
mlbankingr Machine Learning for Banking (with R) 28 hours In this instructor-led, live training, participants will learn how to apply machine learning techniques and tools for solving real-world problems in the banking industry. R will be used as the programming language. Participants first learn the key principles, then put their knowledge into practice by building their own machine learning models and using them to complete a number of live projects. Audience Developers Data scientists Banking professionals with a technical background Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
Fairsec Fairsec: Setting up a CNN-based machine translation system 7 hours Fairseq is an open-source sequence-to-sequence learning toolkit created by Facebok for use in Neural Machine Translation (NMT). In this training participants will learn how to use Fairseq to carry out translation of sample content. By the end of this training, participants will have the knowledge and practice needed to implement a live Fairseq based machine translation solution. Source and target language content samples can be prepared according to audience's requirements. Audience Localization specialists with a technical background Global content managers Localization engineers Software developers in charge of implementing global content solutions Format of the course     Part lecture, part discussion, heavy hands-on practice
mlbankingpython_ Machine Learning for Banking (with Python) 21 hours In this instructor-led, live training, participants will learn how to apply machine learning techniques and tools for solving real-world problems in the banking industry. Python will be used as the programming language. Participants first learn the key principles, then put their knowledge into practice by building their own machine learning models and using them to complete a number of team projects. Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
ApHadm1 Apache Hadoop: Manipulation and Transformation of Data Performance 21 hours This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools in the Hadoop ecosystem this course includes the use of Pig and Hive both of which are heavily used for data transformation and manipulation. This training also addresses performance metrics and performance optimisation. The course is entirely hands on and is punctuated by presentations of the theoretical aspects.
undnn Understanding Deep Neural Networks 35 hours This course begins with giving you conceptual knowledge in neural networks and generally in machine learning algorithm, deep learning (algorithms and applications). Part-1(40%) of this training is more focus on fundamentals, but will help you choosing the right technology : TensorFlow, Caffe, Theano, DeepDrive, Keras, etc. Part-2(20%) of this training introduces Theano - a python library that makes writing deep learning models easy. Part-3(40%) of the training would be extensively based on Tensorflow - 2nd Generation API of Google's open source software library for Deep Learning. The examples and handson would all be made in TensorFlow. Audience This course is intended for engineers seeking to use TensorFlow for their Deep Learning projects After completing this course, delegates will: have a good understanding on deep neural networks(DNN), CNN and RNN understand TensorFlow’s structure and deployment mechanisms be able to carry out installation / production environment / architecture tasks and configuration be able to assess code quality, perform debugging, monitoring be able to implement advanced production like training models, building graphs and logging   Not all the topics would be covered in a public classroom with 35 hours duration due to the vastness of the subject. The Duration of the complete course will be around 70 hours and not 35 hours.
danagr Data and Analytics - from the ground up 42 hours Data analytics is a crucial tool in business today. We will focus throughout on developing skills for practical hands on data analysis. The aim is to help delegates to give evidence-based answers to questions:  What has happened? processing and analyzing data producing informative data visualizations What will happen? forecasting future performance evaluating forecasts What should happen? turning data into evidence-based business decisions optimizing processes The course itself can be delivered either as a 6 day classroom course or remotely over a period of weeks if preferred. We can work with you to deliver the course to best suit your needs.
pythonfinance Python Programming for Finance 35 hours Python is a programming language that has gained huge popularity in the financial industry. Used by the largest investment banks and hedge funds, it is being employed to build a wide range of financial applications ranging from core trading programs to risk management systems. In this instructor-led, live training, participants will learn how to use Python to develop practical applications for solving a number of specific finance related problems. By the end of this training, participants will be able to: Understand the fundamentals of the Python programming language Download, install and maintain the best development tools for creating financial applications in Python Select and utilize the most suitable Python packages and programming techniques to organize, visualize, and analyze financial data from various sources (CSV, Excel, databases, web, etc.) Build applications that solve problems related to asset allocation, risk analysis, investment performance and more Troubleshoot, integrate deploy and optimize a Python application Audience Developers Analysts Quants Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Note This training aims to provide solutions for some of the principle problems faced by finance professionals. However, if you have a particular topic, tool or technique that you wish to append or elaborate further on, please please contact us to arrange.
Fairseq Fairseq: Setting up a CNN-based machine translation system 7 hours Fairseq is an open-source sequence-to-sequence learning toolkit created by Facebok for use in Neural Machine Translation (NMT). In this training participants will learn how to use Fairseq to carry out translation of sample content. By the end of this training, participants will have the knowledge and practice needed to implement a live Fairseq based machine translation solution. Audience Localization specialists with a technical background Global content managers Localization engineers Software developers in charge of implementing global content solutions Format of the course     Part lecture, part discussion, heavy hands-on practice Note If you wish to use specific source and target language content, please contact us to arrange.
rforfinance R Programming for Finance 28 hours R is a popular programming language in the financial industry. It is used in financial applications ranging from core trading programs to risk management systems. In this instructor-led, live training, participants will learn how to use R to develop practical applications for solving a number of specific finance related problems. By the end of this training, participants will be able to: Understand the fundamentals of the R programming language Select and utilize R packages and techniques to organize, visualize, and analyze financial data from various sources (CSV, Excel, databases, web, etc.) Build applications that solve problems related to asset allocation, risk analysis, investment performance and more Troubleshoot, integrate deploy and optimize an R application Audience Developers Analysts Quants Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Note This training aims to provide solutions for some of the principle problems faced by finance professionals. However, if you have a particular topic, tool or technique that you wish to append or elaborate further on, please please contact us to arrange.
facebooknmt Facebook NMT: Setting up a Neural Machine Translation System 7 hours Fairseq is an open-source sequence-to-sequence learning toolkit created by Facebok for use in Neural Machine Translation (NMT). In this training participants will learn how to use Fairseq to carry out translation of sample content. By the end of this training, participants will have the knowledge and practice needed to implement a live Fairseq based machine translation solution. Audience Localization specialists with a technical background Global content managers Localization engineers Software developers in charge of implementing global content solutions Format of the course Part lecture, part discussion, heavy hands-on practice Note If you wish to use specific source and target language content, please contact us to arrange.
mlfinancepython Machine Learning for Finance (with Python) 21 hours Machine learning is a branch of Artificial Intelligence wherein computers have the ability to learn without being explicitly programmed. In this instructor-led, live training, participants will learn how to apply machine learning techniques and tools for solving real-world problems in the finance industry. Python will be used as the programming language. Participants first learn the key principles, then put their knowledge into practice by building their own machine learning models and using them to complete a number of team projects. By the end of this training, participants will be able to: Understand the fundamental concepts in machine learning Learn the applications and uses of machine learning in finance Develop their own algorithmic trading strategy using machine learning with Python Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice

Upcoming Courses

Weekend Data science courses, Evening Data science training, Data science boot camp, Data science instructor-led , Data science training courses, Data science instructor, Data science classes, Data science private courses, Data science trainer , Data science one on one training , Data science coaching, Data science on-site, Evening Data science courses

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients