A practical introduction to Data Analysis and Big Data Training Course

Course CodeCourse Code

BigData_

Duration Duration

35 hours (usually 5 days including breaks)

Requirements Requirements

  • A general understanding of math
  • A general understanding of programming
  • A general understanding of databases

Overview Overview

Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools.

Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class.

The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability.

Audience

  • Developers / programmers
  • IT consultants

Format of the course

  • Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.

Course OutlineCourse Outline

Introduction to Data Analysis and Big Data

  • What makes Big Data "big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Languages used for Data Analysis

  • R language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / Topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability
  • Choosing the right solution for the problem
  • The future of Big Data
  • Closing remarks

TestimonialsTestimonials

Willingness to share more

Balaram Chandra Paul - MOL Information Technology Asia Limited

It covered a broad range of information.

Continental AG / Abteilung: CF IT Finance

presentation of technologies

Continental AG / Abteilung: CF IT Finance

Overall the Content was good.

Sameer Rohadia - Continental AG / Abteilung: CF IT Finance

Bookings, Prices and EnquiriesBookings, Prices and Enquiries

Guaranteed to run even with a single delegate!
Private Classroom
 
Private Classroom
Participants are from one organisation only. No external participants are allowed. Usually customised to a specific group, course topics are agreed between the client and the trainer.
Private Remote
From £5500
Private Remote
The instructor and the participants are in two different physical locations and communicate via the Internet. More Information

The more delegates, the greater the savings per delegate. Table reflects price per delegate and is used for illustration purposes only, actual prices may differ.

Number of Delegates Private Remote
1 £5500
2 £3875
3 £3333
4 £3063
Public Classroom
From £6250
(98)
Public Classroom
Participants from multiple organisations. Topics usually cannot be customised

The more delegates, the greater the savings per delegate. Table reflects price per delegate and is used for illustration purposes only, actual prices may differ.

Number of Delegates Public Classroom
1 £6250
2 £4275
3 £3617
4 £3288
Cannot find a suitable date? Choose Your Course Date >>
Too expensive? Suggest your price

Related Courses

Upcoming Courses

VenueCourse DateCourse Price [Remote / Classroom]
Belfast City CentreMon, 2018-02-05 09:30£5500 / £6500
Edinburgh Training and Conference VenueMon, 2018-02-05 09:30£5500 / £5875
NewcastleTue, 2018-02-06 09:30£5500 / £6300
LiverpoolTue, 2018-02-06 09:30£5500 / £6900
York - Priory Street Centre Mon, 2018-02-12 09:30£5500 / £6100

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients