Introduction to Graph Computing Training Course

Course Code



28 hours (usually 4 days including breaks)


  • An undersanding of Java programming and frameworks
  • A general understanding of Python is helpful but not required
  • A general understanding of database concepts


A large number of real world problems can be described in terms of graphs. For example, the Web graph, the social network graph, the train network graph and the language graph. These graphs tend to be extremely large; processing them requires a specialized set of tools and mindset referred to as graph computing.

In this instructor-led, live training, participants will learn about the various technology offerings and implementations for processing graph data. The aim is to identify real-world objects, their characteristics and relationships, then model these relationships and process them as data using graph computing approaches. We start with a broad overview and narrow in on specific tools as we step through a series of case studies, hands-on exercises and live deployments.

By the end of this training, participants will be able to:

  • Understand how graph data is persisted and traversed
  • Select the best framework for a given task (from graph databases to batch processing frameworks)
  • Implement Hadoop, Spark, GraphX and Pregel to carry out graph computing across many machines in parallel
  • View real-world big data problems in terms of graphs, processes and traversals


  • Developers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Course Outline

    Graph databases and libraries

Understanding graph data
    The graph as a data structure
    Using vertices (dots) and edges (lines) to model real-world scenarios

Using Graph databases to model, persist and process graph data
    Local graph algorithms/traversals
    neo4j, OrientDB and Titan

Exercise: Modeling Graph Data with neo4j
    Whiteboard data modeling

Beyond Graph databases: Graph computing
    Understanding the property graph
    Graph modeling different scenarios (software graph, discussion graph, concept graph)

Solving Real-World Problems with Traversals
    Algorithmic/directed walk over the graph
    Determining circular cependencies

Case Study: Ranking Discussion Contributors
    Ranking by number and depth of conributed discussions
    A note on sentiment and concept analysis

Graph Computing: Local, in-memory graph toolkits
    Graph analysis and visualization
    JUNG, NetworkX, and iGraph

Exercise: Modeling Graph Data with NetworkX
    Using NetworkX to model a complex s

Graph Computing: Batch Processing Graph Frameworks
    Leveraging leverage Hadoop for storage (HDFS) and processing (MapReduce)
    Overview of iterative algorithms
    Hama, Giraph, and GraphLab

Graph Computing: Graph-parallel Computation
    Unifying ETL, exploratory analysis, and iterative graph computation within a single system

Setup and Installation
    Hadoop and Spark

GraphX Operators
    Property, structural, join, neighborhood aggregation, caching and uncaching

Iterating with Pregel API
    Passing arguments for sending, receiving and computing

Building a Graph
    Using vertices and edges in an RDD or on disk

Designing Scalable Algorithms
    GraphX Optimization

Accessing Additional Algorithms
    PageRank, Connected Components, Triangle Counting

Exercis: Page Rank and Top Users
    Building and processing graph data using text files as input

Deploying to Production

Closing Remarks

Bookings, Prices and Enquiries

Guaranteed to run even with a single delegate!

Private Classroom

From £5000

Private Remote

From £4400 (93)

Public Classroom

Cannot find a suitable date? Choose Your Course Date >>Too expensive? Suggest your price

Related Courses

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients