Data Mining Training Courses

Data Mining refers to the process of automatically searching large data sets to discover patterns and useful information.
NobleProg live Data Mining training courses demonstrate through hands-on practice the fundamentals of Data Mining, its sources of methods including Artificial intelligence, Machine learning, Statistics and Database systems, and its use and applications.
Data Mining training is available in various formats, including onsite live training and live instructor-led training using an interactive, remote desktop. Local Data Mining training can be carried out live on customer premises or in NobleProg local training centers.
Client Testimonials
Subcategories
Data Mining Course Outlines
Code | Name | Duration | Overview |
---|---|---|---|
datama | Data Mining and Analysis | 28 hours | Objective: Delegates be able to analyse big data sets, extract patterns, choose the right variable impacting the results so that a new model is forecasted with predictive results. |
matlabfundamentalsfinance | MATLAB Fundamentals + MATLAB for Finance | 35 hours | This course provides a comprehensive introduction to the MATLAB technical computing environment + an introduction to using MATLAB for financial applications. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions Using the Financial Toolbox for quantitative analysis |
rintrob | Introductory R for Biologists | 28 hours | R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among statisticians, engineers and scientists without computer programming skills who find it easy to use. Its popularity is due to the increasing use of data mining for various goals such as set ad prices, find new drugs more quickly or fine-tune financial models. R has a wide variety of packages for data mining. |
matlabdsandreporting | MATLAB Fundamentals, Data Science & Report Generation | 126 hours | In the first part of this training, we cover the fundamentals of MATLAB and its function as both a language and a platform. Included in this discussion is an introduction to MATLAB syntax, arrays and matrices, data visualization, script development, and object-oriented principles. In the second part, we demonstrate how to use MATLAB for data mining, machine learning and predictive analytics. To provide participants with a clear and practical perspective of MATLAB's approach and power, we draw comparisons between using MATLAB and using other tools such as spreadsheets, C, C++, and Visual Basic. In the third part of the training, participants learn how to streamline their work by automating their data processing and report generation. Throughout the course, participants will put into practice the ideas learned through hands-on exercises in a lab environment. By the end of the training, participants will have a thorough grasp of MATLAB's capabilities and will be able to employ it for solving real-world data science problems as well as for streamlining their work through automation. Assessments will be conducted throughout the course to gauge progress. Format of the course Course includes theoretical and practical exercises, including case discussions, sample code inspection, and hands-on implementation. Note Practice sessions will be based on pre-arranged sample data report templates. If you have specific requirements, please contact us to arrange. |
bigddbsysfun | Big Data & Database Systems Fundamentals | 14 hours | The course is part of the Data Scientist skill set (Domain: Data and Technology). |
TalendDI | Talend Open Studio for Data Integration | 28 hours | Talend Open Studio for Data Integration is an open-source data integration product used to combine, convert and update data in various locations across a business. In this instructor-led, live training, participants will learn how to use the Talend ETL tool to carry out data transformation, data extraction, and connectivity with Hadoop, Hive, and Pig. By the end of this training, participants will be able to Explain the concepts behind ETL (Extract, Transform, Load) and propagation Define ETL methods and ETL tools to connect with Hadoop Efficiently amass, retrieve, digest, consume, transform and shape big data in accordance to business requirements Upload to and extract large records from Hadoop, Hive, and NoSQL databases Audience Business intelligence professionals Project managers Database professionals SQL Developers ETL Developers Solution architects Data architects Data warehousing professionals System administrators and integrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice |
mdlmrah | Model MapReduce and Apache Hadoop | 14 hours | The course is intended for IT specialist that works with the distributed processing of large data sets across clusters of computers. |
rprogda | R Programming for Data Analysis | 14 hours | This course is part of the Data Scientist skill set (Domain: Data and Technology) |
PentahoDI | Pentaho Data Integration Fundamentals | 21 hours | Pentaho Data Integration is an open-source data integration tool for defining jobs and data transformations. In this instructor-led, live training, participants will learn how to use Pentaho Data Integration's powerful ETL capabilities and rich GUI to manage an entire big data lifecycle, maximizing the value of data to the organization. By the end of this training, participants will be able to: Create, preview, and run basic data transformations containing steps and hops Configure and secure the Pentaho Enterprise Repository Harness disparate sources of data and generate a single, unified version of the truth in an analytics-ready format. Provide results to third-part applications for further processing Audience Data Analyst ETL developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice |
sspsspas | Statistics with SPSS Predictive Analytics Software | 14 hours | Goal: Learning to work with SPSS at the level of independence The addressees: Analysts, researchers, scientists, students and all those who want to acquire the ability to use SPSS package and learn popular data mining techniques. |
dmmlr | Data Mining & Machine Learning with R | 14 hours | R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining. |
datavault | Data Vault: Building a Scalable Data Warehouse | 28 hours | Data vault modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all of the time". Its flexible, scalable, consistent and adaptable design encompasses the best aspects of 3rd normal form (3NF) and star schema. In this instructor-led, live training, participants will learn how to build a Data Vault. By the end of this training, participants will be able to: Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI. Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse Develop a consistent and repeatable ETL (Extract, Transform, Load) process Build and deploy highly scalable and repeatable warehouses Audience Data modelers Data warehousing specialist Business Intelligence specialists Data engineers Database administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice |
bdbiga | Big Data Business Intelligence for Govt. Agencies | 35 hours | Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information. High-value government solutions will be created from a mashup of the most disruptive technologies: Mobile devices and applications Cloud services Social business technologies and networking Big Data and analytics IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured. But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of thesevolumes of Big Datarequires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog. The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it. The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge. Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billion while also fulfilling mission objectives.). Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers. |
datavis1 | Data Visualization | 28 hours | This course is intended for engineers and decision makers working in data mining and knoweldge discovery. You will learn how to create effective plots and ways to present and represent your data in a way that will appeal to the decision makers and help them to understand hidden information. |
zendfundamentals | Zend Framework: Fundamentals | 21 hours | Zend framework is an open-source, object-orientated framework for developing, deploying, and managing enterprise-ready PHP based web applications and services. Zend framework utilizes the Model-View-Controller (MVC) paradigm to develop basic structures for applications. Zend is considered a "component library"; its unique modular design enables users to use components independently of one another. In this instructor-led, live training, participants will learn how to create a reliable and scalable web application using the Zend framework. By the end of this training, participants will be able to: Use Model-View-Controller design patterns to build a database-based web application Receive and process forms Set up input validation and view scripts Handle the various types of MVC events and services offered by Zend Framework MVC component library Prepare and execute queries for a database adapter Audience Intermediate to advanced PHP developers seeking to develop secure, enterprise scale web-applications Format of the course Part lecture, part discussion, exercises and heavy hands-on practice |
datamin | Data Mining | 21 hours | Course can be provided with any tools, including free open-source data mining software and applications |
dsbda | Data Science for Big Data Analytics | 35 hours | Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy. |
monetdb | MonetDB | 28 hours | MonetDB is an open-source database that pioneered the column-store technology approach. In this instructor-led, live training, participants will learn how to use MonetDB and how to get the most value out of it. By the end of this training, participants will be able to: Understand MonetDB and its features Install and get started with MonetDB Explore and perform different functions and tasks in MonetDB Accelerate the delivery of their project by maximizing MonetDB capabilities Audience Developers Technical experts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice |
d2dbdpa | From Data to Decision with Big Data and Predictive Analytics | 21 hours | Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning. |
neo4j | Beyond the relational database: neo4j | 21 hours | Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer. In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure. Audience Database administrators (DBAs) Data analysts Developers System Administrators DevOps engineers Business Analysts CTOs CIOs Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development. |
foundr | Foundation R | 7 hours | The objective of the course is to enable participants to gain a mastery of the fundamentals of R and how to work with data. |
pmml | Predictive Models with PMML | 7 hours | The course is created to scientific, developers, analysts or any other people who want to standardize or exchange their models with Predictive Model Markup Language (PMML) file format. |
processmining | Process Mining | 21 hours | Process mining, or Automated Business Process Discovery (ABPD), is a technique that applies algorithms to event logs for the purpose of analyzing business processes. Process mining goes beyond data storage and data analysis; it bridges data with processes and provides insights into the trends and patterns that affect process efficiency. Format of the course The course starts with an overview of the most commonly used techniques for process mining. We discuss the various process discovery algorithms and tools used for discovering and modeling processes based on raw event data. Real-life case studies are examined and data sets are analyzed using the ProM open-source framework. Audience Data science professionals Anyone interested in understanding and applying process modeling and data mining |
dataminr | Data Mining with R | 14 hours | R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining. |
kdd | Knowledge Discover in Databases (KDD) | 21 hours | Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations. |
68780 | Apache Spark | 14 hours | |
druid | Druid: Build a fast, real-time data analysis system | 21 hours | Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo. In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment. Audience Application developers Software engineers Technical consultants DevOps professionals Architecture engineers Format of the course Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding |
psr | Introduction to Recommendation Systems | 7 hours | Audience Marketing department employees, IT strategists and other people involved in decisions related to the design and implementation of recommender systems. Format Short theoretical background follow by analysing working examples and short, simple exercises. |
BigData_ | A Practical Introduction to Data Analysis and Big Data | 35 hours | Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress. |
datashrinkgov | Data Shrinkage for Government | 14 hours | The objective of the course is to enable participants to gain a mastery of the fundamentals of data shrinkage for government. |
DatSci7 | Data Science Programme | 245 hours | The explosion of information and data in today’s world is un-paralleled, our ability to innovate and push the boundaries of the possible is growing faster than it ever has. The role of Data Scientist is one of the highest in-demand skills across industry today. We offer much more than learning through theory; we deliver practical, marketable skills that bridge the gap between the world of academia and the demands of industry. This 7 week curriculum can be tailored to your specific Industry requirements, please contact us for further information or visit the Nobleprog Institute website www.inobleprog.co.uk Audience: This programme is aimed post level graduates as well as anyone with the required pre-requisite skills which will be determined by an assessment and interview. Delivery: Delivery of the course will be a mixture of Instructor Led Classroom and Instructor Led Online; typically the 1st week will be 'classroom led', weeks 2 - 6 'virtual classroom' and week 7 back to 'classroom led'. |
matlab2 | MATLAB Fundamentals | 21 hours | This three-day course provides a comprehensive introduction to the MATLAB technical computing environment. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions |
ApHadm1 | Apache Hadoop: Manipulation and Transformation of Data Performance | 21 hours | This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools in the Hadoop ecosystem this course includes the use of Pig and Hive both of which are heavily used for data transformation and manipulation. This training also addresses performance metrics and performance optimisation. The course is entirely hands on and is punctuated by presentations of the theoretical aspects. |
osqlide | Oracle SQL Intermediate - Data Extraction | 14 hours | The objective of the course is to enable participants to gain a mastery of how to work with the SQL language in Oracle database for data extraction at intermediate level. |
danagr | Data and Analytics - from the ground up | 42 hours | Data analytics is a crucial tool in business today. We will focus throughout on developing skills for practical hands on data analysis. The aim is to help delegates to give evidence-based answers to questions: What has happened? processing and analyzing data producing informative data visualizations What will happen? forecasting future performance evaluating forecasts What should happen? turning data into evidence-based business decisions optimizing processes The course itself can be delivered either as a 6 day classroom course or remotely over a period of weeks if preferred. We can work with you to deliver the course to best suit your needs. |
datapro | Data Protection | 35 hours | This is an Instructor led course, and is the non-certification version of the "CDP - Certificate in Data Protection" course Those experienced in data protection issues, as well as those new to the subject, need to be trained so that their organisations are confident that legal compliance is continually addressed. It is necessary to identify issues requiring expert data protection advice in good time in order that organisational reputation and credibility are enhanced through relevant data protection policies and procedures. Objectives: The aim of the syllabus is to promote an understanding of how the data protection principles work rather than simply focusing on the mechanics of regulation. The syllabus places the Act in the context of human rights and promotes good practice within organisations. On completion you will have: an appreciation of the broader context of the Act. an understanding of the way in which the Act and the Privacy and Electronic Communications (EC Directive) Regulations 2003 work a broad understanding of the way associated legislation relates to the Act an understanding of what has to be done to achieve compliance Course Synopsis: The syllabus comprises three main parts, each sub-sections. Context - this will address the origins of and reasons for the Act together with consideration of privacy in general. Law – Data Protection Act - this will address the main concepts and elements of the Act and subordinate legislation. Application - this will consider how compliance is achieved and how the Act works in practice. |
matfin | MATLAB for Financial Applications | 21 hours | MATLAB is a numerical computing environment and programming language developed by MathWorks. |
Upcoming Courses
Course | Course Date | Course Price [Remote / Classroom] |
---|---|---|
Data Mining - Edinburgh | Wed, 2018-05-09 09:30 | £3900 / £5400 |
Beyond the relational database: neo4j - Leeds | Wed, 2018-05-09 09:30 | £3300 / £3900 |
Model MapReduce and Apache Hadoop - Swindon | Thu, 2018-05-10 09:30 | £2200 / £2550 |
Introduction to Recommendation Systems - Belfast City Centre | Fri, 2018-05-11 09:30 | £1100 / £1350 |
Data Vault: Building a Scalable Data Warehouse - Oxford | Tue, 2018-05-29 09:30 | £4400 / £5500 |
Other regions
Other countries
Consulting
Weekend Data Mining courses, Evening Data Mining training, Data Mining boot camp, Data Mining instructor-led
, Weekend Data Mining training, Evening Data Mining courses, Data Mining training courses, Data Mining on-site, Data Mining private courses, Data Mining coaching, Data Mining trainer , Data Mining instructor, Data Mining one on one training