Course Outline

Week 1 Big Data concepts

  • VVVV (Velocity, Volume, Variety, Veracity) definition
  • Limits to traditional data processing capacity
  • Distributed Processing
  • Statistical Analysis
  • Machine Learning Analysis Types
  • Data Visualization
  • Distributed Processing (e.g. map-reduce)
  • Introduction to used languages
  • R language crash-course
  • Python crash course

Weeks 2&3 Performing Data Analysis

  • Statistical Analysis
  • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Inferential Statistics (estimating)
  • Forecasting with Correlation and Regression models
  • Time Series analysis
  • Basics of Machine Learning
  • Supervised vs unsupervised learning
  • Classification and clustering
  • Estimating cost of specific methods
  • Filter

Week 4 Natural Language Processing

  • Processing text
  • Understanding meaning of the text
  • Automatic text generation
  • Sentiment/Topic Analysis
  • Computer Vision

Week 5&6 Tooling concept

  • Data storage solution (SQL, NoSQL, hierarchical, object oriented, document oriented)
  • MySQL, Cassandra, MongoDB, Elasticsearch, HDFS, etc...)
  • Choosing right solution to the problem
  • Distributed Processing
  • Spark
  • Machine Learning with Spark (MLLib)
  • Spark SQL
  • Scalability
  • Public cloud (AWS, Google, etc...)
  • Private cloud (OpenStack, cloud foundry)
  • Autoscalability

Week 7 Soft Skills

  • Advisory & Leadership Skills
  • Making an impact: data-driven story telling
  • Understanding your audience
  • Effective data presentation - getting your message across
  • Influence effectiveness and change leadership
  • Handling difficult situations

Exam

  • End of Programme graduation exam

Requirements

Participants to have good grounding in maths, at least high school level.

Though programming skills are not required, any programming skills will be useful.

Participants will be assessed and interviewed prior to participation in this training programme.

 245 Hours

Testimonials (1)

Related Courses

Kaggle

14 Hours

Accelerating Python Pandas Workflows with Modin

14 Hours

GPU Data Science with NVIDIA RAPIDS

14 Hours

Anaconda Ecosystem for Data Scientists

14 Hours

Introduction to Data Science and AI using Python

35 Hours

Big Data Business Intelligence for Telecom and Communication Service Providers

35 Hours

A Practical Introduction to Data Science

35 Hours

Data Science for Big Data Analytics

35 Hours

Data Science essential for Marketing/Sales professionals

21 Hours

F# for Data Science

21 Hours

Introduction to Data Science

35 Hours

Jupyter for Data Science Teams

7 Hours

Data Science with KNIME Analytics Platform

21 Hours

Data Science Implementation Management using KNIME Server

14 Hours

MATLAB Fundamentals, Data Science & Report Generation

35 Hours

Related Categories

1