課程簡介
- 介紹
- Hadoop 歷史、概念
- 生態系統
- 分佈
- 高級體系結構
- Hadoop 神話
- Hadoop 挑戰(硬體/軟體)
- 實驗室:討論您的大數據項目和問題
- HDFS 操作
- 概念(水平擴展、複製、資料局部性、機架感知)
- 節點和守護進程(NameNode、輔助 NameNode、HA 備用 NameNode、DataNode)
- 運行狀況監視
- 基於命令行和瀏覽器的管理
- 添加存儲,更換有缺陷的驅動器
- 實驗:熟悉 HDFS 命令行
- MapReduce操作和管理
- mapreduce之前的並行計算:比較HPC與Hadoop管理
- MapReduce集群負載
- 節點和守護程式(JobTracker、TaskTracker)
- MapReduce UI演練
- Mapreduce配置
- 作業配置
- 優化MapReduce
- 萬無一失的MR:對程式師說些什麼
- 實驗:運行MapReduce範例
- 高級主題
- 硬體監控
- 集群監控
- 新增和刪除伺服器,升級 Hadoop
- 備份、恢復和業務連續性規劃
- Oozie 作業工作流
- Hadoop 高可用性 (HA)
- Hadoop 聯邦
- 使用 Kerberos 保護群集
- 實驗室:設置監視
最低要求
- 熟悉基本的 Linux 系統管理
- 基本腳本編寫技能
Hadoop 和分散式計算的知識不是必需的,但將在課程中介紹和解釋。
實驗室環境
零安裝:無需在學生機器上安裝hadoop軟體!將為學生提供一個有效的hadoop集群。
學生將需要以下內容
- SSH 用戶端(Linux 和 Mac 已經有 ssh 用戶端,對於 Windows ,建議使用 Putty )
- 用於訪問群集的瀏覽器。我們建議 安裝有 FoxyProxy 擴展的 Firefox瀏覽器
客戶評論 (5)
Trainer's preparation & organization, and quality of materials provided on github.
Mateusz Rek - MicroStrategy Poland Sp. z o.o.
Course - Impala for Business Intelligence
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.
Safar Alqahtani - Elm Information Security
Course - Big Data Analytics in Health
I thought he did a great job of tailoring the experience to the audience. This class is mostly designed to cover data analysis with HIVE, but me and my co-worker are doing HIVE administration with no real data analytics responsibilities.
ian reif - Franchise Tax Board
Course - Data Analysis with Hive/HiveQL
I genuinely enjoyed the many hands-on sessions.
Jacek Pieczątka
Course - Administrator Training for Apache Hadoop
The fact that all the data and software was ready to use on an already prepared VM, provided by the trainer in external disks.