What you’ll learn
-
Design distributed systems that manage “big data” using Hadoop and related technologies.
-
Use HDFS and MapReduce for storing and analyzing data at scale.
-
Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
-
Analyze relational data using Hive and MySQL
-
Analyze non-relational data using HBase, Cassandra, and MongoDB
-
Query data interactively with Drill, Phoenix, and Presto
-
Choose an appropriate data storage technology for your application
-
Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
-
Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
-
Consume streaming data using Spark Streaming, Flink, and Storm
Who this course is for:
- Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend “big data” at scale.
- Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop.
- Data analysts and database administrators who are curious about Hadoop and how it relates to their work.
- System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.
Deal Score-1
Disclosure: This post may contain affiliate links and we may get small commission if you make a purchase. Read more about Affiliate disclosure here.