Scala and Spark Training: Objective
Scala and Spark Training: Objective
Scala and Spark Training: Objective
Objective:
By the end of this, participants will get a thorough knowledge of Scala
Ability to understand Spark
Write data processing programs using Scala and Spark
Prerequisite:
Java programming ( Good to know )
Familiarity with Hadoop eco-system HIVE, HDFS etc ( Good to know )
Familiarity with Unix systems – Basic directory navigation commands, file creation
commands, Environment variables
Systems with sufficient RAM – Minimum of 8GB of RAM
64 bit architecture machines with VT-x enabled to allow the running of Virtual Machine
Hands-On:
This is a hands on session and each section will have relevant hands-on lab
We will use both
o base ubuntu VM
o Cloudera’s quick start VM
SCALA – 3 days
Introduction to Scala
1. Why Scala?
2. What makes Scala tick
3. Scala interpreter
4. Variables
5. Functions
6. Control Statements (if , else, for, foreach)
7. Basics of Lists, tuples, sets, maps arrays
Collections in details
1. Lists
2. Sequences
3. Sets
4. Maps
5. Tuples
SBT
Scala build tool
SPARK – 3 days
Introduction to Spark
1. What is Spark
2. Spark stack
3. Where does spark fit in the Hadoop stack
4. Usages of Spark
Spark setup
1. Downloading
2. Starting spark shell
a. Python
b. Scala
3. SparkContext and SparkSession
Pair RDDs
1. Creating Pair RDDs
2. Transformations on Pair RDDs
3. Data Partitioning
Spark Streaming
1. Understanding Dstream
2. Architecture
3. Transformations
4. Performance Considerations