Welcome to Scribd!

0% found this document useful (0 votes)

85 views

Scala and Spark Training: Objective

Uploaded by

This 5-6 day training covers Scala, Spark, and data processing with Scala and Spark. Over the first 3 days, participants will learn Scala, including classes, functions, and collections. The next 3 days cover Spark, including RDDs, DataFrames, machine learning, and tuning/debugging Spark applications. Hands-on labs are included to reinforce concepts. A background in Java, Hadoop, and Unix is recommended but not required.

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Scala and Spark Training: Objective

Uploaded by

mukramkhan

0% found this document useful (0 votes)

85 views4 pages

Original Description:

scala and spark oultine

Original Title

Scala and Spark

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

85 views4 pages

Scala and Spark Training: Objective

Uploaded by

mukramkhan

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 4

Search inside document

SCALA AND SPARK TRAINING

Objective:
 By the end of this, participants will get a thorough knowledge of Scala
 Ability to understand Spark
 Write data processing programs using Scala and Spark

Duration: 5-6 days (40 to 48 hours)

 SCALA –2- 3 Days
 SPARK – 3 Days

Prerequisite:
 Java programming ( Good to know )
 Familiarity with Hadoop eco-system HIVE, HDFS etc ( Good to know )
 Familiarity with Unix systems – Basic directory navigation commands, file creation
commands, Environment variables
 Systems with sufficient RAM – Minimum of 8GB of RAM
 64 bit architecture machines with VT-x enabled to allow the running of Virtual Machine

Hands-On:
 This is a hands on session and each section will have relevant hands-on lab
 We will use both
o base ubuntu VM
o Cloudera’s quick start VM

SCALA – 3 days
Introduction to Scala
1. Why Scala?
2. What makes Scala tick
3. Scala interpreter
4. Variables
5. Functions
6. Control Statements (if , else, for, foreach)
7. Basics of Lists, tuples, sets, maps arrays

Introduction to Classes and Objects

1. Classes, fields, methods
2. Singleton objects
3. A scala application
4. Application trait
5. Case classes

Functional programming in Scala

1. Treating functions as first class citizens
2. Closures and higher order functions
3. Tail recursion
4. Function literals

Packages and Imports

1. Putting code in packages
2. Imports
3. Access modifiers
4. Package objects

Collections in details
1. Lists
2. Sequences
3. Sets
4. Maps
5. Tuples

Combining Scala and Java

1. Using Scala from Java
2. Annotations
3. Existential types
4. Compiling Scala and Java together

SBT
Scala build tool
SPARK – 3 days
Introduction to Spark
1. What is Spark
2. Spark stack
3. Where does spark fit in the Hadoop stack
4. Usages of Spark

Spark setup
1. Downloading
2. Starting spark shell
a. Python
b. Scala
3. SparkContext and SparkSession

Resilient Data Sets

1. RDD Basics
2. Creating RDDs
3. Transformations
4. Actions
5. Lazy evaluation of RDD
6. Persistence

Pair RDDs
1. Creating Pair RDDs
2. Transformations on Pair RDDs
3. Data Partitioning

Loading and Saving Data

1. Working with Local File System
2. Working with HDFS

Running Spark on a cluster

1. Spark Runtime Arch
a. Driver
b. Executor
c. Cluster Manager
2. Deploying applications using spark-submit
3. Packaging code and dependencies
4. Building a spark application through sbt
5. Intro to cluster managers

Spark SQL , DataFrames and DataSets

1. Using DataFrames
2. Using spark SQL
3. Using DataSets
4. Loading and Saving data
a. Json
b. Parquet
c. Apache Hive
d. RDDs
5. User Defined functions

Spark Streaming
1. Understanding Dstream
2. Architecture
3. Transformations
4. Performance Considerations

Machine Learning Overview

1. Machine learning basics
2. Data Types
3. Algorithms

Tuning and Debugging Spark

1. Jobs
2. Tasks
3. Stages
4. Finding the right information
5. Parallelism
6. Serialization format
7. Memory Management
8. Hardware provisioning

Learning VMware Workstation For Windows - Peter Von Oven
Document507 pages
Learning VMware Workstation For Windows - Peter Von Oven
lalithambigai sivashankar
No ratings yet
Alpine Ski House
Document11 pages
Alpine Ski House
mukramkhan
No ratings yet
Apache Spark Training Course Curriculum
Document3 pages
Apache Spark Training Course Curriculum
CosmicBlue
No ratings yet
Spark Overview: Security
Document4 pages
Spark Overview: Security
gathorsfx
No ratings yet
AccioJobs Full Stack Developer Course Curriculum
Document12 pages
AccioJobs Full Stack Developer Course Curriculum
Adrain Ramos
No ratings yet
DVS SPARK Course Content: Module 1 - Introduction and Evolution of Apache Spark
Document2 pages
DVS SPARK Course Content: Module 1 - Introduction and Evolution of Apache Spark
JayaramReddy
No ratings yet
DVS SPARK Course Content PDF
Document2 pages
DVS SPARK Course Content PDF
JayaramReddy
No ratings yet
Bda 05
Document12 pages
Bda 05
HARSH NAG
No ratings yet
Bda 05
Document12 pages
Bda 05
HARSH NAG
No ratings yet
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
Lara Java Course Content
Document29 pages
Lara Java Course Content
Abhinandan Prasad
No ratings yet
What Is Apache Spark?
Document232 pages
What Is Apache Spark?
Ketan Rana
No ratings yet
Unit 5
Document109 pages
Unit 5
Rajesh Kumar Rakasula
100% (1)
Learning Apache Spark 2: A beginner's guide to real-time Big Data processing using the Apache Spark framework
From Everand
Learning Apache Spark 2: A beginner's guide to real-time Big Data processing using the Apache Spark framework
Muhammad Asif Abbasi
No ratings yet
Athul Dev - Spark With Python (2020) - Libgen - Li
Document153 pages
Athul Dev - Spark With Python (2020) - Libgen - Li
singh vijay
No ratings yet
Spark Training - Java
Document8 pages
Spark Training - Java
Pavan Kumar
No ratings yet
Apache Spark Ecosystem - Complete Spark Components Guide: 1. Objective
Document11 pages
Apache Spark Ecosystem - Complete Spark Components Guide: 1. Objective
divya kolluri
No ratings yet
Java Syllubus
Document7 pages
Java Syllubus
sindhu991994
No ratings yet
Newton School Course Final
Document5 pages
Newton School Course Final
nandan
No ratings yet
Top Answers To Spark Interview Questions
Document32 pages
Top Answers To Spark Interview Questions
Nitin Gorde
No ratings yet
Big Data Assignment
Document6 pages
Big Data Assignment
suibian.270619
No ratings yet
Azure Data Engineer + Databricks Content
Document7 pages
Azure Data Engineer + Databricks Content
sai b
No ratings yet
Edureka Training - Data Engineer Masters Program
Document49 pages
Edureka Training - Data Engineer Masters Program
Chaudry Umer
No ratings yet
Apache Spark & Scala Course Content
Document5 pages
Apache Spark & Scala Course Content
Naveen Elancersoft
No ratings yet
Introduction To Spark
Document84 pages
Introduction To Spark
Namruta G H
No ratings yet
Top Answers To Spark Interview Questions
Document32 pages
Top Answers To Spark Interview Questions
srinivas75k
No ratings yet
Pyspark Interview Code
Document197 pages
Pyspark Interview Code
mailme me
100% (1)
Cloudera Developer Training For Apache Spark
Document3 pages
Cloudera Developer Training For Apache Spark
kesh
No ratings yet
Learning Apache Spark With Python
Document10 pages
Learning Apache Spark With Python
dalalroshan
No ratings yet
Spark-Rdd
Document15 pages
Spark-Rdd
K Anantha Krishnan
No ratings yet
Page 01
Document2 pages
Page 01
leebha.pushparaj
No ratings yet
Spark Interview Questions and Answers
Document31 pages
Spark Interview Questions and Answers
srinivas75k
100% (1)
PySpark Training
Document3 pages
PySpark Training
Mangesh Abnave
No ratings yet
Spark SQL and DataFrames - Spark 2.2.0 Documentation
Document35 pages
Spark SQL and DataFrames - Spark 2.2.0 Documentation
XI Cheng
No ratings yet
JAVA
Document3 pages
JAVA
Suhani Panda 23BAI1064
No ratings yet
Module 3
Document51 pages
Module 3
sagarhn sagarhn
No ratings yet
Apache Spark Theory by Arsh
Document4 pages
Apache Spark Theory by Arsh
Faraz Akhtar
No ratings yet
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
Document11 pages
Name: Wable Snehal Mahesh Subject:-Scala & Spark Div: - Mba Ii Roll No: - 57 Guidence Name: - Prof. Archana Suryawanshi - Kadam
Snehal Mahesh Wable
No ratings yet
8888888888888888888
Document131 pages
8888888888888888888
kumar kumar
100% (1)
Salesforce Course Content PDF
Document8 pages
Salesforce Course Content PDF
Mahesh Gondi
No ratings yet
Spark Introduction
Document4 pages
Spark Introduction
VIKAS YADAV
No ratings yet
Compare Hadoop and Spark.: Table
Document10 pages
Compare Hadoop and Spark.: Table
consania
No ratings yet
Spark Interview Questions
Document7 pages
Spark Interview Questions
Rajesh Sugumaran
100% (1)
Spark Notes
Document37 pages
Spark Notes
bhargavi
No ratings yet
MCSD Study Guide v.7.2017 EDkRkej
Document30 pages
MCSD Study Guide v.7.2017 EDkRkej
jataved
No ratings yet
Hadoop Spark
Document73 pages
Hadoop Spark
rrajaram1997
No ratings yet
Object Oriented Programming (Java)
Document13 pages
Object Oriented Programming (Java)
Pa Krishna Sankar
75% (4)
Apache Spark Interview Questions and Answers PDF
Document31 pages
Apache Spark Interview Questions and Answers PDF
Zyad Ahmed
No ratings yet
SPARK Interview Questions
Document12 pages
SPARK Interview Questions
aditya.rana.datascience
No ratings yet
Apache Spark Interview Questions
Document12 pages
Apache Spark Interview Questions
varun3dec1
No ratings yet
Fast Data Processing With Spark - Second Edition - Sample Chapter
Document18 pages
Fast Data Processing With Spark - Second Edition - Sample Chapter
Packt Publishing
No ratings yet
Interview Question
Document24 pages
Interview Question
Anil Yarlagadda
No ratings yet
Chapter 2
Document22 pages
Chapter 2
Huy Nguyễn
No ratings yet
Important Questions For Bda
Document1 page
Important Questions For Bda
abdulahad.ubeid
No ratings yet
Scala and Spark Practice Questions - Free Practice Test - Spark Quiz and Test
Document9 pages
Scala and Spark Practice Questions - Free Practice Test - Spark Quiz and Test
gkuma020.in.ibm.com
No ratings yet
Apache Spark Components
Document4 pages
Apache Spark Components
nitinlucky
No ratings yet
Advanced Java Unit 3 Digital Notes
Document67 pages
Advanced Java Unit 3 Digital Notes
23102208
100% (1)
Java Learn Java in 3 Days! (David Chang - Programming)
Document77 pages
Java Learn Java in 3 Days! (David Chang - Programming)
EMarpla
No ratings yet
Module 9: Processing Distributed Data With Apache Spark: WWW - Edureka.co/big-Data-And-Hadoop
Document45 pages
Module 9: Processing Distributed Data With Apache Spark: WWW - Edureka.co/big-Data-And-Hadoop
arjun.ec633
No ratings yet
Big Data Analytics
Document2 pages
Big Data Analytics
Tamal Dey
No ratings yet
Bigdata Notes
Document26 pages
Bigdata Notes
Anil Yarlagadda
No ratings yet
Mastering Scala: Elegance in Code
From Everand
Mastering Scala: Elegance in Code
Kameron Hussain
No ratings yet
Presentation Title: Subtitle Comes Here
Document13 pages
Presentation Title: Subtitle Comes Here
mukramkhan
No ratings yet
Vintage Presentation Title
Document13 pages
Vintage Presentation Title
mukramkhan
No ratings yet
Minimal Presentation Cover Title
Document11 pages
Minimal Presentation Cover Title
mukramkhan
No ratings yet
Title: Logo Here
Document17 pages
Title: Logo Here
mukramkhan
No ratings yet
Maker Gallery Design: Lorem Ipsum Dolor Sit Amet, Consectetuer Adipiscing Elit
Document5 pages
Maker Gallery Design: Lorem Ipsum Dolor Sit Amet, Consectetuer Adipiscing Elit
mukramkhan
No ratings yet
Presentation Title
Document11 pages
Presentation Title
mukramkhan
No ratings yet
Artificial Intelligence (AI) / Machine Learning (ML) : Limited Seats Only
Document2 pages
Artificial Intelligence (AI) / Machine Learning (ML) : Limited Seats Only
mukramkhan
100% (1)
Solarsystem Nasa Gov
Document5 pages
Solarsystem Nasa Gov
mukramkhan
No ratings yet
Certificate of Excellence
Document1 page
Certificate of Excellence
mukramkhan
No ratings yet
Teacher's List: Not Started
Document2 pages
Teacher's List: Not Started
mukramkhan
No ratings yet
WebSphere Application Server 6.1 Administration
Document29 pages
WebSphere Application Server 6.1 Administration
mukramkhan
0% (1)
Orders Ord - No Purch - Amt Ord - Date Customer - Id Salesman - Id
Document1 page
Orders Ord - No Purch - Amt Ord - Date Customer - Id Salesman - Id
Rabeya Bashri Bushra
No ratings yet
Introduction To Systems Analysis and Design:: An Agile, Iterative Approach
Document43 pages
Introduction To Systems Analysis and Design:: An Agile, Iterative Approach
Leo Messi
No ratings yet
VENTAS: 69005848 755-26829 690-93991 76894916: Precio y Disponibilidad Sujeto A Cambiar Sin Previo Aviso
Document20 pages
VENTAS: 69005848 755-26829 690-93991 76894916: Precio y Disponibilidad Sujeto A Cambiar Sin Previo Aviso
CEE Centro Empresarial Equipetrol
No ratings yet
CSE 2110 Experiment 01
Document18 pages
CSE 2110 Experiment 01
আমানুর আকাশ
No ratings yet
RBSE 9th CS Syllabus
Document2 pages
RBSE 9th CS Syllabus
kokil
No ratings yet
An Overview of Wireless Technologies For IoT Network
Document6 pages
An Overview of Wireless Technologies For IoT Network
jose
No ratings yet
Terrain Mixer Manual 1 86
Document17 pages
Terrain Mixer Manual 1 86
Phillip Park
No ratings yet
STE Micro Project
Document12 pages
STE Micro Project
ethicalninja7
No ratings yet
SQL04
Document37 pages
SQL04
haflores2512
No ratings yet
Design Guide For Local Dimming Backlight With TLC6C5748-Q1
Document7 pages
Design Guide For Local Dimming Backlight With TLC6C5748-Q1
Carlos Ruiz
No ratings yet
10 - LAB - COAL ok-FF PDF
Document11 pages
10 - LAB - COAL ok-FF PDF
Hafza Ghafoor
No ratings yet
Installation
Document2 pages
Installation
Ana Paula Carmo Hernandes
No ratings yet
IBM 4690 Programming Guide
Document476 pages
IBM 4690 Programming Guide
Cesar Ortega
No ratings yet
Design and Analysis of Area and Time Efficient Hred: Hybrid Reduced Deflection Router For Network-On-Chip (Noc)
Document45 pages
Design and Analysis of Area and Time Efficient Hred: Hybrid Reduced Deflection Router For Network-On-Chip (Noc)
cecilchinnaraj
No ratings yet
Cables and Connectors
Document4 pages
Cables and Connectors
Fik Best
No ratings yet
Puter Networks
Document54 pages
Puter Networks
chuchu
No ratings yet
Power Query Documentation
Document840 pages
Power Query Documentation
nguyet doan
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
Document3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
jefferyleclerc
No ratings yet
T A Dheeraj Gowda Resume
Document1 page
T A Dheeraj Gowda Resume
hi2mabalaji
No ratings yet
Morrigan Department Stores Is A Chain of Department Stores in
Document2 pages
Morrigan Department Stores Is A Chain of Department Stores in
Muhammad Shahid
0% (1)
Cisco TelePresence Video Communication Server To Cisco Unified Communications Manager
Document24 pages
Cisco TelePresence Video Communication Server To Cisco Unified Communications Manager
kayudo80
100% (1)
Aws 202204
Document6 pages
Aws 202204
Mori Yu
No ratings yet
Different Networking Systems Advantages and Disadvantages
Document71 pages
Different Networking Systems Advantages and Disadvantages
shabir Ahmad
No ratings yet
20 Cqi
Document167 pages
20 Cqi
Manoj Deka
No ratings yet
What Is Microcontroller?
Document119 pages
What Is Microcontroller?
Vishal Gudla Nagraj
No ratings yet
EX - No.15-20 and SQL Exercises
Document17 pages
EX - No.15-20 and SQL Exercises
Spectra Dragneel
No ratings yet
16 - BGP Communities Explained
Document7 pages
16 - BGP Communities Explained
pkumar
No ratings yet
BTEC Level 3 National in Information Technology: Unit 1: Learner Workbook 1
Document64 pages
BTEC Level 3 National in Information Technology: Unit 1: Learner Workbook 1
Jake Paul
No ratings yet
Basic Components of Embedded Systems
Document14 pages
Basic Components of Embedded Systems
Jomie Tagudin
No ratings yet