Spark for Developers Training

This training class meant for developers and data analysts will introduce Apache Spark. The students will learn how Spark fits into the Big Data ecosystem, and how to use Spark for data analysis. The course covers Spark shell for interactive data analysis, Spark internals, Spark APIs, Spark SQL, Spark streaming, and machine learning and graphX.

Location

Public Classes: Delivered live online via WebEx and guaranteed to run . Join from anywhere!

Private Classes: Delivered at your offices , or any other location of your choice.

Outline
  1. Scala primer
    1. A Quick Introduction to Scala
  2. Spark Basics
    1. Background and History
    2. Spark and Hadoop
    3. Spark Concepts and Architecture
    4. Spark eco System (core, spark sql, mlib, streaming)
  3. RDDs
    1. Running Spark in Local Mode
    2. Spark Web UI
    3. Spark Shell
    4. Analyzing Dataset - part 1
    5. Inspecting RDDs
  4. RDDs In Depth
    1. Partitions
    2. RDD Operations / Transformations
    3. RDD Types
    4. Key-Value Pair RDDs
    5. MapReduce on RDD
    6. Caching and Persistence
  5. Spark and Hadoop
    1. Hadoop Intro (HDFS / YARN)
    2. Hadoop + Spark Architecture
    3. Running Spark on Hadoop YARN
    4. Processing HDFS Files Using Spark
  6. Spark API programming
    1. Introduction to Spark API / RDD API
    2. Submitting the First Program to Spark
    3. Debugging / Logging
    4. Configuration Properties
  7. Spark SQL
    1. SQL Context
    2. Defining Tables and Importing Datasets
    3. Querying
  8. Spark Streaming
    1. Streaming Overview
    2. Streaming Operations
    3. Sliding Window Operations
    4. Writing Spark Streaming Applications
  9. Spark Mlib
    1. mlib Intro
    2. mlib Algorithms
    3. Writing mlib Applications
  10. Spark GraphX
    1. GraphX Library Overview
    2. GraphX APIs
    3. Processing Graph Data Using Spark
  11. Spark Performance and Tuning
    1. Broadcast Variables
    2. Accumulators
    3. Memory Management
  12. Bonus Lab: Running Spark in Cluster Mode
    1. Inspecting Master / Workers in UIs
    2. Configurations
    3. Distributed Processing of Large Data Sets
Class Materials

Each student in our Live Online and our Onsite classes receives a comprehensive set of materials, including course notes and all the class examples.

Class Prerequisites

Experience in the following is required for this Spark class:

  • Familiarity with either Java / Scala / Python language (labs are in Scala and Python).
  • Basic understanding of Linux development environment (command line navigation / editing files using VI or nano).
Prerequisite Courses

Courses that can help you meet these prerequisites:

Training for your Team

Length: 3 Days
  • Private Class for your Team
  • Online or On-location
  • Customizable
  • Expert Instructors

What people say about our training

This was my first time using Webucator. I really liked the personalized attention and that the instructor was able to incorporate my website into the classroom demos. It made it much easier to apply some of the concepts as it relates to how I will be using them.
Tammy Rosen
Fur-Get Me Not
I found this course to be invaluable. The instructor was friendly, knowledgeable, and engaging, and he clearly knew the program inside and out. The course is 7 hours long, but the day flew by quickly because it was very interactive, with lots of exercises spread throughout the day. I have taken other courses before through other training companies, and the quality and content of this course was far superior. I am pleased with the course, and really appreciate the instructional booklet which I will be able to refer to as needed as well.
Rebecca Cohen
Metsa Board Americas
Wonderful class - really exposes the learner to all the possibilities available in the software.
Susan McKibben
The University of Akron
It was a great experience taking the web based class and I would recommend this for people who can't travel. You get to remote into your own computer and all the scenarios are set up for you in advance.
Hang Voong
Direct Energy

No cancelation for low enrollment

Certified Microsoft Partner

Registered Education Provider (R.E.P.)

GSA schedule pricing

62,921

Students who have taken Instructor-led Training

11,864

Organizations who trust Webucator for their Instructor-led training needs

100%

Satisfaction guarantee and retake option

9.29

Students rated our trainers 9.29 out of 10 based on 29,583 reviews

Contact Us or call 1-877-932-8228