Skip to content

Scala for Spark Training in Singapore and India

Sale Sale
Original price Rs. 174,000.00
Original price Rs. 174,000.00 - Original price Rs. 174,000.00
Original price Rs. 174,000.00
Current price Rs. 135,000.00
Rs. 135,000.00 - Rs. 135,000.00
Current price Rs. 135,000.00
Overview
Scala for Spark Training by Cloud Enabled Pte Ltd in Singapore and India
Course Summary

Hadoop Fundamentals is a one-stop course that introduces you to the domain of spark development as well as gives you technical knowhow of the same. At the end of this course you will be able to earn a credential of Spark professional and you will be capable of dealing with Terabyte scale of data and analyze it successfully using spark and its ecosystem. Scala is a condensed version of Java for large scale functional and object-oriented programming. Apache Spark Streaming is an extended component of the Spark API for processing big data sets as real-time streams. Together, Spark Streaming and Scala enable the streaming of big data.

Course Objectives
  • Create Spark applications with the Scala programming language.
  • Use Spark Streaming to process continuous streams of data.
  • Process streams of real-time data with Spark Streaming.
Course Pre- requisites
  • Programming and scripting experience
Course Duration
  • 21 hours – 3 days
Course Outline

Introduction

Scala Programming in Depth Review

  • Syntax and structure
  • Flow control and function 

Spark Internals

  • Resilient Distributed Datasets (RDD)
  • Spark script to graph to cluster 

Overview of Spark Streaming

  • Streaming architecture
  • Intervals in streaming
  • Fault tolerance

Preparing the Development Environment

  • Installing and configuring Apache Spark
  • Installing and configuring the Scala IDE
  • Installing and configuring JDK

Spark Streaming Beginner to Advanced

  • Working with key/value RDD’s
  • Filtering RDD’s
  • Improving Spark scripts with regular expressions
  • Sharing data on a cluster
  • Working with network data sets
  • Implementing BFS algorithms
  • Creating Spark driver scripts
  • Tracking in real time with scripts
  • Writing continuous applications
  • Streaming linear regression
  • Using Spark Machine Learning Library

Spark and Clusters

  • Bundling dependencies and Spark scripts using the SBT tool
  • Using EMR for illustrating clusters
  • Optimizing by partitioning RDD’s
  • Using Spark logs 

Integration in Spark Streaming

  • Integrating Apache Kafka and working with Kafka topics
  • Integrating Apache Fume and working with pull-based/push-based Flume configurations
  • Writing a custom receiver class
  • Integrating Cassandra and exposing data as real-time services

In Production

  • Packaging an application and running it with Spark-Submit
  • Troubleshooting, tuning, and debugging Spark Jobs and clusters
Training Delivery Mode

Online - Live Instructor Led training 

Due to Covid - we dont engage classroom training till situations are ok

Got Questions

Please email to info@thecloudenabled.com and we will be happy to help

This course is designed , developed and delivered by Cloud Enabled Pte Ltd