Created by Packt Publishing | Video: h264, 1280×720 | Audio: AAC 48KHz 2ch | Duration: 03:37: H/M | Lec: 84 | 5.10 GB | Language: English | Sub: English [Auto-generated]Uncover the lesser known secrets of powerful big data processing with Spark and KafkaWhat you’ll learnHow to attain a solid foundation in the most powerful and versatile technologies involved in data streaming: Apache Spark and Apache KafkaForm a robust and clean architecture for a data streaming pipelineWays to implement the correct tools to bring your data streaming architecture to lifeHow to create robust processing pipelines by testing Apache Spark jobsHow to create highly concurrent Spark programs by leveraging immutabilityHow to solve repeated problems by leveraging the GraphX APIHow to solve long-running computation problems by leveraging lazy evaluation in SparkTips to avoid memory leaks by understanding the internal memory management of Apache SparkTroubleshoot real-time pipelines written in Spark StreamingRequirementsTo pick up this course, you don’t need to be an expert with Spark.
Customers should be familiar with Java or Scala.
DescriptionVideo Learning Path OverviewA Learning Path is a specially tailored course that brings together two or more different topics that lead you to achieve an end goal.
Much thought goes into the selection of the assets for a Learning Path, and this is done through a complete understanding of the requirements to achieve a goal.
Today, organizations have a difficult time working with large datasets.
In addition, big data processing and analyzing need to be done in real time to gain valuable insights quickly.
This is where data streaming and Spark come in.
In this well thought out Learning Path, you will not only learn how to work with Spark to solve the problem of analyzing massive amounts of data for your organization, but you’ll also learn how to tune it for performance.
Beginning with a step by step approach, you’ll get comfortable in using Spark and will learn how to implement some practical and proven techniques to improve particular aspects of programming and administration in Apache Spark.
You’ll be able to perform tasks and get the best out of your databases much faster.
Moving further and accelerating the pace a bit, You’ll learn some of the lesser known techniques to squeeze the best out of Spark and then you’ll learn to overcome several problems you might come across when working with Spark, without having to break a sweat. The simple and practical solutions provided will get you back in action in no time at all!By the end of the course, you will be well versed in using Spark in your day to day projects.
Key FeaturesFrom blueprint architecture to complete code solution, this course treats every important aspect involved in architecting and developing a data streaming pipelineTest Spark jobs using the unit, integration, and end-to-end techniques to make your data pipeline robust and bulletproof.
Solve several painful issues like slow-running jobs that affect the performance of your application.
Who this course is for?An Application Developer, Data Scientist, Analyst, Statistician, Big data Engineer, or anyone who has some experience with Spark will feel perfectly comfortable in understanding the topics presented.
They usually work with large amounts of data on a day to day basis.
They may or may not have used Spark, but it’s an added advantage if they have some experience with the tool.