MP4 | Video: h264, 1280×720 | Audio: AAC, 48 KHz, 2 ChGenre: eLearning | Language: English + .VTT | Duration: 3 hour | Size: 1.75 GBLearn Apache Spark’s key concepts using real-world examplesWhat you’ll learnHow to create RDD’s, Dataframes and DatasetsHow to properly use Map, Reduce & FilterHow to Partition RDD’s in Distributed SystemsCaching Datasets in Memory to Reduce computationsHow to tune Spark ProgramsHow to run Iterative Algorithms on a clusterDifference between GroupByKey and ReduceByKeyRequirementsFamiliar with UbuntuFamiliar with ScalaDescriptionLearn Apache Spark’s key concepts using real-world examples.
This course goes over everything you need to know to get started using Spark.
We start with resilient distributed data-sets and the main transformations and actions that can be performed on them.
Then we move on to Advanced Spark concepts such as Partitioning and Persistence.
Finally the course ends with Spark’s SQL API which includes two data abstractions called Dataframes and Datasets which sit on top of Spark RDD’s.
They allow for new levels of optimization and SQL querying capabilities.
Who this course is for:Beginner scala developers curious about data science