Using R for Big Data with Spark

CG数据库 >> Using R for Big Data with Spark

Using R for Big Data with Spark的图片1

Genre: Development / Programming

Data analysts familiar with R will learn to leverage the power of Spark, distributed computing and cloud storage in this course that shows you how to use your R skills in a big data environment.

You'll learn to create Spark clusters on the Amazon Web Services (AWS) platform; perform cluster based data modeling using Gaussian generalized linear models, binomial generalized linear models, Naive Bayes, and K-means modeling; access data from S3 Spark DataFrames and other formats like CSV, Json, and HDFS; and do cluster based data manipulation operations with tools like SparkR and SparkSQL. By course end, you'll be capable of working with massive data sets not possible on a single computer. This hands-on class requires each learner to set-up their own extremely low-cost, easily terminated AWS account.

Discover how to use your R skills in a big data distributed cloud computing cluster environment

Gain hands-on experience setting up Spark clusters on Amazon's AWS cloud services platform

Understand how to control a cloud instance on AWS using SSH or PuTTY

Explore basic distributed modeling techniques like GLM, Naive Bayes, and K-means

Learn to do cloud based data manipulation and processing using SparkR and SparkSQL

Understand how to access data from the CSV, Json, HDFS, and S3 formats

Using R for Big Data with Spark的图片2

下载链接-百度网盘

发布日期: 2016-10-28

CG数据库关于本站联系方式 Archiver