Coursera - Introduction to Data Science
MP4 | AVC 88kbps | English | 960x540 | 30fps | 16h 03mins | AAC stereo 113kbps | 3.88 GB
Genre: Video Training
Commerce and research is being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels – scalable data management on and off the cloud, parallel algorithms, statistical modeling, and proficiency with a complex ecosystem of tools and platforms – span a variety of disciplines and are not easy to obtain through conventional curricula. Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression).
Categories:
Information, Tech & Design
Computer Science: Systems & Security
Computer Science: Software Engineering
Statistics and Data Analysis
Course Syllabus:
Part 0: Introduction
Examples, data science articulated, history and context, technology landscape
Part 1: Data Manipulation, at Scale
Databases and the relational algebra
Parallel databases, parallel query processing, in-database analytics
MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages
Key-value stores and NoSQL; tradeoffs of SQL and NoSQL
Entity resolution, record linkage, data cleaning
Part 2: Analytics
Basic statistical modeling, experiment design, introduction to machine learning, overfitting
Supervised learning: overview, simple nearest neighbor, decision trees/forests, regression
Unsupervised learning: k-means, multi-dimensional scaling
Graph Analytics: PageRank, community detection, recursive queries, iterative processing
Text Analytics: latent semantic analysis
Collaborative Filtering: slope-one
Part 3: Communicating Results
Visualization, data products, visual data analytics
Provenance, privacy, ethics, governance
Part 4: Guest Lectures
Guest Lectures: AMPLab, Datameer, SciDB, more
发布日期: 2016-02-17