Strata + Hadoop World 2016 - San Jose, California - Data Science & Advanced Analytics
MP4 | Video: AVC 1280x720 | Audio: AAC 44KHz 2ch | Duration: 25 Hours | 6.42 GB
Genre: eLearning | Language: English
Data wrangling and intro to pandas - Part 1 - T.J. Alumbaugh (Continuum Analytics), James Powell (NumFOCUS) 00:57:37
Data wrangling and intro to pandas - Part 2 - T.J. Alumbaugh (Continuum Analytics), James Powell (NumFOCUS) 00:54:47
Intro to data visualization with Bokeh - Part 1 - Bryan Van de Ven (Continuum Analytics), Sarah Bird (Aptivate) 1:06:12
Intro to data visualization with Bokeh - Part 2 - Bryan Van de Ven (Continuum Analytics), Sarah Bird (Aptivate) 00:46:22
Intro to machine learning with scikit-learn - Part 1 - Jake Vanderplas (eScience Institute, University of Washington), Katrina Riehl (Continuum Analytics) 00:58:36
Intro to machine learning with scikit-learn - Part 2 - Jake Vanderplas (eScience Institute, University of Washington), Katrina Riehl (Continuum Analytics) 00:54:49
R quickstart: Transform and visualize data - Garrett Grolemund (RStudio, Inc.) 1:07:14
Validating models in R - Part 1 - Nina Zumel (Win-Vector LLC), John Mount (Win Vector LLC) 00:49:13
Validating models in R - Part 2 - Nina Zumel (Win-Vector LLC), John Mount (Win Vector LLC) 00:38:44
Scaling R: Analytics for big data - Stephen Elston (Quantia Analytics, LLC) 1:04:29
Reproducible reports with big data - Garrett Grolemund (RStudio, Inc.) 1:02:59
A year of anomalies: Building shared infrastructure for anomaly detection - Chris Sanden (Netflix), Christopher Colburn (Netflix) 00:42:01
Augmenting machine learning with human computation for better personalization - Eric Colson (Stitch Fix) 00:47:33
Real-time fraud detection using process mining with Spark Streaming - Hylke Hendriksen (ING) 00:37:15
Building a marketplace: Eventbrite's approach to search and recommendation - John Berryman (Eventbrite) 00:42:18
Docker for data scientists - Michelangelo D'Agostino (Civis Analytics) 00:42:49
How to make analytic operations look more like DevOps: Lessons learned moving machine-learning algorithms to production environments - Robert Grossman (University of Chicago) 00:41:29
Analyzing time series data with Spark - Sandy Ryza (Cloudera) 00:31:38
Faster conclusions using in-memory columnar SQL and machine learning - Wes McKinney (Cloudera), Jacques Nadeau (Dremio) 00:47:23
Putting the “science” into data science: The importance of reproducibility and peer review for quantitative research - Erik Andrejko (The Climate Corporation) 00:38:27
Can deep neural networks save your neural network? Artificial intelligence, sensors, and strokes - Brandon Ballinger (Cardiogram), Johnson Hsieh (Cardiogram) 00:44:30
Deep learning and recurrent neural networks applied to electronic health records - Josh Patterson (Patterson Consulting), David Kale (University of Southern California), Zachary Lipton (University of California, San Diego) 00:45:34
Data science teams: Hold out for the unicorn or build bands of steeds? - Michael Dauber (Amplify), Yael Garten (LinkedIn), Monica Rogati (Data Natives), Daniel Tunkelang (Various) 00:43:20
How LinkedIn built a text analytics platform at scale - Chi-Yi Kuan (LinkedIn), Weidong Zhang (LinkedIn), Yongzheng Zhang (LinkedIn) 00:40:10
Python scalability: A convenient truth - Travis Oliphant (Continuum Analytics) 00:41:28
Data modeling for data science: Simplify your workload with complex types - Marcel Kornacker (Cloudera) 00:38:15
Atom smashing using machine learning at CERN - Siddha Ganju (Carnegie Mellon University) 00:37:54
Large-scale product classification via text and image-based signals using a fusion of discriminative and deep learning-based classifiers - Sreeni Iyer (quadanalytix), Anurag Bhardwaj (Quad Analytix) 00:49:22
Vowpal Wabbit: The essence of speed in machine learning - Jeroen Janssens (Tilburg University) 00:36:00
The polyglot Beaker notebook - Scott Draves (Two Sigma Open Source) 00:40:26
发布日期: 2018-04-14