Ambitious Data Science requires massive computational experimentation; the entry ticket for a solid PhD in some fields is now to conduct experiments involving 1 Million CPU hours. Recently several groups have created efficient computational environments that make it painless to run such massive experiments. This course reviews state-of-the-art practices for doing massive computational experiments on compute clusters in a painless and reproducible manner. Students will learn how to automate their computing experiments first of all using nuts-and-bolts tools such as Perl and Bash, and later using available comprehensive frameworks such as ClusterJob and CodaLab, which enables them to take on ambitious Data Science projects. The course also features few guest lectures by renowned scientists in the field of Data Science. Students should have a familiarity with computational experiments and be facile in some high-level computer language such as R, Matlab, or Python.
-
Notifications
You must be signed in to change notification settings - Fork 13
STATS285 course website
License
stats285/stats285.github.io
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
STATS285 course website
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published