At the occasion of the 3rd NESUS Winter School and PhD Symposium on Data Science and Heterogeneous Computing, Dr. S. Varrette was invited to give a lecture and hands-on on “Big Data Analytics”.
- Date: Tuesday January 23th, 2018, 9h – 13h.
- Location: Zagreb, Croatia
- Slides (PDF) – Reference site: http://nesusws-tutorials-bd-dl.rtfd.io/
- Github sources
This tutorial offered a synthetic view of Big Data Analytics challenges, the tools permitting to address these challenges and focus on some of these tools through a practical session with a set of concrete examples.
Time | Session |
---|---|
09:00 - 09:30 | Discover the Hands-on tool: Vagrant |
09:30 - 10:00 | HPC and Big Data (BD): Architectures and Trends |
10:00 - 10:30 | Interlude: Software Management in HPC systems |
10:30 - 11:00 | [Big] Data Management in HPC Environment: Overview and Challenges |
11:00 - 11:15 | Coffee Break |
11:15 - 12:30 | Big Data Analytics with Hadoop & Spark |
12:30 - 13:00 | Deep Learning Analytics with Tensorflow |
13:00 | Lunch |
Title: Big Data Analytics: Overview and Practical Examples
Topics
- Focus on practicals tools rather than theoretical content
- starts with daily data management…
- … before speaking about Big data management
- in particular: data transfer (over SSH), data versioning with Git
- continue with classical tools and their usage in HPC
- review HPC environments and the hands-on environment
- reviewing Environment Modules and Lmod
- introducing Vagrant and Easybuild
- introduction to Big Data processing engines: Hadoop, Spark
- introduction to Tensorflow, a Machine Learning (ML)/Deep Learning (DL) processing framework
- review HPC environments and the hands-on environment
Level: beginner - advanced
For more details: see this blog post