Isum 2017
March 5 - 9  2018
Mérida, Yucatán, México
9th International Supercomputing Conference In Mexico
Creating an Insightful World through Supercomputing


Ir a facebookir a twittermandar un correo


Taller 3: Data processing and querying at scale using Big Data Analytics Stacks


Genoveva Vargas Solar
After more than thirty years, querying is still an important and central issue. Today, querying techniques can be organised along a spectrum that shows different techniques to explore and exploit data and expect to extract some kind of value out of it. Each point in this long spectrum represents also families of commercial and non-commercial tools that can address a given type of querying. These families are not orthogonal and often provide techniques for addressing different querying techniques. The objective of this tutorial is to give an overview of the data querying spectrum and perform hands on some of the most prominent ones. This will be done addressing the following topics:

  • Using big data analytics stacks for managing data collections (PigLatin and MongoDB)
  • Exploring Data Collections: control flow & dataflow-oriented approaches (Spark)
  • Understanding data collections: machine learning & visualization (Spark & other tools)