Analyzing Data Using Spark 2.0 DataFrames With Python

Apache Spark 2.0 has become the gold standard for processing large datasets. This course, designed for learners with basic Python programming experience, takes you on an introductory journey into the world of big data analysis using Spark 2.0, Python, and the Spark DataFrame API. Beginning with an o...

Descripción completa

Detalles Bibliográficos
Otros Autores: Portilla, Jose, author (author)
Formato: Video
Idioma:Inglés
Publicado: Infinite Skills 2017.
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009631248106719
Descripción
Sumario:Apache Spark 2.0 has become the gold standard for processing large datasets. This course, designed for learners with basic Python programming experience, takes you on an introductory journey into the world of big data analysis using Spark 2.0, Python, and the Spark DataFrame API. Beginning with an overview of Spark 2.0 and Python, and then moving into a detailed examination of DataFrames, you'll learn about using SQL with DataFrames, DataFrame dates and timestamps, DataFrame aggregate operations, and about DataFrames and missing data. The course includes a hands-on data analysis exercise using real stock data. Learners should have Python and Spark installed on their computers before starting the class. Gain a core understanding of Spark 2.0 and Spark DataFrames Learn how to use Python with Spark DataFrames Gain big data experience analyzing stock data with Python and Spark DataFrames Jose Marcial Portilla is Head of Data Science at SF Bay area based Pierian Data, where he creates and delivers data science and Python training courses for Fortune 500 clients such as Credit Suisse, General Electric, and The New York Times. Jose holds degrees in Mechanical Engineering from Santa Clara University.
Notas:Title from title screen (viewed June 9, 2017).
Date of publication from resource description page.
Descripción Física:1 online resource (1 video file, approximately 1 hr., 17 min.)