PySpark Recipes A Problem-Solution Approach with PySpark2
Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved! PySpark R...
Autor principal: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Berkeley, CA :
Apress
2018.
|
Edición: | 1st ed. 2018. |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630430306719 |
Tabla de Contenidos:
- Chapter 1: The Era of Big Data, Hadoop, and Other Big Data Processing Frameworks
- Chapter 2: Installation
- Chapter 3: Introduction to Python and NumPy
- Chapter 4: Spark Architecture and Resilient Distributed Dataset
- Chapter 5: The Power of Pairs: Paired RDD
- Chapter 6: IO in PySpark
- Chapter 7: Optimizing PySpark and PySpark Streaming
- Chapter 8: PySparkSQL
- Chapter 9: PySpark MLlib and Linear Regression.