Real-time stream processing using Apache Spark 3 for Python developers

Build your own real-time stream processing applications using Apache Spark 3.x and PySpark About This Video Learn real-time stream processing concepts Understand Spark structured streaming APIs and architecture Work with file streams, Kafka source, and integrating Spark with Kafka In Detail Take you...

Descripción completa

Detalles Bibliográficos
Autor Corporativo: Packt Publishing, publisher (publisher)
Formato: Video
Idioma:Inglés
Publicado: [Birmingham, United Kingdom] : Packt Publishing [2022]
Edición:[First edition]
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009649836306719
Descripción
Sumario:Build your own real-time stream processing applications using Apache Spark 3.x and PySpark About This Video Learn real-time stream processing concepts Understand Spark structured streaming APIs and architecture Work with file streams, Kafka source, and integrating Spark with Kafka In Detail Take your first steps towards discovering, learning, and using Apache Spark 3.0. We will be taking a live coding approach in this carefully structured course and explaining all the core concepts needed along the way. In this course, we will understand the real-time stream processing concepts, Spark structured streaming APIs, and architecture. We will work with file streams, Kafka source, and integrating Spark with Kafka. Next, we will learn about state-less and state-full streaming transformations. Then cover windowing aggregates using Spark stream. Next, we will cover watermarking and state cleanup. After that, we will cover streaming joins and aggregation, handling memory problems with streaming joins. Finally, learn to create arbitrary streaming sinks. By the end of this course, you will be able to create real-time stream processing applications using Apache Spark. Audience This course is designed for software engineers and architects who are willing to design and develop big data engineering projects using Apache Spark. It is also designed for programmers and developers who are aspiring to grow and learn data engineering using Apache Spark. For this course, you need to know Spark fundamentals and should be exposed to Spark Dataframe APIs. Also, you should know Kafka fundamentals and have a working knowledge of Apache Kafka. One should also have programming knowledge of Python programming.
Descripción Física:1 online resource (1 video file (4 hr., 36 min.)) : sound, color
ISBN:9781803246543