Articoli correlati a PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes - Brossura

 
9781484243367: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Al momento non sono disponibili copie per questo codice ISBN.

Sinossi

Chapter 1:  Introduction to PySparkSQL

Chapter Goal: Reader will  understand about PySpark, PySparkSQL , Catalyst Optimizer, Project Tungsten and Hive

No of pages                   20-30

Sub -Topics

1.      PySpark

2.      PySparkSQL

3.      Hive

4.      Catalyst

5.      Project Tungsten

 

Chapter 2:  Some time with Installation

Chapter Goal: Learner will understand about installation of Spark, Hive, PostgreSQL, MySQL, MongoDB, Cassandra etc.

No of pages: 30 -40

Sub - Topics                 

1.       Installation Spark

2.      Installation Hive

3.      Installation MySQL

4.      Installation MongoDB

Chapter 3:  IO in PySparkSQL

Chapter Goal: This chapter will provide recipes to the reader, which will  enable them to create PySparkSQL DataFrame from different sources.

No of pages : 40-50

Sub - Topics:                

1.      Creating DataFrame from data.

2.      Reading csv file to create Dataframe

3.  Reading JSON file to create Dataframe.

4.  Saving  DataFrames to different formats.

 

Chapter 4 :  Operations on PySparkSQL DataFrames

Chapter Goal:               Reader will learn about data filtering, data manuipulation, data descriptive analysis , Dealing with missing value etc

No Of Pages ; 40 -50

1.      Data filtering

2.      Data manipulation

3.      Row and column manipulation

 

Chapter 5 :  Data Merging and Data Aggregation using PySparkSQL

Chapter Goal: Reader will learn about data merging and aggregation using PySparkSQL

1.      Data Merging

2.      Data aggregation

 

Chapter 6: SQL, NoSQL and PySparkSQL

Chapter Goal: Reader will learn to run SQL and HiveQL queries on Dataframe

No of pages: 30-40

Sub - Topics:

1. Running SQL on DataFrame

2. Running HiveQL

 

Chapter 7: Structured Streaming

Chapter Goal:               Reader will understand about structured streaming

No of pages : 30-40

1.      Different type of modes.

2.      Data aggregation in structured streaming

3.      Different type of sources

 

 

 

 

Chapter 8 : Optimizing PySparkSQL

Chapter Goal:               Reader will learn about optimizing PySparkSQL

No Of pages  : 20-30

Optimizing PySparkSQL

 

 

 

Chapter 9 : GraphFrames

Chapter Goal:               Reader will understand about graph data analysis with Graphframes. 

No of pages : 30-40

1. GraphFrame Creat

Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.

(nessuna copia disponibile)

Cerca:



Inserisci un desiderata

Non riesci a trovare il libro che stai cercando? Continueremo a cercarlo per te. Se uno dei nostri librai lo aggiunge ad AbeBooks, ti invieremo una notifica!

Inserisci un desiderata

Altre edizioni note dello stesso titolo

9781484243343: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Edizione in evidenza

ISBN 10:  148424334X ISBN 13:  9781484243343
Casa editrice: Apress, 2019
Brossura