[Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

The paper of SparkSQL provides a very nice figure about SparkSQL data flow. I’ve had experiences on Apache Pig for more than one year so I realized that it is better to put them all together. I created a new figure that includes the data flow of Hive, Pig, and SparkSQL. I know a little … Continue reading [Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

Advertisements