[Arch] An overview of Spark components and their dependencies

I sketch here the components inside Spark and their dependencies so you can have a general overview of Spark.

Each component is in charge of a particular function (of course). Straightforwardly, you can understand most of the components and their functions. I just explain some components that “not easy” to understand.
– repl: the interractive shell for spark, like spark-shell, pyspark…
– bagel: Spark implementation of Google’s Pregel graph processing framework. It will be replaced by GraphX.
– catalyst: a query optimization framework for Spark.

spark-components

 

Next “ARCH” posts will focus on the most important components of Spark-Core.

Advertisements

2 thoughts on “[Arch] An overview of Spark components and their dependencies

  1. Pingback: [Arch] SparkContext and its components | Quang-Nhat HOANG-XUAN

  2. Pingback: [Sysdeg] Moving to SparkSQL, why not? | Quang-Nhat HOANG-XUAN

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s