[SysDeg] Worksharing Framework and its design – Part 2: Communication method

After having a basic understanding about Spark and SparkSQL, I came back to my system. The high level design of the system remains the same as I described two months ago. It is a client-server model, but the server is changed from the Spark server to the SparkSQL server. I spent roughly two weeks for some coding … Continue reading [SysDeg] Worksharing Framework and its design – Part 2: Communication method

Advertisements

[Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

The paper of SparkSQL provides a very nice figure about SparkSQL data flow. I’ve had experiences on Apache Pig for more than one year so I realized that it is better to put them all together. I created a new figure that includes the data flow of Hive, Pig, and SparkSQL. I know a little … Continue reading [Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

[Arch] SparkContext and its components

When you work with Spark or read documents about Spark, definitely you will face SparkContext, which is inside the driver at client-side. This really made me confused and curious when I heard about it so I decided to dig into it. To summarize it in some words, I would say that SparkContext, in general, is … Continue reading [Arch] SparkContext and its components

My internship and some documents on Apache Spark

I heard about what I will do in my internship 6 months ago. Well, to be precise, it was right after I finished my summer internship. It is designing and building a worksharing framework (scan, computation)  for Pig queries - Hadoop MapReduce, which mostly focuses on GROUPING SETS operation. 6 months later, my internship still … Continue reading My internship and some documents on Apache Spark

Some experiences on building my own Pig

Pig is a high-level platform for creating MapReduce programs used with Hadoop. The language for this platform is called Pig Latin. Pig Latin abstracts the programming from the Java MapReduce idiom into a notation which makes MapReduce programming high level, similar to that of SQL for RDBMS systems. Pig Latin can be extended using UDF … Continue reading Some experiences on building my own Pig