[SysDeg] Worksharing Framework and its design: Some modifications

After a lot of discussions, we decided to change the design of the system a little bit, so it can be more general and extensible. The new design is described as the figure above. The WorkSharing Detector remains the same as the old design. Its goal is generating bags of DAGs which are labeled with … Continue reading [SysDeg] Worksharing Framework and its design: Some modifications

[SysDeg] Worksharing Framework and its design – Part 3: Prototype for the first version

Long time no see! After one month playing with caching in Spark, I learned many valuable lessons (which will be posted on other blog posts, about Cache Manager and Block Manager of Spark). Our team came back to the  design of the system - spark SQL server. To be honest, i spent too much time … Continue reading [SysDeg] Worksharing Framework and its design – Part 3: Prototype for the first version

[SysDeg] Worksharing Framework and its design – Part 2: Communication method

After having a basic understanding about Spark and SparkSQL, I came back to my system. The high level design of the system remains the same as I described two months ago. It is a client-server model, but the server is changed from the Spark server to the SparkSQL server. I spent roughly two weeks for some coding … Continue reading [SysDeg] Worksharing Framework and its design – Part 2: Communication method

[Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

The paper of SparkSQL provides a very nice figure about SparkSQL data flow. I’ve had experiences on Apache Pig for more than one year so I realized that it is better to put them all together. I created a new figure that includes the data flow of Hive, Pig, and SparkSQL. I know a little … Continue reading [Arch] SparkSQL Internals – Part 2: SparkSQL Data Flow

[Sysdeg] Worksharing framework and its design – Part 1

As the previous post I mentioned, my internship will mostly focus on designing and implementing a worksharing framework for GROUPING SETS on Apache Spark. I will briefly discuss about the concept of the framework and its design in this post. Actually, many sharing (scan, computation) frameworks have been proposed in also traditional database systems and in … Continue reading [Sysdeg] Worksharing framework and its design – Part 1