I am a PhD student working in HPCS Lab, advised by Xiaodong Zhang.
You can contact me at email@example.com
YSmart is a correlation aware SQL-to-MapReduce translator. It can automatically generate
MapReduce programs for a SQL query. Its optimizer tries to generate less number of MapReduce jobs by
eliminating unnecessary data shuffling.
The core optimizer of YSmart has been integrated into
Apache Hive, a data warehouse system for Hadoop.
This Hive optimizer is called Correlation Optimizer.
Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters,
To be presented in VLDB 2014.
Accelerating Pathology Image Data Cross Comparison on CPU-GPU Hybrid Systems,
Cathy H. Xia,
DOT: A Matrix Model for Analyzing, Optimizing and Deploying
Software for Big Data Analytics in Distributed Systems,
YSmart: Yet another SQL-to-MapReduce Translator,
ICDCS, 2011, Best Paper Award.
RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems,
- SoCC 2011 Conference Talk, October 2011.
I am a committer of Apache Hive. (My Hive Jiras)