Jiang Lin, Qingda Lu, Xiaoning Ding, Zhao Zhang, Xiaodong Zhang, and P. Sadayappan, "Gaining Insights into Multi-Core Cache Partitioning: Bridging the Gap between Simulation and Real Systems", 14th International Symposium on High-Performance Computer Architecture (HPCA) , 2008
Qingda Lu, Sriram Krishnamoorthy, P. Sadayappan, ''Combining Analytical and Empirical Approaches in Tuning Matrix Transposition'', the 15th international conference on Parallel architectures and compilation techniques (PACT),2006
Albert Hartono, Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Marcel Nooijen, Gerald Baumgartner, David E. Bernholdt, Venkatesh Choppella, Russell M. Pitzer, J. Ramanujam, Atanas Rountev, P. Sadayappan:, ''Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations. '', International Conference on Computational Science (ICCS), 2006
Xiaoyang Gao, Swarup Kumar Sahoo, Qingda Lu, Gerald Baumgartner, Chi-Chung Lam, J. Ramanujam, P. Sadayappan, ''Performance Modeling and Optimization of Parallel Out-of-Core Tensor Contractions'', ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2005.
G. Baumgartner, A. Auer, D.E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R.J. Harrison, S. Hirata, S. Krishnamoorthy, S. Krishnan, C. Lam, Q. Lu, M. Nooijen, R.M. Pitzer, J. Ramanujam, P. Sadayappan and A. Sibiryakov. ''Synthesis of High-Performance Parallel Programs for a Class of Ab Initio Quantum Chemistry Models'' Invited paper for special issue of Proceedings of the IEEE, January 2005.
Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Gerald Baumgartner, J. Ramanujam, P. Sadayappan, ``Empirical Performance-Model Driven Data Layout Optimization,'', Proceedings of the 17th International Workshop on Languages and Compilers for Parallel Computing (LCPC), September, 2004
Qingda Lu, Jiesheng Wu, Dhabaleswar K. Panda and P. Sadayappan. ``Applying MPI Derived Datatypes to the NAS Benchmarks: A Case Study'', 3rd Workshop on Compile and Runtime Techniques for Parallel Computing (CRTPC-3), held in conjunction with The 2004 International Conference on Parallel Processing (ICPP-2004) , August, 2004.