Next: About this document ...
Up: Cache Miss Characterization and
Previous: Experimental Results
-
- 1
-
N. Ahmed.
Locality Enhancement for Imperfectly Nested loops.
PhD thesis, Cornell Computer Science Department, 2000.
- 2
-
N. Ahmed, N. Mateev, and K. Pingali.
Synthesizing transformations for locality enhancement of imperfectly
nested loops.
In Proc. of ACM Intl. Conf. on Supercomputing, 2000.
- 3
-
G. Almasi, C. Cascaval, and D. A. Padua.
Calculating stack distances efficiently.
In Proceedings of the workshop on Memory system performance,
pages 37-43. ACM Press, 2002.
- 4
-
J. M. Anderson, S. P. Amarasinghe, and M. S. Lam.
Data and Computation Transformations for Multiprocessors.
In Proc. of the Fifth ACM SIGPLAN Symposium on Principles and
Practice of Parallel Processing, July 1995.
- 5
-
G. Baumgartner, D. Bernholdt, D. Cociorva, R. Harrison, S. Hirata, C. Lam,
M. Nooijen, R. Pitzer, J. Ramanujam, and P. Sadayappan.
A High-Level Approach to Synthesis of High-Performance Codes for
Quantum Chemistry.
In Proc. of Supercomputing 2002, November 2002.
- 6
-
C. Cascaval and D. A. Padua.
Estimating cache misses and locality using stack distances.
In Proceedings of the 17th annual international conference on
Supercomputing, pages 150-159. ACM Press, 2003.
- 7
-
S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck.
Exact analysis of the cache behavior of nested loops.
In Proceedings of the ACM SIGPLAN 2001 conference on Programming
language design and implementation, pages 286-297. ACM Press, 2001.
- 8
-
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen,
D. Bernholdt, and R. Harrison.
Space-Time Trade-Off Optimization for a Class of Electronic
Structure Calculations.
In Proc. of ACM SIGPLAN 2002 Conference on Programming Language
Design and Implementation (PLDI), pages 177-186, 2002.
- 9
-
D. Cociorva, X. Gao, S. Krishnan, G. Baumgartner, C. Lam, P. Sadayappan, and
J. Ramanujam.
Global Communication Optimization for Tensor Contraction Expressions
under Memory Constraints.
In Proc. of Seventeenth International Parallel and Distributed
Processing Symposium (IPDPS), 2003.
- 10
-
D. Cociorva, J. Wilkins, G. Baumgartner, P. Sadayappan, J. Ramanujam,
M. Nooijen, D. E. Bernholdt, and R. Harrison.
Towards Automatic Synthesis of High-Performance Codes for Electronic
Structure Calculations: Data Locality Optimization.
In Proc. of the Intl. Conf. on High Performance Computing,
volume 2228, pages 237-248. Springer-Verlag, 2001.
- 11
-
D. Cociorva, J. Wilkins, C. Lam, G. Baumgartner, P. Sadayappan, and
J. Ramanujam.
Loop optimization for a class of memory-constrained computations.
In Proc. of the Fifteenth ACM International Conference on
Supercomputing (ICS'01), pages 500-509, 2001.
- 12
-
S. Ghosh, M. Martonosi, and S. Malik.
Precise miss analysis for program transformations with caches of
arbitrary associativity.
In Proceedings of the eighth international conference on
Architectural support for programming languages and operating systems, pages
228-239. ACM Press, 1998.
- 13
-
S. Ghosh, M. Martonosi, and S. Malik.
Cache miss equations: a compiler framework for analyzing and tuning
memory behavior.
ACM Trans. Program. Lang. Syst., 21(4):703-746, 1999.
- 14
-
I. Kodukula, N. Ahmed, and K. Pingali.
Data-centric multi-level blocking.
In Proc. of SIGPLAN Conf. Programming Language Design and
Implementation, 1997.
- 15
-
C. Lam.
Performance Optimization of a Class of Loops Implementing
Multi-Dimensional Integrals.
PhD thesis, The Ohio State University, Columbus, OH, August 1999.
- 16
-
C. Lam, D. Cociorva, G. Baumgartner, and P. Sadayappan.
Memory-optimal evaluation of expression trees involving large
objects.
In Proc. of Intl. Conf. on High Perf. Comp., 1999.
- 17
-
C. Lam, D. Cociorva, G. Baumgartner, and P. Sadayappan.
Optimization of Memory Usage and Communication Requirements for a
Class of Loops Implementing Multi-Dimensional Integrals.
In Proc. of Twelfth LCPC Workshop, 1999.
- 18
-
C. Lam, P. Sadayappan, and R. Wenger.
On Optimizing a Class of Multi-Dimensional Loops with Reductions for
Parallel Execution.
Parallel Processing Letters, 7(2):157-168, 1997.
- 19
-
C. Lam, P. Sadayappan, and R. Wenger.
Optimization of a Class of Multi-Dimensional Integrals on Parallel
Machines.
In Proc. of Eighth SIAM Conf. on Parallel Processing for
Scientific Computing, 1997.
- 20
-
A. W. Lim, S.-W. Liao, and M. S. Lam.
Blocking and array contraction across arbitrarily nested loops using
affine partitioning.
In Proc. of the Eighth ACM SIGPLAN Symposium on Principles and
Practices of Parallel Programming, pages 103-112. ACM Press, 2001.
- 21
-
K. S. McKinley, S. Carr, and C.-W. Tseng.
Improving Data Locality with Loop Transformations.
ACM TOPLAS, 18(4):424-453, July 1996.
rajkiran panuganti
2005-05-12