
This document lists the titles of selected technical reports (published or to be published) of the High Performance Computing and Software Laboratory (since 1994) with links to corresponding ./publications/abstracts. Included in the heading of each ./publications/abstract is a link to download the actual technical report.
I/O Management and Data-intensitive Computing
Processor Caches and DRAM Memory SystemsMultimedia Streaming and Other Applications in Internet Systems
Performance and Reliability of Internet Systems
P2P and Overlay Networks
Wireless and Sensor NetworksCluster Systems and Computing
Parallel Systems and Computing
I/O Management and Data-intensitive Computing
``Accelerating pathology image data cross-comparison on CPU-GPU hybrid systems"
,
Proceedings of 38th ACM International Conference on Very Large Databases
(VLDB 2012), Istanbul, Turkey, August 27-31, 2012.
``hStorage-DB: hhStorageDB: heterogeneity-aware data management to exploit full
capacity of hybrid storage systems",
Proceedings of 38th ACM International Conference on Very Large Databases
(VLDB 2012), Istanbul, Turkey, August 27-31, 2012.
``DOT: a matrix model for analyzing, optimizing and deploying software
for big data analytics in distributed systems",
Proceedings of 2nd ACM Symposium on Cloud Computing
(SOCC 2011), Cascais, Portugal, October 27-28, 2011.
``YSmart: Yet another SQL-to-MapReduce Translator",
Proceedings of 31st International Conference on Distributed Computing
Systems (ICDCS 2011), Minneapolis, Minnesota, June 20-24, 2011.
Best Paper Award
.
|
YSmart is used both as an independent open source software
and as a component in large production systems
|
``Hystor: Making the Best Use of Solid State Drives in High Performance
Storage Systems",
Proceedings of 25th ACM International Conference on Supercomputing
(ICS 2011), Tucson, Arizona, May 31 - June 4, 2011.
Best Paper Award
.
``CAFTL: a content-aware flash translationa layer enhancing the lifespan
of flash memory basedsolid state drives",
Proceedings of 9th USENIX Conference on File and Storage
Technologies
(FAST'11), San Jose, California,
February 15-17, 2011.
``Essential roles of exploiting internal parallelism of flash memory based
solid state drives in high-speed data processing",
Proceedings of 17th International Symposium on High Performance
Computer Architecture
(HPCA-17), San Antonio, Texas,
February 12-16, 2011.
``RCFile: a fast and space-efficient data placement structure in
MapReduce-based Warehouse systems",
Proceedings of International Conference on Data Engineering
(ICDE 2011), Hannover, Germany,
April 11-16, 2011.
|
RCFile has been widely used for
big data analytics in distributed systems:
|
``PS-BC: power-saving considerations in design of buffer caches serving
heterogeneous storage devices",
Proceedings of 16th ACM International Symposium on Low Power
Electronics and Design (ISLPED 2010), Austin, Texas,
August 18-20, 2010.
``Understanding
intrinsic characteristics and system implications of
flash memory based solid state drives",
Proceedings of 2009 ACM SIGMETRICS Conference on Measurement and Modeling
of Computer Systems
(SIGMETRICS/Performance 2009), Seattle, WA, June 15-19, 2009.
``BP-Wrapper: a system framework making any replacement algorithms (almost)
lock contention free"
,
Proceedings of 25th International Conference on Data Engineering
(ICDE'09), Shanghai, China, March 29- April 4, 2009.
``Caching for Bursts (C-Burst): let hard disks sleep well and work
energetically",
Proceedings of 13th ACM/IEEE International Symposium on Low Power
Electronics and Design (ISLPED'08), Bangalore, India, August 11-13,
2008.
``Cost-aware caching algorithms for distributed storage servers",
Proceedings of the 21st International Symposium on Distributed Computing
(DISC'07),
Lemesos, Cyprus, September 24-26, 2007.
``STEP: Sequentiality and Thrashing Detection based Prefetching to improve
performance of networked storage servers",
Proceedings of the 27 International Conference on Distributed Computing
Systems (ICDCS'07),
Toronto, Canada, June 25-29, 2007.
``DiskSeen: exploiting disk layout and access history to enhance I/O prefetch",
Proceedings of 2007 USENIX Annual Technical Conference (USENIX'07),
Santa Clara, California, June 17-22, 2007.
``Coordinated multilevel buffer cache management with consistent access
locality quantification",
IEEE Transactions on Computers, Vol. 56,
No. 1, 2007.
``SmartSaver: turning flash drive into a disk energy saver for mobile
computers",
Proceedings of 11th ACM/IEEE International Symposium on Low Power
Electronics and Design (ISLPED'06), Tegernsee, Germany, October 4-6,
2006.
``A locality-aware cooperative cache management protocol to improve
network file system performance",
Proceedings of the 26th International Conference on Distributed
Computing Systems (ICDCS'06), Lisbon, Portugal, July 4-7, 2006.
``DULO: an effective buffer cache management scheme to exploit both temporal and spatial localities",
Proceedings of the 4th USENIX Conference on Files and Storage Technologies (FAST'05),
San Francisco, CA, December 14-16, 2005.
``CLOCK-Pro: an effective improvement of the CLOCK replacement",
Proceedings of 2005 USENIX Annual Technical Conference (USENIX'05),
Anaheim, CA, April 10-15, 2005.
|
Clock-Pro has been adopted in NetBSD and patched in
Linux Kernel:
|
``Making LRU friendly to weak locality workloads: a novel replacement algorithm to improve
buffer cache performance",
IEEE Transactions on Computers, Vol. 54, No. 8,
2005.
``Token-ordered LRU: an effective page replacement policy and its
implementation in Linux systems",
Performance Evaluation, Vol. 60, Issue 1-4, 2005.
|
The token algorithm has been adopted in Linux Kernel
|
``ULC: A file block placement and replacement protocol to effectively
exploit hierarchical locality in multi-level buffer caches" ,
Proceedings of the 24th International Confernece
on Distributed
Computing Systems, (ICDCS'04), Tokyo, Japan, March 23-26, 2004.
``Efficient Distributed Disk Caching in Data Grid Management"
Proceedings of IEEE International Confernece on Cluster Computing,
(Cluster'03), December 1-4, 2003.
``LIRS: an efficient low inter-reference
recency set replacement to improve buffer cache performance"
,
Proceedings of the 2002 ACM SIGMETRICS Conference on Measurement and
Modeling of Computer Systems,
(SIGMETRICS'02), Marina Del Rey, California, June 15-19, 2002.
The LIRS caching algorithm has been adopted in MySQL databases.
|
``TPF: a system thrashing protection facility",
Software: Practice and Experience, Vol. 32, Issue 3, 2002.
``Adaptive page replacement to protect thrashing in Linux",
Proceedings of the 5th USENIX Annual Linux Showcase and Conference,
(ALS'01), Oakland, California, November 5-10, 2001.
``BWS: Balanced Work Stealing for time-sharing multicores",
Proceedings of ACM EuroSys'12,
Bern, Switzerland,
April 10-13, 2012.
``SRM-Buffer: An OS Buffer Management Technique to Prevent Last Level
Caches from Thrashing in Multicores",
Proceedings of ACM EuroSys'11,
Salzburg, Austria,
April 10-13, 2011.
``ULCC: a user-level facility for optimizing shared cache performance
on multicores",
Proceedings of 16th ACM SIGPLAN Annual Symposium on Principles and
Practice of Parallel Programming
(PPoPP 2011), San Antonio, Texas,
February 12-16, 2011.
``Enabling software management for multicore caches with a lightweight
hardware support",
Proceedings of 22nd ACM/IEEE Annual Conference on Supercomputing
(SC09), Portland, Oregon, November 14-20, 2009.
``Soft-OLP: improving hardware cache performance through software-controlled
object-level partitioning",
Proceedings of 18th International Conference on Parallel Architectures and
Compilation Techniques
(PACT 2009), Raleigh, North Carolina, September 12-16, 2009.
``MCC-DB: minimizing cache conflicts in multi-core processors for
databases",
Proceedings of 35th International Conference on Very Large Data
Bases
(VLDB 2009), Lyon, France, August 24-28, 2009.
``Gaining insights into multicore cache partitioning: bridging the gap
between simulation and real systems",
Proceedings
of the 14th International Symposium on High Performance Computer
Architecture
(HPCA'08), Salt Lake City, Utah, February 16-20, 2008.
|
The cache partitioning method has been
adopted by Intel:
|
``MESA: reducing cache conflicts by increasing static and run-time
methods",
Proceedings of International Symposium on Performance Analysis of
Systems and Software (ISPASS-2006), Austin, Texas, March 19-21, 2006.
``Look-ahead architecture adaptation to reduce processor power
consumption"
IEEE Micro,
Vol. 25, No. 4, July/August, 2005.
``Design and optimization of large size and low overhead off-chip caches",
IEEE Transactions on Computers,
Vol. 53, No. 7, 2004.
``Access-mode predictions for low-power cache design",
IEEE Micro,
Vol. 22, No. 2, March/April, 2002.
``Fine-grain priority scheduling on multi-channel memory systems",
Proceedings of the 8th International Symposium on High
Performance Computer Architecture, (HPCA-8),
Cambridge, Massachusetts, February 2-6, 2002.
``Breaking address mapping symmetry at multi-level of memory hierarchy to
reduce DRAM row-buffer conflicts",
Journal of Instruction-Level Parallelism,
Vol. 3, 2001.
``Cached DRAM for ILP processor memory
access latency reduction",
IEEE Micro, Vol. 21, No. 4, July/August, 2001.
``Fast bit-reversals on uniprocessors and shared-memory multiprocessors",
SIAM Journal on Scientific Computing, Vol. 22, No. 6, 2001.
``A permutation-based page interleaving scheme to reduce
row-buffer conflicts and exploit data locality",
Proceedings of the 33rd Annual International Symposium
on Microarchitecture, (Micro-33), Monterey, California,
December 10-13, 2000.
The permutation technique has been adopted in the memory controller in the
Sun MicroSystems'
UltraSPARC IIIi processor.
|
``Improving memory performance of sorting algorithms",
ACM Journal on Experimental Algorithmics, Vol. 5, 2000.
``Cacheminer: a runtime approach to exploit cache locality
on SMP",
IEEE Transactions on Parallel and Distributed Systems,
Vol. 11, No. 4, 2000.
``Cache-optimal methods for bit-reversals",
Proceedings of Supercomputing'99, (SC'99),
November, Portland, Oregon, 1999.
``A memory-layout oriented run-time technique for locality optimization",
Proceedings of 1998 International Conference on Parallel Processing,
(ICPP'98), August 1998.
Exploiting Cache Locality on Symmetric Multiprocessors: A Run-Time
Approach,
Ph.D. Dissertation, College of William and Mary,
May 1998.
``Two fast and high-associativity cache schemes",
IEEE Micro, October, 1997.
``Spam behavior analysis and detection in user generated content on
social networks",
Proceedings of 32nd International Conference on Distributed Computing
Systems (ICDCS 2012), Macau, China, June 18-21, 2012.
``CUBS: coordinated upload bandwidth sharing in residential networks",
Proceedings of 17th International Conference on Network Protocols
(ICNP 2009), Princeton, NJ, October 13-16, 2009.
``Analyzing patterns of user content generation in online social networks",
Proceedings of 15th ACM SIGKDD Conference on Knowledge Discovery and Data
Mining
(KDD-09), Paris, France, June 28- July 1st, 2009.
``The stretched exponential distribution of Internet media access patterns"
,
Proceedings of 27th ACM Symposium on Principles of Distributed Computing
(PODC 2008), Toronto, Canada, August 18-21,
2008.
``SProxy: a caching infrastructure to support Internet streaming",
IEEE Transactions on Multimedia, Vol. 9,
No. 5, 2007.
``Does Internet media traffic really follow Zipf-like distribution?",
(an extended abstract),
Proceedings of 2007 ACM International Conference on Measurement and
Modeling of Computer Systems (SIGMETRICS'07),
San Diego, California, June 12-16, 2007.
``Delving into Internet streaming media delivery: a quality and resource
utilization perspective",
Proceedings of ACM SIGCOMM Internet Measurement Conference (IMC'06),
Rio de Janeiro, Brazil, October 25-27, 2006.
``Segment-based streaming media proxy: modeling and optimization",
IEEE Transactions on Multimedia, Vol. 8,
No. 2, 2006.
``Design and evaluation of a scalable and reliable P2P assisted proxy
for on-demand streaming media delivery",
IEEE Transactions on Knowledge and Data Engineering, Vol. 18,
No. 5, 2006.
``Fast proxy delivery of multiple streaming sessions in shared running buffers",
IEEE Transactions on Multimedia,
Vol. 7, No. 6, December 2005.
``Segment-based proxy caching for Internet streaming media delivery",
IEEE Multimedia, Vol. 12, No. 3,
July-September, 2005.
``Analysis of multimedia workloads with implications for internet streaming"
,
Proceedings of the 14th International World Wide Web Conference,
(WWW'2005),
Chiba, Japan, May 10-14, 2005.
``DISC:
Dynamic Interleaved Segment Caching for interactive steaming accesses",
Proceedings of the 25th International Conference
on Distributed
Computing Systems, (ICDCS'2005),
Columbus, Ohio, June 6-9, 2005.
``PROP: a scalable and reliable P2P assisted streaming proxy system" ,
Proceedings of the 24th International Confernece
on Distributed
Computing Systems, (ICDCS'04), Tokyo, Japan, March 23-26, 2004.
``SRB: Shared Running Buffers in proxy to exploit memory locality of
multiple streaming media sessions" ,
Proceedings of the 24th International Confernece
on Distributed
Computing Systems, (ICDCS'04), Tokyo, Japan, March 23-26, 2004.
``Designs of high quality streaming proxy systems" ,
Proceedings of IEEE INFOCOM'04, Hong Kong, March 7-11, 2004.
``Investigating performance insights of segment-based proxy caching of
streaming media strategies",,
Proceedings of ACM International Conference on Multimedia Computing and
Networking (MMCN'04),
January 21-22, 2004.
``Adaptive and lazy segmentation based proxy caching for streaming media
delivery"
,
Proceedings of 13th
ACM International Workshop on Network and Operating Systems Support for
Design Audio and Video, (NOSSDAV'03),
Monterey, California, USA, June 1-3, 2003.
``Splitter: a proxy-based approach for post-migration testing of Web
applications",
Proceedings of ACM EuroSys 2010, Paris, France,
April 13-16, 2010.
``Maintaining strong cache consistency for the Domain Name System",
IEEE Transactions on Knowledge and Data Engineering, Vol. 19,
No. 8, 2007.
``DNScup: a strong cache consistency protocol for DNS",
Proceedings of the 26th International Conference on Distributed
Computing Systems (ICDCS'06), Lisbon, Portugal, July 4-7, 2006.
``Coordinated data prefetching for Web contents",
Computer Communications,
Vol. 28, Issue 17, October 2005.
``Enforcing direct communications between clients and Web servers to
improve proxy performance and security",
Software: Practice and Experience,
Vol. 34, Issue 12, October 2004.
``Strong cache consistency support for domain name system", a poster
presentation in SIGCOMM'04, Portland, Oregon,
August 31 - September 3, 2004.
``Accurately modeling workload interactions for deploying prefetching in
Web servers", ,
Proceedings of 2003 International Conference on Parallel Processing,
(ICPP'03),
Kaohsiung, Taiwan, China, October 6-9, 2003.
``On scalable and locality aware Web file sharing",
Journal of Parallel and Distributed Computing,
Vol. 63, No. 10, 2003.
``A popularity-based prediction model for Web prefetching",
IEEE Computer,
Vol. 36, No. 3, March, 2003.
``Detective borwsers: a software technique to improve Web access performance
and security",
Proceedings of the 7th International Workshop on
Web Content Caching and Distribution, (WCW'02), Boulder, Colorado,
August 14-16, 2002.
``Coordinated data prefetching by utilizing reference information
at both proxy and Web servers",
Proceedings of the ACM Workshop on Performance and Architecture
of Web Servers,
(PAWS-2001), Boston, Massachusetts, June 16-17, 2001.
``Exploiting neglected data locality in browsers",
Proceedings of the 10th International World Wide Web Conference,
(WWW10), Hong Kong, May 1-5, 2001, (an extended abstract).
``TopBT: a topology-aware and infrastructure-independent BitTorrent client",
Proceedings of INFOCOM'10, San Diego, California,
March 15-19, 2010.
TopBT is open source software.
``A Performance Study of BitTorrent-like Peer-to-Peer
Systems",
IEEE Journal on Selected Areas in Communications, Vol. 25,
No. 1, 2007.
``ASAP: an AS-Aware Peer-relay protocol for high quality VoIP",
Proceedings of the 26th International Conference on Distributed
Computing Systems (ICDCS'06), Lisbon, Portugal, July 4-7, 2006.
``Measurement, analysis, and modeling of BitTorrent-like systems"
Proceedings of ACM SIGCOMM Internet Measurement Conference (IMC'05),
New Orleans, LA, October 19-21, 2005.
``Fast and low-cost search schemes by exploiting localities in P2P
networks",
Journal of Parallel and Distributed Computing,
Vol. 65, Issue 6, 2005.
``Locality awareness in unstructured peer-to-peer systems",
IEEE Transactions on Parallel and Distributed Systems,
Vol. 16, No. 2, February 2005.
``SCOPE: scalable consistency maintenance in structured P2P systems",
Proceedings of IEEE INFOCOM 2005 Conference, Miami, Florida,
March 13-17, 2005.
``Exploiting content localities for efficient search in P2P systems",
Proceedings of the 18th International Symposium on Distributed
Computing (DISC 2004),
Amsterdam, Netherlands, October 4 - 8, 2004.
``Building a large and efficient hybrid peer-to-peer Internet caching
system" ,
IEEE Transactions on Knowledge and Data Engineering,
Vol. 16, No. 6, 2004.
``SAT-Match: a self-adaptive topology matching method to achieve low
lookup latency in structured P2P overlay networks" ,
Proceedings of the 18th International Parallel and
Distributed Processing Symposium (IPDPS'04),
Santa Fe, New Mexico,
April 26-30,
2004.
``Locality-aware topology matching in P2P systems" ,
Proceedings of IEEE INFOCOM'04, Hong Kong, March 7-11, 2004.
``Low cost and reliable mutual anonymity protocols in peer-to-peer
networks",
IEEE Transactions on Parallel and Distributed Systems,
Vol. 14, No. 9, 2003.
``LighFlood: an efficient flooding scheme for file search in unstructured
peer-to-peer systems",
Proceedings of 2003 International Conference on Parallel Processing,
(ICPP'03),
Kaohsiung, Taiwan, China, October 6-9, 2003.
``Mutual anonymity protocols for hybrid peer-to-peer systems"
,
Proceedings of 23rd
International Conference on Distributed Computing Systems,
(ICDCS'03), Providence, Rhode Island, May 19-22, 2003.
``On reliable and scalable peer-to-peer web document sharing",
Proceedings of 2002
International Parallel and Distributed Processing Symposium,
(IPDPS'02), Fort Lauderdale, Florida, April 15-19, 2002.
``PSM-Throttling: minimizing energy comsumption for bulk data communications
in WLANs",
Proceedings of the 15th International Conference on Network Protocols,
(ICNP'07),
Beijing, China, October 16-19, 2007.
``SCAP: Smart Caching in wireless Access Points to improve P2P streaming",
Proceedings of the 27 International Conference on Distributed Computing
Systems (ICDCS'07),
Toronto, Canada, June 25-29, 2007.
``Design and Analysis of Sensing Scheduling Algorithms under Partial
Coverage for Object Detection in Sensor Networks",
IEEE Transactions on Parallel and Distributed Systems, Vol. 18,
No. 3, 2007.
``Cooperative Relay Service in a Wireless LAN",
IEEE Journal on Selected Areas in Communications, Vol. 25,
No. 2, 2007.
``Exploiting idle communication power to improve wireless network performance
and energy efficiency",
Proceedings of INFOCOM'06, Barcelona, Spain,
April 23-29, 2006.
``Design and analysis of wave sensing scheduling protocols for
object-tracking applications",
Proceedings of the First International Conference on Distributed
Computing in Sensor Systems (DCOSS '05),
Marina del Rey, California, June 30 - July 1, 2005.
``Analyzing object detection quality under probabilistic coverage in sensor
networks",
Proceedings of the 13th International Workshop on Quality of Service,
(IWQoS'05),
Passau, Germany, June 21 - 23, 2005.
``A study on object tracking quality under probabilistic coverage in
sensor networks", a poster
presentation in MobiCom'04, Philadelphia, Pennsylvania,
September 26 to October 1, 2004; an extended abstract published in
ACM Mobile Computing and Communication Review (MC2R), Vol. 9, No. 1,
pp 73-76, January 2005.
``Automatic software fault diagnosis by exploiting application signatures"
,
Proceedings of 22nd USENIX Conference on Large Installation System Administra
tion
(LISA'08), San Siego, california, November 9-14,
2008.
(Best Paper Award ).
``Adaptive memory allocations in clusters to handle unexpectedly large
data-intensive jobs" ,
IEEE Transactions on Parallel and Distributed Systems,
Vol. 15, No. 7, 2004.
``Auto-CFD: efficiently
parallelizing CFD applications on clusters"
Proceedings of IEEE International Confernece on Cluster Computing,
(Cluster'03), December 1-4, 2003.
``Adaptive and virtual reconfigurations for dynamic job scheduling in clusters"
,
Proceedings of 22nd
International Conference on Distributed Computing Systems,
(ICDCS'02), Vienna, Austria, July 2-5, 2002.
``Dynamic cluster resource allocations for jobs with known and unknown
memory demands",
IEEE Transactions on Parallel and Distributed Systems, Vol. 13,
No. 3, 2002.
``Dynamic load sharing with unknown memory demands in clusters",
Proceedings of the 21st International Conference on Distributed Computing
Systems, (ICDCS'2001), Phoenix, Arizona, April 16-19, 2001.
``Architectural effects of symmetric multiprocessors on TPC-C commercial
workload",
Journal on Parallel and Distributed Computing, Vol. 61, 2001.
``Memory hierarchy considerations for cost-effective cluster computing",
IEEE Transactions on Computers, Vol. 49, No. 9, 2000.
``Incorporating job
migration and network RAM to share memory resources",
Proceedings of the 9th IEEE International Symposium on
High Performance Distributed Computing, (HPDC-9), Pittsburgh,
Pennsylvania, August 1-4, 2000.
``Effective Load Sharing on Heterogeneous Networks of Workstations",
Proceedings of the 2000 International Parallel and Distributed
Processing Symposium, (IPDPS'2000), Cancun, Mexico, May 1-5, 2000.
``Improving distributed workload
performance by sharing both CPU and memory resources",
Proceedings of the 20th International Conference on
Distributed Computing
Systems, (ICDCS'2000), Taipei, Taiwan, April 10-13, 2000.
``Analysis of commercial workload on SMP multiprocessors",
Proceedings of Performance'99
August, 1999.
``Profit-effective parallel computing",
IEEE Concurrency, Vol. 7, No. 2, 1999.
``The impact of memory hierarchies
on cluster computing",
Proceedings of 13th
International Parallel Processing
Symposium & 10th Symposium on Parallel and Distributed Processing,
(Second Merged Symposium IPPS/SPDP'99),
April, 1999.
``Engineering workstations",
Encyclopedia of Electrical and Electronics Engineering,
John Wiley & Sons, Publishers, February, 1999.
``Performance models and
simulation", Chapter 6,
High Performance Cluster Computing, Volume 1,
edited by R. Buyya, Prentice Hall, New Jersey, 1999.
``Comparative evaluation and case studies of shared-memory and
data-parallel execution patterns",
Scientific Programming,
Vol. 7, No. 1, 1999.
``Characterizing and scheduling
communication tasks of parallel and sequential jobs on
networks of workstations",
Computer Communications, Vol. 21, Issue. 5, 1998.
``An Integrated Approach of Performance Prediction on
Networks of Workstations"
,
Chapter 4,
Advanced Computer System Design,
K. Bagchi, J. Walrand and
G.Zobrist, Eds, Gordon and Breach Publishers, 1998.
``A comparative evaluation of hierarchical network architecture
of the HP-Convex Exemplar",
Proceedings of ICCD'97.
``Coordinating parallel processes on networks of workstations",
Journal of Parallel and Distributed Computing, Vol. 46, No. 2,
1997.
``Effectively scheduling parallel tasks and communications
on networks of workstations",
Proceedings of Euro-Par'97.
``Simulation of heterogeneous networks of workstations"
,
Proceedings of MASCOTS'96, IEEE Computer Society Press,
February, 1996.
``Software
support for asynchronous computing across networks"
,
Proceedings of the 19th Annual International Computer Software
and Application Conference , IEEE Computer Society Press, August, 1995.
``Lock Bypassing: an efficient algorithm for concurrently accessing
priority heaps",
ACM Journal on Experimental Algorithmics,
Vol. 3, No. 3, 1998.
``Nova visualization for optimization of data-parallel programs",
Proceedings of Euro-Par'97.
``Distributed edge detection: issues and implementations",
IEEE Computational Science and Engineering, Spring Issue, 1997.
``Software
support for multiprocessor latency measurement and
evaluation",
IEEE Transactions on Software Engineering , Vol. 23, No. 1, 1997.
``Adaptively
scheduling parallel loops on
distributed shared-memory systems", IEEE Transactions on
Parallel and Distributed Systems, Vol. 8, No. 1, 1997.
``Semi-empirical multiprocessor performance predictions"
,
Journal of Parallel and Distributed Computing, Vol. 39, No. 1,
1996.
``An effective and practical performance prediction model for
parallel computing on non-dedicated heterogeneous NOW"
,
Journal of Parallel and Distributed Computing, Vol. 38, No. 1,
1996.
``An adaptive loop scheduling algorithm on shared-memory systems"
,
Proceedings of the 8th Symposium on Parallel and Distributed
Processing, IEEE Computer Society Press, October, 1996.
``Evaluating and designing
software mutual exclusion algorithms on shared-memory multiprocessors"
,
IEEE Parallel & Distributed Technology, Spring Issue, 1996.
``A fast token-chasing
mutual exclusion algorithm in arbitrary network topologies"
,
Journal of Parallel and Distributed Computing, Vol. 35, No. 2, 1996.
``Parallelizing FDTD Methods for
Solving Electromagnetic Scattering Problems"
,
Applications on Advanced Architecture Computers,
G. Astfalk Eds., SIAM Press, 1996.
``Comparative modeling
and evaluation of CC-NUMA and COMA on hierarchical ring architectures",
IEEE Transactions on Parallel and Distributed Systems, Vol. 6,
No. 12, 1995.
``Modeling and characterizing parallel computing performance on
heterogeneous networks of workstations"
,
Proceedings of the 7th IEEE Symposium on
Parallel and Distributed Processing, IEEE Computer Society Press,
October, 1995.
``*Graph: a tool for visualizing communication and optimizing
layout in data-parallel programs"
,
Proceedings of the 1995 International Conference on
Parallel Processing, CRC Press, Vol. 2, August, 1995.
``Multiprocessor scalability predictions
through detailed program execution analysis"
,
Proceedings of the 9th ACM International Conference on
Supercomputing, ACM Press, July, 1995.
(Best Paper Award ).
``Comparative performance
analysis and evaluation of hot
spots on network-based shared-memory architectures",
IEEE Transactions on Parallel and Distributed Systems, Vol. 6, No. 8,
1995.
``Parallelizing an oil
refining simulation: numerical methods,
implementations and experience", Parallel Computing , Vol. 21,
No. 4, 1995.
"Distributed image
edge detection methods and performance",
Proceedings of the Sixth IEEE Symposium on Parallel and
Distributed Processing ,
IEEE Computer Society Press, October, 1994.
"Performance
predictions on implicit communication systems",
Proceedings of the Sixth IEEE Symposium on Parallel and
Distributed Processing ,
IEEE Computer Society Press, October, 1994.
``Distributed computation of
electromagnetic scattering problems using finite-difference time-domain
decompositions", Proceedings of the Third IEEE International
Symposium on High-Performance Distributed Computing , IEEE Computer
Society Press,
August, 1994.
``Latency metric:
an experimental method for measuring
and evaluating parallel program and architecture scalability",
Journal of Parallel and Distributed Computing ,
Vol. 22, No. 3, 1994.
``Measuring and analyzing
parallel computing scalability", Proceedings of the 1994
International Conference of Parallel Processing , CRC Press, Vol. II,
August, 1994.
``Comparative performance evaluation of spin-lock synchronization
on MIN-based and HR-based multiprocessors",
IEEE Parallel and Distributed Technology, Spring Issue, 1994.
``Computation and communication patterns of large-scale
image convolutions on parallel architectures", Proceedings of
the 8th International Parallel Processing Symposium,
IEEE Computer Society Press, April, 1994.
``Evaluation and
measurement of multiprocessor latency patterns",
Proceedings of
the 8th International Parallel Processing Symposium ,
IEEE Computer Society Press, April, 1994.
Tutorial on
Multiprocessor Performance Measurement and Evaluation , IEEE Computer
Society Press, 1994.
``Triangular decomposition methods for solving reducible nonlinear
systems of equations", SIAM Journal on Optimization , Vol. 5.
No. 2, 1994.
Find the hidden treasure, eh?!