- Student Information Sheet – please fill out this form, attach a recent recognizable photograph, and turn it in Wednesday, January 23rd.
- Class Auditor Permit – follow the instructions on this form if you want to officially audit the class
This section lists the papers referenced in class. Some links may require you to login using your UT EID if accessed off-campus.
-
First Lecture
-
Simultaneous Multithreading
- Burton Smith. “Architecture and applications of the HEP multiprocessor computer system,” Proc. SPIE, vol. 298 Real-Time Signal Processing IV, 1981, pp. 241-248.
- Mario Nemirovsky, Forrest Brewer, Roger C. Wood. DISC: Dynamic Instruction Stream Computer. MICRO'91, 1991.
- D.M. Tullsen, S.J. Eggers, H.M. Levy. Simultaneous Multithreading: Maximizing On-Chip Parallelism. Proceedings of ISCA-22, June 1995.
- Robert S. Chappell, et. al. Simultaneous subordinate microthreading (SSMT). ISCA 26, 1999.
-
Out-of-Order and Superscalar
-
Joseph A. Fisher. Very Long Instruction Word architectures and the ELI-512. ISCA 10, 1983.
-
James E. Smith, Decoupled Access/Execute Computer. 1984. (revised journal version)
-
Yale Patt, Wen-mei Hwu, and Michael Shebanow.
HPS, a new microarchitecture: rationale and introduction. MICRO'85, 1985.
-
Yale Patt, Stephen W. Melvin, Wen-mei Hwu, and Michael Shebanow.
Critical issues regarding HPS, a high performance microarchitecture. MICRO'85, 1985.
-
Dynamic Instruction Scheduling
Runahead Execution
-
James Dundas and Trevor Mudge. Improving data cache performance by pre-executing instructions under a cache miss. ICS 11, 1997.
-
Onur Mutlu, Jared Stark, Chris Wilkerson, and Yale N. Patt. Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors. HPCA 9, 2003.
-
Onur Mutlu, Hyesoon Kim, and Yale N. Patt. Techniques for Efficient Processing in Runahead Execution Engines. ISCA 32, 2005.
-
Onur Mutlu, Hyesoon Kim, and Yale N. Patt. Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns. MICRO'05, 2005.
Trace Cache
-
Stephen W. Melvin and Yale N. Patt. Performance benefits of large execution atomic units in dynamically scheduled machines. ICS 3, 1989.
-
Alexander Peleg and Uri Weiser. Dynamic flow instruction cache memory organized around trace segments independent of virtual address line. U.S. Patent 5381533, 1994.
-
Daniel H. Friendly, Sanjay J. Patel, and Yale N. Patt. Alternative Fetch and Issue Policies for the Trace Cache Fetch Mechanism. MICRO'97, 1997.
-
Sanjay J. Patel, Marius Evers, and Yale N. Patt. Improving trace cache effectiveness with branch promotion and trace packing. ISCA 25, 1998.
-
Eric Rotenberg, Jim Smith, and Steve Bennett. Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching. MICRO'96, 1996.
-
Eric Rotenberg, Quinn Jacobson, Yiannakis Sazeides, and Jim Smith. Trace processors. MICRO'97, 1997.
-
Bryan Black, Bohuslav Rychlik, and John Paul Shenn. The block-based trace cache. ISCA 26, 1999.
Post-Decode Cache
Block-Structured ISA
Superblocks and Hyperblocks
Virtual Machines
-
Wayne T. Wilner. “Design of Burroughs B1700.” AFIPS Conf. Proc., Vol. 41, Part 1, 1972.
Branch Prediction
- J.K.F. Lee, Alan J. Smith. Branch Prediction Strategies and Branch Target Buffer Design. IEEE Computer Vol. 17, Iss. 1, 1984.
- Tse-Yu Yeh and Yale Patt. Two-Level Adaptive Training Branch Prediction. MICRO 21, 1991.
- Shien-Tai Pan, Kimming So, Joseph T. Rahmeh. Improving the accuracy of dynamic branch prediction using branch correlation. ASPLOS-V, 1992.
- Scott McFarling. Combining Branch Predictors. WRL Technical Note TN-36, DEC, 1993.
- Eric Sprangle, et. al. The Agree Predictor: A Mechanism For Reducing Negative Branch History Interference. ISCA 24, 1997.
- Daniel A. Jiménez and Calvin Lin. Dynamic Branch Prediction with Perceptrons.HPCA 7, 2001.
- Andre Seznec. Analysis of the OGEHL predictor. ISCA 32, 2005.
- Andre Seznec, Pierre Michaud. A case for (partially) tagged Geometric History Length Branch Prediction. Journal of Instruction Level Parallelism, Feb. 2006.
Performance Measurement
RISC
Cache Coherence
Parallel Computers
- Doug Burger, Stephen W. Keckler, Kathryn S. McKinley, et al. Scaling to the End of Silicon with EDGE Architectures. IEEE Computer, 37 (7), 2004.
- Michael Beford Taylor et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs. MICRO 02, 2002.
- Steve Swanson et al. WaveScalar. MICRO 03, 2003.
- W. Daniel Hillis. The connection machine. 1985.
- B. A. Kahle and W. D. Hillis. The Connection Machine model CM-1 architecture. IEEE Trans. on Systems, Man and Cybernetics. 1989.
- D. E. Shaw. “NON-VON: A Parallel Machine Architecture for Knowledge Based Information Processing.” Proc. of the Seventh International Joint Conf. on Artificial Intelligence, 1981.
- Robert A. Wagner. The Boolean Vector Machine (BVM). ISCA 10, 1983.
Consistency Models
Books
Look for the following x86 manuals (with 64-bit extensions) at Intel's Software Developer's Manuals webpage:
- IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic Architecture
- IA-32 Intel Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
- IA-32 Intel Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z
- IA-32 Intel Architecture Software Developer's Manual Volume 3A: System Programming Guide
- IA-32 Intel Architecture Software Developer's Manual Volume 3B: System Programming Guide
Alternatively, you may download the older manuals (without 64-bit extensions):