教学资源 – 图书教辅

扩展信息

语种 : 英文

页数 : 692

开本 : 16

原书名 : Computer Organization and Design: The Hardware/Software Interface, RISC-V Edition

原出版社: Elsevier (Singapore) Pte Ltd

属性分类: 教材

包含CD : 无CD

绝版 : 无

图书简介

【网店勿用！此为申报选题所用简介，网店请调用CIP单中的最终简介】本书是经典著作《计算机组成与设计》继MIPS版、ARM版之后的最新版本，这一版专注于RISC-V，是Patterson和Hennessy的又一力作。RISC-V指令集作为首个开源架构，是专为云计算、移动计算以及各类嵌入式系统等现代计算环境设计的架构。本书更加关注后PC时代发生的变革，通过实例、练习等详细介绍最新计算模式，更新的内容还包括平板电脑、云基础设施以及ARM（移动计算设备）和x86 (云计算)体系结构。

图书特色

WU

图书前言

The most beautiful thing we can experience is the mysterious. It is the source of all true art and science.
Albert Einstein, What I Believe, 1930
About This Book
We believe that learning in computer science and engineering should reflect the current state of the field, as well as introduce the principles that are shaping computing. We also feel that readers in every specialty of computing need to appreciate the organizational paradigms that determine the capabilities, performance, energy, and, ultimately, the success of computer systems.
Modern computer technology requires professionals of every computing specialty to understand both hardware and software. The interaction between hardware and software at a variety of levels also offers a framework for understanding the fundamentals of computing. Whether your primary interest is hardware or software, computer science or electrical engineering, the central ideas in computer organization and design are the same. Thus, our emphasis in this book is to show the relationship between hardware and software and to focus on the concepts that are the basis for current computers.
The recent switch from uniprocessor to multicore microprocessors confirmed the soundness of this perspective, given since the first edition. While programmers could ignore the advice and rely on computer architects, compiler writers, and silicon engineers to make their programs run faster or be more energy-efficient without change, that era is over. For programs to run faster, they must become parallel. While the goal of many researchers is to make it possible for programmers to be unaware of the underlying parallel nature of the hardware they are programming, it will take many years to realize this vision. Our view is that for at least the next decade, most programmers are going to have to understand the hardware/software interface if they want programs to run efficiently on parallel computers.
The audience for this book includes those with little experience in assembly language or logic design who need to understand basic computer organization as well as readers with backgrounds in assembly language and/or logic design who want to learn how to design a computer or understand how a system works and why it performs as it does.
About the Other Book
Some readers may be familiar with Computer Architecture: A Quantitative Approach, popularly known as Hennessy and Patterson. (This book in turn is often called Patterson and Hennessy.) Our motivation in writing the earlier book was to describe the principles of computer architecture using solid engineering fundamentals and quantitative cost/performance tradeoffs. We used an approach that combined examples and measurements, based on commercial systems, to create realistic design experiences. Our goal was to demonstrate that computer architecture could be learned using quantitative methodologies instead of a descriptive approach. It was intended for the serious computing professional who wanted a detailed understanding of computers.
A majority of the readers for this book do not plan to become computer architects. The performance and energy efficiency of future software systems will be dramatically affected, however, by how well software designers understand the basic hardware techniques at work in a system. Thus, compiler writers, operating system designers, database programmers, and most other software engineers need a firm grounding in the principles presented in this book. Similarly, hardware designers must understand clearly the effects of their work on software applications.
Thus, we knew that this book had to be much more than a subset of the material in Computer Architecture, and the material was extensively revised to match the different audience. We were so happy with the result that the subsequent editions of Computer Architecture were revised to remove most of the introductory material; hence, there is much less overlap today than with the first editions of both books.
Why RISC-V for This Edition?
The choice of instruction set architecture is clearly critical to the pedagogy of a computer architecture textbook. We didn’t want an instruction set that required describing unnecessary baroque features for someone’s first instruction set, no matter how popular it is. Ideally, your initial instruction set should be an exemplar, just like your first love. Surprisingly, you remember both fondly.
Since there were so many choices at the time, for the first edition of Computer Architecture: A Quantitative Approach we invented our own RISC-style instruction set. Given the growing popularity and the simple elegance of the MIPS instruction set, we switched to it for the first edition of this book and to later editions of the other book. MIPS has served us and our readers well.
It’s been 20 years since we made that switch, and while billions of chips that use MIPS continue to be shipped, they are typically in found embedded devices where the instruction set is nearly invisible. Thus, for a while now it’s been hard to find a real computer on which readers can download and run MIPS programs.
The good news is that an open instruction set that adheres closely to the RISC principles has recently debuted, and it is rapidly gaining a following. RISC-V, which was developed originally at UC Berkeley, not only cleans up the quirks of the MIPS instruction set, but it offers a simple, elegant, modern take on what instruction sets should look like in 2017.
Moreover, because it is not proprietary, there are open-source RISC-V simulators, compilers, debuggers, and so on easily available and even open-source RISC-V implementations available written in hardware description languages. In addition, there will soon be low-cost hardware platforms on which to run RISC-V programs. Readers will not only benefit from studying these RISC-V designs, they will be able to modify them and go through the implementation process in order to understand the impact of their hypothetical changes on performance, die size, and energy.
This is an exciting opportunity for the computing industry as well as for education, and thus at the time of this writing more than 40 companies have joined the RISC-V foundation. This sponsor list includes virtually all the major players except for ARM and Intel, including AMD, Google, Hewlett Packard Enterprise, IBM, Microsoft, NVIDIA, Oracle, and Qualcomm.
It is for these reasons that we wrote a RISC-V edition of this book, and we are switching Computer Architecture: A Quantitative Approach to RISC-V as well.
Given that RISC-V offers both 32-bit address instructions and 64-bit address instructions with essentially the same instruction set, we could have switched instruction sets but kept the address size at 32 bits. Our publisher polled the faculty who used the book and found that 75% either preferred larger addresses or were neutral, so we increased the address space to 64 bits, which may make more sense today than 32 bits.
The only changes for the RISC-V edition from the MIPS edition are those associated with the change in instruction sets, which primarily affects Chapter 2, Chapter 3, the virtual memory section in Chapter 5, and the short VMIPS example in Chapter 6. In Chapter 4, we switched to RISC-V instructions, changed several figures, and added a few “Elaboration” sections, but the changes were simpler than we had feared. Chapter 1 and the rest of the appendices are virtually unchanged.
The extensive online documentation and combined with the magnitude of RISC-V make it difficult to come up with a replacement for the MIPS version of Appendix A (“Assemblers, Linkers, and the SPIM Simulator” in the MIPS Fifth Edition). Instead, Chapters 2, 3, and 5 include quick overviews of the hundreds of RISC-V instructions outside of the core RISC-V instructions that we cover in detail in the
rest of the book.
Note that we are not (yet) saying that we are permanently switching to RISC-V. For example, in addition to this new RISC-V edition, there are ARMv8 and MIPS versions available for sale now. One possibility is that there will be a demand for all versions for future editions of the book, or for just one. We’ll cross that bridge when we come to it. For now, we look forward to your reaction to and feedback on this effort.
Changes for the Fifth Edition
We had six major goals for the fifth edition of Computer Organization and Design demonstrate the importance of understanding hardware with a running example; highlight main themes across the topics using margin icons that are introduced early; update examples to reflect changeover from PC era to post-PC era; spread the material on I/O throughout the book rather than isolating it into a single chapter; update the technical content to reflect changes in the industry since the publication of the fourth edition in 2009; and put appendices and optional sections online instead of including a CD to lower costs and to make this edition viable as an electronic book.
Before discussing the goals in detail, let’s look at the table on the next page. It shows the hardware and software paths through the material. Chapters 1, 4, 5, and 6 are found on both paths, no matter what the experience or the focus. Chapter 1 discusses the importance of energy and how it motivates the switch from single core to multicore microprocessors and introduces the eight great ideas in computer architecture. Chapter 2 is likely to be review material for the hardware-oriented, but it is essential reading for the software-oriented, especially for those readers interested in learning more about compilers and object-oriented programming languages. Chapter 3 is for readers interested in constructing a datapath or in learning more about floating-point arithmetic. Some will skip parts of Chapter 3, either because they don’t need them, or because they offer a review. However, we introduce the running example of matrix multiply in this chapter, showing how subword parallels offers a fourfold improvement, so don’t skip Sections 3.6 to 3.8. Chapter 4 explains pipelined processors. Sections 4.1, 4.5, and 4.10 give overviews, and Section 4.12 gives the next performance boost for matrix multiply for those with a software focus. Those with a hardware focus, however, will find that this chapter presents core material; they may also, depending on their background, want to read Appendix A on logic design first. The last chapter, on multicores, multiprocessors, and clusters, is mostly new content and should be read by everyone. It was significantly reorganized in this edition to make the flow of ideas more natural and to include much more depth on GPUs, warehouse-scale computers, and the hardware–software interface of network interface cards that are key to clusters.
The first of the six goals for this fifth edition was to demonstrate the importance of understanding modern hardware to get good performance and energy efficiency with a concrete example. As mentioned above, we start with subword parallelism in Chapter 3 to improve matrix multiply by a factor of 4. We double performance in Chapter 4 by unrolling the loop to demonstrate the value of instruction-level parallelism. Chapter 5 doubles performance again by optimizing for caches using blocking. Finally, Chapter 6 demonstrates a speedup of 14 from 16 processors by using thread-level parallelism. All four optimizations in total add just 24 lines of C code to our initial matrix multiply example.
The second goal was to help readers separate the forest from the trees by identifying eight great ideas of computer architecture early and then pointing out all the places they occur throughout the rest of the book. We use (hopefully) easyto-remember margin icons and highlight the corresponding word in the text to remind readers of these eight themes. There are nearly 100 citations in the book. No chapter has less than seven examples of great ideas, and no idea is cited less than five times. Performance via parallelism, pipelining, and prediction are the three most popular great ideas, followed closely by Moore’s Law. Chapter 4, The Processor, is the one with the most examples, which is not a surprise since it probably received the most attention from computer architects. The one great idea found in every chapter is performance via parallelism, which is a pleasant observation given the recent emphasis in parallelism in the field and in editions of this book.
The third goal was to recognize the generation change in computing from the PC era to the post-PC era by this edition with our examples and material. Thus, Chapter 1 dives into the guts of a tablet computer rather than a PC, and Chapter 6 describes the computing infrastructure of the cloud. We also feature the ARM, which is the instruction set of choice in the personal mobile devices of the post-PC era, as well as the x86 instruction set that dominated the PC era and (so far) dominates cloud computing.
The fourth goal was to spread the I/O material throughout the book rather than have it in its own chapter, much as we spread parallelism throughout all the chapters in the fourth edition. Hence, I/O material in this edition can be found in Sections 1.4, 4.9, 5.2, 5.5, 5.11, and 6.9. The thought is that readers (and instructors) are more likely to cover I/O if it’s not segregated to its own chapter.
This is a fast-moving field, and, as is always the case for our new editions, an important goal is to update the technical content. The running example is the ARM Cortex A53 and the Intel Core i7, reflecting our post-PC era. Other highlights include a tutorial on GPUs that explains their unique terminology, more depth on the warehouse-scale computers that make up the cloud, and a deep dive into 10 Gigabyte Ethernet cards.
To keep the main book short and compatible with electronic books, we placed the optional material as online appendices instead of on a companion CD as in prior editions.
Finally, we updated all the exercises in the book.
While some elements changed, we have preserved useful book elements from prior editions. To make the book work better as a reference, we still place definitions of new terms in the margins at their first occurrence. The book element called
“Understanding Program Performance” sections helps readers understand the performance of their programs and how to improve it, just as the “Hardware/Software Interface” book element helped readers understand the tradeoffs at this interface. “The Big Picture” section remains so that the reader sees the forest despite all the trees. “Check Yourself ” sections help readers to confirm their comprehension of the material on the first time through with answers provided at the end of each chapter. This edition still includes the green RISC-V reference card, which was inspired by the “Green Card” of the IBM System/360. This card has been updated and should be a handy reference when writing RISC-V assembly language programs.
Instructor Support
We have collected a great deal of material to help instructors teach courses using this book. Solutions to exercises, figures from the book, lecture slides, and other materials are available to instructors who register with the publisher. In addition, the companion Web site provides links to a free RISC-V software. Check the publisher’s Web site for more information:
textbooks.elsevier.com/9780128122754
Concluding Remarks
If you read the following acknowledgments section, you will see that we went to great lengths to correct mistakes. Since a book goes through many printings, we have the opportunity to make even more corrections. If you uncover any remaining, resilient bugs, please contact the publisher by electronic mail at codRISCVbugs@ mkp.com or by low-tech mail using the address found on the copyright page.
This edition is the third break in the long-standing collaboration between Hennessy and Patterson, which started in 1989. The demands of running one of the world’s great universities meant that President Hennessy could no longer make the substantial commitment to create a new edition. The remaining author felt once again like a tightrope walker without a safety net. Hence, the people in the acknowledgments and Berkeley colleagues played an even larger role in shaping the contents of this book. Nevertheless, this time around there is only one author to blame for the new material in what you are about to read.
Acknowledgments
With every edition of this book, we are very fortunate to receive help from many readers, reviewers, and contributors. Each of these people has helped to make this book better.
We are grateful for the assistance of Khaled Benkrid and his colleagues at ARM Ltd., who carefully reviewed the ARM-related material and provided helpful feedback.
Chapter 6 was so extensively revised that we did a separate review for ideas and contents, and I made changes based on the feedback from every reviewer. I’d like to thank Christos Kozyrakis of Stanford University for suggesting using the network interface for clusters to demonstrate the hardware–software interface of I/O and for suggestions on organizing the rest of the chapter; Mario Flagsilk of Stanford University for providing details, diagrams, and performance measurements of the NetFPGA NIC; and the following for suggestions on how to improve the chapter: David Kaeli of Northeastern University, Partha Ranganathan of HP Labs, David Wood of the University of Wisconsin, and my Berkeley colleagues Siamak Faridani, Shoaib Kamil, Yunsup Lee, Zhangxi Tan, and Andrew Waterman.
Special thanks goes to Rimas Avizenis of UC Berkeley, who developed the various versions of matrix multiply and supplied the performance numbers as well. As I worked with his father while I was a graduate student at UCLA, it was a nice symmetry to work with Rimas at UCB.
I also wish to thank my longtime collaborator Randy Katz of UC Berkeley, who helped develop the concept of great ideas in computer architecture as part of the extensive revision of an undergraduate class that we did together.
extensive revision of an undergraduate class that we did together. I’d like to thank David Kirk, John Nickolls, and their colleagues at NVIDIA (Michael Garland, John Montrym, Doug Voorhies, Lars Nyland, Erik Lindholm, Paulius Micikevicius, Massimiliano Fatica, Stuart Oberman, and Vasily Volkov) for writing the first in-depth appendix on GPUs. I’d like to express again my appreciation to Jim Larus, recently named Dean of the School of Computer and Communications Science at EPFL, for his willingness in contributing his expertise on assembly language programming, as well as for welcoming readers of this book with regard to using the simulator he developed and maintains.
I am also very grateful to Zachary Kurmas of Grand Valley State University, who updated and created new exercises, based on originals created by Perry Alexander (The University of Kansas); Jason Bakos (University of South Carolina); Javier Bruguera (Universidade de Santiago de Compostela); Matthew Farrens (University of California, Davis); David Kaeli (Northeastern University); Nicole Kaiyan (University of Adelaide); John Oliver (Cal Poly, San Luis Obispo); Milos Prvulovic (Georgia Tech); Jichuan Chang (Google); Jacob Leverich (Stanford); Kevin Lim (Hewlett-Packard); and Partha Ranganathan (Google).
Additional thanks goes to Peter Ashenden for updating the lecture slides.
I am grateful to the many instructors who have answered the publisher’s surveys, reviewed our proposals, and attended focus groups. They include the following individuals: Focus Groups: Bruce Barton (Suffolk County Community College), Jeff Braun (Montana Tech), Ed Gehringer (North Carolina State), Michael Goldweber (Xavier University), Ed Harcourt (St. Lawrence University), Mark Hill (University of Wisconsin, Madison), Patrick Homer (University of Arizona), Norm Jouppi (HP Labs), Dave Kaeli (Northeastern University), Christos Kozyrakis (Stanford University), Jae C. Oh (Syracuse University), Lu Peng (LSU), Milos Prvulovic (Georgia Tech), Partha Ranganathan (HP Labs), David Wood (University of Wisconsin), Craig Zilles (University of Illinois at Urbana-Champaign). Surveys and Reviews: Mahmoud Abou-Nasr (Wayne State University), Perry Alexander (The University of Kansas), Behnam Arad (Sacramento State University), Hakan Aydin (George Mason University), Hussein Badr (State University of New York at Stony Brook), Mac Baker (Virginia Military Institute), Ron Barnes (George Mason University), Douglas Blough (Georgia Institute of Technology), Kevin Bolding (Seattle Pacific
University), Miodrag Bolic (University of Ottawa), John Bonomo (Westminster College), Jeff Braun (Montana Tech), Tom Briggs (Shippensburg University), Mike Bright (Grove City College), Scott Burgess (Humboldt State University), Fazli Can (Bilkent University), Warren R. Carithers (Rochester Institute of Technology), Bruce Carlton (Mesa Community College), Nicholas Carter (University of Illinois at Urbana-Champaign), Anthony Cocchi (The City University of New York), Don Cooley (Utah State University), Gene Cooperman (Northeastern University), Robert D. Cupper (Allegheny College), Amy Csizmar Dalal (Carleton College), Daniel Dalle (Université de Sherbrooke), Edward W. Davis (North Carolina State University), Nathaniel J. Davis (Air Force Institute of Technology), Molisa Derk (Oklahoma City University), Andrea Di Blas (Stanford University), Derek Eager (University of Saskatchewan), Ata Elahi (Souther Connecticut State University), Ernest Ferguson (Northwest Missouri State University), Rhonda Kay Gaede (The University of Alabama), Etienne M. Gagnon (L’Université du Québec à Montréal), Costa Gerousis (Christopher Newport University), Paul Gillard (Memorial
University of Newfoundland), Michael Goldweber (Xavier University), Georgia Grant (College of San Mateo), Paul V. Gratz (Texas A&M University), Merrill Hall (The Master’s College), Tyson Hall (Southern Adventist University), Ed Harcourt (St. Lawrence University), Justin E. Harlow (University of South Florida), Paul F. Hemler (Hampden-Sydney College), Jayantha Herath (St. Cloud State University), Martin Herbordt (Boston University), Steve J. Hodges (Cabrillo College), Kenneth Hopkinson (Cornell University), Bill Hsu (San Francisco State University), Dalton Hunkins (St. Bonaventure University), Baback Izadi (State University of New York—New Paltz), Reza Jafari, Robert W. Johnson (Colorado Technical University), Bharat Joshi (University of North Carolina, Charlotte), Nagarajan Kandasamy (Drexel University), Rajiv Kapadia, Ryan Kastner (University of California, Santa Barbara), E.J. Kim (Texas A&M University), Jihong Kim (Seoul National University), Jim Kirk (Union University), Geoffrey S. Knauth (Lycoming College), Manish M. Kochhal (Wayne State), Suzan Koknar-Tezel (Saint Joseph’s University), Angkul Kongmunvattana (Columbus State University), April Kontostathis (Ursinus
College), Christos Kozyrakis (Stanford University), Danny Krizanc (Wesleyan University), Ashok Kumar, S. Kumar (The University of Texas), Zachary Kurmas (Grand Valley State University), Adrian Lauf (University of Louisville), Robert N. Lea (University of Houston), Alvin Lebeck (Duke University), Baoxin Li (Arizona State University), Li Liao (University of Delaware), Gary Livingston (University of Massachusetts), Michael Lyle, Douglas W. Lynn (Oregon Institute of Technology), Yashwant K Malaiya (Colorado State University), Stephen Mann (University of Waterloo), Bill Mark (University of Texas at Austin), Ananda Mondal (Claflin University), Alvin Moser (Seattle University),
Walid Najjar (University of California, Riverside), Vijaykrishnan Narayanan (Penn State University), Danial J. Neebel (Loras College), Victor Nelson (Auburn University), John Nestor (Lafayette College), Jae C. Oh (Syracuse University), Joe Oldham (Centre College), Timour Paltashev, James Parkerson (University of Arkansas), Shaunak Pawagi (SUNY at Stony Brook), Steve Pearce, Ted Pedersen (University of Minnesota), Lu Peng (Louisiana State University), Gregory D. Peterson (The University of Tennessee), William Pierce (Hood College), Milos Prvulovic (Georgia Tech), Partha Ranganathan (HP Labs), Dejan Raskovic (University of Alaska, Fairbanks) Brad Richards (University of Puget Sound), Roman Rozanov, Louis Rubinfield (Villanova University), Md Abdus Salam (Southern University), Augustine Samba (Kent State University), Robert Schaefer (Daniel Webster College), Carolyn J. C. Schauble (Colorado State University), Keith Schubert (CSU San Bernardino), William L. Schultz, Kelly Shaw (University of Richmond), Shahram Shirani (McMaster University), Scott Sigman (Drury University), Shai Simonson (Stonehill College), Bruce Smith, David Smith, Jeff W. Smith (University of Georgia, Athens), Mark Smotherman (Clemson University), Philip Snyder (Johns Hopkins University), Alex Sprintson (Texas A&M), Timothy D. Stanley (Brigham Young University), Dean Stevens (Morningside College), Nozar Tabrizi (Kettering University), Yuval Tamir (UCLA), Alexander Taubin (Boston University), Will Thacker (Winthrop University), Mithuna Thottethodi (Purdue University), Manghui Tu (Southern Utah University), Dean Tullsen (UC San Diego), Steve VanderLeest (Calvin College), Christopher Vickery (Queens College of CUNY), Rama Viswanathan (Beloit College), Ken Vollmar (Missouri State University), Guoping Wang (Indiana-Purdue University), Patricia Wenner (Bucknell University), Kent Wilken (University of California, Davis), David Wolfe (Gustavus Adolphus College), David Wood (University of Wisconsin, Madison), Ki Hwan Yum (University of Texas, San Antonio), Mohamed Zahran (City College of New York), Amr Zaky (Santa Clara University), Gerald D. Zarnett (Ryerson University), Nian Zhang (South Dakota School of Mines & Technology), Jiling Zhong (Troy University), Huiyang Zhou (North Carolina State University), Weiyu Zhu (Illinois Wesleyan University).
A special thanks also goes to Mark Smotherman for making multiple passes to find technical and writing glitches that significantly improved the quality of this edition.
We wish to thank the extended Morgan Kaufmann family for agreeing to publish this book again under the able leadership of Katey Birtcher, Steve Merken, and Nate McFadden: I certainly couldn’t have completed the book without them. We also want to extend thanks to Lisa Jones, who managed the book production process, and Victoria Pearson Esser, who did the cover design. The cover cleverly connects the post-PC era content of this edition to the cover of the first edition.
Finally, I owe a huge debt to Yunsup Lee and Andrew Waterman for taking on this conversion to RISC-V in their spare time while founding a startup company. Kudos to Eric Love as well, who made RISC-V versions of the exercises in this edition while finishing his Ph.D. We’re all excited to see what will happen with RISC-V in academia and beyond.
The contributions of the nearly 150 people we mentioned here have helped make this new edition what I hope will be our best book yet. Enjoy!
David A. Patterson

上架指导

计算机体系结构

封底文字

在广大计算机程序员和工程师中，几乎没有人不知道Patterson和Hennessy的大作，而今RISC-V版的推出，再次点燃了大家的热情。RISC-V作为一种开源体系结构，从最初用于支持科研和教学，到现在已发展为产业标准的指令集。正在和即将阅读本书的年轻人，你们不仅能够从先行者的智慧中理解RISC-V的精髓，而且有望创建自己的RISC-V内核，为广阔的开源硬件和软件生态系统贡献力量。
——Krste Asanovi? ，RISC-V基金会主席

教材的选择往往是一个令人沮丧的妥协过程——教学方法的适用度、知识点的覆盖范围、文辞的流畅性、内容的严谨度、成本的高低等都需要考虑。本书之所以是难得一见的好书，正是因为它能满足各个方面的要求，不再需要任何妥协。这不仅是一部关于计算机组成的教科书，也是所有计算机科学教科书的典范。
——Michael Goldweber，Xavier University

无论是对于80后、90后还是00后，这都是一本应该珍藏在书架上（或iPad中）的计算机体系结构教材。这本书既古老又新颖，不仅介绍了那些伟大的原理——摩尔定律、抽象、加速大概率事件、可靠性、存储器层次结构、并行和流水线，而且使用现代设计对这些伟大原理进行了说明。
——Mark D. Hill, University of Wisconsin-Madison

本书不仅仅会讲解计算机体系结构，而且为读者准备了迎接新的变化与挑战的“锦囊”。目前，半导体工艺技术按比例缩小的困难使得所有系统功率受限，而移动系统和大数据处理的性能需求却仍在不断增长。在计算技术这一新领域，必须进行软硬件协同设计，并且系统级体系结构优化与部件级优化一样重要。
——Christos Kozyrakis, Stanford University

Patterson和Hennessy讨论了不断变化的计算机硬件体系结构中的重要议题，强调硬件和软件模块在不同抽象层次上的交互。书中涵盖各种硬件和软件机制，I/O和并行的概念贯穿其中，全景式呈现了后PC时代的计算机体系结构。无论是平板电脑硬件工程师还是云计算软件架构师，如果你正对能源效率和并行化问题一筹莫展，那么本书必将成为不二之选。
——Jae C. Oh, Syracuse University

图书目录

C H A P T E R S
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 The Power Wall 40
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stuff: Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions: Language of the Computer 60
2.1 Introduction 62
2.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 67
2.4 Signed and Unsigned Numbers 74
2.5 Representing Instructions in the Computer 81
2.6 Logical Operations 89
2.7 Instructions for Making Decisions 92
2.8 Supporting Procedures in Computer Hardware 98
2.9 Communicating with People 108
2.10 RISC-V Addressing for Wide Immediates and Addresses 113
2.11 Parallelism and Instructions: Synchronization 121
2.12 Translating and Starting a Program 124
2.13 A C Sort Example to Put it All Together 133
2.14 Arrays versus Pointers 141
2.15 Advanced Material: Compiling C and Interpreting Java 144
2.16 Real Stuff: MIPS Instructions 145
2.17 Real Stuff: x86 Instructions 146
2.18 Real Stuff: The Rest of the RISC-V Instruction Set 155
2.19 Fallacies and Pitfalls 157
2.20 Concluding Remarks 159
2.21 Historical Perspective and Further Reading 162
2.22 Exercises 162
3 Arithmetic for Computers 172
3.1 Introduction 174
3.2 Addition and Subtraction 174
3.3 Multiplication 177
3.4 Division 183
3.5 Floating Point 191
3.6 Parallelism and Computer Arithmetic: Subword Parallelism 216
3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions
in x86 217
3.8 Going Faster: Subword Parallelism and Matrix Multiply 218
3.9 Fallacies and Pitfalls 222
3.10 Concluding Remarks 225
3.11 Historical Perspective and Further Reading 227
3.12 Exercises 227
4 The Processor 234
4.1 Introduction 236
4.2 Logic Design Conventions 240
4.3 Building a Datapath 243
4.4 A Simple Implementation Scheme 251
4.5 An Overview of Pipelining 262
4.6 Pipelined Datapath and Control 276
4.7 Data Hazards: Forwarding versus Stalling 294
4.8 Control Hazards 307
4.9 Exceptions 315
4.10 Parallelism via Instructions 321
4.11 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Pipelines 334
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply 342
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware
Design Language to Describe and Model a Pipeline and More Pipelining
Illustrations 345
4.14 Fallacies and Pitfalls 345
4.15 Concluding Remarks 346
4.16 Historical Perspective and Further Reading 347
4.17 Exercises 347
5 Large and Fast: Exploiting Memory Hierarchy 364
5.1 Introduction 366
5.2 Memory Technologies 370
5.3 The Basics of Caches 375
5.4 Measuring and Improving Cache Performance 390
5.5 Dependable Memory Hierarchy 410
5.6 Virtual Machines 416
5.7 Virtual Memory 419
5.8 A Common Framework for Memory Hierarchy 443
5.9 Using a Finite-State Machine to Control a Simple Cache 449
5.10 Parallelism and Memory Hierarchy: Cache Coherence 454
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive
Disks 458
5.12 Advanced Material: Implementing Cache Controllers 459
5.13 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Memory
Hierarchies 459
5.14 Real Stuff: The Rest of the RISC-V System and Special Instructions 464
5.15 Going Faster: Cache Blocking and Matrix Multiply 465
5.16 Fallacies and Pitfalls 468
5.17 Concluding Remarks 472
5.18 Historical Perspective and Further Reading 473
5.19 Exercises 473
6 Parallel Processors from Client to Cloud 490
6.1 Introduction 492
6.2 The Difficulty of Creating Parallel Processing Programs 494
6.3 SISD, MIMD, SIMD, SPMD, and Vector 499
6.4 Hardware Multithreading 506
6.5 Multicore and Other Shared Memory Multiprocessors 509
6.6 Introduction to Graphics Processing Units 514
6.7 Clusters, Warehouse Scale Computers, and Other Message-Passing
Multiprocessors 521
6.8 Introduction to Multiprocessor Network Topologies 526
6.9 Communicating to the Outside World: Cluster Networking 529
6.10 Multiprocessor Benchmarks and Performance Models 530
6.11 Real Stuff: Benchmarking and Rooflines of the Intel Core i7 960 and the
NVIDIA Tesla GPU 540
6.12 Going Faster: Multiple Processors and Matrix Multiply 545
6.13 Fallacies and Pitfalls 548
6.14 Concluding Remarks 550
6.15 Historical Perspective and Further Reading 553
6.16 Exercises 553
A P P E N D I X
A The Basics of Logic Design A-2
A.1 Introduction A-3
A.2 Gates, Truth Tables, and Logic Equations A-4
A.3 Combinational Logic A-9
A.4 Using a Hardware Description Language A-20
A.5 Constructing a Basic Arithmetic Logic Unit A-26
A.6 Faster Addition: Carry Lookahead A-37
A.7 Clocks A-47
A.8 Memory Elements: Flip-Flops, Latches, and Registers A-49
A.9 Memory Elements: SRAMs and DRAMs A-57
A.10 Finite-State Machines A-66
A.11 Timing Methodologies A-71
A.12 Field Programmable Devices A-77
A.13 Concluding Remarks A-78
A.14 Exercises A-79
Index I-1
O N L I N E C O N T E N T
Graphics and Computing GPUs B-2
B.1 Introduction B-3
B.2 GPU System Architectures B-7
B.3 Programming GPUs B-12
B.4 Multithreaded Multiprocessor Architecture B-25
B.5 Parallel Memory System B-36
B.6 Floating Point Arithmetic B-41
B.7 Real Stuff: The NVIDIA GeForce 8800 B-46
B.8 Real Stuff: Mapping Applications to GPUs B-55
B.9 Fallacies and Pitfalls B-72
B.10 Concluding Remarks B-76
B.11 Historical Perspective and Further Reading B-77
Mapping Control to Hardware C-2
C.1 Introduction C-3
C.2 Implementing Combinational Control Units C-4
C.3 Implementing Finite-State Machine Control C-8
C.4 Implementing the Next-State Function with a Sequencer C-22
C.5 Translating a Microprogram to Hardware C-28
C.6 Concluding Remarks C-32
C.7 Exercises C-33
A Survey of RISC Architectures for Desktop, Server,
and Embedded Computers D-2
D.1 Introduction D-3
D.2 Addressing Modes and Instruction Formats D-5
D.3 Instructions: the MIPS Core Subset D-9
D.4 Instructions: Multimedia Extensions of the Desktop/Server RISCs D-16
D.5 Instructions: Digital Signal-Processing Extensions of the Embedded
RISCs D-19
D.6 Instructions: Common Extensions to MIPS Core D-20
D.7 Instructions Unique to MIPS-64 D-25
D.8 Instructions Unique to Alpha D-27
D.9 Instructions Unique to SPARC v9 D-29
D.10 Instructions Unique to PowerPC D-32
D.11 Instructions Unique to PA-RISC 2.0 D-34
D.12 Instructions Unique to ARM D-36
D.13 Instructions Unique to Thumb D-38
D.14 Instructions Unique to SuperH D-39
D.15 Instructions Unique to M32R D-40
D.16 Instructions Unique to MIPS-16 D-40
D.17 Concluding Remarks D-43
Glossary G-1
Further Reading FR-1