教学资源 – 图书教辅

扩展信息

语种 : 英文

页数 : 1080

开本 : 16

原书名 : 深入理解计算机系统（英文版第2版）

原出版社: Pearson Education Asia

属性分类: 教材

包含CD : 无

绝版 : 无

图书简介

本书的主要读者是计算机科学家、计算机工程师，以及那些想通过学习计算机系统的内在运作而能够写出更好程序的人。
我们的目的是解释所有计算机系统的本质概念，并向你展示这些概念是如何实实在在地影响应用程序的正确性、性能和实用性的。本书从程序员的角度来讲述应用程序员如何能够利用系统知识来写出更好的程序。

图书前言

本书的主要读者是计算机科学家、计算机工程师，以及那些想通过学习计算机系统的内在运作而能够写出更好程序的人。
　　我们的目的是解释所有计算机系统的本质概念，并向你展示这些概念是如何实实在在地影响应用程序的正确性、性能和实用性的。其他的系统类书籍都是从构建者的角度来写的，讲述如何实现硬件或是系统软件，包括操作系统、编译器和网络接口。而本书是从程序员的角度来写的，讲述应用程序员如何能够利用系统知识来编写出更好的程序。当然，学习一个计算机系统应该做些什么，是学习如何构建一个计算机系统的很好的出发点，所以，对于希望继续学习系统软硬件实现的人来说，本书也是一本很有价值的介绍性读物。
本书概述
　　本书由12 章组成，旨在阐述计算机系统的核心概念。
　　. 第1 章：计算机系统漫游。这一章通过研究“hello, world”这个简单程序的生命周期，介绍计算机系统的主要概念和主题。
　　. 第2 章：信息的表示和处理。我们讲述了计算机的算术运算，重点描述了会对程序员有影响的无符号数和数的二进制补码（two’s complement）表示的特性。我们考虑数字是如何表示的，以及由此确定对于一个给定的字长，其可能编码值的范围。我们讨论该如何表示数字，以及因此用给定的字长能编码的数值的范围。我们探讨有符号和无符号数字之间类型转换的效果，还阐述算术运算的数学特性。菜鸟级程序员经常很惊奇地了解到（用二进制补码表示的）两个正数的和或者积可能为负。另一方面，二进制补码的算术运算满足代数环的特性，因此，编译器可以很安全地把一个常量乘法转化为一系列的移位和加法。我们用C 语言的位级操作来说明布尔代数的原理和应用。我们从两个方面讲述了IEEE 标准的浮点格式：一是如何用它来表示数值，一是浮点运算的数学属性。
　　. 第3 章：程序的机器级表示。我们教读者如何阅读由C 编译器生成的IA32 和x86-64 汇编语言。我们说明为不同控制结构，比如条件、循环和开关语句，生成的基本指令模式。我们还讲述过程的执行，包括栈分配、寄存器使用惯例和参数传递。我们讨论不同数据结构（如结构、联合（union）和数组）的分配和访问方式。我们还以分析程序在机器级的样子作为途径，来理解常见的代码安全漏洞，例如，缓冲区溢出，以及理解程序员、编译器和操作系统可以采取的减轻这些威胁的措施。
　　. 第4 章：处理器体系结构。这一章讲述基本的组合和时序逻辑元素，并展示这些元素如何在数据通路（datapath）中组合到一起来执行IA32 指令集的一个称为“Y86”的简化子集。本章中处理器设计的控制逻辑是用一种称为HCL 的简单硬件描述语言来描述的。用HCL 写的硬件设计能够编译和链接到本书提供的模拟器中，还可以根据这些设计生成Verilog 描述，它适合合成（synthesis）到实际可以运行的硬件上去。
　　. 第5 章：优化程序性能。在这一章里，我们介绍了许多提高代码性能的技术，主要思想就是让程序员通过使编译器能够生成更有效的机器代码来学习编写C 代码。
　　. 第6 章：存储器层次结构。对应用程序员来说，存储器系统是计算机系统中最直接可见的部分之一。我们讲述不同类型的随机存取存储器（RAM）和只读存储器（ROM），以及磁盘和固态硬盘的几何形状和组织构造。我们描述这些存储设备是如何放置在层次结构中的，讲述访问局部性是如何使这种层次结构成为可能的。我们通过一个独特的观点使这些理论具体化、形象化，那就是将存储器系统视为一个“存储器山”，山脊是时间局部性，而斜坡是空间局部性。最后，我们向读者阐述如何通过改善程序的时间局部性和空间局部性来提高应用程序的性能。
　　. 第7 章：链接。本章讲述静态和动态链接，包括的概念有可重定位的（relocatable）和可执行的目标文件、符号解析、重定位（relocation）、静态库、共享目标库，以及与位置无关的代码。
　　. 第8 章：异常控制流。在本书的这个部分，我们通过介绍异常控制流（比如，除了正常分支和过程调用以外的控制流的变化）的一般概念，打破单一程序的模型。我们给出存在于系统所有层次的异常控制流的例子，从底层的硬件异常和中断，到并发进程的上下文切换，到由于Unix 信号传送引起的控制流突变，到C 语言中破坏栈原则的非本地跳转（nonlocal jump）。
　　. 第9 章：虚拟存储器。我们讲述虚拟存储器系统是希望读者对它是如何工作的以及它的特性有所了解。我们想让读者了解为什么不同的并发进程各自都有一个完全相同的地址范围，能共享某些页，而又独占另外一些页。我们还覆盖讲了一些管理和操纵虚拟存储器的问题。特别地，我们讨论了存储分配操作，就像Unix 的malloc 和free 操作。
　　. 第10 章：系统级I/O 。我们讲述Unix I/O 的基本概念，例如文件和描述符。我们描述如何共享文件，I/O 重定向是如何工作的，还有如何访问文件的元数据。我们还开发了一个健壮的带缓冲区的I/O 包，可以正确处理一种称为short counts 的奇特行为，也就是库函数只读取一部分的输入数据。我们阐述C 的标准I/O 库，以及它与Unix I/O 的关系，重点谈到标准I/O 的局限性，这些局限性使之不适合网络编程。
　　. 第11 章：网络编程。对编程而言，网络是非常有趣的I/O 设备，将许多我们前面文中学习的概念，比如进程、信号、字节顺序（byte order）、存储器映射和动态存储器分配，联系在一起。网络程序还为下一章的主题—并发，提供了一个很令人信服的上下文。本章只是网络编程的一个很小的部分，使读者能够编写一个Web 服务器。我们还讲述了位于所有网络程序底层的客户端-服务器模型。我们展现了一个程序员对Internet 的观点，并且教读者如何用套接字（socket）接口来编写Internet 客户端和服务器。最后，我们介绍超文本传输协议HTTP，并开发了一个简单的迭代式（iterative）Web 服务器。
　　. 第12 章：并发编程。这一章以Internet 服务器设计为例介绍了并发编程。我们比较对照了三种编写并发程序的基本机制（进程、I/O 多路复用技术和线程），并且展示如何用它们来建造并发Internet 服务器。我们探讨了用P、V 信号操作来实现同步、线程安全和可重入（reentrancy）、竞争条件以及死锁等的基本原则。我们还讲述了线程级编程的使用方法，来解释应用程序中的并行性，使得程序在多核的处理器上能执行得更快。
本版新增内容
　　本书的第1 版于2003 年出版。考虑到计算机技术发展如此迅速，这本书的内容还算是保持得很好。事实证明Intel x86 的机器上运行类Unix 操作系统，加上采用C 语言编程，是一种能够涵盖当今许多系统的组合。硬件技术和编译器的变化，以及很多教师教授这些内容的经验，都促使我们做了大量的修改。
　　下面列出的是一些更加详细的改进：
　　. 第2 章：信息的表示和处理。通过更加详细地解释概念以及更多的练习题和家庭作业，我们试图使这部分内容更加易懂。我们将一些比较偏理论的内容放到了网络旁注里。还讲述了一些由于计算机算术运算的溢出造成的安全漏洞。
　　. 第3 章：程序的机器级表示。我们将内容的覆盖范围扩展到了包括x86-64，也就是将x86 处理器扩展到了64 位字长。也使用了更新版本的GCC 产生的代码。另外还增强了对缓冲区溢出漏洞的描述。在网络旁注里，我们给出了两类不同的浮点指令，还介绍了当编译器试图做更高等级优化的时候，做的一些奇特的变换。另外，还有一个网络旁注描述了如何在一个C 语言程序中嵌入x86 汇编代码。
　　. 第4 章：处理器体系结构。更加详细地说明了我们的处理器设计中的异常发现和处理。在网络旁注里，我们也给出了处理器设计的Verilog 描述映射，使得我们的设计能够合成到可运行的硬件上。
　　. 第5 章：优化程序性能。我们极大地改变了对乱序处理器如何运行的描述，还提出了一种简单的技术，能够基于程序的数据流图表示中的路径来分析程序的性能。在网络旁注里，描述了C 语言程序员如何能够利用较新的x86 处理器中提供的SIMD（单指令流，多数据流）指令来编程。
　　. 第6 章：存储器层次结构。我们增加了固态硬盘的内容，还更新了我们的表述，使之基于Intel Core i7 处理器的存储器层次结构。
　　. 第7 章：链接。本章的变化不大。
　　. 第8 章：异常控制流。我们改进了对于进程模型如何引入一些基本的并发概念的讨论，例如非确定性。
　　. 第9 章：虚拟存储器。我们更新了存储器系统案例研究，采用了64 位Intel Core i7 处理器为例来讲述。我们还更新了malloc 函数的示例实现，使之既能在32 位也能在64 位环境中执行。
　　. 第10 章：系统级I/O 。本章的变化不大。
　　. 第11 章：网络编程。本章的变化不大。
　　. 第12 章：并发编程。我们增加了关于并发性一般原则的内容，还讲述了程序员如何利用线程级并行性使得程序在多核机器上能运行得更快。
　　此外，我们还增加和修改了很多练习题和家庭作业。

上架指导

计算机\硬件

封底文字

“2005年，我开始采用Bryant和O’Hallaron的这本书作为本科生计算机系统课程的教材。三年后，这本书仍然是我的计算机系统课程教科书的首选。”　　　　　　　　　　　　　　　　　　　　　　　—— Mirela Damian，维拉诺瓦大学
　　“本书表述清晰、恰到好处——举重若轻地呈现了那些非常复杂的内容。”　　　　　　　　　　　　　　　　　　　　　　　—— Ibrahim Matta, 波士顿大学
　　“这是一本学习计算机硬件和软件如何‘真正’协同工作的好书，还教会你为什么了解这些知识会使你成为一个更有价值的程序员。本书还帮你为学习像操作系统和编译器这样的高级课程做好准备。在本书中，我最喜欢的章节是关于缓存的，当我第一次发现缓存有多重要时，真是难以置信！”　　　　　　　　　　　　　　　　　　　　　　—— Vishal Shah，Ask.com总架构师
　　本书是一本将计算机软件和硬件理论结合讲述的经典教程，内容覆盖计算机导论、体系结构和处理器设计等多门课程。本书的最大优点是为程序员描述计算机系统的实现细节，通过描述程序是如何映射到系统上，以及程序是如何执行的，使读者更好地理解程序的行为为什么是这样的，以及造成效率低下的原因。
相对于第1版，本版主要是反映了过去十年间硬件技术和编译器的变化，具体更新如下：
对系统的介绍（特别是实际使用部分）做了增加和修改。例如，既保持了原有的针对32位系统的说明，又增加了对64位系统的描述。
增加了很多关于由算术运算溢出以及缓冲区溢出造成安全漏洞的内容。
更详细讲述了处理器对异常的发现和处理。
描述了基于Intel Core i7处理器的存储器层次结构，还增加了固态硬盘的内容。
强调并发性，增加了关于并发性一般原则的内容。

作者简介

（美）Randal E. Bryant; David R. O'Hallaron 著：Randal E. Bryant 1973年获得密歇根大学学士学位，随即就读麻省理工学院的研究生院，并在1981年获得计算机博士学位。从1984年至今一直任教于卡内基-梅隆大学，现在是卡内基-梅隆大学计算机学院院长、教授，同时受邀任教于电子与计算机工程学院。他还是ACM院士、IEEE院士和美国国家工程院院士。其研究成果获得过数项大奖，其中包括Semiconductor Research Corporation颁发的两个发明荣誉奖和一个技术成就奖，ACM颁发的Kanellakis理论与实践奖，还有IEEE授予的W. R. G. Baker奖、Emmanuel Piore奖和Phil Kaufman奖。 David R. O'Hallaron 现为Intel匹兹堡实验室主任，卡内基-梅隆大学电子和计算机工程学院副教授，并在维吉尼亚大学（University of Virginia）获得计算机科学的博士学位。他曾获得卡内基-梅隆大学计算机学院颁发的Herbert Simon杰出教学奖，并同Quake项目中其他成员一起获得了高性能计算领域中的最高国际奖项——Gordon Bell奖。

图书目录

出版者的话3
前言节选 4
Preface 7
About the Authors 21
1 A Tour of Computer Systems 35
1.1 Information Is Bits + Context 37
1.2 Programs Are Translated by Other Programs into Different Forms 38
1.3 It Pays to Understand How Compilation Systems Work 40
1.4 Processors Read and Interpret Instructions Stored in Memory 41
1.4.1 Hardware Organization of a System 41
1.4.2 Running the hello Program 44
1.5 Caches Matter 46
1.6 Storage Devices Form a Hierarchy 47
1.7 The Operating System Manages the Hardware 48
1.7.1 Processes 50
1.7.2 Threads 51
1.7.3 Virtual Memory 51
1.7.4 Files 53
1.8 Systems Communicate with Other Systems Using Networks 54
1.9 Important Themes 55
1.9.1 Concurrency and Parallelism 55
1.9.2 The Importance of Abstractions in Computer Systems 58
1.10 Summary 59
Bibliographic Notes 60
Part I Program Structure and Execution
Representing and Manipulating Information 63
2.1 Information Storage 67
2.1.1 Hexadecimal Notation 68
2.1.2 Words 72
2.1.3 Data Sizes 72
2.1.4 Addressing and Byte Ordering 73
2.1.5 Representing Strings 80
2.1.6 Representing Code 81
2.1.7 Introduction to Boolean Algebra 82
2.1.8 Bit-Level Operations in C 85
2.1.9 Logical Operations in C 88
2.1.10 Shift Operations in C 88
2.2 Integer Representations 90
2.2.1 Integral Data Types 91
2.2.2 Unsigned Encodings 92
2.2.3 Two’s-Complement Encodings 94
2.2.4 Conversions Between Signed and Unsigned 99
2.2.5 Signed vs. Unsigned in C 103
2.2.6 Expanding the Bit Representation of a Number 105
2.2.7 Truncating Numbers 109
2.2.8 Advice on Signed vs. Unsigned 110
2.3 Integer Arithmetic 113
2.3.1 Unsigned Addition 113
2.3.2 Two’s-Complement Addition 117
2.3.3 Two’s-Complement Negation 121
2.3.4 Unsigned Multiplication 122
2.3.5 Two’s-Complement Multiplication 123
2.3.6 Multiplying by Constants 126
2.3.7 Dividing by Powers of Two 129
2.3.8 Final Thoughts on Integer Arithmetic 132
2.4 Floating Point 133
2.4.1 Fractional Binary Numbers 134
2.4.2 IEEE Floating-Point Representation 137
2.4.3 Example Numbers 139
2.4.4 Rounding 144
2.4.5 Floating-Point Operations 147
2.4.6 Floating Point in C 148
2.5 Summary 152
Bibliographic Notes 153
Homework Problems 153
Solutions to Practice Problems 168
3 Machine-Level Representation of Programs 187
3.1 A Historical Perspective 190
3.2 Program Encodings 193
3.2.1 Machine-Level Code 194
3.2.2 Code Examples 196
3.2.3 Notes on Formatting 199
3.3 Data Formats 201
3.4 Accessing Information 202
3.4.1 Operand Specifiers 203
3.4.2 Data Movement Instructions 205
3.4.3 Data Movement Example 208
3.5 Arithmetic and Logical Operations 211
3.5.1 Load Effective Address 211
3.5.2 Unary and Binary Operations 212
3.5.3 Shift Operations 213
3.5.4 Discussion 214
3.5.5 Special Arithmetic Operations 216
3.6 Control 219
3.6.1 Condition Codes 219
3.6.2 Accessing the Condition Codes 221
3.6.3 Jump Instructions and Their Encodings 223
3.6.4 Translating Conditional Branches 227
3.6.5 Loops 231
3.6.6 Conditional Move Instructions 240
3.6.7 Switch Statements 247
3.7 Procedures 253
3.7.1 Stack Frame Structure 253
3.7.2 Transferring Control 255
3.7.3 Register Usage Conventions 257
3.7.4 Procedure Example 258
3.7.5 Recursive Procedures 263
3.8 Array Allocation and Access 266
3.8.1 Basic Principles 266
3.8.2 Pointer Arithmetic 267
3.8.3 Nested Arrays 269
3.8.4 Fixed-Size Arrays 271
3.8.5 Variable-Size Arrays 272
3.9 Heterogeneous Data Structures 275
3.9.1 Structures 275
3.9.2 Unions 278
3.9.3 Data Alignment 282
3.10 Putting It Together: Understanding Pointers 286
3.11 Life in the Real World: Using the gdb Debugger 288
3.12 Out-of-Bounds Memory References and Buffer Overflow 290
3.12.1 Thwarting Buffer Overflow Attacks 295
3.13 x86-64: Extending IA32 to 64 Bits 301
3.13.1 History and Motivation for x86-64 302
3.13.2 An Overview of x86-64 304
3.13.3 Accessing Information 307
3.13.4 Control 313
3.13.5 Data Structures 324
3.13.6 Concluding Observations about x86-64 325
3.14 Machine-Level Representations of Floating-Point Programs 326
3.15 Summary 327
Bibliographic Notes 328
Homework Problems 328
Solutions to Practice Problems 342
4 Processor Architecture 367
4.1 The Y86 Instruction Set Architecture 370
4.1.1 Programmer-Visible State 370
4.1.2 Y86 Instructions 371
4.1.3 Instruction Encoding 373
4.1.4 Y86 Exceptions 378
4.1.5 Y86 Programs 379
4.1.6 Some Y86 Instruction Details 384
4.2 Logic Design and the Hardware Control Language HCL 386
4.2.1 Logic Gates 387
4.2.2 Combinational Circuits and HCL Boolean Expressions 388
4.2.3 Word-Level Combinational Circuits and HCL Integer Expressions 389
4.2.4 Set Membership 394
4.2.5 Memory and Clocking 395
4.3 Sequential Y86 Implementations 398
4.3.1 Organizing Processing into Stages 398
4.3.2 SEQ Hardware Structure 409
4.3.3 SEQ Timing 413
4.3.4 SEQ Stage Implementations 417
4.4 General Principles of Pipelining 425
4.4.1 Computational Pipelines 426
4.4.2 A Detailed Look at Pipeline Operation 427
4.4.3 Limitations of Pipelining 428
4.4.4 Pipelining a System with Feedback 432
4.5 Pipelined Y86 Implementations 434
4.5.1 SEQ+: Rearranging the Computation Stages 434
4.5.2 Inserting Pipeline Registers 435
4.5.3 Rearranging and Relabeling Signals 439
4.5.4 Next PC Prediction 440
4.5.5 Pipeline Hazards 442
4.5.6 Avoiding Data Hazards by Stalling 447
4.5.7 Avoiding Data Hazards by Forwarding 449
4.5.8 Load/Use Data Hazards 452
4.5.9 Exception Handling 454
4.5.10 PIPE Stage Implementations 457
4.5.11 Pipeline Control Logic 465
4.5.12 Performance Analysis 478
4.5.13 Unfinished Business 480
4.6 Summary 483
4.6.1 Y86 Simulators 484
Bibliographic Notes 485
Homework Problems 485
Solutions to Practice Problems 491
5 Optimizing Program Performance 507
5.1 Capabilities and Limitations of Optimizing Compilers 510
5.2 Expressing Program Performance 514
5.3 Program Example 516
5.4 Eliminating Loop Inefficiencies 520
5.5 Reducing Procedure Calls 524
5.6 Eliminating Unneeded Memory References 525
5.7 Understanding Modern Processors 530
5.7.1 Overall Operation 531
5.7.2 Functional Unit Performance 534
5.7.3 An Abstract Model of Processor Operation 536
5.8 Loop Unrolling 543
5.9 Enhancing Parallelism 547
5.9.1 Multiple Accumulators 548
5.9.2 Reassociation Transformation 552
5.10 Summary of Results for Optimizing Combining Code 558
5.11 Some Limiting Factors 559
5.11.1 Register Spilling 559
5.11.2 Branch Prediction and Misprediction Penalties 560
5.12 Understanding Memory Performance 565
5.12.1 Load Performance 565
5.12.2 Store Performance 566
5.13 Life in the Real World: Performance Improvement Techniques 573
5.14 Identifying and Eliminating Performance Bottlenecks 574
5.14.1 Program Profiling 574
5.14.2 Using a Profiler to Guide Optimization 576
5.14.3 Amdahl’s Law 579
5.15 Summary 581
Bibliographic Notes 582
Homework Problems 583
Solutions to Practice Problems 586
6 The Memory Hierarchy 593
6.1 Storage Technologies 595
6.1.1 Random-Access Memory 595
6.1.2 Disk Storage 604
6.1.3 Solid State Disks 615
6.1.4 Storage Technology Trends 617
6.2 Locality 620
6.2.1 Locality of References to Program Data 621
6.2.2 Locality of Instruction Fetches 622
6.2.3 Summary of Locality 623
6.3 The Memory Hierarchy 625
6.3.1 Caching in the Memory Hierarchy 626
6.3.2 Summary of Memory Hierarchy Concepts 629
6.4 Cache Memories 630
6.4.1 Generic Cache Memory Organization 631
6.4.2 Direct-Mapped Caches 633
6.4.3 Set Associative Caches 640
6.4.4 Fully Associative Caches 642
6.4.5 Issues with Writes 645
6.4.6 Anatomy of a Real Cache Hierarchy 646
6.4.7 Performance Impact of Cache Parameters 648
6.5 Writing Cache-friendly Code 649
6.6 Putting It Together: The Impact of Caches on Program Performance 654
6.6.1 The Memory Mountain 655
6.6.2 Rearranging Loops to Increase Spatial Locality 659
6.6.3 Exploiting Locality in Your Programs 663
6.7 Summary 663
Bibliographic Notes 664
Homework Problems 665
Solutions to Practice Problems 676
29 Part II Running Programs on a System
7 Linking 687
7.1 Compiler Drivers 689
7.2 Static Linking 691
7.3 Object Files 691
7.4 Relocatable Object Files 692
7.5 Symbols and Symbol Tables 694
7.6 Symbol Resolution 697
7.6.1 How Linkers Resolve Multiply Defined Global Symbols 698
7.6.2 Linking with Static Libraries 701
7.6.3 How Linkers Use Static Libraries to Resolve References 704
7.7 Relocation 706
7.7.1 Relocation Entries 706
7.7.2 Relocating Symbol References 707
7.8 Executable Object Files 712
7.9 Loading Executable Object Files 713
7.10 Dynamic Linking with Shared Libraries 715
7.11 Loading and Linking Shared Libraries from Applications 717
7.12 Position-Independent Code (PIC) 721
7.13 Tools for Manipulating Object Files 724
7.14 Summary 725
Bibliographic Notes 725
Homework Problems 726
Solutions to Practice Problems 732
8 Exceptional Control Flow 735
8.1 Exceptions 737
8.1.1 Exception Handling 738
8.1.2 Classes of Exceptions 740
8.1.3 Exceptions in Linux/IA32 Systems 742
8.2 Processes 746
8.2.1 Logical Control Flow 746
8.2.2 Concurrent Flows 747
8.2.3 Private Address Space 748
8.2.4 User and Kernel Modes 748
8.2.5 Context Switches 750
8.3 System Call Error Handling 751
8.4 Process Control 752
8.4.1 Obtaining Process IDs 753
8.4.2 Creating and Terminating Processes 753
8.4.3 Reaping Child Processes 757
8.4.4 Putting Processes to Sleep 763
8.4.5 Loading and Running Programs 764
8.4.6 Using fork and execve to Run Programs 767
8.5 Signals 770
8.5.1 Signal Terminology 772
8.5.2 Sending Signals 773
8.5.3 Receiving Signals 776
8.5.4 Signal Handling Issues 779
8.5.5 Portable Signal Handling 786
8.5.6 Explicitly Blocking and Unblocking Signals 787
8.5.7 Synchronizing Flows to Avoid Nasty Concurrency Bugs 789
8.6 Nonlocal Jumps 793
8.7 Tools for Manipulating Processes 796
8.8 Summary 797
Bibliographic Notes 797
Homework Problems 798
Solutions to Practice Problems 805
9 Virtual Memory 809
9.1 Physical and Virtual Addressing 811
9.2 Address Spaces 812
9.3 VM as a Tool for Caching 813
9.3.1 DRAM Cache Organization 814
9.3.2 Page Tables 814
9.3.3 Page Hits 816
9.3.4 Page Faults 816
9.3.5 Allocating Pages 817
9.3.6 Locality to the Rescue Again 818
9.4 VM as a Tool for Memory Management 819
9.5 VM as a Tool for Memory Protection 820
9.6 Address Translation 821
9.6.1 Integrating Caches and VM 825
9.6.2 Speeding up Address Translation with a TLB 825
9.6.3 Multi-Level Page Tables 826
9.6.4 Putting It Together: End-to-end Address Translation 828
9.7 Case Study: The Intel Core i7/Linux Memory System 833
9.7.1 Core i7 Address Translation 834
9.7.2 Linux Virtual Memory System 837
9.8 Memory Mapping 841
9.8.1 Shared Objects Revisited 841
9.8.2 The fork Function Revisited 843
9.8.3 The execve Function Revisited 844
9.8.4 User-level Memory Mapping with the mmap Function 844
9.9 Dynamic Memory Allocation 846
9.9.1 The malloc and free Functions 848
9.9.2 Why Dynamic Memory Allocation 850
9.9.3 Allocator Requirements and Goals 851
9.9.4 Fragmentation 853
9.9.5 Implementation Issues 854
9.9.6 Implicit Free Lists 854
9.9.7 Placing Allocated Blocks 856
9.9.8 Splitting Free Blocks 857
9.9.9 Getting Additional Heap Memory 857
9.9.10 Coalescing Free Blocks 858
9.9.11 Coalescing with Boundary Tags 858
9.9.12 Putting It Together: Implementing a Simple Allocator 861
9.9.13 Explicit Free Lists 869
9.9.14 Segregated Free Lists 870
9.10 Garbage Collection 872
9.10.1 Garbage Collector Basics 873
9.10.2 Mark&Sweep Garbage Collectors 874
9.10.3 Conservative Mark&Sweep for C Programs 876
9.11 Common Memory-Related Bugs in C Programs 877
9.11.1 Dereferencing Bad Pointers 877
9.11.2 Reading Uninitialized Memory 877
9.11.3 Allowing Stack Buffer Overflows 878
9.11.4 Assuming that Pointers and the Objects They Point to Are the Same Size 878
9.11.5 Making Off-by-One Errors 879
9.11.6 Referencing a Pointer Instead of the Object It Points to 879
9.11.7 Misunderstanding Pointer Arithmetic 880
9.11.8 Referencing Nonexistent Variables 880
9.11.9 Referencing Data in Free Heap Blocks 881
9.11.10 Introducing Memory Leaks 881
9.12 Summary 882
Bibliographic Notes 882
Homework Problems 883
Solutions to Practice Problems 887
Part III Interaction and Communication Between Programs
10 System-Level I/O 895
10.1 Unix I/O 896
10.2 Opening and Closing Files 897
10.3 Reading and Writing Files 899
10.4 Robust Reading and Writing with the Rio Package 901
10.4.1 Rio Unbuffered Input and Output Functions 901
10.4.2 Rio Buffered Input Functions 902
10.5 Reading File Metadata 907
10.6 Sharing Files 909
10.7 I/O Redirection 911
10.8 Standard I/O 913
10.9 Putting It Together: Which I/O Functions Should I Use 914
10.10 Summary 915
Bibliographic Notes 916
Homework Problems 916
Solutions to Practice Problems 917
11 Network Programming 919
11.1 The Client-Server Programming Model 920
11.2 Networks 921
11.3 The Global IP Internet 925
11.3.1 IP Addresses 927
11.3.2 Internet Domain Names 929
11.3.3 Internet Connections 933
11.4 The Sockets Interface 934
11.4.1 Socket Address Structures 935
11.4.2 The socket Function 936
11.4.3 The connect Function 937
11.4.4 The open_clientfd Function 937
11.4.5 The bind Function 938
11.4.6 The listen Function 939
11.4.7 The open_listenfd Function 939
11.4.8 The accept Function 941
11.4.9 Example Echo Client and Server 942
11.5 Web Servers 945
11.5.1 Web Basics 945
11.5.2 Web Content 946
11.5.3 HTTP Transactions 948
11.5.4 Serving Dynamic Content 950
11.6 Putting It Together: The Tiny Web Server 953
11.7 Summary 961
Bibliographic Notes 962
Homework Problems 962
Solutions to Practice Problems 963
12 Concurrent Programming 967
12.1 Concurrent Programming with Processes 969
12.1.1 A Concurrent Server Based on Processes 970
12.1.2 Pros and Cons of Processes 971
12.2 Concurrent Programming with I/O Multiplexing 973
12.2.1 A Concurrent Event-Driven Server Based on I/O Multiplexing 976
12.2.2 Pros and Cons of I/O Multiplexing 980
12.3 Concurrent Programming with Threads 981
12.3.1 Thread Execution Model 982
12.3.2 Posix Threads 982
12.3.3 Creating Threads 984
12.3.4 Terminating Threads 984
12.3.5 Reaping Terminated Threads 985
12.3.6 Detaching Threads 985
12.3.7 Initializing Threads 986
12.3.8 A Concurrent Server Based on Threads 986
12.4 Shared Variables in Threaded Programs 988
12.4.1 Threads Memory Model 989
12.4.2 Mapping Variables to Memory 990
12.4.3 Shared Variables 990
12.5 Synchronizing Threads with Semaphores 991
12.5.1 Progress Graphs 994
12.5.2 Semaphores 997
12.5.3 Using Semaphores for Mutual Exclusion 998
12.5.4 Using Semaphores to Schedule Shared Resources 1000
12.5.5 Putting It Together: A Concurrent Server Based on Prethreading 1004
12.6 Using Threads for Parallelism 1008
12.7 Other Concurrency Issues 1013
12.7.1 Thread Safety 1013
12.7.2 Reentrancy 1014
12.7.3 Using Existing Library Functions in Threaded Programs 1016
12.7.4 Races 1017
12.7.5 Deadlocks 1019
12.8 Summary 1022
Bibliographic Notes 1023
Homework Problems 1023
Solutions to Practice Problems 1028
A Error Handling 1033
A.1 Error Handling in Unix Systems 1034
A.2 Error-Handling Wrappers 1035
References 1039
Index 1045