Department of Computer Science
University of Crete, in collaboration with
Computer Architecture and VLSI Systems Laboratory,
ICS-FORTH, Crete, Greece
Spring 2008
Dimitris Nikolopoulos
Computers can today be found in a plethora of system and an engineer will without a doubt use computers as components throughout his or her career. Computers are not only used as office equipment or servers. An equally important, if not more important, area is embedded systems, e.g. computers used for industrial control, computers used in cars, and computers used in data-/telecommunication, where the computer is an integral part of a larger system. In these embedded systems, software and hardware must cooperate to meet performance constraints.
Computer architecture can be defined as the Art of designing computers based on engineering principles and quantitative performance evaluations. In this course we study computer architecture with a focus on why computers are built the way they are.
The textbook for this course is Hennessy and Patterson's Computer Architecture, a Quantitative Approach, 4th edition.
Copies of the 3rd edition of the textbook will be handed out only to registered students for the course in due time. Notes and slides with lecture material will also be handed out to registered students only, via the University's print center. The latter hand-outs will make up for any differences between the 3rd and the 4th editions of the textbook.The course has HY225 as a hard prerequisite. The course assumes that you are not only familiar (basic undestanding) with concepts of computer organization (e.g. datapath, control, assembly language, cache memories, etc.) but you are comfortable and genuinely interested in these concepts.
After completing successfully this course you will have thoroughly studied the design principles in modern computer systems and you will have observed the relations between the design of the instruction set of a processor, the micro-architecture of a processor, compiler technology and to some extent, applications. You will also able to evaluate design alternatives towards design goals, using quantitative evaluation methods for performance, as well as other important properties, such as power and reliability.
The course includes the following tentative list of topics:
This is a tentative schedule subject to cahnge. Unless otherwise noted, the lectures are held:
Week | Date | Lecture | Topic | Slides | Literature | Problems (self-study and assignments) from textbook |
Solutions to practice problems | |
1 | Mon 18/2 09-11 RA203 |
No class | ||||||
1 | Wed 20/2 09-11 RA203 |
1 | Quantitative principles of design |
Instructor notes, textbook companion slides (protected) |
Chapter 1, CA:QA, 4th ed., | None | -- | |
1 | Fri 22/2 09-11 RA203 |
2 | Evaluating technology trends. Measuring power, reliability and performance |
Instructor notes, textbook companion slides (protected) |
Chapter 1, CA:QA, 4th ed. | None | -- | |
2 | Mon 25/2 09-11 RA203 |
3 | Background: Review of pipelines and hazards |
Instructor notes, textbook companion slides (protected) |
Appendix A, CA:QA, 4th ed., pp. A.1-A20 | Solve practice problems 1.4, 1.5, 1.6 and 1.7 from CA:QA, 4th ed. | Solutions to problems 1.4-1.7 | |
2 | Wed 27/2 09-11 RA203 |
4 | Background: Final notes on pipelining, review of memory hierarchies and caches |
Instructor notes, textbook companion slides (protected) |
Appendix A, CA:QA, 4th ed., Appendix C, CA:QA, 4th ed. | Solve practice problems 1.8, 1.9, 1.10 and 1.11 from CA:QA, 4th ed. | Solutions to problems 1.8-1.11 | |
2 | Fri 29/2 09-11 RA203 |
5 | Virtual memory, Instruction-level parallelism, I: Unrolling and branch prediction |
Instructor notes, textbook companion slides (protected) |
Appendix C, Chapter 2, CA:QA, 4th ed., Section 2.1 | Homework 1 | Solution to Homework 1 | |
3 | Mon 3/3 09-11 RA203 |
6 | Instruction-Level Parallelism: Loop Unrolling, branch prediction, dynamic scheduling | See notes and slides of Lecture 5 | Chapter 2, CA:QA, 4th ed., Sections 2.1-2.5 | Solve problems 2.1, 2.2, 2.7 from the textbook Machine Assignment 1 |
Solutions to problems 2.1, 2.2, 2.7 | |
3 | Wed 5/3 09-11 RA203 |
7 | Instruction-Level Parallelism: Dynamic scheduling, speculation, superscalar and VLIW processors |
Instructor notes, textbook companion slides (protected) |
Chapter 2, CA:QA, 4th ed, Sections 2.6-2.9 | Solve problems 2.3, 2.8 from the textbook | Solutions to problems 2.3, 2.8 | |
3 | Fri 7/3 09-11 RA203 |
8 | Limits to ILP. Thread-Level parallelism: Simultaneous Multithreading |
Instructor notes, textbook companion slides (protected) |
Chapter 3, CA:QA, 4th ed, Sections 3.1-3.5 | Study problem 3.1 from textbook | -- | |
4 | Mon 10/3 09-11 RA203 |
No class | ||||||
4 | Wed 12/3 09-11 RA203 |
No class | ||||||
4 | Fri 14/3 09-11 RA203 |
No class | ||||||
5 | Mon 17/3 09-11 RA203 |
No class | Homework 2 assigned | Solution to Homework 2 | ||||
5 | Wed 19/3 09-11 RA203 |
No class | ||||||
5 | Fri 21/3 09-11 RA203 |
No class | ||||||
6 | Mon 24/3 09-11 RA203 |
No class | ||||||
6 | Wed 26/3 09-11 RA203 |
9 | Continued discussion of SMT | No new lecture notes/slides | Chapter 3 in CA:QA, 4th edition | Solve problem 3.1 from the textbook (caution, allocate extra time) | Solution to problem 3.1 | |
6 | Fri 28/3 09-11 RA203 |
10 | Introduction to vector processors | Instructor notes (updated), textbook companion slides (protected) | CA:QA, 4th ed., Appendix F, F.1-F.3 on vector processors | Solve exercises F.1, F.2 from Appendix F of the textbook | none | |
7 | Mon 31/3 09-11 RA203 |
11 | Vector processors | See updated notes from Lecture 10 | CA:QA, 4th ed., Appendix F, F.4-F.5 on vector processors | Solve exercises F.4, F.5 from Appendix F of the textbook | none | |
7 | Wed 2/4 09-11 RA203 |
12 | Cache and memory hierarchy optimizations | textbook companion slides | CA:QA, 4th ed., Chapter 5, 5.1-5.2 | none | none | |
7 | Fri 4/4 09-11 RA203 |
13 | Midterm review and compiler optimizations for caches | No new slides | CA:QA, 3rd ed. (note the change), Chapter 5, 5.5 | none | none | |
8 | Mon 7/4 09-11 RA203 |
Midterm Exam | ||||||
8 | Wed 9/4 09-11 RA203 |
No class (due to elections) | ||||||
8 | Fri 11/4 09-11 RA203 |
14 and 15 | DRAM. Virtual machines | No new slides | CA:QA, 4th ed., Chapter 5 | Solve exercises 5.1, 5.2, 5.3 from the textbook | Solutions to problems 5.1-5.3 | |
9 | Mon 14/4 09-11 RA203 |
16 | Introduction to multiprocessors. | textbook companion slides | CA:QA, 4th ed;, Chapter 4 | Solve (and discuss with colleagues) exercises 5.13, 5.14, 5.15 and 5.16 | Solutions to problems 5.13-5.16 | |
9 | Wed 16/4 09-11 RA203 |
No class | ||||||
9 | Fri 18/4 09-11 RA203 |
No class | ||||||
Easter Break | ||||||||
10 | Mon 5/5 09-11 RA203 |
No class | ||||||
10 | Wed 7/5 09-11 RA203 |
No class | ||||||
10 | Fri 9/5 09-11 RA203 |
17 | Introduction to coherence and consistency | No new slides | CA:QA, 4th ed., Chapter 4 (theory of coherence and consistency) | none | Machine Assignment 3 | |
11 | Mon 12/5 09-11 RA203 |
18 | Bus-based multiprocessors and snooping cache coherence protocols | No new slides | CA:QA 4th ed., Chapter 4, 4.2 and 4.3 | none | none | |
11 | Wed 14/5 09-11 RA203 |
19 | Bus-based multiprocessors and snooping cache coherence protocols (II) | No new slides | No new reading items | none | none | |
11 | Fri 16/5 09-11 RA203 |
20 | Directory-based coherence protocols and scalable multiprocessors | textbook companion slides | CA:QA 4th ed., Chapter 4, 4.4 | Study problems 4.1, 4.2, 4.3 from the textbook | none | |
12 | Mon 19/5 09-11 RA203 |
No class | ||||||
12 | Wed 21/5 09-11 RA203 |
No class | ||||||
12 | Fri 23/5 09-11 RA203 |
No class (unscheduled) | ||||||
13 | Mon 26/5 09-11 RA203 |
21 | Memory consistency protocols | instructor lecture slides | lecture slides | Study problems 4.16, 4.17, 4.18, 4.19 from the textbook | none | |
13 | Wed 28/5 09-11 RA203 |
22 | ||||||
13 | Fri 30/5 09-11 RA203 |
23, 24 | TBA | Final exam |