Syllabus

Week

Date (2010)

Topic

Tasks

1a

27-Sep

Administration

  1. Course topics and logistics

  2. Assignment 1

  3. Lecture: Technology trends (Chapters: 1)

1b

29-Sep

Introduction & Background

  1. Assignment 1 intro

  • What are threads?

  • User vs. kernel threads

  1. Lecture: Convergence of parallel architectures (Chapters: 1)

2a

4-Oct

Understanding Concurrency

  1. Assignment 1 data structures, ADTs, and mechanisms

  2. Paper discussion:

2b

6-Oct

Convergence of Parallel Architectures

  1. Assignment 1 questions

  2. Lecture: Convergence of parallel architectures (Chapters: 1)

3a

11-Oct

Convergence of Parallel Architectures

  1. Lecture: Parallel Programming Examples (Chapters: 2,3)

3b

13-Oct

Parallel Programming

  1. Paper discussion: J. B. Dennis and D. P. Misunas, "A Preliminary Architecture for a Basic Data-Flow Processor," Proc. 2nd Annual Symposium on Computer Architecture, Computer Architecture News, 3, 4 (December 1974), 126-132, ACM.

4a

18-Oct

Parallel Programming

  1. Assignment 1 due

  2. Assignment 2 intro

  3. Lecture: Parallel Programming Examples (Chapters: 2,3)

4b

20-Oct

Workload-driven Performance Evaluation

  1. Paper discussion: The SPLASH2 Programs: Characterization and Methodological Considerations. Steven Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh and Anoop Gupta. In Proceedings of the 21st International Symposium on Computer Architecture, June 1995.

5a

21-Oct (Fri)
(replacement for 27-Oct)

Parallel Programming,

Workload-driven Performance Evaluation

  1. Lecture: Workload driver performance evaluation: Parameter space, metrics, interactions between application and architecture (Chapters: 4)

5b

25-Oct

Time, Ordering, and Memory Consistency

  1. Paper discussion (cont'd): The SPLASH2 Programs: Characterization and Methodological Considerations. Steven Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh and Anoop Gupta. In Proceedings of the 21st International Symposium on Computer Architecture, June 1995.

  2. Lecture: Ordering issues, program order, sequential order, coherency vs. consistency, memory consistency models, ensuring coherency, ensuring consistency. (Chapters: 5,9)

6a

1-Nov

Time, Ordering, and Memory Consistency

  1. Assignment 2 due

  2. Assignment 3 intro

  3. Start thinking about project proposals.

  4. Paper discussion: Time, Clocks, and the Ordering of Events in a Distributed System. Leslie Lamport. Communications of the ACM, 21(7), pp. 558-565, July 1978.

6b

8-Nov

Memory consistency models

  1. Lecture: Memory consistency models: sequential consistency, weak consistency, release consistency, entry consistency. (Chapters: 5,9)

7a

10-Nov

Memory consistency models

  1. Paper discussion: Memory Consistency Models. D. Mosberger. ACM Operating Systems Review, 27(1), pp. 18-26, January 1993.

7b

11-Nov (Fri)
(replacement for 3-Nov)

Snoop-Based Shared Memory Multiprocessors

  1. Lecture: Providing consistency in bus-based shared memory multiprocessors. Cache coherency/memory consistency protocols, cache misses categorization. (Chapters: 6)

8a

15-Nov

Distributed Shared Memory Multiprocessors

  1. Assignment 3 due

  2. Project proposals due. After ack from instructor, start working on your projects.

  3. Paper discussion: Correct Memory Operation of Cache-Based Multiprocessors. C. Scheurich and M. Dubois. Proceedings of the 14th International Symposium on Computer Architecture, June 1987, pp. 234:243.

8b

18-Nov (fri)
(replacement for 17-Nov)

Distributed Shared Memory Multiprocessors

  1. Lecture: Scalable shared memory machines, distributed shared memory, directory protocols, CC-NUMA, COMA, synchronization issues. (Chapters: 7, 8, 9)

9a

22-Nov

Distributed Shared Memory Multiprocessors

  1. Paper Discussion: An Evaluation of Directory Schemes for Cache Coherence. A. Agarwal, R. Simoni, J. Hennessy, M. Horowitz. ISCA'88.

9b

24-Nov

Software Distributed Shared Memory

  1. Lecture: Software shared memory. Shared Virtual Memory, Instrumentation-based software DSM, Implementation issues. (Chapters: 9)

10a

29-Nov

Software Distributed Shared Memory


  1. Paper discussion: Fine-grain Access Control for Distributed Shared Memory, Ioannis Schoinas, Babak Falsafi, Alvin R. Lebeck, Steven K. Reinhardt, James R. Larus, David A. Wood (The Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI), Oct. 1994).

10b

1-Dec

Message Passing Multiprocessors

  1. Lecture: Node-to-network interfaces in message passing systems, programming considerations. Increasing message throughput. (Chapters: 10, 7.7)

11a

6-Dec

Distributed Shared Memory Multiprocessors

  1. Paper discussion: Peter A. Steenkiste. A systematic approach to host interface design for high-speed networks. Computer, Volume 27, Issue 3, pages: 47 – 57. March, 1994.

11b

8-Dec

Message Passing Multiprocessors

  1. Lecture: Node-to-network interfaces in message passing systems, programming considerations. Reducing CPU overhead. (Chapters: 10, 7.7)

12a

13-Dec

Message Passing Multiprocessors

  1. Paper discussion: Software Overhead in Message Passing Layers: Where Does the Time Go? Vijay Karamcheti and Andrew Chien. International Conference on Architectural Support for Programming Languages and Operating Systems. October 1994.

12b

15-Dec

Message Passing Multiprocessors

  1. Lecture: Node-to-network interfaces in message passing systems, programming considerations. Reducing message latency. (Chapters: 10, 7.7)

13a

20-Dec

Cell BE

  1. Paper Discussion: Introduction to the Cell multiprocessor. J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. IBM Journal of Research and Development. Volume 49, Number 4/5, 2005. Points for discussion.

13b

22-Dec

Anton

  1. Paper Discussion: DE Shaw et al.  Millisecond-Scale Molecular Dynamics Simulations on Anton. Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis (SC09), Portland, Oregon, November 14–20, 2009.

14a

22-Dec

Project Presentations

  1. Projects due: Presentations, 15-20 mins/project.

  2. End of HY527. 


Further References

  1. Anton, a special-purpose machine for molecular dynamics simulation. D.E. Shaw et al. Proceedings of the 34th annual international symposium on Computer architecture. June 2007. San Diego, California, USA.

  2. A tightly-coupled processor-network interface. Dana S. Henry and Christopher F. Joerg. Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). October, 12-15, 1992. Boston, Massachusetts, USA.

  3. DDM--A Cache-Only Memory Architecture. E. Hagersten, A. Landin, and S. Haridi. IEEE Computer, 25, 9 (September 1992), 44-54.

  4. Synchronization and Communication in the T3E Multiprocessor. S. Scott. Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems. October 1996, pp 26-36.

  5. Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. ISCA24, Denver, Co, June, 1997. Richard P. Martin, Amin M. Vahdat, David E. Culler, Thomas E. Anderson.

  6. Autonet: a High-speed, Self-configuring Local Area Network with Point-to-point Links. Michael D. Schroeder and Andrew D. Birrell and Michael Burrows and Hal Murray and Roger M. Needham and Thomas L. Rodeheffer and Edwin H. Satterthwaite and Charles P. Thacker. Technical Report, Digital Equipment Corporation, Systems Research Centre, Number 59, p. 42 pages, 30 April 1990.

  7. Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. William J. Dally Charles L. Seitz. IEEE trans. on comp. Vol. C-36. No. 5, pp. 547-553, May 1987.

  8. User-Space Communication: A Quantitative Study. Soichiro Araki, Angelos Bilas, Cezary Dubnicki, Jan Edler, Koichi Konishi and James Philbin. Supercomputing, November 1998.

  9. K. LI and P. HUDAK, "Memory Coherence in Shared Virtual Memory Systems", ACM Trans. on Computer Systems, 7, 4 (November 1989), 321-359.

  10. System Area Network Mapping. Brent Chun, Alan Mainwaring, Saul Schleimer, Daniel Wilkerson. SPAA'97 , Newport, Rhode Island , June 1997.

  11. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. Pete Keleher, Alan L. Cox, Sandhya Dwarkadas and Willy Zwaenepoel. In The 1994 Winter USENIX Conference.

  12. D. LENOSKI, J. LAUDON, K. GHARACHORLOO, W. WEBER, A. GUPTA, J. HENNESSY, M. HOROWITZ and M. LAM, "The Stanford DASH Multiprocessor", IEEE Computer, 25, 3 (March 1992), 63-79.

  13. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors. J. M. Mellor-Crummey and M. L. Scott. ACM Trans. on Computer Systems, February 1991.

  14. L. M. CENSIER and P. FEAUTRIER, "A New Solution to Coherence Problems in Multicache Systems", IEEE Transactions on Computers, C-27, 12 (December 1978), 1112-1118.

  15. How to Make a Multiprocessor That Correctly Executes Multiprocess Programs. Leslie Lamport. IEEE Trans. on Computers, Vol. C-28, Number 9, pp. 690-691, September 1979.

  16. J. E. THORNTON "Parallel Operation in the Control Data 6600," Fall Joint Computers Conference, vol. 26, pp. 33-40, 1961.

  17. G. MOORE, "Cramming More Components onto Integrated Circuits", Electronics, p114-117, April 1965.

  18. G. M. AMDAHL, "Validity of the Single-Processor Approach to Achieving Large Scale Computing Capabilities", AFIPS Conference Proceedings, (April 1967), 483-485.