TCPP View

This view classifies PDC unplugged activities by their classification according to the NSF/IEEE-TCPP Curriculum Initiative on Parallel and Distributed Computing (TCPP 2012) which aims to articulate the core PDC areas that an undergraduate computer science student should cover in the course of their undergraduate education. The TCPP report informed the creation of the PD topic area in CS2013, and lists several cross-cutting topics. For the immediate, our focus is on identifying unplugged activities that satisfy each of topics suggested for Core courses (CS1, CS2, Systems and DS/A); Advanced/Elective courses are not covered. Bloom's Taxonomy is used by TCPP to help solidify the level of coverage of each topic:

Each topic is preceded by one of these terms. Use the Bloom taxonomy classification to gauge the level of topic coverage in a course.

Architecture Topics (11 Activities )

Topic Suggested Course Unplugged Activities
Comprehend Parallel Taxonomy: Flynn's taxonomy, data vs. control parallelism, shared/distributed memory Systems DesertIslandsAnalogy; SieveOfErastothenes; See All (2)
Know Superscalar (ILP): Describe opportunities for multiple instruction issue and execution (different instructions on different data)Systems
Know SIMD/Vector (e.g., AVX, GPU, TPU): Describe uses of SIMD/Vector (same operation on multiple data items), e.g., accelerating graphics for games.Systems CoinFlip; See All (1)
Know SIMD/Vector energy effects: Saves energy by sharing one instruction over many instruction executions, whether in parallel (SIMD) or pipelined (vector)Systems
Comprehend Pipelines (Basic Structure and Organization): Describe basic pipelining process (multiple instructions can execute at the same time), describe stages of instruction executionSystems KitchenAnalogy; PipelineSort; See All (2)
Know Pipelines (Data and control hazards): Show examples of how one pipe stage can depend on a result from another, or delayed branch resolution can start the wrong instructions in a pipe that can reduce performanceSystems
Know Pipeline Streams (e.g., GPU): Know that stream-based architecture exists in GPUs for graphics Systems PipelineSort; See All (1)
Know MIMD: Identify MIMD instances in practice (multicore, cluster, e.g.), and know the difference between execution of tasks and threadsSystems CoinFlip; JigsawPuzzle; See All (4)
Know MultiThreading: Distinguish multithreading from multicore (based on which resources are shared)Systems
Comprehend Multicore: Describe how cores share resources (cache, memory) and resolve conflictsSystems JigsawPuzzle; MatrixAddition; See All (2)
Know Heterogeneous (e.g., GPU, security, AI, low-power cores): Recognize that multicore may not all be the same kind of core (mix of organizations, instruction sets, high level explanation of benefits and costs)Systems DesertIslandsAnalogy; See All (1)
Know Heterogeneous Architecture Energy (small vs. big cores, CPU vs. GPU, accelerators): Know that heterogeneity saves energy by using power-efficient cores when there is sufficient parallelismSystems
Know Symmetric Multi-Processor (SMP): Explain concept of uniform access shared memory architectureSystems
Comprehend Buses for shared-memory implementation: Be able to explain issues related to shared memory being a single resource, limited bandwidth and latency, snooping, and scalabilitySystems MatrixAddition; See All (1)
Know Distributed Memory Latency: Know the concept, implications for scaling, impact on work/communication ratio to achieve speedup. Awareness of aspects of latency in parallel systems and networks that contribute to the idea of time accumulating over dependent parts or distanceSystems ParallelAddition; LongDistancePhoneCall; See All (3)
Know Distributed Memory Bandwidth: Know the concept, how it limits sharing, and considerations of data movement costSystems LongDistancePhoneCall; See All (1)
Comprehend Caching Know the cache hierarchies, and that shared caches (vs. private caches) result in coherency and performance issues for software Systems KitchenAnalogy; See All (1)
Know Atomicity: CS2: Show awareness of the significance of thread-safe data structures in libraries. Systems: Explain the need for atomic operations and their use in synchronization, both in local and distributed contexts. CS2/Systems
Know Interrupts and event handling: CS2: know that event handling is an aspect of graphical user interfaces. Systems: know that I/O is mostly interrupt-driven. CS2/Systems
Know Handshaking: Know the difference between synchronous and asynchronous communication protocols, including context of multiple processors. Systems
Know Process ID, System/Node ID : Explain need for a process identifier to enable interprocess communication, and for a node identifier in initialization of multiprocessor systems Systems StabalizingLeaderElection; See All (1)
Know Floating Point Range: Understand that range is limited, implications of infinitiesCS1/CS2/Systems
Know Floating Point Precision: How single and double precision floating point numbers impact software performance, basic ways to avoid loss of precision in calculationsCS1/CS2/Systems
Know Floating Point Error Propagation: Understand NaN, Infinity values and how they affect computations and exception handlingCS2
Know Floating Point IEEE 754 standard: Representation, range, precision, rounding, NaN, infinities, subnormals, comparison, effects of casting to other types CS1/CS2/Systems
Comprehend Instructions Per Cycle: Explain pros and cons of IPC as a performance metric, various pipelined implementationsSystems
Know Performance Benchmarks: Awareness of various benchmarks (such as LinPack, NAS parallel) and how they test different aspects of performance in parallel systemsSystems
Comprehend Peak Performance: Understanding peak performance, how it is rarely valid for estimating real performance, illustrate fallaciesSystems
Know MIPS/FLOPS: Know meaning of termsSystems
Comprehend Peak Vs. Sustained Performance: Know difference between peak and sustained performance, how to define, measure, different benchmarksSystems
Comprehend Power, Energy: Be able to explain difference between power (rate of energy usage) and energySystems
Know Hardware limitations for data bound computation: Comprehend the hardware limitations in the context of Big data and their most relevant v's (volume, velocity)Systems
Know Pressure imposed by data volume: Know of the limitations in terms of storage, memories, and filesystems when dealing with very large data volumesSystems
Know Pressure imposed by data velocity: Know of the limitations in bandwidth, memory, and processing when processing very fast data streams. Include networking and reliability issuesSystems
Know Cost of data movement across memory hierarchies: Comprehend the difference in speed vs cost (latency, energy) of accessing the different memory hierarchies, and of changing their sizes, in the context of multiple processors and hierarchiesSystems


Programming Topics (24 Activities )

Topic Suggested Course Unplugged Activities
Comprehend Concurrency and Parallelism: Understand concurrency is an algorithmic property; it exposes potential for parallelization. If concurrency is present in an algorithm, it can be parallelized, without concurrency there is no scope for parallelization. Concurrency can be present in a sequential program, parallelization takes advantage of concurrency to increase performance.CS1/CS2/DSA SweetenJuice; PBJinParallel; See All (2)
Know SIMD: Understand common vector operations including element-by-element operations and reductionsCS2/Systems
Know SIMD (Process vector extensions) : Know examples - e.g., Intel AVX or Power VSX macrosSystems
Apply Shared Memory: Be able to write correct thread-based programs (protecting shared data) and understand how to obtain speed up.CS2/DSA ArrayAddition; See All (1)
Know Shared Memory Language Extensions: Know about language extensions for parallel programming. Illustrate with examples from Cilk (spawn/join), Java (Java threads), or other languages.CS2/DSA FlowerJoinAnalogy; See All (1)
Comprehend Shared Memory Compiler Directives: Understand what simple directives, such as those of OpenMP, mean (parallel for, concurrent section), pragmas show examplesCS2/DSA
Comprehend Shared Memory Libraries: Know one in detail, and know of the existence of some other example libraries such as Pthreads, Pfunc, Intel's TBB (Thread building blocks), Microsoft's TPL (Task Parallel Library), C++ threads, etc.CS2/DSA
Know Distributed Memory: Know basic notions of messaging among processes, different ways of message passing, collective operationsSystems/DSA MessagePassingAcrobats; ByzantineGenerals; See All (5)
Comprehend Client Server and Peer to Peer models: Know notions of invoking and providing services (e.g., RPC, RMI, web services) - understand these as concurrent processes; know about network model of distributed computing (e.g., sockets); know that in distributed systems such handshaking interaction is crucial for the efficient communication between asynchronous processesSystems
Know Hybrid: Know the notion of programming over multiple classes of machines simultaneously (CPU, GPU, TPU, etc.)Systems DesertIslandsAnalogy; MatrixAddition; See All (2)
Apply Task/thread spawning: Be able to write correct programs with threads, synchronize (fork-join, producer/consumer, master/worker, etc.), use dynamic threads (in number and possibly recursively) thread creation - (e.g., Pthreads, Java threads, etc.) - builds on shared memory topic aboveCS2/DSA
Know Event-Driven Execution: Know about the need for event-driven execution; possible approaches to implementing it. Know about the notion of causality among events (e.g., remote file access, GUI). These effects may be easier to discuss in the context of distributed systems.CS2
Comprehend SPMD: Understand how SPMD program is written and how it executesCS2 DesertIslandsAnalogy; See All (1)
Comprehend SPMD Notations: Know the existence of highly threaded data parallel notations (e.g., CUDA, OpenCL), message passing (e.g., MPI), and some others (e.g., Global Arrays, BSP library)CS2/DSA
Apply Data parallel: Be able to write a correct data-parallel program for shared-memory machines and get speedup, should do an exercise.CS2/DSA ArrayAddition; SieveOfErastothenes; See All (3)
Apply Parallel loops for shared memory: Know, through an example, one way to implement parallel loops, understand collision/dependencies across iterations (e.g., OpenMP, Intel's TBB) for shared memoryCS2/DSA
Know Tasks and threads: Understand what it means to create and assign work to threads/processes in a parallel program, and know of at least one way do that (e.g., OpenMP, Intel TBB, etc.) (0.5 hours)CS2/DSA/Systems/ProgLang MoreProcessorsNotAlwaysBetter; CompanyAnalogy; See All (2)
Know MapReduce: Understand how problems can be solved by mapreduce, and how algorithms can be written using map and reduceCS2
Know Offloading to accelerators: Know about running parts of applications on accelerators (e.g., GPU, TPU, FPGA)CS2/Systems
Apply Tasks and Threads: Be able to write parallel programs that create and assign work to threads/processes, in at least one parallel environment (e.g., OpenMP, Intel TBB, pthreads, etc.)CS2/DSA/Systems
Apply Synchronization: Be able to write shared memory programs with critical regions, producer- consumer communication, and get speedup; know the notions of mechanisms for concurrency (monitors, semaphores, etc.)CS2/DSA/Systems
Apply Critical regions: Be able to write shared memory programs that use critical regions for synchronizationCS2/DSA/Systems ArrayAddition; SweetenJuice; See All (2)
Apply Producer-consumer: Be able to write shared memory programs that use the producer-consumer pattern to share data and synchronize threadsCS2/DSA/Systems
Comprehend Concurrency Issues: Understand the notions of deadlock (detection, prevention), race conditions (definition), determinacy/non-determinacy in parallel programs (e.g., if there is a race condition, the correctness of the output may depend on the order of execution)DSA/Systems PBJinParallel; See All (1)
Comprehend Deadlock/Livelock: Understand what deadlock and livelock are, and methods for detecting and preventing them; also cast in terms of distributed systemsDSA/Systems PBJinParallel; OrangeGame; See All (2)
Comprehend Starvation: Understand how starvation (of a thread or process) can occur, in context of an example (e.g., dining philosophers)DSA/Systems
Comprehend Race Condition: Know what a race condition is, and how to use synchronization to prevent itDSA/Systems SurvivorAnalogy; ArrayAddition; See All (6)
Comprehend Computation Decomposition Strategies: Understand different ways to assign computations to threads or processesCS2/DSA PlantingTrees; StuffingEnvelopes; See All (7)
Comprehend Owner Computes Rule: Understand how to assign loop iterations to threads based on which thread/process owns the data element(s) written in an iterationCS2/DSA
Comprehend Decomposition into Atomic Tasks: Understand how to decompose computations into tasks with communication only at the beginning and end of each task, and assign them to threads/processesCS2/DSA
Know Decomposition into Map and Reduce tasks: Know that divide and conquer can be expressed and programmed as a Map and Reduce decompositionCS2
Comprehend Load balancing: Understand the effects of load imbalances on performance, and ways to balance load across threads or processes (1 hour)DSA/Systems PlantingTrees; FindOldestPenny; See All (5)
Comprehend Scheduling and mapping: Understand the importance of a programmer, compiler and/or runtime system mapping and scheduling computations to threads/processes, both statically and dynamicallyDSA/Systems JigsawPuzzle; CompanyAnalogy; See All (2)
Know Effect of timing failures/delay in distributed systems: Understand that a failure in one node can cause a global failure in a distributed system. For example, one could use waiting on a non-terminating program to illustrate a failure scenario in distributed systems (e.g., in the context of consensus).CS2
Know Data: Understand impact of data distribution, layout and locality on performanceDSA JigsawPuzzle; See All (1)
Know Data layout: Know how to lay out data in memory to get improve performance (memory hierarch in shared memory parallel system)DSA/Systems KitchenAnalogy; See All (1)
Know Data False sharing: Know that for cache coherent shared memory systems, data is kept coherent in blocks, not individual words, and how to avoid false sharing across threads of data for a blockDSA/ProgLang KitchenAnalogy; See All (1)
Know Data Energy Impact: Know the energy cost of loading data into memory from secondary storage (and writing out modified data to secondary storage)Systems
Know Data Representation: Power savings using smaller data representation 64-bit vs. 32-bit vs. 16-bit floating-point precision, 32-bit vs. 16-bit integer). For example, machine learning on GPUs is driving lower (16-bit) floating-point precision. DSA/Systems
Know Data locality: Know what spatial and temporal locality are, and how to organize data to take advantage of themDSA/ProgLang KitchenAnalogy; See All (1)
Know Performance impact of data movement: Know the performance cost of moving data to secondary storage for big data analysis, distributed vs centralized analysis. Be aware of specific mechanisms that take advantage of locality (e.g., in-situ processing)Systems
Know Structured vs unstructured data: Know the differences and tradeoffs between these data representationsDSA
Know Graph representations and databases: Know of graph representations of data to facilitate graph analysisDSA
Know Distributed file systems: Be aware of existence of distributed file systems and common examples of where they are used and whySystems
Know Performance Tools: Know of tools for runtime monitoring (e.g., gprof, perf, Intel Performance Toolkit, TAU)DSA/Systems
Comprehend Speedup: Understand how to compute speedup, and what it meansCS2/DSA ParallelAddition; FindSmallestCard; See All (4)
Comprehend Efficiency: Understand how to compute efficiency, and why it mattersCS2/DSA
Comprehend Parallel Scalability: Understand that speedup and efficiency is a single point of measure for a particular problem size and number of processes/threads. These metrics change as problem size and/or number of processes/threads vary. Understand that scalability is a metric that measures how speedup varies as problem size and/or number of processes/threads vary.CS2/DSA
Know Amdahl's Law: Know that speedup is limited by the sequential portion of a parallel program, if problem size is kept fixedCS2/DSA ParallelAddition; MoreProcessorsNotAlwaysBetter; See All (3)
Know Gustafson’s Law: Understand the idea of weak scaling, where problem size increases as the number of processes/threads increasesCS2/DSA
Know Power-latency tradeoff: Familiar with the notion that problem decompositions (including their granularity), and active/idle states (e.g., including modulation of CPU frequency) may be exploited to adjust balance among throughput, latency, and energy consumption.Systems
Know Energy efficiency vs. load balancing: Aware that unbalanced work decomposition and communication congestion can prolong computation and reduce energy efficiency.Systems


Algorithms Topics (24 Activities )

Topic Suggested Course Unplugged Activities
Comprehend Concurrency, Asynchrony, Dependencies, and Nondeterminism: Qualitatively understand the notion of concurrency, asynchrony, dependencies, and nondeterminism through one or more every day examples illustrating simultaneous events with dependencies. The examples can be non-computational in CS1; for example, preheating the oven and preparing the batter for a cake can proceed concurrently and asynchronously to save time, but both these events must finish before baking starts (dependency). Computational examples of the appropriate level can be used in CS2. The goal for the instructor is to develop a baseline of ideas upon which to build the PDC concepts. CS1/CS2 StabalizingLeaderElection; ParallelGarbageCollection; See All (4)
Comprehend Asymptotics: Be able to describe upper (big-O) and lower bounds (big- Omega,) in the context of PDC, understanding that the functions whose order is being analyzed may have an additional variable related to the number of processing elements in the PDC context. DSA/CS2
Comprehend Time: Recognize time as a fundamental computational resource that can be influenced by parallelism.DSA PlantingTrees; ParallelAddition; See All (6)
Comprehend Work: Understand the definition of computational work and how it is different from time. Be able to observe its impact on complexity measures such as time, speedup, energy consumption, etc.DSA
Comprehend Space/Memory: Recognize space/memory in the same manner as time DSA
Comprehend Memory and Communication complexity: Understand that data movement (such as memory accesses or communication) may take more wall-time and energy than computations. Understand also that in certain distributed "big data" applications, moving data is cost prohibitive.DSA
Comprehend Speedup (Scalability): Recognize the use of parallelism either to solve a given problem instance faster or to solve larger instance in the same time. Understand and be able to explain why there's an upper limit on speedup for a given problem of a given size (Amdah's law). DSA ParallelAddition; FindSmallestCard; See All (8)
Know Efficiency, Scalability, Throughput: Know about the notions of efficiency, strong and weak scaling, and throughput.DSA
Know Model(s) to abstract from architectural details (for example PRAM (shared mem) and/or completely connected (network)): Understand concurrency basics without the trappings of real systems (routing, data alignment etc.). Recognize the PRAM as embodying the simplest forms of parallel computation: Embarrassingly parallel problems can be sped up easily just by employing many processors. Recognize how a completely connected network abstracts away from routing details. Recognize the difference between the model(s) and real systems. Such a MODEL of CHOICE (MoC) is assumed to be adopted by the instructor on which PDC concepts would be discussed.DSA
Know Notions from scheduling: Understand how to decompose a problem into tasksDSA StuffingEnvelopes; See All (1)
Apply Dependencies: Understand how dependencies constrain the execution order of sub-computations (thereby lifting one from the limited domain of “embarrassing parallelism” to more complex computational structures) and how deleterious race conditions can occur when dependencies among concurrent tasks are not respected. Also understand that there are situations where not respecting the order of certain computations may result in nondeterministic, but acceptable results.CS1/CS2/DSA PlantingTrees; StuffingEnvelopes; See All (6)
Comprehend Task Graphs: See multiple examples of this concrete algorithmic abstraction as a mechanism for exposing inter-task dependencies. These graphs, which are used also in compiler analyses, form the level at which parallelism is exposed and exploitedDSA/SWEng
Know Make/Span: Observe analyses in which makespan is identified with parallel time (basically, time to completion)DSA
Apply Divide & Conquer (parallel aspects): Recognize that the same structure that enables divide and conquer (sequential) algorithms exposes opportunities for parallel computation. Examples include mergesort or numerical integration (trapezoid rule, Simpson’s rule) or (at a more advanced level) Strassen's matrix-multiply. CS2/DSA ParallelAddition; FindSmallestCard; See All (3)
Know Load Balancing: Understand that processors equitably sharing the computational/communication load among processors benefits the entire algorithm. That is, the most stretched processor determines the performance of the entire algorithm. Use a simple task graph (for example) with a small number of processors to illustrate the idea.DSA/CS2 PlantingTrees; SieveOfErastothenes; See All (2)
Comprehend Reduction: Recognize and use the tree structure implicit in applications such as scalar product, mergesort, histogram, mapreduce, etc.DSA
Apply Synchronization: Recognize that synchronization is necessary for certain algorithms to work correctly. Also recognize that synchronization should be used only when needed as it entails its own overheads. CS1/CS2/DSA SweetenJuice; PBJinParallel; See All (2)
Know Parallel Prefix (Scan): Observe, via examples this "high-level" algorithmic tool. For instance, polynomial evaluation can be done as a prefix product (of powers of x), then a set of multiplications, followed by a reduction.DSA
Know Other multi-party communication patterns: Recognize common multi-party communication patterns such as broadcast, gather, scatter, all-to-all or many-to-many communications, and their use as building blocks for parallel and distributed algorithms. Illustrate using block matrix transpose or shuffle from map-reduce, etc.CS2/DSA
Comprehend Mutual Exclusion and Conflict Resolution: Understand the need to resolve conflicts among concurrent processes competing for a shared resource. Here the computation may have to grant exclusive access to one process to ensure correctness and/or progress. Be able to identify and mitigate problems due to races. Instructors may provide examples such as (a) selecting an airline seat that may be simultaneously competed for by several passengers, (b) selecting which customer gets an item when more than one tries to buy it simultaneously, (c) mutual exclusion in the context of Java threads, (d) Dining philosophers. CS2/DSA SweetenJuice; See All (1)
Comprehend Reduction and Broadcast for communication and synchronization: Understand, for example, how recursive doubling can be used for all-to-one reduction, and its dual, one-to-all reduction, in log(p) steps. The same applies to all-to-all broadcast and all-to-all reduction. Be aware of the synonyms for these operations in the jargon associated with different areas; for example, all-to-all broadcast may be referred to as "gossip" or "total exchange". Recognize that all-to-all broadcast/reduction are synchronizing operations in a distributed (event-driven) environment.DSA
Comprehend Parallel Prefix (Scan): Understand the structure of at least one simple parallel prefix algorithm, for example, on a PRAM-type model. One could consider recursive or iterative approaches (such as those of Ladner-Fischer, Kogge-Stone, Brent-Kung)DSA
Know Termination detection: Observe that, unlike the sequential case, processes in parallel and distributed algorithms may not know when the problem has been solved, or even if their part of the problem is solved; so termination has to be addressed explicitly. In some cases (such as reduction, tree algorithms, divide and conquer) it may be possible for a process to terminate on the basis of local information (for example, a node has passed its information to its parent in a reduction tree). In other cases, a global check may be necessary.DSA
Know Sorting: Observe several sorting algorithms for varied platforms --- together with analyses. Parallel merge sort is the simplest example, but equally simple alternatives for rings and meshes might be covered also; more sophisticated algorithms might be covered in more advanced coursesCS2/DSA NondeterministicSorting; PipelineSort; See All (6)
Know Search: With the help of BFS- or DFS-like parallel search in a tree, graph or solution space, understand speedup anomalies and the fact that certain algorithms don't lend themselves to parallelization without modifying the semantics of the original problem. DSA ParallelGarbageCollection; See All (1)
Know Algorithms for streams: Comprehend the notion of efficient algorithms (e.g., Bloom filters, heavy hitters) and structures (e.g., distributed hash tables) for stream data, and the difficulty of dealing with limited space.CS1/DSA
Apply Deeper Algorithmic Experience: Experience through class instruction and assignment/project the design, analysis, and implementation aspects of at least one parallel or distributed algorithm of choice in detail. Master PDC algorithmic concepts through a detailed exploration, including recognizing how algorithm design reflects the structure of the computational problem(s) and the PDC model/environment. Possible computational problems to explore include matrix product, map reduce, sorting, search, convolution, a graph algorithm of your choice. CS2/DSA SieveOfErastothenes; MatrixAddition; See All (2)



Emerging and Pervasive Topics (3 Activities )

Topic Suggested Course Unplugged Activities
Know Emerging Distributed Systems: Know about different emerging distributed systems (e.g. Internet of Things, Edge Computing, Cyber Physical Systems, Software Defined Networks, Social Networking, Web Services, AI, ML, etc). CS1/CS2
Comprehend Asynchrony: 1) Understand cause and effect of Asynchrony and how to ensure that computational correctness. 2)Understand asynchrony is the characteristics of modern systems and understand asynchrony in PDC context. 3)1) Can utilize a standard coordination strategy to prevent incorrect operation due to uncontrolled concurrency (race conditions) CS1/CS2/DSA ParallelGarbageCollection; SweetenJuice; See All (3)
Apply Concurrency and Dependency: 1) Understand concurrency is an algorithmic property and difference between concurrency and parallelism. 2) Understand sequential dependency limit degree of concurrency and hence parallelism. 3)Identify sequential dependency in algorithms. CS1/CS2/DSA MoreProcessorsNotAlwaysBetter; See All (1)
Apply Locality:1)Understand the locality of space and time. 2) Can describe one case of locality affecting behavior(e.g. web cache). 3) Know about some algorithm/program sensitive to some sort of artifact of architectural locality. 4) Implement some programs that are optimized around some locality. 5) Aware of at least two forms of locality. CS1/CS2/DSA/Systems KitchenAnalogy; See All (1)
Apply Performance: 1) Understand what to measure and how. 2) Understand space, time, & energy are fundamental commodities to measure. 3)Able to implement/tune some algorithms around performance metric. 4) Comprehend, explore, and analyze some speedup, efficiency, and scalability. CS1/CS2/DSA/Systems