TCPP View

This view classifies PDC unplugged activities by their classification according to the NSF/IEEE-TCPP Curriculum Initiative on Parallel and Distributed Computing (TCPP 2012) which aims to articulate the core PDC areas that an undergraduate computer science student should cover in the course of their undergraduate education. The TCPP report informed the creation of the PD topic area in CS2013, and lists several cross-cutting topics. For the immediate, our focus is on identifying unplugged activities that satisfy each of topics suggested for Core courses (CS1, CS2, Systems and DS/A); Advanced/Elective courses are not covered. Bloom's Taxonomy is used by TCPP to help solidify the level of coverage of each topic:

Each topic is preceded by one of these terms. Use the Bloom taxonomy classification to gauge the level of topic coverage in a course.

Architecture Topics (9 Activities )

Topic Suggested Course Unplugged Activities
Comprehend Parallel Taxonomy: Flynn's taxonomy, data vs. control parallelism, shared/distributed memory (0.5 hours)Systems DesertIslandsAnalogy; SieveOfErastothenes; See All (2)
Know Superscalar (ILP): Describe opportunities for multiple instruction issue and execution (0.25-1 hours)Systems
Know SIMD/Vector (e.g., SSE, Cray): Describe uses of SIMD/Vector (same operation on multiple data items), e.g., accelerating graphics for games. (0.1-0.5 hours)Systems CoinFlip; See All (1)
Know Pipelines (Single vs. multicycle): Describe basic pipelining process (multiple instructions can execute at the same time), describe stages of instruction execution (1-2 hours)Systems KitchenAnalogy; See All (1)
Know Streams (e.g., GPU): Know that stream-based architecture exists in GPUs for graphics (0.1-0.5 hours)Systems
Know MIMD: Identify MIMD instances in practice (multicore, cluster, e.g.), and know the difference between execution of tasks and threads (0.1-0.5 hours)Systems CoinFlip; JigsawPuzzle; See All (4)
Know Simultaneous MultiThreading: Distinguish SMT from multicore (based on which resources are shared) (0.2-0.5 hours)Systems
Comprehend Multicore: Describe how cores share resources (cache, memory) and resolve conflicts (0.5-1 hours)Systems JigsawPuzzle; MatrixAddition; See All (2)
Know Heterogeneous (e.g., Cell,on-chip GPU): Recognize that multicore may not all be the same kind of core (0.1-0.5 hours)Systems DesertIslandsAnalogy; See All (1)
Comprehend Shared vs. distributed memory (SMP Buses): Systems Single resource, limited bandwidth and latency, snooping, scalability issues (0.5-1 hours)Systems MatrixAddition; See All (1)
Know Message Passing Latency: Know the concept, implications for scaling, impact on work/communication ratio to achieve speedup (0.2-0.5 hours)Systems ParallelAddition; LongDistancePhoneCall; See All (3)
Know Message Passing Bandwidth: Know the concept, how it limits sharing, and considerations of data movement cost (0.1-0.5 hours)Systems LongDistancePhoneCall; See All (1)
Comprehend Cache organization Know the cache hierarchies, shared caches (as opposed to private caches) result in coherency and performance issues for software (0.2 to 1 hours) Systems KitchenAnalogy; See All (1)
Know Floating Point Range: Understand that range is limited, implications of infinitiesCS1/CS2/Systems
Know Floating Point Precision: How single and double precision floating point numbers impact software performance (0.1-0.5 hours)CS1/CS2/Systems
Know Floating Point Error Propagation: Understand NaN, Infinity values and how they affect computations and exception handling (0.1-0.5 hours)CS2
Know Floating Point IEEE 754 standard: Representation, range, precision, rounding, NaN, infinities, subnormals, comparison, effects of casting to other types (0.5-1 hours)CS1/CS2/Systems
Comprehend Cycles per Instruction (CPI): Number of clock cycles for instructions, understand the performance of processor implementation, various pipelined implementations (0.25-1 hours) Systems
Know Performance Benchmark - Spec Mark: Awareness of pitfalls in relying on averages (different averages can alter perception of which architecture is faster) (0.25-0.5 hours)Systems
Comprehend Peak Performance: Understanding peak performance, how it is rarely valid for estimating real performance, illustrate fallacies (0.1-0.5 hours)Systems
Know MIPS/FLOPS: Understand meaning of terms (0.1 hours)Systems
Comprehend Peak Vs. Sustained Performance: Know difference between peak and sustained performance, how to define, measure, different benchmarks (0.1-0.5 hours)Systems


Programming Topics (24 Activities )

Topic Suggested Course Unplugged Activities
Know SIMD: Understand common vector operations including element-by-element operations and reductions (0.5 hours)CS2/Systems
Know SIMD (Process vector extensions) : Know examples - SSE/Altivec macrosSystems
Apply Shared Memory: Be able to write correct thread-based programs (protecting shared data) and understand how to obtain speed up (2 hours).CS2/DSA/ProgLang ArrayAddition; See All (1)
Know Shared Memory Language Extensions: Know about language extensions for parallel programming. Illustration from Cilk (spawn/join) and Java (Java threads)CS2/DSA/ProgLang FlowerJoinAnalogy; See All (1)
Comprehend Shared Memory Compiler Directives: Understand what simple directives, such as those of OpenMP, mean (parallel for, concurrent section), pragmas show examplesCS2/DSA/ProgLang
Comprehend Shared Memory Libraries: Know one in detail, and know of the existence of some other example libraries such as Pthreads, Pfunc, Intel's TBB (Thread building blocks), Microsoft's TPL (Task Parallel Library), etc.CS2/DSA/ProgLang
Comprehend Distributed Memory: Know basic notions of messaging among processes, different ways of message passing, collective operations (1 hour)DSA/Systems MessagePassingAcrobats; ByzantineGenerals; See All (5)
Comprehend Client Server: Know notions of invoking and providing services (e.g., RPC, RMI, web services) - understand these as concurrent processes (1 hour)DSA/Systems
Know Hybrid: Know the notion of programming over multiple classes of machines simultaneously (CPU, GPU, etc.) (0.5 hours)Systems DesertIslandsAnalogy; MatrixAddition; See All (2)
Apply Task/thread spawning: Be able to write correct programs with threads, synchronize (fork-join, producer/consumer, etc.), use dynamic threads (in number and possibly recursively) thread creation (e.g. Pthreads, Java threads, etc.); builds on shared memory topic above (1 hour)CS2/DSA
Comprehend SPMD: Understand how SPMD program is written and how it executes (1 hour)CS2/DSA DesertIslandsAnalogy; See All (1)
Comprehend SPMD Notations: Know the existence of highly threaded data parallel notations (e.g., CUDA, OpenCL), message passing (e.g, MPI), and some others (e.g., Global Arrays, BSP library)CS2/DSA
Apply Data parallel: Be able to write a correct data parallel program for shared-memory machines and get speedup, should do an exercise. Understand relation between different notations for data parallel: Array notations, SPMD, and parallel loops. Builds on shared memory topic above. (1 hour)CS2/DSA/ProgLang ArrayAddition; SieveOfErastothenes; See All (3)
Apply Parallel loops for shared memory: Know, through an example, one way to implement parallel loops, understand collision/dependencies across iterations (e.g., OpenMP, Intel's TBB) for shared memoryCS2/DSA/ProgLang
Know Tasks and threads: Understand what it means to create and assign work to threads/processes in a parallel program, and know of at least one way do that (e.g., OpenMP, Intel TBB, etc.) (0.5 hours)CS2/DSA/Systems/ProgLang MoreProcessorsNotAlwaysBetter; CompanyAnalogy; See All (2)
Apply Critical regions: Be able to write shared memory programs that use critical regions for synchronizationCS2/DSA/Systems ArrayAddition; SweetenJuice; See All (2)
Apply Producer-consumer: Be able to write shared memory programs that use the producer-consumer pattern to share data and synchronize threadsCS2/DSA/Systems
Know Monitors: Understand how to use monitors for synchronizationCS2/DSA/Systems
Comprehend Deadlocks: Understand what a deadlock is, and methods for detecting and preventing themDSA/Systems PBJinParallel; OrangeGame; See All (2)
Know Data Races: Know what a data race is, and how to use synchronization to prevent itDSA/Systems ArrayAddition; ConcertTickets; See All (5)
Know Tools to detect concurrency defects: Know the existence of tools to detect race conditions (e.g., Eraser) (0.5 hours)DSA/Systems
Comprehend Computation Decomposition Strategies: Understand different ways to assign computations to threads or processesCS2/DSA PlantingTrees; StuffingEnvelopes; See All (7)
Comprehend Owner Computes Rule: Understand how to assign loop iterations to threads based on which thread/process owns the data element(s) written in an iterationCS2/DSA
Comprehend Decomposition into Atomic Tasks: Understand how to decompose computations into tasks with communication only at the beginning and end of each task, and assign them to threads/processesCS2/DSA
Comprehend Load balancing: Understand the effects of load imbalances on performance, and ways to balance load across threads or processes (1 hour)DSA/Systems PlantingTrees; FindOldestPenny; See All (5)
Comprehend Static Scheduling and mapping: Understand how to map and schedule computations before runtimeDSA/Systems JigsawPuzzle; CompanyAnalogy; See All (2)
Comprehend Dynamic Scheduling and mapping: Understand how to map and schedule computations at runtimeDSA/Systems JigsawPuzzle; CompanyAnalogy; See All (2)
Know Data: Understand impact of data distribution, layout and locality on performance; know false sharing and its impact on performance (e.g., in a cyclic mapping in a parallel loop); notion that transfer of data has fixed cost plus bit rate (irrespective of transfer from memory or inter-processor) (1 hour)DSA/ProgLang JigsawPuzzle; See All (1)
Know Data Distribution: Know what block, cyclic, and block-cyclic data distributions are, and what it means to distribute data across multiple threads/processesDSA/ProgLang
Know Data layout: Know how to lay out data in memory to get improve performance (memory hierarchy)DSA/ProgLang KitchenAnalogy; See All (1)
Know Data locality: Know what spatial and temporal locality are, and how to organize data to take advantage of themDSA/ProgLang KitchenAnalogy; See All (1)
Know False sharing: Know that for cache coherent shared memory systems, data is kept coherent in blocks, not individual words, and how to avoid false sharing across threads of data for a blockDSA/ProgLang KitchenAnalogy; See All (1)
Know Performance Tools: Know of tools for runtime monitoring (e.g., gprof, monitoring tools Vtune) (0.5 hours)DSA/Systems
Comprehend Speedup: Understand how to compute speedup, and what it meansCS2/DSA ParallelAddition; FindSmallestCard; See All (4)
Comprehend Efficiency: Understand how to compute efficiency, and why it mattersCS2/DSA
Know Amdahl's Law: Know that speedup is limited by the sequential portion of a parallel program, if problem size is kept fixedCS2/DSA ParallelAddition; MoreProcessorsNotAlwaysBetter; See All (3)
Know Gustafsonā€™s Law: Understand the idea of weak scaling, where problem size increases as the number of processes/threads increasesCS2/DSA


Algorithms Topics (22 Activities )

Topic Suggested Course Unplugged Activities
Comprehend Asymptotics: Understand upper (big-O) and lower bounds (big- Omega,); follow elementary big-O analyses, e.g., the O(log n) tree-depth argument for mergesort with unbounded parallelism. (1 hour) DSA
Comprehend Time: Recognize time as a fundamental computational resource that can be influenced by parallelism (0.33 hours)DSA PlantingTrees; ParallelAddition; See All (6)
Comprehend Space/Memory: Recognize space/memory in the same manner as time (0.33 hours)DSA
Comprehend Scaling: Recognize the use of parallelism either to solve a given problem instance faster or to solve larger instance in the same time (strong and weak scaling) (1 hour)DSA ParallelAddition; FindSmallestCard; See All (8)
Comprehend/Know Scalability in Algorithms and Architectures: Comprehend via several examples that having access more processors does not guarantee faster execution --- the notion of inherent sequentiality (0.5 hours)DSA MoreProcessorsNotAlwaysBetter; See All (1)
Know PRAM: Recognize the PRAM as embodying the simplest forms of parallel computation: Embarrassingly parallel problems can be sped up easily just by employing many processors (1 hour)DSA
Know BSP/CILK: Be exposed to higher-level algorithmic abstractions that encapsulate more aspects of real architectures. Either BSP or CILK would be a good option to introduce a higher level programming model and higher-level notions. Remark that both of these abstractions have led to programming models. (1 hour)DSA
Apply Dependencies: Observe how dependencies constrain the execution order of subcomputations --- thereby lifting one from the limited domain of "embarrassing parallelism" to more complex computational structures (0.5 hours)CS1/CS2/DSA PlantingTrees; StuffingEnvelopes; See All (6)
Comprehend Task Graphs: See multiple examples of this concrete algorithmic abstraction as a mechanism for exposing inter-task dependencies. These graphs, which are used also in compiler analyses, form the level at which parallelism is exposed and exploited (0.5 hours)DSA/SWEng
Comprehend Work: Observe the impact of computational work (e.g., the total number of tasks executed) on complexity measures such as power consumption. (0.5 hours)DSA
Know Make/Span: Observe analyses in which makespan is identified with parallel time (basically, time to completion) (0.5 hours)DSA
Comprehend Divide & Conquer (parallel aspects): Observe, via tree-structured examples such as mergesort or numerical integration (trapezoid rule, Simpson's rule) or (at a more advanced level) Strassen's matrix-multiply, how the same structure that enables divide and conquer (sequential) algorithms exposes opportunities for parallel computation (1 hour) CS2/DSA/Algo2 ParallelAddition; FindSmallestCard; See All (3)
Comprehend Recursion (parallel aspects): Recognize algorithms that, via unfolding, yield tree structures whose subtrees can be computed independently, in parallel (0.5 hours)CS2/DSA
Know/Comprehend Reduction (map-reduce): Recognize, and use, the tree structure implicit in scalar product or mergesort or histogram (equivalent apps) (1 hour)DSA
Know Dependencies: Understand the impacts of dependencies (0.5 hours)Systems PlantingTrees; StuffingEnvelopes; See All (3)
Comprehend/Know Series-parallel composition: Understand how "barrier synchronizations" can be used to enable a simple thread-based abstraction for parallel programming. Understand the possible penalties (in parallelism) that this transformation incurs.CS2(K)/Systems(C)
Comprehend/Apply Communication: Understand --- via hands-on experience --- that inter-processor communication is one of the most challenging aspects of PDC. (2 hours) -- BuildingCommunicationAnalogy; ParallelAddition; See All (6)
Comprehend/Apply Broadcast: Use this important mode of global communication; observe enabling algorithms for various platforms (e.g., recursive doubling) (1 hour)DSA
Know/Comprehend Multicast: Recognize other modalities of global communication on a variety of platforms: e.g., rings, 2D-meshes, hypercubes, trees (0.5 hours)DSA
Comprehend/Apply Scatter/gather: Recognize these informational analogues of Map and reduce (0.5 hours)DSA
Know Asynchrony: Understand asynchrony as exhibited on a distributed platform, its strengths (no need for synchs) and pitfalls (the danger of race conditions) (0.5 hours)CS2 StabalizingLeaderElection; ParallelGarbageCollection; See All (2)
Know Synchronization: Be aware of methods for controlling race conditions (1 hour) CS2/DSA SurvivorAnalogy; SweetenJuice; See All (3)
Know Sorting: Observe several sorting algorithms for varied platforms --- together with analyses. Parallel merge sort is the simplest example, but equally simple alternatives for rings and meshes might be covered also; more sophisticated algorithms might be covered in more advanced courses (1 hour)CS2/DSA NondeterministicSorting; OddEvenTranspositionSort; See All (4)
Know Selection: Observe algorithms for finding order statistics, notably min and max. Understand that selection can always be accomplished by sorting but that direct algorithms may be simpler. 0.5 hours)CS2/DSA ParallelAddition; FindSmallestCard; See All (4)
Comprehend Graph Search Algorithms: Know how to carry out BFS- and DFS-like parallel search in a graph or solution space (1 hour) DSA ParallelGarbageCollection; See All (1)
Apply Specialized computations: Master one or two from among computations such as: matrix product, transposition, convolution, and linear systems; recognize how algorithm design reflects the structure of the computational problems. (2 hours)CS2/DSA MatrixAddition; See All (1)



Cross Cutting and Advanced Topics (8 Activities )

Topic Suggested Course Unplugged Activities
Know why and what is parallel/distributed computing: Know the common issues and differences between parallel and distributed computing; history and applications. Microscopic level to macroscopic level parallelism in current architectures. (0.5 hours) CS1/CS2
Comprehend Locality: Understand this as a dominant factor impacting performance - minimizing cache/memory access latency or inter-processor communication. (1 hour) DSA/Systems KitchenAnalogy; See All (1)
Know Concurrency: The degree of inherent parallelism in an algorithm, independent of how it is executed on a machine (0.5 hours) CS2/DSA MoreProcessorsNotAlwaysBetter; See All (1)
Know Non-determinism: Different execution sequences can lead to different results hence algorithm design either be tolerant to such phenomena or be able to take advantage of this. (0.5 hours) DSA/Systems NondeterministicSorting; SweetenJuice; See All (2)
Know Power Consumption: Know that power consumption is a metric of growing importance, its impact on architectural evolution, and design of algorithms and software. (0.5 hours) DSA/Systems
Know Fault tolerance: Large-scale parallel/distributed hardware/software systems are prone to components failing but system as a whole needs to work. (0.5 hours) Systems StabalizingLeaderElection; FaultTolerantTokenRing; See All (3)
Know Cluster Computing: Be able to describe a cluster as a popular local-memory architecture with commodity compute nodes and a high-performance interconnection network. (0.25 hours) CS2/DSA/Systems ByzantineGenerals; See All (1)
Know Cloud/grid Computing: Recognize cloud and grid as shared distributed resources - cloud is distinguished by ondemand, virtualized, service-oriented software and hardware resources. (0.25 hours) Systems
Know Peer to Peer Computing: Be able to describe a peer to peer system and the roles of server and client nodes with distributed data. Recognize existing peer to peer systems. (0.25 hours) CS1/CS2
Know Consistency in Distributed Transactions: Recognize classic consistency problems. Know that consistency maintenance is a primary issue in transactions issued concurrently by multiple agents. (0.25 hours) CS1/CS2/Systems ConcertTickets; ByzantineGenerals; See All (2)
Know Web search: Recognize popular search engines as large distributed processing systems for information gathering that employ distributed hardware to support efficient response to user searches. (0.25 hours) CS1/CS2
Know Security in Distributed Systems: Know that distributed systems are more vulnerable to privacy and security threats; distributed attacks modes; inherent tension between privacy and security. (0.5 hours) Systems ByzantineGenerals; See All (1)