PD_ParallelDecomposition PD_ParallelAlgorithms PD_ParallelArchitecture PD_ParallelPerformance TCPP_Architecture TCPP_Programming TCPP_Algorithms CS2 DSA Systems movement visual

Originally Described by Michelle Moore (Moore2000). In a separate paper (Kitchen1992), the notion of adding numbers in a matrix to illustrate master/worker is alluded to, but not discussed in depth.

No web-link to independent description available. See paper (Moore2000) for additional details.

Other activities by author

(Moore2000): SieveOfErastothenes, OddEvenTranspositionSort, CardSorting


Moore describes how to use Matrix Addition to describe the differences between shared memory vs distributed memory systems. Prior to the start of the exercise, the instructor writes N x N matrices A and B on the board, along with an empty matrix C of the same dimensions.

Shared Memory:

Here, each student represents a thread, with the board representing shared memory. If there are too many students up on the board, many will start bumping into each other, signifying resource contention. Assigning fewer students (say one student per row) will likely lead to less contention.

Distributed Memory:

Here, each student represents a process (sitting at a separate node), and the master-worker pattern is illustrated. The board is the memory of the master node.

Once students are used to the exercise, issues like task granularity and load balancing can be discussed, along with communication overhead:


(Kitchen1992) talks about adding values in a matrix (e.g. each process/thread adds a column) as a way to illustrate the Master-Worker pattern. However, the description does not go into further detail.


The following variations are suggested:

Matrix-Vector Multiplication: The above activity can easily be used to illustrate matrix-vector multiplication by having the master assign each cell to a separate student helper (or subset of cells) along with the common vector. To ensure that students can complete the multiplication in a reasonable about of time, reduce the range of values in matrices A and B (say to 1..5).

Matrix Multiplication: The above activity can easily be used to illustrate matrix multiplication by having the master assign rows in matrix A and B to a separate student helper. To ensure that students can complete the multiplication in a reasonable about of time, reduce the range of values in matrices A and B (say to 1..5).

Heterogeneous Distributed Architectures: For heterogeneous architectures, push desks together to form “multi-core” nodes. Each group of desks has a large piece of paper on it, signifying shared memory.

When the master sends a message, it is sent to the “team leader” at the desk, who then splits the work to his or her team at adjacent desks (representing individual threads). Each person in the team fills out the cells in the matrix assigned to them. Once the paper is completely filled in, the team leader takes the paper, and gives it to the master at the front of the class, who fills in the associated cells in matrix C.

To illustrate a multi-core master node, simply have the student playing the master node select a set of “mini-masters” to stand in front with them, and do an initial split of the matrix (i.e. one row to a “master”). Then each master will send a subset of their assigned cells to student team leaders at desks; the activity then continues as normal.

CS2013 Knowledge Unit Coverage

Parallel Decomposition (Core Tier 1)

2. Identify opportunities to partition a serial program into independent parallel modules. [Familiarity]

Parallel Algorithms (Core Tier 2, Elective)

4. Core Tier 2: Identify independent tasks in a program that may be parallelized. [Usage]

10. Elective: Implement a parallel matrix algorithm. [Usage]

Parallel Architecture (Core Tier 1)

1. Explain the differences between shared and distributed memory. [Familiarity]

Parallel Performance (Elective)

3. Describe how data distribution/layout can affect an algorithm’s communication costs. [Familiarity]

TCPP Topics Coverage

Architecture Topics

Programming Topics

Algorithms Topics


This exercise requires movement. and may be difficult for students who are mobility impaired. However, these students can be included in the distributed portion of the exercise, as they can remain at their desk. Blind students will need Braille cards associated with the numbers that they need add/multiply handed to them. Teammates at the desk can hand them the appropriate cards.


Assessment of this particular exercise is unknown; (Moore2000) discusses the impact of the parallel computing labs in her course. In general, students responded to the labs positively, and felt that labs increased their knowledge significantly.