First Programming Assignment

Due: March 13, 2017

In this homework you will develop three parallel implementations of the n-Queens problem, measure their performance and explain it.

The n-Queens problem is defined as follows:
Find the total number of distinct solutions to place a given number of chess queens on a quadratic chess board with a specific number of squares on one side of a board (n) in a way that no two queens would be able to attack each other. Two queens can attack each other when they are placed on the same horizontal row, vertical column or in one of (2 × (2 × n - 1)) possible diagonals.

For example, on an 8×8 board, there are 92 distinct solutions. The difference between a unique and distinct solution is that distinct solutions allow for symmetrical operations, like rotations and reflections of the board, to be counted as one; so an 8×8 board has 12 unique solutions. The n‐Queens problem has exponential complexity.

The world record at the moment is held for the solution on a 26×26 board implemented on a custom supercomputer using FPGAs at TU Dresden. More information is available here.

You are asked to parallelize an efficient serial implementation of the algorithm, which finds distinct solutions using bit vectors. The code will be distributed to you via the class mailing list. Your team should parallelize the code using three programming models: OpenMP, Cilk, and TBB.

You are free to devise the parallelization strategy yourselves using the aforementioned programming models and you are free to pick the compilation flags of your code. You are not allowed to modify the algorithm but you are free to use any parallel control structures that you see fit (including loops, tasks, vectors, etc.) You will measure your algorithm on chessboards of size 10×10, 12×12 and 15×15. You should try your best to achieve high performance and good scalability of your parallel algorithm! You are welcome to discuss your ideas with the instructor.

You will carry out the assignment on an 8-core x86 SMP (shared-memory multiprocessor) system at FORTH-ICS. Instructions on how to access the machine will be posted soon. Carry out the project in groups of two students. You need to write a report of your project, where you describe:

  1. the parallelization strategy and the parallel control constructs that you used (loops, tasks, threads, etc.) with an explanation of your choices;
  2. any optimizations that you performed to improve parallelism, locality, communication, or synchronization;
  3. a detailed analysis of your experimental results.

While there is no prescribed length of these reports, they should not exceed 8 pages of text, including figures and references.