Tuesday, 18 April 2017

Matrix Multiplication using MapReduce

Matrix Multiplication:


To multiply two matrixes sufficient and necessary condition is "number of columns in matrix A = number of rows in matrix B"


How serial code works:


Loop for each row in matrix A.Loop for each column in matrix B and initialize output matrix C to 0. This loop will run for each row of matrix A.Loop for each column in matrix A.Multiply A[i,k] to B[k,j] and add this value to C[i,j]Return output matrix C


Map-Reduce logic:



 The logic here is to send the calculation part of each output cell of the result matrix to a reducer.  
In the matrix multiplication the first cell of output  (0,0) has multiplication and summation of elements from row 0 of the matrix A and elements from col 0 of matrix B.  To do the computation of value  in the output cell (0,0) of resultant matrix  in a separate reducer we need to use (0,0) as output key of map phase and value should have array of values from row 0 of matrix A and column 0 of matrix B.

This algorithm's output from map phase should be having a <key,value> , 

To multiply an nxn matrix M with an n-element vector v, compute 




v fits into memory
map: output (i, mijvj)

reduce: sum

Implementation

Map
In the map function, each input from the dataset is organised to produce a key value pair such that reducer can do the entire computation of the corresponding output cell. The code is given here.



Reducer
Input to the reducer is the key that corresponds to the output cell of resultant matrix and values required to do the computation of that cell. The code is given here.


Output:

You can see the output here

Experimental Setup:

4-Core Intel I7 @ 3.07GHz 16 GB RAM
Hadoop version - 2.0
Java version - 1.8

How to run?

Follow this link

Why this will be faster:

This implementation will work in many phases, it can have multiple maps and reduce.So this can lead to fast computation in the map and finalise the value in reduce.
Map and reduce work individually so this is faster than the serial code.







No comments:

Post a Comment