Before diving into Parallel computing:
Without using any parallel computing techniques, we can make our code faster by making it cache friendly.
Let us see an example :
one classic example is to iterate a multidimensional array "inside out":
The reason this is cache inefficient is because modern CPUs will load the cache line with "near" memory addresses from main memory when you access a single memory address. We are iterating through the "j" (outer) rows in the array in the inner loop, so for each trip through the inner loop, the cache line will cause to be flushed and loaded with a line of addresses that are near to the [j][i] entry. If this is changed to the equivalent:
It will run much faster.
Explanation is very good and interesting. Please move forward.
ReplyDeletevery well explained, please provide parallel computing examples.
ReplyDeleteSure, please be visiting the blog for further updates.
Delete