I had run into some fortran code to modify. Obviously, it was written without thinking of high performance computing and not parallelized... Now I would like to make the code "on track" and parallel. After a whole afternoon thinking, I still cannot find where to start. Can any one help me on how to HPC and parallel the code? Thank you very much.
Sharp
DO I=1,N
DO J=1,N
XXX= 0.0D+00
DO K=1,N
DO L=1,N
XXX = XXX + C(K,I)*CABM(K,L)*C(L,J)
ENDDO
ENDDO
IF(I.eq.J) XXX=XXX-1.0d0
ENDDO
ENDDO
This one's not too hard. You're summing up all the subexpressions of I,k,j,l. Each iteration can be dine on a separate node and "reduced" to a single sum.
Perhaps the easiest way would be to parallelize the outermost loop, splitting the task among N processors and summing each result.
Hi, yes, I do have a MPI environment. However, I am not too familiar with MPI but OpenMP. And after a few days of thinking, I think I can parallelize the code successfully using OpenMP. And you were right, the best performance I got is to parallel the outer most loop. Thank you for your help.