-
Notifications
You must be signed in to change notification settings - Fork 121
Description
There was a recent addition of openMP macro support to enable the selection of openMP or openACC code. To do this, the entire loop was wrapped in a macro call instead of having one placed at the start of the loop. This is required because openMP requires all GPu parallel loops to be closed, but there is a problem. After FYPP has processed the file, it first creates the new line numbers and then places the macro. This results in all lines being reported as the line of the macro. For example, the following chunk of code
#:call GPU_PARALLEL_LOOP(private='[physical_loc,dyn_pres,alpha_rho_IP, alpha_IP,pres_IP,vel_IP,vel_g,vel_norm_IP,r_IP, v_IP,pb_IP,mv_IP,nmom_IP,presb_IP,massv_IP,rho, gamma,pi_inf,Re_K,G_K,Gs,gp,innerp,norm,buf, radial_vector, rotation_velocity, j,k,l,q]')
do i = 1, num_gps
gp = ghost_points(i)
j = gp%loc(1)
k = gp%loc(2)
l = gp%loc(3)
patch_id = ghost_points(i)%ib_patch_id
! Calculate physical location of GP
if (p > 0) then
physical_loc = [x_cc(j), y_cc(k), z_cc(l)]
else
physical_loc = [x_cc(j), y_cc(k), 0._wp]
end if
is processed by fypp into
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
!$acc parallel loop gang vector default(present) private(physical_loc, dyn_pres, alpha_rho_IP, alpha_IP, pres_IP, vel_IP, vel_g, vel_norm_IP, r_IP, v_IP, pb_IP, mv_IP, nmom_IP, presb_IP, massv_IP, rho, gamma, pi_inf, Re_K, G_K, Gs, gp, innerp, norm, buf, radial_vector, rotation_velocity, j, k, l, q)
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
do i = 1, num_gps
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
gp = ghost_points(i)
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
j = gp%loc(1)
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
k = gp%loc(2)
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
l = gp%loc(3)
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
patch_id = ghost_points(i)%ib_patch_id
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
! Calculate physical location of GP
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
if (p > 0) then
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
physical_loc = [x_cc(j), y_cc(k), z_cc(l)]
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
else
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
physical_loc = [x_cc(j), y_cc(k), 0._wp]
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
end if
# 200 "/home/dan/Documents/repos/MFC/src/simulation/m_ibm.fpp"
Because the same line number (200 in this case) is being reported, then the compiler reports problem to line 200 instead of the actual problematic line. This makes debugging GPU code extremely slow.
A PR should be made that resolves this issue. The PR should allow correct line numbers to be generated from the source code so that the correct line numbers are generated.