I read a bit on breakpoints and understand them a bit, but not enough to know where to put them in a cuda application. Can you explain where you would put it first and why?
But first, let's see if this helps, on this code
double_array <<< 40000, 1 >>> (a_d + i * 40000);
the last < is...