get_global_id(0) cause much latency ?

Honored Contributor

8 years ago

--- Quote Start ---

Are you launching this as a single work-item kernel or as an NDRange kernel? Since you're using get_global_id (or trying to), I presume NDRange. Maybe the design would work better as a single work item kernel. Some code (from the host and kernel) might help to explain.

--- Quote End ---

NDRange, as soon as I add the get_global_id it will become much slower...

It did get the correct global_id when I actually use it.

__attribute__((num_compute_units(2)))
__attribute__((reqd_work_group_size(2, 1, 1)))
__kernel void pointWiseMul(__global float2* restrict d_afCorr, __global float2* restrict d_afPadScn, __global float2* restrict d_afPadTpl, int dataN, float fScale)
{
        int begin = get_global_id(0);//mark out this line and the speed change dramatically
	for (int iIndx = 0; iIndx < dataN; iIndx++)
	{
		float2 cDat = d_afPadScn;
		float2 cKer = d_afPadTpl;
		//take the conjugate of the kernel
		cKer.y = -cKer.y;
		float2 cMul = { cDat.x* cKer.x - cDat.y * cKer.y, cDat.y * cKer.x + cDat.x * cKer.y };
		cMul.x = fScale * cMul.x;
		cMul.y = fScale * cMul.y;
		d_afCorr = cMul;
	}
}

Forum Discussion

Recent Discussions

Signal Tap - *** Fatal Error: Segment Violation

Quartus Eda_Writer keeps crashing

Warning at Standard 25.1 by Arria 10

License issue

jtagserver.exe causing BSOD together with ftdi driver