Forum Discussion

New Contributor

5 years ago

Runtime hang with pipes and global memory access

I'm trying to develop a oneAPI application that uses a long-running persistent kernel running independently of the host process and uses short-lived kernels to coordinate messages between the host an...

AustinKnutsonTMobile

New Contributor

5 years ago

I may have stumbled across a solution but I'm not sure if it's correct. By wrapping accesses to global memory from different concurrent kernels with atomic_fence, it removes the hang. I updated my example repo here: https://github.com/AustinKnutsonSprint/oneapi-timer-kernel-hang/tree/fences

Hopefully someone from Intel can confirm what the underlying problem is and whether this is an appropriate fix.

HRZ

Frequent Contributor

5 years ago

Kernel hangs with pipes pretty much always happen due to one of the following two reasons:

The amount of data read/written to a pipe is not equal to the amount written/read on the other side. This would result in a hang during software emulation, too.
Existence of a cycle of pipes in the kernel where, in case of pipe read/write operations being reordered by the compiler, could result in a kernel hang. This will not show up in software emulation.

I am not familiar with OneApi, but the backend compiler is supposedly the same the OpenCL compiler. I would assume just like OpenCL, there should also be some barrier pragma or something that allows forcing ordering of pipe operations and preventing the compiler from reordering them. The first debugging step in your case would probably be to add such a barrier after every pipe read and write operation in your code to see if the hang is the result of operation re-ordering by the compiler.

P.S. Pipes will never "drop" data; that is why they can cause hangs.

Forum Discussion

Runtime hang with pipes and global memory access

Recent Discussions

Agilex 7 FPGA Starter Kit with oneAPI Toolkit flow not detected over PCIe

MCTP over PCIe VDM routing to PMCI in OFS N6000 FIM configuration and datapath clarification

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

Error faced while executing on Agilex FPGA board....

AI Suite System Throughput Issue