Forum Discussion

Occasional Contributor

3 years ago

parallel_for very slow in dpc++

Hello, I really need help with this. I am trying to accelerate an algorithm using DPC++. what happens is that the normal calculations takes 1.5 times faster than kernel parallel execution. The fo...

amaltaha

Occasional Contributor

3 years ago

Yes, I understand it is all on CPU, I might have misexplained this, I meant that the iterative normal for loop takes less time than that which allows for parallelism (parallel_for). Doesn't parallel_for apply parallelism at the same time to all the rows in the buffers, why its performance is worse? this is mainly my question. The iterative for loop is on the host and the parallel_for is on the kernel (device).

I have tried to split the input into smaller ones by using parallel_for_work_group but it gave the same results. The iterative code no more than 40 seconds while the parallel one takes more than 7 minutes.

Thank you!

Forum Discussion

parallel_for very slow in dpc++

Recent Discussions

Agilex 7 I-Series "aocl diagnose acl0" error following OFS

AI Suite System Throughput Issue

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

How Do I get the License for HLS?

Deprecation Notice for FPGA Support Package for oneAPI DPC++/C++. What is the alternative?