Forum Discussion
Altera_Forum
Honored Contributor
9 years ago2)
If you mean that you run the kernel on 350056 work-items, then to know the ideal execution time you would need to know how many work-items come out of the pipeline per clock. It might not necessarily be one depending on the kernel structure and instructions, in fact it will most probably be lower. I don't have any report on hand right now, but I believe with the profiler you are able to get that information. Also, you would need to know the generated pipeline depth and calculate the time it takes for the first work-unit to get processed. All in all, maybe something like this would do: 1/135.64MHz * 350056 / clockcycles_per_work-unit + pipeline_processing_time