Forum Discussion
While I was asking around for the best method to measure this, I received some information that you are looking for. Using a Linux host, a Stratix V - A7 device takes approximately 750ms to be reconfigured by the runtime. Note this number does not take into consideration the amount of time necessary to move any buffers that are active in the FPGA so whenever possible it's recommended to free any buffers that are allocated in the FPGA before the kernel hardware switchover occurs. Active buffers must be copied up to the host before the hardware is swapped out and restored after the hardware has been replaced, and there is an overhead associated with this, I can't give you a number for this because it's heavily dependent on your software implementation.
If this amount of time is a significant amount of time in comparison to the kernel execution time then you should examine amortizing this cost. Lets say you have a billion data points of data move between kernels "A" and "B" and you handle it a million points (work-items) at a time. Instead of calling up kernel A followed by B for each million points, you would call up kernel A many times to finish off all billion points, followed by kernel B to do the same. That way there is only one swapping out of the hardware instead of a around two thousand hardware swaps (A --> B --> A --> B --> etc...) In situations like these I also try to combine the kernels if possible since not only do you eliminate the hardware swapping in and out, but you often end up with a more efficient hardware implementation because the same compute unit will encapsulate both kernels.