Forum Discussion
Hi,
Thanks for reaching out to us.
>> I would guess that they are either 4-byte or 8-byte but how can I determine which for any given processor?
For systems based on the IA-32 architecture, classification is performed on 4 bytes. For systems based on other architectures, classification is performed on 8 bytes.
For DPC++, classification is performed on 8 bytes.
>> It talks about banks in local memory. How can I find out how many of these there are?
You can use numbanks() memory attribute in your source code to define the number of banks.
For more information you can refer to the below link:
>>However what happens if two work-items access the same element in local memory. (e.g. One reads the top-half and the other reads the bottom-half).
Could you please elaborate more on this statement?
Could you please provide us with an example/usecase?
Thanks & Regards,
Noorjahan.
Thank you for the reply that helps a lot. However it raises a few more questions:-
How do I find out about features like intel::numbanks() and intel::bankwidth()?
Is there any reference documentation describing them properly? There are various tutorials, white-paper and examples, but I have yet to find any reference documentation.
What for example is the applicability of the bank control directives above? They appeared in a paper on optimising FGPA access so it is safe to assume that they will be effective on FGPA. I would be very surprised if they have any effect on the CPU, which leaves me doubting if they actually work on GPUs. (I am trying to program a GPU.)