Hi Thank you for waiting,
I'm still working to get input from the subject matter expert(SME) in Altera regarding your question 1 and question 3. But in the mean time can provide the response for Q2 and Q4 based on the current documentation and known issue.
For Q2
- You are right, the hardware bug described is due to the IP Bug in the CCU, where from software perspective, disabling the snoop filter is the only workaround this issue. The snoop filter will benefit when there is a heavy load transaction. like in your case sending small amount of data eg: 1 cache line disabling the snoop filter is unlikely to negatively affect performance, in fact for small cache line the HPS expect fast access, disabling the snoop filter should be more beneficial in your case, improved latency and reliability.
For Q4
- At this time, to the best of my knowledge, there is no fully validated reference design that demonstrates low-latency cache-injected writes from FPGA to HPS L2 cache. That said, I will check with the SMEs to confirm whether there might be an internal example or recommended setup that we can share.
Regards,
Boon Khai.