Bug! Quartus Pro 20.1.1, Cyclone V, utilizing PCIe example from 16.1.
Problem Details
Error:
Internal Error: Sub-system: VPR20KMAIN, File: /quartus/fitter/vpr20k/altera_arch_common/altera_arch_place_anneal.c, Line: 2744
Internal Error
Stack Trace:
0xdce7a: vpr_qi_jump_to_exit + 0x6f (fitter_vpr20kmain)
0x797f83: vpr_exit_at_line + 0x53 (fitter_vpr20kmain)
0x2ec700: l_initial_low_temp_moves + 0x1cb (fitter_vpr20kmain)
0x6f97f1: l_thread_pool_do_work + 0x41 (fitter_vpr20kmain)
0x2c74fb: l_thread_pool_fn + 0x4e (fitter_vpr20kmain)
0xefe28: l_thread_start_wrapper(void*) + 0x29 (fitter_vpr20kmain)
0x5acc: thr_final_wrapper + 0xc (ccl_thr)
0x3eeef: msg_thread_wrapper(void* (*)(void*), void*) + 0x62 (ccl_msg)
0x9f9c: mem_thread_wrapper(void* (*)(void*), void*) + 0x5c (ccl_mem)
0x8b39: err_thread_wrapper(void* (*)(void*), void*) + 0x27 (ccl_err)
0x5b0f: thr_thread_wrapper + 0x15 (ccl_thr)
0x5df2: thr_thread_begin + 0x46 (ccl_thr)
0x7f9e: start_thread + 0xde (pthread.so.0)
0xfd0af: clone + 0x3f (c.so.6)
End-trace
Executable: quartus
Comment:
Device is very full. Trying to shoe horn it in by forcing most ram into M10K. Reduced the PCIe DMA buffer from 256K to 128K got me the space that I needed, but now this crash has occurred.
System Information
Platform: linux64
OS name: This is
OS version:
Quartus Prime Information
Address bits: 64
Version: 20.1.1
Build: 720
Edition: Standard Edition
I'd love some support. I'm trying to get PCIe Root Port working in Cyclone V and so far have not found any examples that will fit and meet timing in the part I chose: 5CSXFC5D6F31C7.
I followed an example called: cv_soc_rp_simple_design and it won't fit with my logic.
I followed another example that didn't have bus syncs, where I tied everything from PCIe directly to the HPS and the xcvr_reconfig block, but that missed timing by up to 1.8ns on the 125MHz path to the main HPS DRAM.
This failure happens with the former, the cv_soc_rp_simple_design/pcie_rp_ed_5csxfc6.qsys which didn't fit until I trimmed all the unnecessary logic, including the jtag port and I shrunk all the bus retimers to be as small as possible. The last change was shrinking the 256KB pcie DMA buffer to 128K and then this error occurred.
If there's a simple way to send the database through the FAEs I've been working with, let me know.
Thanks in advance.
I got it working.
The last hurdle was the MSI interface. I'm not sure why it wasn't working, but restarting the design from scratch with my recently acquired knowledge got everything working.
I've attached my qsys file and socfpga.dtsi. Hopefully it will help others get a jump on things so they don't have to learn everything the hard way as I did.
This design is not optimized for speed, nor is it optimized for space.
NVMe read speed is around 80 MB/s.
NVMe write speed is around 50 MB/s.
Faster drives will do a little better, but even the over a gig per second on PC 4 lane part I have doesn't do much better than 110 MB/s read. Bandwidth is limited by the ARM memory interface and the fact that bursting logic at 125MHz causes timing violations. A burst length of one on the Txs interface has got to slow things down.
Performance is quoted with the 5.11 kernel, the 5.4 kernel is not quite as fast. Still fast enough for an embedded system though.
Don't forget to enable the fpga, pcie and msi modules in your top level board dts file.
And don't forget to reserve the first 64K of DRAM as stated above. If you don't you will get read errors.
There are probably better configurations, and eventually I'll probably try to optimize for size since my logic will need to get bigger on the next project. But for now. It finally works.
Good luck to you all.