N6000/PL-1 SmartNIC image deployment error
Hello,
I’ve installed an Intel N6000/1-PL SmartNIC on a Lenovo SR650v2 server with the following stack:
- N6000 SKU1
- CentOS Stream release 8
- OPAE v2.1.1
- kernel 5.15.92-dfl
Server BIOS settings: card tested on two slots (1 and 7) with PCIe bifurcation set to x8x8. Fan speed set to maximum.
The server BIOS reports the following warning:
PCIe error recovery has occurred in slot number 1. The adapter may not work correctly.
And dmesg contains:
[22638.864360] intel-m10bmc-sec-update n6000bmc-sec-update.3.auto: SDM trigger failure: 4
[22638.877250] dfl-pci 0000:c5:00.1: enabling device (0140 -> 0142)
[22638.877568] dfl-pci 0000:c5:00.1: PCIE AER unavailable -5.
[22638.890287] dfl-pci 0000:c5:00.2: enabling device (0140 -> 0142)
[22638.890607] dfl-pci 0000:c5:00.2: PCIE AER unavailable -5.
[22638.904091] dfl-pci 0000:c5:00.3: enabling device (0140 -> 0142)
[22638.904377] dfl-pci 0000:c5:00.3: PCIE AER unavailable -5.
[22638.916944] dfl-pci 0000:c5:00.4: enabling device (0140 -> 0142)
[22638.917231] dfl-pci 0000:c5:00.4: PCIE AER unavailable -5.
Trying to deploy an image results in the error included below.
Otherwise PCIe inventory and fpgainfo command seem to work ok as shown below.
Any help would be appreciated. Hardware problem, on-card BMC problem, software problem ?
fpgasupdate --log-level debug ofs_top_page1_pacsign_user1.bin 0000:C5:00.0
[2024-01-29 05:07:27.46] [DEBUG ] fw file: ofs_top_page1_pacsign_user1.bin
[2024-01-29 05:07:27.46] [DEBUG ] addr: 0000:C5:00.0
[2024-01-29 05:07:27.46] [DEBUG ] hash256: b'e026976389252b8a746943f351e8f149e5f0415f620cd1e0618229eb79e01bb8'
[2024-01-29 05:07:27.46] [DEBUG ] hash384: b'bb04ea12557ce23f2cb75685669d794fb6a06bf7b590430aa8bfdb4c765c6e15ecdb38200e1599aa8a7e52a2958e20db'
[2024-01-29 05:07:27.46] [DEBUG ] file type: Static Region (Update)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.3 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.1 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.0 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.4 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.2 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.47] [DEBUG ] found device at 0000:c5:00.0 -tree is
[pci_address(0000:c2:04.0), pci_id(0x8086, 0x347c)] (pcieport)
[pci_address(0000:c5:00.3), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.1), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.4), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.2), pci_id(0x8086, 0xbcce)] (dfl-pci)
[pci_address(0000:c5:00.0), pci_id(0x8086, 0xbcce)] (dfl-pci)
[2024-01-29 05:07:27.48] [DEBUG ] could not find: "/sys/class/fpga_region/region0/dfl-fme.0/dfl*.*/*spi*/spi_master/spi*/spi*"
[2024-01-29 05:07:27.48] [DEBUG ] could not find: "/sys/class/fpga_region/region0/dfl-fme.0/dfl*.*/spi_master/spi*/spi*"
[2024-01-29 05:07:27.48] [DEBUG ] could not find: "/sys/class/fpga_region/region0/dfl-fme.0/spi*/spi_master/spi*/spi*"
[2024-01-29 05:07:27.48] [DEBUG ] could not find: "/sys/class/fpga_region/region0/dfl-fme.0/dfl_dev.4/n6000bmc-sec-update.3.auto/*fpga_sec_mgr*/*fpga_sec*"
[2024-01-29 05:07:27.48] [DEBUG ] could not find: "/sys/class/fpga_region/region0/dfl-fme.0/dfl_dev.4/n6000bmc-sec-update.3.auto/fpga_image_load/fpga_image*"
Traceback (most recent call last):
File "/usr/bin/fpgasupdate", line 33, in <module>
sys.exit(load_entry_point('opae.admin===1.4.1-', 'console_scripts', 'fpgasupdate')())
File "/usr/lib/python3.6/site-packages/opae/admin/tools/fpgasupdate.py", line 789, in main
if pac.upload_dev.find_one(os.path.join('update', 'filename')):
AttributeError: 'NoneType' object has no attribute 'find_one'
lspci -vt
| +-02.0-[c3-c4]--+-00.0 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.1 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.2 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.3 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.4 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.5 Intel Corporation Ethernet Controller E810-C for backplane
| | +-00.6 Intel Corporation Ethernet Controller E810-C for backplane
| | \-00.7 Intel Corporation Ethernet Controller E810-C for backplane
| \-04.0-[c5]--+-00.0 Intel Corporation Device bcce
| +-00.1 Intel Corporation Device bcce
| +-00.2 Intel Corporation Device bcce
| +-00.3 Intel Corporation Device bcce
| \-00.4 Intel Corporation Device bcce
fpgainfo fme
Intel Acceleration Development Platform N6001
Board Management Controller NIOS FW version: 3.14.0
Board Management Controller Build version: 3.14.0
//****** FME ******//
Object Id : 0xEF00000
PCIe s:b:d.f : 0000:C5:00.0
Vendor Id : 0x8086
Device Id : 0xBCCE
SubVendor Id : 0x8086
SubDevice Id : 0x1771
Socket Id : 0x00
Ports Num : 01
Bitstream Id : 0x5010202FAB46E6A
Bitstream Version : 5.0.1
Pr Interface Id : 00bc56cf-9e1f-5bf0-8011-48736ec862c9
Boot Page : user1
Factory Image Info : 801148736ec862c900bc56cf9e1f5bf0
User1 Image Info : 801148736ec862c900bc56cf9e1f5bf0
User2 Image Info : 801148736ec862c900bc56cf9e1f5bf0