NIOSV/g with FPU: inconsistent calculation results

I'm using a NIOSV/g with FPU enabled in a MAX10 project. The project involves heavy use of float point calculations, hence the need for the FPU. I noticed some occasionally inconsistent results in this program and started debugging - assuming this was a bug in my code. However I was able to run my code in a simulator and on a different RISCV microcontroller and everything worked flawlessly. I also disabled the FPU in the NIOSV design and again the code ran fine.

In order to recreate the problem, I created a basic project with just the NIOSV, some RAM and the JTAG-Uart. I also wrote a tiny C program to stress test the FPU. The results of this show that again, the FPU is producing incorrect results.

I've attached a screenshot of the Platform Designer design. I'm running the design at 75Mhz and the design meets timing requirements.

Here is the code I ran. Note that I have interrupts disabled to be sure this isn't a context switching issue. I also did not wrap the calculations into a function so I could more easily view the various calculation results in the debugger. This code works as expected when using a soft-FPU. When using the NIOSV FPU, results are inconsistent. I've attached a screenshot of one failed cycled. You can see that a1 and b1 are not equal.

#include <stdint.h>
#include <math.h>
#include "sys/alt_stdio.h"

static void fpuTest(void) {
  int fail_count = 0;
  int iteration = 0;

  while (1) {
    float a0 = (float)iteration * 0.001f;
    float a1 = 1.1f * sinf((float)iteration * 0.1f);
    float a2 = 2.2f / (1.0f + (float)iteration * 0.0001f);
    float a3 = sqrtf(3.3f + (float)iteration);
    float a4 = powf(4.4f + (float)iteration, 1.1f);
    float a5 = logf(5.5f + (float)iteration + 1.0f);
    float a6 = 6.6f * cosf((float)iteration * 0.05f);
    float a7 = 7.7f + tanf((float)iteration * 0.02f);

    float result_a = a0 + a1 + a2 + a3 + a4 + a5 + a6 + a7;

    float b0 = (float)iteration * 0.001f;
    float b1 = 1.1f * sinf((float)iteration * 0.1f);
    float b2 = 2.2f / (1.0f + (float)iteration * 0.0001f);
    float b3 = sqrtf(3.3f + (float)iteration);
    float b4 = powf(4.4f + (float)iteration, 1.1f);
    float b5 = logf(5.5f + (float)iteration + 1.0f);
    float b6 = 6.6f * cosf((float)iteration * 0.05f);
    float b7 = 7.7f + tanf((float)iteration * 0.02f);

    float result_b = b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7;

    // Check if result is consistent (should be identical)
    if (fabsf(result_a - result_b) > 1e-6f) {
      alt_printf("FPU test failed at iteration %x\n", iteration);
      fail_count++;
    }

    iteration++;
  }
}

int main(void) {
  // Make sure interrupts are disabled
  __asm volatile ( "csrc mstatus, 8" );

  fpuTest();

  while (1);

  return 0;
}

Can someone help me investigate what could be wrong here? Could there be an issue in the FPU itself?

Nios Architecture

11 Replies

Mark_H_Intel1
New Contributor
4 months ago
Hi @Broddo
Our patch is through internal testing which includes a check against your software and NiosV parameterization.
It should become available on our website in roughly a week's time - we will update the thread when it is available.
Thanks
Mark
- Broddo
  Occasional Contributor
  4 months ago
  This is fantastic @Mark_H_Intel1 - thank you!

LYGOOI

New Contributor

5 months ago

Hi @Broddo

Can you share your processor IP Parameter Editor settings?

And which Quartus version is your design based on?

Or, you can zip the design & attach it in your next reply.

Regards,

Liang Yu

Broddo

Occasional Contributor

5 months ago

Thanks for the reply @LYGOOI and apologies for the delay in getting back to you - I was on vacation.

To answer your questions: I'm using Quartus Lite 24.1 and I've pasted the CPU parameters below.

I've attached the test project that builds all of this. I'm running on a custom board (that was previously running a NIOS2 application with no issues). If you want to run it yourself, the only change you'll have to make is the location of the source clock.

For convenience, I've added a Makefile that will build the project and the software - you'll see for yourself.

Here are the CPU parameters

<module
   name="intel_niosv_g_0"
   kind="intel_niosv_g"
   version="4.0.0"
   enabled="1">
  <parameter name="AUTO_CLK_CLOCK_DOMAIN" value="3" />
  <parameter name="AUTO_CLK_RESET_DOMAIN" value="3" />
  <parameter name="AUTO_DEVICE" value="10M50DAF484C8G" />
  <parameter name="AUTO_DEVICE_SPEEDGRADE" value="8" />
  <parameter name="Blind_Window_Period" value="1000" />
  <parameter name="CLICenabledInterruptMode" value="0" />
  <parameter name="CLICenabledShadowRegisterFiles" value="1" />
  <parameter name="CUSTOM_OP" value="" />
  <parameter name="Default_Timeout_Period" value="255" />
  <parameter name="SUB_OP" value="" />
  <parameter name="alignCLICVectorTable" value="8" />
  <parameter name="basicInterruptMode" value="0" />
  <parameter name="basicShadowRegisterFiles" value="0" />
  <parameter name="clockFrequency" value="75000000" />
  <parameter name="dataCacheSize" value="4096" />
  <parameter name="dataSlaveMapParam"><![CDATA[<address-map><slave name='onchip_flash.data' start='0x0' end='0x160000' type='altera_onchip_flash.data' /><slave name='onchip_memory.s1' start='0x200000' end='0x214000' type='altera_avalon_onchip_memory2.s1' /><slave name='intel_niosv_g_0.dm_agent' start='0x220000' end='0x230000' type='intel_niosv_g.dm_agent' /><slave name='intel_niosv_g_0.timer_sw_agent' start='0x230000' end='0x230040' type='intel_niosv_g.timer_sw_agent' /><slave name='jtag_uart_0.avalon_jtag_slave' start='0x230040' end='0x230048' type='altera_avalon_jtag_uart.avalon_jtag_slave' /><slave name='onchip_flash.csr' start='0x230048' end='0x230050' type='altera_onchip_flash.csr' /></address-map>]]></parameter>
  <parameter name="deviceFamily" value="MAX 10" />
  <parameter name="disableFsqrtFdiv" value="false" />
  <parameter name="dtcm1Base" value="18874368" />
  <parameter name="dtcm1InitFile" value="" />
  <parameter name="dtcm1Size" value="0" />
  <parameter name="dtcm2Base" value="0" />
  <parameter name="dtcm2InitFile" value="" />
  <parameter name="dtcm2Size" value="0" />
  <parameter name="enableBranchPrediction" value="true" />
  <parameter name="enableCLICInterruptEdgeTriggerConfig" value="false" />
  <parameter name="enableCLICInterruptPolarityConfig" value="false" />
  <parameter name="enableCLICSelectiveHardwareVectoring" value="false" />
  <parameter name="enableCoreLevelInterruptController" value="false" />
  <parameter name="enableDebug" value="true" />
  <parameter name="enableDebugReset" value="true" />
  <parameter name="enableECCFull" value="false" />
  <parameter name="enableECCLite" value="false" />
  <parameter name="enableFPU" value="true" />
  <parameter name="enableLockstep" value="false" />
  <parameter name="enableLockstepExtRst" value="false" />
  <parameter name="enableMulDiv" value="true" />
  <parameter name="funct3" value="" />
  <parameter name="funct7_l" value="" />
  <parameter name="funct7_u" value="" />
  <parameter name="hartId" value="0" />
  <parameter name="instCacheSize" value="4096" />
  <parameter name="instSlaveMapParam"><![CDATA[<address-map><slave name='onchip_flash.data' start='0x0' end='0x160000' type='altera_onchip_flash.data' /><slave name='onchip_memory.s1' start='0x200000' end='0x214000' type='altera_avalon_onchip_memory2.s1' /><slave name='intel_niosv_g_0.dm_agent' start='0x220000' end='0x230000' type='intel_niosv_g.dm_agent' /></address-map>]]></parameter>
  <parameter name="itcm1Base" value="19922944" />
  <parameter name="itcm1InitFile" value="" />
  <parameter name="itcm1Size" value="0" />
  <parameter name="itcm2Base" value="0" />
  <parameter name="itcm2InitFile" value="" />
  <parameter name="itcm2Size" value="0" />
  <parameter name="mnemonic" value="" />
  <parameter name="numCLICDebugTriggers" value="0" />
  <parameter name="numCLICLevels" value="2" />
  <parameter name="numCLICPlatformInterrupts" value="16" />
  <parameter name="numCLICPriorities" value="8" />
  <parameter name="opcode" value="" />
  <parameter name="peripheralRegionABase" value="2293760" />
  <parameter name="peripheralRegionASize" value="65536" />
  <parameter name="peripheralRegionBBase" value="67108864" />
  <parameter name="peripheralRegionBSize" value="2097152" />
  <parameter name="resetOffset" value="0" />
  <parameter name="resetSlave" value="onchip_flash.data" />
  <parameter name="useResetReq" value="false" />
 </module>

fpu-test.zip351 KB

LYGOOI
New Contributor
5 months ago
Hi @Broddo,
Thanks for the design.
We are able to replicate the same issue. Found the cause, and currently investigating deeper.
As a temporary workaround, please perform the following steps:
Disable cache in the Platform Designer
(Select No Cache for both Instruction & Data cache)
Enable C++ in BSP Editor
(Checked the enable_c_plus_plus checkbox)
Regards,
Liang Yu
BoonBengT_Altera
Moderator
5 months ago
Hi @Broddo,

Good day, just following up on the previous clarification.
By any chances did you managed try out the workaround?
Hope to hear from you soon.

Best Wishes
BB
Broddo
Occasional Contributor
5 months ago
@BoonBengT_Altera @LYGOOI
My apologies again for the delay in responding. Yes I can confirm that the work-around suggested by @LYGOOI does address the problem - the FPU is producing consistently correct results now, so thanks for this!
However, disabling caching is a big trade off for my project as I'm executing in place from external flash. I'll need to profile this to determine if FPU without cache is more performant than soft-float with cache. For now, I'm working with the latter and will wait for the fix in a coming release.
Thanks once again,
Broddo
BoonBengT_Altera
Moderator
5 months ago
Hi @Broddo,

Great! Thanks for confirming that it is working for the workaround and filling us in with your actions, with no further clarification on this thread, it will be transitioned to community support for further help on doubts in this thread.

Please login to ‘ https://supporttickets.intel.com/s/?language=en_US’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions.
Thank you for the questions and as always pleasure having you here.

Best Wishes
BB
Mark_H_Intel1
New Contributor
4 months ago
Hi @Broddo
Just to let you know, we have found and fixed the issue. A patch is being prepared and is currently in internal testing.
Thank you very much for the example code, that helped us find the root cause very quickly.
Mark
LYGOOI
New Contributor
4 months ago
Hi @Broddo ,
Here is the patch file for the Nios V FPU inconsistency issue in Max 10.
While waiting for the KDB article, you can start patching your Quartus software.
Regards,
Liang Yu
quartus-24.1std-0.02std-windows.zip14.7 MB
quartus-24.1std-0.02std-linux.zip12.7 MB
quartus-24.1std-0.02std-readme.txt2 KB
- Broddo
  Occasional Contributor
  3 months ago
  Thanks @LYGOOI
  I'll try the patch today and report back.

Forum Discussion

NIOSV/g with FPU: inconsistent calculation results

11 Replies

Recent Discussions

How to reduce ROM/RAM requirements for a NIOSV Compact CPU Platform?

Created Free NIOSV IP evaluation license but did not get any license file by email?

eCoS OS throws execption when freeing memory

How to use SDRAM IP core on Agilex 3?

Nios-V alt_epcq_controller_write() Problem