Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
15 years ago

Floading Point divider, throughput/latency

Hi all,

I'm considering using the floading point blocks available from Altera in a design I have. Throughput is a top priority as I'm doing complex image transformation for high speed application.

I was trying out the altfp_div megafunction and found out that it outputs only correct answers every other clock cycle, possibly worse as I only tried it with two numbers. Is the altfp_div not fully pipelined? I saw this also with latency set to 14.

I have added figures with this where div6rest_result is the correct values and div6_result_real is the value form the altfp_div.

I also did the same for the altfp_inv floating point inverter. It seems to output correct values when receiving data every cycle.

I also added a zip file with this test made in Quartus 9.1

Cheers

Stefan

p.s. Second question I have is are those megafunctions free to use with the web edition?

19 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    If you want to have an RTL simulation that is consistent with the real device, let the input data change silightly before the edge of the clock that registers it.

    --- Quote End ---

    How do you suggest doing this in synthesizableRTL code?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I would expect, that Altera is mainly focussing on synthesis performance. I also won't expect issues in the timing analysis.

    The latest after Altera abandonned the internal simulator in favor of ModelSim, one should expect simulation proof IP code. Having the timing intentionally set in a test case is only the most obvious case. Normally, the data to altfp_mult would be supplied after rising clock edge from a register burried deep in the code. I guess some experience of the latter kind motivated your tests? So you should file a service request, I'm curious to hear the answer.

    In the meanwhile, you could try how many simulation time steps of additional delay in your test bench can make the artefact vanish.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    How do you suggest doing this in synthesizableRTL code?

    --- Quote End ---

    I do not suggest to do this in synthesizable RTL Code.

    I suggest to do this in testbench.

    Testbenches are not for synthesys and it is your decision when to

    let the input data change.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I would expect, that Altera is mainly focussing on synthesis performance. I also won't expect issues in the timing analysis.

    The latest after Altera abandonned the internal simulator in favor of ModelSim, one should expect simulation proof IP code. Having the timing intentionally set in a test case is only the most obvious case. Normally, the data to altfp_mult would be supplied after rising clock edge from a register burried deep in the code. I guess some experience of the latter kind motivated your tests? So you should file a service request, I'm curious to hear the answer.

    In the meanwhile, you could try how many simulation time steps of additional delay in your test bench can make the artefact vanish.

    --- Quote End ---

    I don't think this is an artifact.

    The edge of the clock and the data change simultaneously.

    How should the simulator decide which is the actual input of the flip flop in an RTL simulation?

    in a timing simulation the situation is different. there you can check for setup and hold times.

    I think this is one of the cases in which the result of the simulation depends on the simulator.

    In general, letting data and clock edge change at the same time, is asking for trouble.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I don't think this is an artifact.

    The edge of the clock and the data change simultaneously.

    How should the simulator decide which is the actual input of the flip flop in an RTL simulation?

    in a timing simulation the situation is different. there you can check for setup and hold times.

    I think this is one of the cases in which the result of the simulation depends on the simulator.

    In general, letting data and clock edge change at the same time, is asking for trouble.

    --- Quote End ---

    Well this is actually wrong

    The synthesis and clock tree generation will take care of this. Gate level simulation is used to verify the clock tree is buffered correctly and that the design does not brake setup/hold times. Running in max/min corners will assure you don't brake the setup/hold times.

    If you would be hand-placing the block into the FPGA then you would use falling->rising edge or to protect your design for metastability between Analog World and FPGA or crossing clock domains...
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    This is a delta-delay issue like FvM mention. I added 1ps to the rising_edge and the divider works.

    I'm installing version 10.1sp1 of the tools to see if this is the same there...
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    How should the simulator decide which is the actual input of the flip flop in an RTL simulation?

    --- Quote End ---

    VHDL has clear rules for it. Actually, most of the VHDL rules for "order of execution" are only relevant for simulation. In synthesis, logic delays clear most possible doubts, if timing closure can be achieved, everything is fine.

    Please notice, that the testbench code

    wait until risind_edge(clk);
    data <= xxx;

    is simply identical to the behavioral code for a register in the data path. So if altfp_div fails with the testbench, it must be expected to fail in functional simulation with a simple register, too. Which means, you can't perform a functional simulation, as the original poster mentioned.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I tested this with altfp_div generated in 10.1sp1 and ModelSim 6.6d. It also fails there also. I will try to send service request to Atlera and see what they think.

    Thanks all for the help

    Cheers

    Stefan