Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

Quartus complaint: My LOOP is too big? But the code is from the book, what's wrong?

Hello friends. I'm trying to make addition and subtraction of floating point. My guide is a book "Computer Arithmetic and Verilog HDL Fundamentals" by Cavanagh. Inside the module he use a code for aligning exponents as I show as follows.

always @ (oper_1 or oper_2)

begin

exp_a = oper_1 [31:24];

exp_b = oper_2 [31:24];

fract_a = oper_1 [23:0];

fract_b = oper_2 [23:0];

// bias exponents

exp_a_bias = exp_a + 8'b0111_1111;

exp_b_bias = exp_b + 8'b0111_1111;

// align fractions

if (exp_a_bias < exp_b_bias)

ctrl_align = exp_b_bias - exp_a_bias;

while (ctrl_align)

begin

fract_a = fract_a >> 1;

exp_a_bias = exp_a_bias + 1;

ctrl_align = ctrl_align - 1;

end

if (exp_b_bias < exp_a_bias)

ctrl_align = exp_a_bias - exp_b_bias;

while (ctrl_align)

begin

fract_b = fract_b >> 1;

exp_b_bias = exp_b_bias + 1;

ctrl_align = ctrl_align - 1;

end

Quartus II give me the following error:

Error (10119): Verilog HDL Loop Statement error at ADD_SUB_FLO.v(40): loop with non-constant loop condition must terminate within 250 iterations

I searched and it seems that since Quartus can't be sure what will be the size of "ctrl_align" resultant it won't synthesize. Quartus site says I can edit the .qsf file, But I couldn't find in my file any

set_global_assignment -name VERILOG_NON_CONSTANT_LOOP_LIMIT 300

Moreover, I'm somehow impressed that the book teach that step with a wrong code, not syntesizable, I also wish to actually fix the problem not to workaround, what can I do?

15 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I think in this case is mandatory to use floating point, I need iterative divisions, divisions will bring up fractional numbers of different sizes, example. 2.9 - 2.1899, it is necessary to align exponents before make any operations, thus is necessary exponents to vary according to the input size difference, and this is exactly what floating point is about.

    --- Quote End ---

    In that case - stick with the altera floating point IP cores rather than trying to write your own floating point arithmatic.

    This is of course, if there is no chance you can redesign the algorhith to fit with an FPGA. If you want lots of floating point (and especially division - which is particularly expensive in FPGA) why not use a DSP?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    In that case - stick with the altera floating point IP cores rather than trying to write your own floating point arithmatic.

    This is of course, if there is no chance you can redesign the algorhith to fit with an FPGA. If you want lots of floating point (and especially division - which is particularly expensive in FPGA) why not use a DSP?

    --- Quote End ---

    Tricky, sorry for the question, this DSP you mentioned is like an accessory chip in the development board, or is it the NIOS-II?

    In theory. I would need about 57.600 Processing Elements (240 x 240 pixels from a black and white picture) , Each processing element calculates an equation like this

    http://www.alteraforum.com/forum/attachment.php?attachmentid=11768&stc=1

    where A = is a positive integer from 0 to 255, 1 pixel = 1 processing element, (B, X and Y start with random numbers which will update by addition of the others pixels-PE results, but that's a different story) which I expected to implement in hardware somehow.

    In fact, since this is a school project, requirements are pretty flexible, so I have the freedom to use any technology (But I have later to explain the reason).
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    a DSP chip is processor designed for digitial signal processing. You can get ones that work in fixed or floating point: https://en.wikipedia.org/wiki/digital_signal_processor

    If you try and make 57600 PEs in a single FPGA, you're doomed to failure. For a start, your image (240x240) will not arrive in parrallel. You will get the data as a streaming set of pixels. Then you need to calculate the results in series.

    Also, the algorithm looks about ready for some re-design, and I dont understand why it cannot be fixed point. If you know the width of the inputs, then you can with out the max width of U.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    a DSP chip is processor designed for digitial signal processing. You can get ones that work in fixed or floating point: https://en.wikipedia.org/wiki/digital_signal_processor

    If you try and make 57600 PEs in a single FPGA, you're doomed to failure. For a start, your image (240x240) will not arrive in parrallel. You will get the data as a streaming set of pixels. Then you need to calculate the results in series.

    Also, the algorithm looks about ready for some re-design, and I dont understand why it cannot be fixed point. If you know the width of the inputs, then you can with out the max width of U.

    --- Quote End ---

    Tricky, thanks as always for your support. 57600 PE might be an exaggeration (I could actually implement only 1.000 pixels for sake of simplicity) , but the way data get PE is actually not relevant (according to the specs of my project), Once each data is in "position" is when the processing must start. In the other hand. I'm thinking to reformulate my project to use my board DE1-SOC's resources. Do you think I could use NIOS-II and C or Assembly programming language for the equation above for (let's say) 1.000 independent and parallel cores (in a way which makes sense)?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I don't think that you need a DSP or other processor to perform the calculation. You can also write HDL code that processes data sequentially, using block RAM to hold the array elements. You have to set up a state machine that controls the calculation flow.

    I presume that the parallel cores idea doesn't fit any FPGA hardware available to you.