Forum Discussion
12 Replies
- Altera_Forum
Honored Contributor
A valid or done bit is only meaningful for sequentially operating units. The said FP MegaFunctions are fully pipelined, emitting a new result every clock cycle. The input data must be valid only for one clock cycle, you have to know the pipeline delay, however.
- Altera_Forum
Honored Contributor
its quite easy just to store a valid bit in parrallel to the floating point units if you need it for other modules.
- Altera_Forum
Honored Contributor
It would have been smart thinking from the Altera guys if they had provisioned the 'valid' pipeline inside the building block. It would make for a much cleaner design as we don't have to add the glue logic mentioned by Tricky.
While we at it: can we have a separate clock enable for every stage too? I have a dataflow based development environment, but because all of Altera's building blocks use a global clock enable I'll be stuck when I would need more advanced functionality (or with a pipeline greater than 1). - Altera_Forum
Honored Contributor
Could anyone give a small code example to show what the glue logic about the parallel bit is being talked about?
- Altera_Forum
Honored Contributor
The "parallel" valid bit chain is simply a shift register (respectively a number of cascaded D-FFs), the delay (number of stages) is equal to the pipeline delay of the respective IP block.
I aggree, that Altera could have added it as an option, but as mentioned above, it won't be of any use in the standard application, where a continous data stream is fed to the IP. - Altera_Forum
Honored Contributor
So what I understand:
1-Instantiate IP in module 2-Also make a shift register to implement the latency delay of your IP 3-The shift register holds a '1' for the 'valid' bit which gets successively gets shifted and is finally given as output. Right? Also what is the buffer capacity of the IPs, if I keep giving new data in every clock cycle, how long before I have to stall the input data before the IP starts giving wrong outputs? - Altera_Forum
Honored Contributor
If the design is fully pipelined there's no buffer.
it can process a new input for every clock cycle. After an initial delay, the IP provides an output for every clock cycle. - Altera_Forum
Honored Contributor
But there are two different delays for an IP:
The delay between first input and output and the delay between subsequent outputs assuming inputs are being given every clock cycle. For example for the exponential core, there is a latency delay of 17 clock cycles between the first input and output but subsequent outputs appear at intervals of 6 clock cycles(not every next clock cycle) assuming new input data is being given at every clock cycle. Hence I was thinking that there will be a point where probably the buffer or whatever the mechanism inside the IP is, will be overflown by the input data. Am I correct in my understanding? Thanks. - Altera_Forum
Honored Contributor
Can you tell me where did you read the 6 cycles delays between the subsequent inputs and the results?
I'm trying to read the "Floating-Point Megafunctions User Guide" and still found nothing. Thx - Altera_Forum
Honored Contributor
I assumed the same myself after reading the IP documentation. (Page-35 of floating point megafunctions userguide gives the detail for floating exponential IP). But after instantiating the IP and running testbench on the code, I found that initially it takes 17 clock cycles to produce the output and thereafter it takes only 6 clock cycles. Try it. Just instantiate the IP and run a testbench which constantly supplies input data. Let me know what you find.