Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

the waitrequest of avalon

Hi:

the output shoud use the register many paper write that,but I find that if the waitrequest output with register will lead to delay a cycle.it is difficult to design the register timing.if use the combinational logic will solve the problem.but I am not sure whether it is reliable.

thank you!

8 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi:

    the output shoud use the register many paper write that,but I find that if the waitrequest output with register will lead to delay a cycle.it is difficult to design the register timing.if use the combinational logic will solve the problem.but I am not sure whether it is reliable.

    thank you!

    --- Quote End ---

    There's really only two hardware interfaces for custom slaves (ignoring Altera provided IP for DDR controllers and the like);

    1) The slave is slow.

    In this case, leave waitrequest asserted until the transaction occurs (read or write detected as asserted). This allows you to use both input and output registers on all signals. The latency does not really matter, since the device was slow anyway.

    2) The slave is fast.

    Eg., a block of on-chip registers or RAM.

    For this case you simply leave waitrequest deasserted. There's really no problem accepting a write or read transaction on every clock and pipelining the transaction. Write address, byte-enables, and data get routed through pipeline registers into the RAM or control registers blocks. Read data is then delivered a few pipeline cycles later with the assertion of readdatavalid. Multiple back-to-back data phases are no problem.

    These are the two approaches that give you the best timing results; given that both methods can use input and output registers.

    The examples shown in the Avalon-MM spec would require combinatorial paths to implement. They are not very good examples.

    TimeQuest timing analysis can be used to analyze the timing of your component designs.

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    There's really only two hardware interfaces for custom slaves (ignoring Altera provided IP for DDR controllers and the like);

    1) The slave is slow.

    <snip>

    2) The slave is fast.

    <snip>

    --- Quote End ---

    Your classifications of interfaces is incorrect and not necessary. An Avalon component needs to have a wait request or it must accept commands on every clock cycle. This is spelled out in the Avalon spec and has nothing to do with a slave that is 'fast' or 'slow'.

    As an example where your classification breaks down, consider a simple fifo inside the slave. The fifo exists because the output at certain times cannot keep up with the input. The input side can accept new data on every clock cycle (so it would be 'fast' by your classification), but could fill up so it would need to have a wait request output which is connected to the fifo full signal. Having the wait request would make it 'slow' by your classification. So this example would be both 'fast' and 'slow' according to you. Your classification of 'fast' and 'slow' is meaningless and does nothing to define how to implement wait request.

    --- Quote Start ---

    The examples shown in the Avalon-MM spec would require combinatorial paths to implement. They are not very good examples.

    --- Quote End ---

    Not true.

    Kevin Jennings
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Your classifications of interfaces is incorrect and not necessary. An Avalon component needs to have a wait request or it must accept commands on every clock cycle. This is spelled out in the Avalon spec and has nothing to do with a slave that is 'fast' or 'slow'.

    --- Quote End ---

    The fast and slow classification referred to how fast a component can deliver data to the Avalon interface, and how to implement the waitrequest control for both styles. You point out a slight variation, so I'll address that.

    The implementation detail that requires separation into these two classifications is that the device should have both input and output registers. These registers are there to cut the timing path between the fabric and your custom component.

    1) Slow devices.

    If a device cannot deliver data onto an Avalon bus at a sustained rate, comparable to the Avalon clock rate, then its slow. For example, an LCD or other off-chip device. Slow devices can start with waitrequest active.

    The input registers on the read/write controls delay those signals into the component by one clock. If waitrequest was deasserted, then the component would have already accepted one transaction. Given that the component waitrequest output is also registered, then by the time it asserts the response handshake, it may have had to accept two transactions. By starting out with waitrequest active, the control FSM can deassert the control for a single clock, and then process the transaction.

    2) Fast devices.

    A fast device such as a RAM, registers, or an interface with a FIFO can generally start with waitrequest deasserted. RAM or registers can handle writes on every clock, and deliver data after a pipeline delay, so their waitrequest signals can generally remain low. During reset, their waitrequest should be asserted (per the Avalon specification recommendations).

    FIFO devices need to make use of their almost-full flags when determining the state of waitrequest. If there have been no transactions for a while, and the transaction FIFO has drained, then waitrequest can be deasserted. For example, a write-burst can occur on the Avalon bus and be write-posted to the FIFO. Those transactions can then drain from the FIFO to the external device, eg. an SRAM, at a slower rate. If another write or read transaction occurs on the Avalon bus before the FIFO has been drained, then the control FSM can either leave waitrequest asserted until it is completely finished with the previous transaction, or it can use the FIFO almost-full flag to determine whether to accept more transactions into the FIFO.

    The key aspect of the FIFO is that its flags can be use to accommodate the pipeline delay due to the input/output registers.

    --- Quote Start ---

    As an example where your classification breaks down, consider a simple fifo inside the slave. The fifo exists because the output at certain times cannot keep up with the input. The input side can accept new data on every clock cycle (so it would be 'fast' by your classification), but could fill up so it would need to have a wait request output which is connected to the fifo full signal. Having the wait request would make it 'slow' by your classification. So this example would be both 'fast' and 'slow' according to you. Your classification of 'fast' and 'slow' is meaningless and does nothing to define how to implement wait request.

    --- Quote End ---

    I hope the above comments clarify the classification.

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The fast and slow classification referred to how fast a component can deliver data to the Avalon interface, and how to implement the waitrequest control for both styles. You point out a slight variation, so I'll address that.

    --- Quote End ---

    As I already pointed out, your classification into 'fast' and 'slow' is not relevant at all in implementing the logic for wait request. Wait request is asserted under the following conditions:

    - Input interface (i.e. responds to 'write'): Set active when the slave is not able to receive any more input.

    - Output interface (i.e. responds to 'write'): Set active when the slave is not able to produce any more output.

    End of story.

    --- Quote Start ---

    The implementation detail that requires separation into these two classifications is that the device should have both input and output registers. These registers are there to cut the timing path between the fabric and your custom component.

    --- Quote End ---

    These input and output registers can be there in either a 'fast' or 'slow' device. They are totally irrelevant to wait request, they do not alter the fundamental definition of when wait request should be asserted as mentioned above.

    --- Quote Start ---

    1) Slow devices.

    The input registers on the read/write controls delay those signals into the component by one clock. If waitrequest was deasserted, then the component would have already accepted one transaction. Given that the component waitrequest output is also registered, then by the time it asserts the response handshake, it may have had to accept two transactions. By starting out with waitrequest active, the control FSM can deassert the control for a single clock, and then process the transaction.

    --- Quote End ---

    You're describing the operation of a one deep FIFO and calling them 'input registers'...I don't think you realize that, but you are.

    --- Quote Start ---

    2) Fast devices.

    A fast device such as a RAM, registers, or an interface with a FIFO can generally start with waitrequest deasserted. RAM or registers can handle writes on every clock, and deliver data after a pipeline delay, so their waitrequest signals can generally remain low. During reset, their waitrequest should be asserted (per the Avalon specification recommendations).

    --- Quote End ---

    Not relevant...see previous definition of the implementation logic for wait request.

    --- Quote Start ---

    FIFO devices need to make use of their almost-full flags when determining the state of waitrequest.

    --- Quote End ---

    FIFOs do not need to make use of the almost-full (or almost empty) when determining the state of wait request. Full and empty represent wait request directly, no 'almost' required. Try it and see.

    The only reason one would have for using 'almost' flags is so that one could take those flags and delay them by a clock cycle to set wait request. By doing so though you show that you've missed the point that the flags already are the outputs of a flip flop and can be used directly.

    --- Quote Start ---

    If there have been no transactions for a while, and the transaction FIFO has drained, then waitrequest can be deasserted.

    --- Quote End ---

    Yes...FIFO full will not be set, wait request = FIFO Full therefore wait request will not be set...no logic is needed in between FIFO Full and wait request.

    --- Quote Start ---

    For example, a write-burst can occur on the Avalon bus and be write-posted to the FIFO. Those transactions can then drain from the FIFO to the external device, eg. an SRAM, at a slower rate. If another write or read transaction occurs on the Avalon bus before the FIFO has been drained, then the control FSM can either leave waitrequest asserted until it is completely finished with the previous transaction, or it can use the FIFO almost-full flag to determine whether to accept more transactions into the FIFO.

    --- Quote End ---

    You can choose to overcomplicate the logic for wait request as much as you'd like...just so long as you don't undercomplicate it and accept a command that you can't process.

    Kevin Jennings
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    These input and output registers can be there in either a 'fast' or 'slow' device. They are totally irrelevant to wait request, they do not alter the fundamental definition of when wait request should be asserted as mentioned above.

    --- Quote End ---

    My recommendation is to always add input and output registers to the Avalon bus, irrespective of whether I considered the device fast or slow. These registers then affect how you implement waitrequest, given that you then have to deal with the pipeline latency of the input read and write signals being delayed one clock before any control FSM can view them, and then the waitrequest signal being delayed by one more clock before it appears back on the Avalon bus, i.e., there can be two complete periods when the read/write signals can be asserted. If waitrequest is deasserted, then those two transactions would be accepted into the pipeline versus waitrequest being asserted, and no transactions being accepted.

    Clearly you understand how to implement Avalon components. The original poster was asking for advice regarding whether or not to implement a combinatorial interface as shown in the Avalon specification.

    If you like, we can both come up with some example slave interfaces, and timequest analysis to show why registers and FIFOs improve timing, and then post them to the Altera wiki.

    Cheers,

    Dave