Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

CRC Calculation - VHDL

Hello to all forum members!!!

I'll be glad to get your's suggestions to solve my problem.

I am interested to make the CRC Calculation as fast as possible (without using the MegaWizard function).

In order to make the calculation as fast as possible, I used the "variables" and not the signals. The purpose of VHDL Code is doing something like 250 xor actions(5 actions on every byte while the data telegram build from 53 bytes). During the sumalition (ModelSim) I got the right results. Pending one clock the result is ready!

Of course, it's uncompareable to real FPGA chip perfomance.

I did the Quartus TimeQuest analyze with 50MHz oscillator. The result was really bad. The parameters that failed are:

  • Report SetUp Summary

  • Fmax is something like 20Mhz

And now the question's time :) How can the setup time be affected if I didn't use the clock during calculation (before CRC it was OK)? As i know, SetUp time is a parameter that defines how much time the data has to be stable before changing the clk edge. How can I increase the Fmax of the design.(before CRC Fmax was ~ 120MHz)

thanks for every offer!!!

Y.

17 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    CRC16 can be reduced to the following C:

    uint32_t
    crc_step(uint32_t crc, uint32_t byte_val)
    {
        uint32_t t = crc ^ (byte_val & 0xff);
        t = (t ^ t << 4) & 0xff;
        return crc >> 8 ^ t << 8 ^ t << 3 ^ t >> 4;
    }
    

    Which can trivially be converted to VHDL:

            t1 <= crc_in(7 downto 0) xor data(7 downto 0);
            t2 <= t1 xor t1(3 downto 0) & B"0000";
            crc_out <= X"0000" & (X"00" & crc_in(15 downto 8)) xor (t2 & X"00")
                    xor (B"00000" & t2 & B"000") xor (X"000" & t2(7 downto 4));
    
    Which is 4 levels of XOR.

    As I said earlier, if you really need to generate the CRC of a 53 byte buffer in parallel every clock (I can't imaging why!) then you probably need to make use of the linearity of CRC calculations.

    Basically, if you CRC random data, then change a single bit, the difference in the CRC is independant of the original data.

    So, for a fixed length packet, you can easily determine which CRC bits each input bit changes and xor those values for every set bit onto the CRC for an all-zero pattern.

    --- Quote End ---

    Thank you for response :)

    My target is to send data packed with rs-485 communication protocol. Every 10mSec I send the packet. As I said before, packed build from 53 bytes and 2 bytes of CheckSum. Before, I used the MegaWizard for CheckSum perfomance and it was done during one clock cycle. At CheckSum I'm interested to change with CRC. That's the reason for CRC calculation. As I have read from different posts, CRC suitable for such a long data packets. Am I right?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    53 bytes isn't long!

    Just feed in one byte per clock sometime in the 10ms window
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    How many bits do you receive in one clock cycle? I would compute the CRC with this number if bits in your case.

    Lets say you receive 8 bits per clock cycle, this would results in a much smaller logic and meet your timing target easily.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    How many bits do you receive in one clock cycle? I would compute the CRC with this number if bits in your case.

    Lets say you receive 8 bits per clock cycle, this would results in a much smaller logic and meet your timing target easily.

    --- Quote End ---

    The number of bits dependent from what side do you want to look - transmitter or receiver.

    * During transmission I get all the packet during one cycle time. One more cycle for CRC calculating and then I enable the transmission. After this I'm waiting the nest 10mSec to transmit another packet.

    * During receiveing I get the bytes one by one.

    If i'll separate the action on steps (step = one byte calculation). So I need something like 53 cycles to get the result.

    how can I use the pipeline for it?

    Or maybe onother one method the get the right results with minimum of actions.

    TNX
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    There is no point pipelining it - that would only be relevant if you wre trying to process a packet every clock (with a 53 clock delay before the crc was available).

    Surely you can do the CRC as part of the tx dma? Then send the (inverted) CRC register at the end of the normal data buffer.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    dsl already posted some code how to implement your CRC for 8 bits per clock cycle.

    If you need some more inspiration for your implementation I can recommend to read the user guide for Alteras CRC Compiler MegaCore Function:

    http://www.altera.com/products/ip/communications/additional_functions_comm/m-alt-crc-compiler.html

    And/Or check Alteras Advanced Synthesis Cookbook - chapter 12 there is even some source code available:

    http://www.altera.com/literature/manual/stx_cookbook.pdf
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Tnx,I check the cookbook...

    I'm trying not to use the IP Cores cause it's making the life easier :)