VHDL and Transposed Fir

Question

Hello guys ,  I'm implementing a transposed fir filter in vhdl in which only the sum are pipelined.  The code I wrote up (using this (http://www.google.it/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=3&amp;cad=rja&amp;uact=8&amp;ved=0cdaqfjac&amp;url=http%3a%2f%2fwww4.hcmut.edu.vn%2f~hoangtrang%2flecture%2520note%2fdsp%2520on%2520fpga%2fdsp_fpga_ch%252010%2520-%2520fir%2520filter%2520design.pdf&amp;ei=a76eu6uummnp4qt88yggba&amp;usg=afqjcnhr62rzy5-5rvzgccd0nxqz9xaoow&amp;bvm=bv.68911936,d.bge)) until now is the one below: 
LIBRARY lpm; 
    USE lpm.lpm_components.ALL;
LIBRARY ieee;
    USE ieee.std_logic_1164.ALL;
    USE ieee.numeric_std.all;
    
ENTITY transposed_fir_test IS
GENERIC (W1 : INTEGER := 16; -- Input bit width
            W2 : INTEGER := 32; -- Multiplier bit width 
            W3 : INTEGER := 35; -- Adder width
            W4 : INTEGER := 16; -- Output bit width
            L : INTEGER := 15;  -- Filter length
            Mpipe : INTEGER := 0-- Pipeline steps of multiplier
);
PORT ( clk : IN STD_LOGIC;
         Load_x : IN STD_LOGIC;
         x_in : IN STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
         c_in : IN STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
         y_out : OUT STD_LOGIC_VECTOR(W4-1 DOWNTO 0));
END transposed_fir_test;
ARCHITECTURE fpga OF transposed_fir_test IS
    SUBTYPE N1BIT IS STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
    SUBTYPE N2BIT IS STD_LOGIC_VECTOR(W2-1 DOWNTO 0);
    SUBTYPE N3BIT IS STD_LOGIC_VECTOR(W3-1 DOWNTO 0);
    TYPE ARRAY_N1BIT IS ARRAY (0 TO L-1) OF N1BIT;
    TYPE ARRAY_N2BIT IS ARRAY (0 TO L-1) OF N2BIT;
    TYPE ARRAY_N3BIT IS ARRAY (0 TO L-1) OF N3BIT;
    SIGNAL x : N1BIT;
    SIGNAL y : N3BIT;
    SIGNAL c : ARRAY_N1BIT; -- Coefficient array
    SIGNAL p : ARRAY_N2BIT; -- Product array
    SIGNAL a : ARRAY_N3BIT; -- Adder array
    BEGIN

Load: PROCESS ------&gt; Load data or coefficient
            BEGIN
                WAIT UNTIL clk = '1';
                IF (Load_x = '0') THEN
                    c(L-1) &lt;= c_in; -- Store coefficient in register
                    FOR I IN L-2 DOWNTO 0 LOOP -- Coefficients shift one
                        c(I) &lt;= c(I+1);
                    END LOOP;
                    
                ELSE
                    x &lt;= x_in; -- Get one data sample at a time
                END IF;
            END PROCESS Load;

SOP: PROCESS (clk) ------&gt; Compute sum-of-products
            BEGIN
                IF rising_edge(clk) THEN
                    FOR I IN 0 TO L-2 LOOP -- Compute the transposed
                        a(I) &lt;= std_logic_vector(signed(p(I)) + signed(a(I+1))); -- filter adds
                    END LOOP;
                        a(L-1) &lt;=std_logic_vector(resize(signed(p(L-1)),W3)); -- First TAP has
                END IF;                                                                      -- only a register
                y &lt;= a(0);
        END PROCESS SOP;
        -- Instantiate L pipelined multiplier
        MulGen: FOR I IN 0 TO L-1 GENERATE
        Muls: lpm_mult -- Multiply p(i) = c(i) * x;
            GENERIC MAP ( LPM_WIDTHA =&gt; W1, LPM_WIDTHB =&gt; W1,
                              LPM_PIPELINE =&gt; Mpipe,
                              LPM_REPRESENTATION =&gt; "SIGNED",
                              LPM_WIDTHP =&gt; W2,
                              LPM_WIDTHS =&gt; W2)
            PORT MAP ( dataa =&gt; x,
                            datab =&gt; c(I), result =&gt; p(I));
        END GENERATE;
y_out &lt;=y(W3-1 DOWNTO W3-W4);
END fpga;
  By the way using a delta-stimulus of 32767 and this set of coeff: [32437,31463,29888,27779,25230,22351,19269,16118,13036,10157,7608,5500,3924,2950,2621] I get the following output: [X,X,X,X,X,1576,1396,1204,1007,814,634,475,343,245,184,163,0]  So it seems I miss the first ouptut samples of the filter and a scaled output..  Any suggestions ??  ty !

altera_forum · Answer

to get initial values correctly you need to set accumulators to zero start.  you are scaling by discarding 19 bits. Is that what you wanted?

altera_forum · Answer

--- Quote Start ---  to get initial values correctly you need to set accumulators to zero start.  you are scaling by discarding 19 bits. Is that what you wanted?  --- Quote End ---    Yep, those X's are related to the initial values of the C matrix.  As long as I need only 16 bit I have to take only the MSbs at the output. The magnitude is ok. I was a little confused because the very first values at the output of the filter where not shown  due to the lack of initialization of the C matrix.  One curiosity, now the filter coefficient are uploaded at the same clock of the input signal. By the way I want to control the coefficient uploading with a different clock (I want to set the coefficient via NIOS PIOs).  I've thought to use a first pio to load the coefficient into c_in and a secondo pio to get the c_in loaded (so it acts like the clock of the previus code). I ended up with this code:  -- This is a generic FIR filter generator
-- It uses W1 bit data/coefficients bits
LIBRARY lpm; -- Using predefined packages
    USE lpm.lpm_components.ALL;
LIBRARY ieee;
    USE ieee.std_logic_1164.ALL;
    USE ieee.numeric_std.all;
    
ENTITY transposed_fir_test IS ------&gt; Interface
GENERIC (W1 : INTEGER := 16; -- Input bit width
            W2 : INTEGER := 32; -- Multiplier bit width 2*W1
            W3 : INTEGER := 35; -- Adder width = W2+log2(L)-1
            W4 : INTEGER := 16; -- Output bit width
            L : INTEGER := 15;  -- Filter length
            Mpipe : INTEGER := 0-- Pipeline steps of multiplier
);
PORT ( clk : IN STD_LOGIC;
         rst : IN STD_LOGIC;
         Load_x : IN STD_LOGIC;
         x_in : IN STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
         c_in : IN STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
         wr_clk : IN STD_LOGIC;
         y_out : OUT STD_LOGIC_VECTOR(W4-1 DOWNTO 0));
END transposed_fir_test;
ARCHITECTURE fpga OF transposed_fir_test IS
    SUBTYPE N1BIT IS STD_LOGIC_VECTOR(W1-1 DOWNTO 0);
    SUBTYPE N2BIT IS STD_LOGIC_VECTOR(W2-1 DOWNTO 0);
    SUBTYPE N3BIT IS STD_LOGIC_VECTOR(W3-1 DOWNTO 0);
    TYPE ARRAY_N1BIT IS ARRAY (0 TO L-1) OF N1BIT;
    TYPE ARRAY_N2BIT IS ARRAY (0 TO L-1) OF N2BIT;
    TYPE ARRAY_N3BIT IS ARRAY (0 TO L-1) OF N3BIT;
    SIGNAL x : N1BIT;
    SIGNAL y : N3BIT;
    SIGNAL c : ARRAY_N1BIT; -- Coefficient array
    SIGNAL p : ARRAY_N2BIT; -- Product array
    SIGNAL a : ARRAY_N3BIT; -- Adder array
    BEGIN
        x &lt;= x_in;
        
        Load: PROCESS(wr_clk,Load_x) ------&gt; Load data or coefficient
            BEGIN
                if(rising_edge(wr_clk)) then
                    IF (Load_x = '0') THEN
                        c(L-1) &lt;= c_in; -- Store coefficient in register
                        FOR I IN L-2 DOWNTO 0 LOOP -- Coefficients shift one
                            c(I) &lt;= c(I+1);
                        END LOOP;
                    ELSE
                        ;
                    END IF;
                end if;
            END PROCESS Load;

SOP: PROCESS (clk) ------&gt; Compute sum-of-products
            BEGIN
                IF rising_edge(clk) THEN
                    FOR I IN 0 TO L-2 LOOP -- Compute the transposed
                        a(I) &lt;= std_logic_vector(signed(p(I)) + signed(a(I+1))); -- filter adds
                    END LOOP;
                    a(L-1) &lt;=std_logic_vector(resize(signed(p(L-1)),W3)); -- First TAP has
                END IF;                                                                      -- only a register
                y &lt;= a(0);
        END PROCESS SOP;
        -- Instantiate L pipelined multiplier
        MulGen: FOR I IN 0 TO L-1 GENERATE
        Muls: lpm_mult -- Multiply p(i) = c(i) * x;
            GENERIC MAP ( LPM_WIDTHA =&gt; W1, LPM_WIDTHB =&gt; W1,
                              LPM_PIPELINE =&gt; Mpipe,
                              LPM_REPRESENTATION =&gt; "SIGNED",
                              LPM_WIDTHP =&gt; W2,
                              LPM_WIDTHS =&gt; W2)
            PORT MAP ( dataa =&gt; x,
                            datab =&gt; c(I), result =&gt; p(I));
        END GENERATE;
        
        y_out &lt;=y(W3-1 DOWNTO W3-W4);
        
END fpga;  Any suggestions ?  What can I do if I want to intialize to 0's the C matrix while resetting ?  ty !

altera_forum · Answer

to initilaise use := (others =&gt; (others =&gt; '0'));  your dividing by 2^19 may not be right for unity gain. Your output will be scaled somewhere above or below unity

altera_forum · Answer

Thank you kaz !  I have never understand if that type initialization is true also during the power up .. could you confirm it ?

altera_forum · Answer

--- Quote Start ---

Thank you kaz !

I have never understand if that type initialization is true also during the power up .. could you confirm it ?

--- Quote End ---

yes it is supported as initial values on registers at powerup

Forum Discussion

VHDL and Transposed Fir

5 Replies

Recent Discussions

Quartus messages web search goes to Intel

Duplicate_hierarchy_depth / duplicate_register

how to reduce clock skew between synchronous clock

Quartus - Users getting license Notification with new license applied

Is Quartus Prime Pro 22.4 Compatible with Stratix 10 NX Series Device 1SN21CEU2F55E2VG?