Altera_Forum
Honored Contributor
14 years agoIs it possible to make the code faster via pipelining?
Hi there everyone...i'm trying to implement both TEA and XTEA algorithm to make a comparison. I have a complete working vhdl of TEA as following:
Library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
use ieee.std_logic_unsigned.all;
entity TEA_en is
port(
clock: in std_logic; --clock input
input_data: in std_logic_vector (63 downto 0); --input data
key : in std_logic_vector (127 downto 0); --secret key 127 downto 0--
encrypted_data: out std_logic_vector (63 downto 0) --output/encrypted data
);
end entity TEA_en;
architecture behave of TEA_en is
--declare signals
signal Key0, Key1, Key2, Key3 : std_logic_vector (31 downto 0);
signal Z, Y : std_logic_vector (31 downto 0);
signal count :integer :=0;
begin
--separate key into four parts
Key0<=key(127 downto 96);
Key1<=key(95 downto 64);
Key2<=key(63 downto 32);
Key3<=key(31 downto 0);
Process(Input_data, clock)
--declare and initialize variable
Variable delta: std_logic_vector (31 downto 0):=x"9e3779b9";
Variable sum: std_logic_vector (31 downto 0):=x"00000000";
Variable Zeq,Yeq,Z,Y: std_logic_vector (31 downto 0);
Begin
If(rising_edge(clock)) then
If (count<1) then --separate input data into two parts
Z:=input_data(63 downto 32); --part 1 (32bits)
Y:=input_data(31 downto 0); --part 2 (32bits)
Else --null;
End if;
If (count<32) then
--Encryption routine algorithms
sum:=sum+delta;
--Calculate Y
Zeq:=( (Z(27 downto 0) & "0000")+Key0) xor --left shift 4 bits and sum to secret key1
(Z+ sum) xor --Z add to sum
(("00000" & Z(31 downto 5))+Key1); --right shift 5 bits and sum to key2
Y:=Y+Zeq;
--Calculate Z
Yeq:=( (Y(27 downto 0) & "0000")+Key2) xor --left shift 4 bits and sum to secret key1
(Y+ sum) xor --Z add to sum
(("00000" & Y(31 downto 5))+Key3); --right shift 5 bits and sum to key2
Z:=Z+Yeq;
--Output encrypted data
Encrypted_data<=Y&Z;
else
end if;
count<=count+1; --increase value of count
End if;
End process;
end architecture behave;
It can run at a clock period of 15ns using the code above while giving the correct results. Is there anyway to make this code faster whereby it can run at faster clock period? how can i implement pipelining or other methods for improvement in terms of speed operation? (i tried breaking up the zeq and yeq process by using signals to simulate pipelining but it does not give me the correct results) the code is:
signal pipeline_0,pipeline_1,pipeline_2,pipeline_3,pipeline_4,pipeline_5: std_logic_vector(31 downto 0);
pipeline_0<=((Z(27 downto 0) & "0000")+Key0);
pipeline_1<=(Z+ sum);
pipeline_2<=(("00000" & Z(31 downto 5))+Key1);
pipeline_3<=( (Y(27 downto 0) & "0000")+Key2);
pipeline_4<=(Y+ sum);
pipeline_5<=(("00000" & Y(31 downto 5))+Key3);
Zeq:= pipeline_0 xor pipeline_1 xor pipeline_2;
Yeq:= pipeline_3 xor pipeline_4 xor pipeline_5; ...any suggestions or is there anyway to improve resource usage and speed of operation or am i doing anything redundant?...any experts out there please do help! thanx in advance!