--- Quote Start ---
Without seeing the code, I cannot really say alot.
But long compiliation times for idividual modules (eg >30 mins) is likely a problem in the code and the easiest way to kill it is to have a large ram that gets inferred as registers instead of Ram blocks because the behaviour is incorrect.
Can you post your code?
--- Quote End ---
Basically the top module is as such:
module KC(clk, reset, in, out);
input clk;
input reset;
input [575:0] in;
output [511:0] out;
wire [1599:0] r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10,r11;
wire [1599:0] r12,r13,r14,r15,r16,r17,r18,r19,r20,r21,r22,r23;
assign out = r23[1599:1599-511];
mKC
round0 ( clk, reset, 8'h01, {in, 1024'b0 } , r0);
mKC
round1 ( clk, reset, 8'h32, r0, r1);
mKC
round2 ( clk, reset, 8'hba, r1, r2);
.
.
.
.
.
.
.
--basically there are 23 identical portmap as above
endmodule
and each of the mKC module contains several logical expression with output is registered.