--- Quote Start ---
I would really like to see the C code that adds the numbers so I can compare the software cycle count that you have with the cycle count that I get running the same code on my processor that directly executes the C code without compiling to a native instruction set. Will you please attach the code to this thread? The whole object of the design is to minimize the number of cycles so it fits right in with what you are doing.
--- Quote End ---
the code is like you generate 2 sets of random number, then you just add it up. for the "sub processor" that do the adding, you need to have 3 submodules: adding module, interface and the top level system. adding module is where you add up the generated number, interface is controlling the input and output and top level system is like whole system which includes interface and adding module.