Forum Discussion
Altera_Forum
Honored Contributor
16 years agoYou could also try rearranging the matrix and vector and check if the implementation works on rows faster then columns. Try using BB transpose for AA and AA transpose for BB. Mathmatically, the result = AA * BB = (BB' * AA')'. See http://en.wikipedia.org/wiki/matrix_multiplication#common_properties. Depending on how your matrices are stored, transposes can be "free" in FPGAs.