Attached below is a quick/simplistic diagram that maybe better explains.
The NIOS/SGDMA store operand data into the onchip RAM, and the "BIGNUM" block implements custom instructions for add/subtract/multiply/divide operations on those operands.
The purpose of the dual port memory is to keep all the x32 masters on one port, and x1024 masters on the other.
Use of the custom instructions in C code might end up looking like:
bignum_t dst, srcA, srcB;
...
...
BIGNUM_ADD(&dst, &srcA, &srcB);
BIGNUM_MPY(&dst, &srcA, &srcB);
...
...
This is similar to what Daixiwen mentioned, with a bank of 1024-bit registers; except you're just using RAM (with associated latencies) instead.
The simplistic diagram is the same if you use a bursting master and external memory (SDRAM, DDR) except you would not have the BIGNUM component implement a 1024-bit master width (x64 or x128 is more manageable, but depends on your memories).