An update:
When defining arrays like this
int value[2048]; //source array
int dest[2048] ; //destination array
and running memcpy(dest,value,2048*4), memcpy speed is high: 446 Mbytes/s
And the compile flag -Ofast give faster speed than -O1, as expected.
- - - - - -
My design is based upon the fpga_fft example from Rocketboard where DMA transfers data from FPGA into HPS’s DRAM memory.
The memory space for these data (*value) is defined using mmap:
volatile unsigned int *value;
volatile unsigned int dest[2048*4];
# define
result_base (FFT_SUB_DATA_BASE +
(int)mappedbase +(FFT_SUB_DATA_SPAN/2))
- - - - - -
In main:
// we need to get a pointer to the LW_BRIDGE from the softwares point of view.
// need to open a file.
/* Open /dev/mem */
if ((mem = open("/dev/mem", O_RDWR | O_SYNC)) == -1)
fprintf(stderr, "Cannot open /dev/mem\n"), exit(1);
// now map it into lw bridge space:
mappedbase = mmap(0, 0x1f0000, prot_read | prot_write, map_shared, mem, alt_lwfpgaslvs_ofst); if (mappedBase == (void *)-1) {
printf("Memory map failed. error %i\n", (int)mappedBase);
perror("mmap");
}
Run DMA and wait for completion
...
...
// And when the DMA is finnished the data is available:
value = (unsigned int *)((int)result_base); - - - - - -
Now, when running memcpy(dest,value,2048*4) the speed is slow: only 42 Mbytes/s, and the compiler does not respond as expected to the -O compiler flags, i.e. -Ofast is slower that -O1.
It seems that using mmap really slows down the access to memory. Is it possible to speed this up?
Any help would be greatly appreciated!
Thanks,