In the case of your code, none of the buffers/variables in your code are being implemented as Block RAMs. If they were, the compiler would explicitly report the number of reads and writes from and to each buffer and the replication factor required to support these accesses and the number of Block RAMs used for each buffer. However, the report explicitly says the Block RAMs are used to implement the "state" of the kernel. If you check the "System viewer", you will see that by increasing the unroll factor, the latency of the pipeline keeps increasing in the consumer kernel, which means the pipeline is getting longer and more Block RAMs will be required to keep the state of the variables.