Your code snippet is missing the channel definitions. I defined the channels manually and ran a test on the kernel. From what I can see in the area report, the Block RAMs used in the consumer kernel are used to keep the "state" of the variables in the kernel. There is no direct relationship between the Block RAM consumption for keeping variable states and the unroll factor. These Block RAMs are used to allow correct pipelining and the amount depends on pipeline length, pipeline complexity, number and scope of variables in the loop(s), target operating frequency and probably other stuff. The 16 Block RAMs used in the sink kernel are also used as buffers between the kernel and external memory interface.