num_compute_unit(N) is a basic compiler feature for NDRange kernels and it is fully supported in the emulator. The users does not need to change anything in the host or kernel code, other than adding the associated attribute. The compiler will automatically handle pipeline replication and distribution of work-groups over the multiple compute units. Functionally, the emulator behaves in the exact same way as the actual hardware does. However, you should not expect to see a speed-up in the emulator by using this attribute, since the emulator is not timing-accurate.