The primary input to an NCO is frequency word (i.e. phase increment value) . The additional modulation input adds to that internally so that you can move away from centre frequency generated by frequency word. For example if your frequency word generates 1MHz centre then you can set modulation input as +/- that word to move off centre instead of changing frequency word directly(thus your mod input gets modulated on centre frequency).
However many people modulate by multiplying NCO centre frequency with input. Either way it works.