--- Quote Start ---
WHEN RUN =>
CASE P(1 DOWNTO 0) IS
WHEN "00" =>
WHEN "11" =>
WHEN "01" =>
P <= P + A;
WHEN "10" =>
P <= P + S;
END CASE;
P(0) <= P(1);
P(1) <= P(2);
P(2) <= P(3);
P(3) <= P(4);
P(4) <= P(5);
P(5) <= P(6);
P(6) <= P(7);
P(7) <= P(8);
P(8) <= P(9);
P(9) <= P(10);
P(10) <= P(11);
P(11) <= P(12);
P(12) <= P(13);
P(13) <= P(14);
P(14) <= P(15);
P(15) <= P(16);
Count <= Count + 1;
IF Count > 8 THEN
state <= STOP;
END IF;
--- Quote End ---
I have not looked at you code in great detail but are you aware that in your state RUN (quoted above), the assignments to signal P (P <= P + A; or P <= P + S;) made in the case statement will be overridden by the shift register assigments (P(0) <= P(1);) etc. All except bit P(16), which has no further assigments.
Assignments are made at the end of process execution.
Not sure is this is relevant to you algorithm but you might like to take a look