Direct CP violation in two-body $D\to PP$ and $D\to V\!P$ decays is studied within the framework of the topological amplitude approach for tree amplitudes and the QCD factorization approach for penguin amplitudes. It is the interference between tree and {\it long-distance} penguin and penguin-exchange amplitudes that pushes CP asymmetry difference between $D^0\to K^+K^-$ and $\pi^+\pi^-$ modes up to the per mille level.
Using the same mechanism, we find that CP asymmetry can also occur at the $10^{-3}$ level in many of the $D\to V\!P$ channels or otherwise be negligibly small. There are six golden modes which have sufficiently large branching fractions and direct CP violation at the per mille level. In particular, the direct CP asymmetry difference between $D^0\to K^+K^{*-}$ and $\pi^+\rho^-$ is predicted to be $(-1.61\pm0.33)\times 10^{-3}$, very similar to the counterpart in the $P\!P$ sector. The LHCb's observation of CP asymmetry difference can be explained within the standard model without the need of new physics. The key lies in the long-distance penguin topology (penguin and penguin-exchange) arising
from final-state rescattering.