To improve the recognition and detection accuracy and recall of objects in unnatural images, we use remote sensing video image data to detect low-quality and small targets. Based on this, we propose the use of a deconvolution network and hyper features to control convolutional feature quality. We call this approach the Quality Deconvolution Single Shot Detector (QDSSD) detection model. Through the frame-by-frame annotation of the video data from Jilin Satellite No. 1, we propose a CSU-RSI-Video dataset, with no fewer than 30 targets per image frame. We published the data for researchers to do experiments. To achieve small target detection, we enrich the information by gradually adding the underlying detail features to the upper layers and deconvolve the high-layer information to obtain stable detailed features for target detection. The empirical results show that, among the detected problems of low-quality small targets, the improved QDSSD network has better detection capability, and the detection effect is the best for many small targets that are close to each other. In the CSU-RSI-Video dataset, the QDSSD model’s mAP achieves 0.90227 for a single target. Comparing to the You Only Look Once (YOLO) model, the result is still superior in accuracy.