In recent years, the development of automatic driving technology is unprecedented prosperity, and it is widely concerned in the world. Automatic driving technology is committed to providing convenient and intelligent travel solutions for human beings. Dynamic target detection and tracking in urban traffic scene is very important for the research of automatic driving technology. High intelligent driving decisions (such as obstacle avoidance, overtaking, following, and so on) all depend on the recognition and tracking of moving targets. The vehicle detection method of integrating millimeter wave radar and machine vision is an important part of the technology of automatic driving environment perception. Through multi-sensor fusion, the advantages of each sensor can be complementary. In this paper, the existing vehicle detection method of fusion radar and vision is improved, and a target tracking method is proposed. In radar data processing, a novel filtering method of radar data is proposed based on hierarchical clustering, which can effectively extract radar moving targets and exclude the invalid. In visual data processing, an adaptive vehicle detection method based on the depth of field is proposed. The region of interest (ROI) of potential target is generated based on the radar data, and the real vehicle target verified through vehicle shadow detection and support vector machine (SVM) classifier discrimination. At last, a target tracking method is proposed based on extended Kalman filter (EKF) and kernelized correlation filter (KCF). By tracking radar target and image target respectively and then fusing them, the geometry and motion information of vehicle are estimated effectively, which greatly improves the accuracy and robustness of the detecting system. Compared the practical test result with the existing methods in different road environments (urban trunk road, urban expressway, park road, et al.) and all kinds of bad weather conditions (strong light, weak light, overcast day, snow day, et al.), the result shows that the algorithm in this paper has better performance in accuracy and robustness, and the effective detection distance of the algorithm is more than 100 m, fully proving the advantages of the combination of radar and vision in target detection.