Most model reconstruction and indoor scene recovery methods with Kinect, involve either depth or color images, and merely combine them. These approaches do not make full use of Kinect data, and are not robust and accurate enough in many use cases. In order to solve this problem, this paper proposes a new method in which Kinect is calibrated and epipolar constraints from matching the sequence of color images are combined with point-to-plane constraints in ICP registration to improve the accuracy and robustness. As ICP registers clouds frame by frame, the error will inevitably accumulate. So a four-points coplanar method is applied to optimize the Kinect position and orientation, and make the reconstructed model more precise. Model and indoor scene experiments were designed to demonstrate that the proposed method is effective. Results show that it is more robust, even in a scene that KinectFusion fails at tracking and modeling. The registration accuracy of point clouds accords with Kinect observation accuracy.