Abstract:
Objectives Researchers notice that the quality of training samples will impact the effective of training phase and then further will have an influence on the overall classification accuracy in the testing phase. In fact, representativeness or typicalness of training samples is able to reflect the quality of training samples in a way. Especially for the currently popular deep learning methods, it has needed thousands or millions of training samples. Therefore, how to reduce the number of training samples for deep learning method becomes a very important problem. In another hand, from the actual application angle, it is also very expensive. Therefore, we propose one method of reducing the training samples as less as possible based on the representativeness or typicalness of training samples.
Methods Selection of training samples based on oblique factor model is proposed and it relaxes the independent condition among common factors in the orthogonal factor model, which is able to better describe the real world.
Results Experimental results show the proposed method is feasible and effective and it is able to select more representative training samples than the method of selection of training samples based on orthogonal factor model and achieve better performance in the overall classification precision and stability. And the selection of training samples based on oblique factor model outperforms selection of training samples based on orthogonal factor model. And the distribution of selected samples becomes more decentralized and reasonable and the overall classification accuracy averagely improves about 3%.
Conclusions The proposed method not only supports how to optimize capturing data in the theory, but also is able to guide how to effectively capture data in the actual application.