两种适用于线性回归EIV模型的高崩溃污染率算法

Two Algorithms with High Breakdown Points Applied in Linear Regression EIV Model

  • 摘要: 混合总体最小二乘(mixed least squares and total least squares,mixed LS-TLS)是求解带有固定列的线性回归变量误差(Errors-in-Variables,EIV)模型的严密方法,结合M估计可以进一步增加其稳健性。但是M估计结果受初值影响,容易收敛错误。针对该问题,将两种Gauss-Markov模型下的抗差估计算法拓展到EIV模型中,提出两种高崩溃污染率的算法,即加权总体最小平方中值法(weighted total least median of squares,WTLMS)和加权截断总体最小二乘法(weighted total least trimmed squares,WTLTS)。分析两种算法的等变性质和崩溃污染率,给出单位权中误差的评定公式,分别通过重采样方法和可行集算法得到参数估计值。不同于已有的高崩溃污染率算法,本文算法考虑系数矩阵存在固定列的情况,同时减少对随机模型的限制。通过仿真数据和真实数据解算,验证两种算法在高粗差污染的观测数据能够得到稳健可靠的估计结果。

     

    Abstract: Objectives: Linear regression model is a basic model in the field of geodesy. To consider the structure of the coefficient matrix with the fixed column, the mixed least squares and total least squares method (mixed LS-TLS) is implemented. However, it is easily contaminated by outliers. The M-estimator results depend on the initial value and are extremely prone to convergence badly. To increase the robustness, we propose two algorithms with high breakdown points for linear regression EIV models, namely, the weighted total least median of squares (WTLMS) method and the weighted total least trimmed squares (WTLTS) method. Methods: The two algorithms are extensions of traditional algorithms and use a more general stochastic model. Their breakdown points are near 50% and the two algorithms have two equivariant properties:scale equivariance and affine equivariance. The estimation formula of variance components is given. Since their objective functions are not differentiable, WTLMS and WTLTS get the solutions by the resampling algorithm and the feasible set algorithm in the EIV model respectively. Results: The results show that:(1) The result of the M-estimator is biased heavily from the real line, while the two proposed algorithms in this paper can obtain results close to the true value. Their performances are significantly better than M-estimator in terms of root mean squares errors (RMSE) and standard deviation (STD). The efficiency of the two algorithms is not high, which can be further improved when the results of the two algorithms are used as the initial value of the M-estimator. The breakdown points of the two algorithms are close to 50% in the real data, which is extremely robust. (2) In the experiment of the lidar data, the performance of the proposed methods is better than the M-estimator. Conclusions: The two proposed algorithms have outstanding robustness, but their complexities are high and their efficiency is not ideal. We will focus to find an easy solution with higher efficiency.

     

/

返回文章
返回