基于非线性高斯-赫尔默特模型的结构总体最小二乘法

文章信息

吕志鹏, 隋立芬

LÜ Zhipeng, SUI Lifen

A Structured Total Least Squares Method Based on Nonlinear Gauss-Helmert Model

武汉大学学报·信息科学版, 2019, 44(12): 1808-1815

Geomatics and Information Science of Wuhan University, 2019, 44(12): 1808-1815

http://dx.doi.org/10.13203/j.whugis20180104

文章历史

收稿日期: 2018-09-30

Abstract

PDF

Figures

Tables

吕志鹏, 隋立芬. 基于非线性高斯-赫尔默特模型的结构总体最小二乘法[J]. 武汉大学学报·信息科学版, 2019, 44(12): 1808-1815.

LÜ Zhipeng, SUI Lifen. A Structured Total Least Squares Method Based on Nonlinear Gauss-Helmert Model[J]. Geomatics and Information Science of Wuhan University, 2019, 44(12): 1808-1815.

基于非线性高斯-赫尔默特模型的结构总体最小二乘法

吕志鹏¹ , 隋立芬¹

1. 战略支援部队信息工程大学地理空间信息学院, 河南郑州, 450001

收稿日期：2018-09-30

项目资助：国家自然科学基金（41674016，41274016）

第一作者：吕志鹏, 博士生, 主要从事空间大地测量数据处理方面的研究。E-mail:lvzhipeng1989@qq.com.

通讯作者：隋立芬, 博士, 教授。suilifen@163.com.

摘要：变量误差（error-in-variables，EIV）模型的系数矩阵存在结构特征的情况，并且这种结构特征可以扩展到观测向量中。首先采用变量投影法将系数矩阵的增广矩阵展开成仿射矩阵形式，提取系数矩阵和观测向量中的随机量，并将EIV模型表示为非线性高斯-赫尔默特模型，然后利用非线性最小二乘原理推导了一种结构总体最小二乘法。该算法统一了普通的结构总体最小二乘法、结构数据最小二乘法以及最小二乘法。将该算法应用到真实算例和模拟算例中，两个算例结果表明，该算法与已有能够解决EIV模型结构特征的结构或加权总体最小二乘法估计结果一致，验证了该算法的有效性。同时，该算法对结构特征的提取方式简单、规律性强且易于编程实现；且在算法设计中，把结构总体最小二乘问题转换为附有参数的条件平差问题，即将其纳入到最小二乘平差理论体系，便于其扩展应用。同时对平面拟合问题的误差估计特性进行了定性分析，由分析可知参数的相对大小对估计误差的一致性有直接影响，这说明EIV模型下系数矩阵和观测向量中随机量的估计误差与真误差的一致性关系相对复杂。

关键词：结构特征 Partial EIV 非线性高斯-赫尔默特模型误差估计总体最小二乘

A Structured Total Least Squares Method Based on Nonlinear Gauss-Helmert Model

LÜ Zhipeng¹ , SUI Lifen¹

1. Institute of Geographical Spatial Information, Information Engineering University, Strategic Support Force, Zhengzhou 450001, China

Foundation support: The National Natural Science Foundation of China, Nos. 41674016, 41274016

First author: LÜ Zhipeng, PhD candidate, specializes in the theories and methods of spatial geodetic data processing. E-mail:lvzhipeng E-mail:lvzhipeng1989@qq.com.

Corresponding author: SUI Lifen, PhD, professor. E-mail: suilifen@163.com.

Abstract: The coefficient matrix of errors-in-variables(EIV) model may contain structural features, and this situation can be extended to the observation vector. Here, we define the the augmented matrix of the coefficient matrix that consists of the coefficient matrix and the observation vector. Firstly, we reformulate the EIV model as the nonlinear Gauss-Helmert model. The augmented matrix is expanded as an affine function form, and then the random elements in it are extracted by variable projection method. Finally, we derive a novel algorithm for the structural total least squares (STLS) problem and its first-order approximation precision estimator based on the nonlinear least squares (NLS) theory. The proposed algorithm unifies the general STLS method, the structural data least squares (SDLS) method and the least squares (LS) method. Furthermore, this algorithm is applied to a real example and a simulation example, e. g. straight line fitting and plane fitting. The results of the two examples show that the proposed algorithm is consistent with the existing structural or weighted total least squares methods which can solve the structure problem of the EIV model, thus the results verifies the effectiveness of the proposed algorithm. The method of extracting structural features in this paper is simple in the concept and easy in the implementation. And the STLS problem is transformed into a condition adjustment problem with parameters, which is incorporated into the least squares adjustment theory system to facilitate its extended application. Additionally, the characteristic of the estimated errors of plane fitting is analyzed qualitatively. It can be seen from the analysis that the relative size of the parameters has a direct impact on the consistency of the estimated errors, which indicates that the consistency between the estimated errors and the true errors is relatively complex in the EIV model.

Key words: structure feature Partial EIV nonlinear Gauss-Helmert model error estimation total least squares

变量误差（error-in-variables，EIV）模型同时考虑了系数矩阵和观测向量的误差。Goulub等^[1]首次提出了总体最小二乘(total least squares，TLS)法，并给出了EIV模型的奇异值分解算法。TLS估计是EIV模型下的无偏估计，并且在观测个数趋于无穷大时，估计结果具有弱一致性和渐进无偏性的统计特征^[2]。针对数据处理的不同情况，众多学者对TLS估计进行了丰富和发展。在观测误差不等权的情况下，文献[3-6]提出了加权总体最小二乘(weighted total least squares，WTLS)法。考虑到系数矩阵的非线性特征，文献[7]提出了非线性TLS法。结合各种等式或不等式约束条件，文献[8-9]提出了约束TLS法。目前，测绘领域对TLS及其扩展方法的应用集中于直线拟合^[10]、基准变换^[11]、地面解析摄影测量^[12]、摄影测量空间后方交会^[13]、应变参数反演^[14]、自回归分析^[15]等问题。而在这些应用中，EIV模型的系数矩阵存在结构特征。通常可以采用基于矩阵正交分解的混合最小二乘和总体最小二乘(least squares-total least squares，LS-TLS)法^[2]，或各种结构总体最小二乘(structural total least squares，STLS)法^[16-18]进行求解。Xu等^[19]提出了部分变量误差（Partial EIV）模型，可以视为混合LS-TLS法或者STLS法的理论模型。

Teunissen^[20]在研究基准变换问题时提出了非线性高斯-赫尔墨特（Gauss-Helmert, GH）模型，这一模型同时考虑了公共点在原坐标系下和目标坐标系下的观测误差。非线性GH模型可以等价于回归分析中的EIV模型，文献[21]将EIV模型表示为非线性GH模型，然后在参数和随机误差近似值处对非线性GH模型进行线性化，应用LS法得到非线性GH模型下的WTLS解。因此可认为基于非线性GH模型的TLS法采用非线性LS法处理TLS问题，推导过程更易理解。文献[22]将混合LS-TLS法的平差模型表示为非线性GH模型形式，提出了适用于系数矩阵包含固定列、固定量和随机量的混合LS-TLS法。但该方法的实现过程仍采用非线性拉格朗日乘数法^[3-9]。

EIV模型的结构特征不仅存在于系数矩阵中，而且可以扩展到观测向量中，即系数矩阵和观测向量存在固定量和随机量以及不同位置的随机量之间存在线性关系。本文在已有研究的基础上，考虑了系数矩阵和观测向量的结构特征以及随机量之间的线性关系，将EIV模型表示为非线性GH模型，采用非线性最小二乘理论，推导了一种STLS法，并给出参数估值和误差估值的一阶近似精度。通过真实算例和模拟算例验证了本文算法的有效性，并进一步对平面拟合问题的误差估计特性进行了定性分析，说明了参数对误差估值可靠性的影响性质。本文算法统一了普通的结构总体最小二乘法、结构数据最小二乘法以及最小二乘法，同时考虑了系数矩阵和观测向量的完整结构，并采用变量投影法^[18]给出了系数矩阵和观测向量一致化的结构特征提取方法。相较其他算法，本文算法更具一般性。

1 Partial EIV模型

EIV模型同时考虑了系数矩阵误差和观测向量误差，其形式如下：

$ \mathit{\boldsymbol{y}} - {\mathit{\boldsymbol{e}}_y} - \left( {\mathit{\boldsymbol{A}} - {\mathit{\boldsymbol{E}}_A}} \right)\mathit{\boldsymbol{x}} = 0, m > n = {\rm{rank}}\left( \mathit{\boldsymbol{A}} \right) $

(1)

式中，$\mathit{\boldsymbol{y}} \in {{\bf{R}}^{m \times 1}}$为观测向量；e_y为其误差向量；$ \mathit{\boldsymbol{A}} \in {{\bf{R}}^{m \times n}}$为系数矩阵；E_A为其误差矩阵；$\mathit{\boldsymbol{x}} \in {{\bf{R}}^{n \times 1}} $为参数向量。对于Partial EIV模型而言，其系数矩阵的增广矩阵$\mathit{\boldsymbol{C}} = \left[ {\begin{array}{*{20}{l}} \mathit{\boldsymbol{A}}&\mathit{\boldsymbol{y}} \end{array}} \right] $具有某种结构特征。因而平差后的误差矩阵$\Delta \mathit{\boldsymbol{C}} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{E}}_A}}&{{\mathit{\boldsymbol{e}}_y}} \end{array}} \right]$应具有相同的结构特征。假定C仅存在仿射结构，即不考虑随机量之间的非线性关系，同时设C中独立随机量组成的向量为$ \mathit{\boldsymbol{p}} \in {{\bf{R}}^{{n_p}}}$，n_p为独立随机量个数。故采用变量投影法将C表示成如下仿射函数形式^[18]:

$ \mathit{\boldsymbol{C}} = \mathit{\boldsymbol{S}}(\mathit{\boldsymbol{p}}) = {\mathit{\boldsymbol{S}}_0} + \sum\limits_{t = 1}^{{n_p}} {{\mathit{\boldsymbol{S}}_t}} {\mathit{\boldsymbol{p}}_t}, {\mathit{\boldsymbol{S}}_0}, {\mathit{\boldsymbol{S}}_t} \in {{\bf{R}}^{m \times (n + 1)}} $

(2)

式中，S₀为固定量仿射结构矩阵；S₀中零元素为C中固定量；S_i为随机量p_i对应的仿射结构矩阵。此时可将STLS问题的数学模型表示为：

$ \left\{ {\begin{array}{*{20}{l}} {f\left( {\Delta \mathit{\boldsymbol{p}}, \mathit{\boldsymbol{x}}} \right) = \mathit{\boldsymbol{S}}\left( {\mathit{\boldsymbol{p}} - \Delta \mathit{\boldsymbol{p}}} \right)\left[ {\begin{array}{*{20}{c}} \mathit{\boldsymbol{x}}\\ { - 1} \end{array}} \right] = \mathit{\boldsymbol{S}}\left( {\mathit{\boldsymbol{p}} - \Delta \mathit{\boldsymbol{p}}} \right){\mathit{\boldsymbol{x}}_{{\rm{ext}}}} = 0}\\ {\Delta \mathit{\boldsymbol{p}} \sim \left( {0, \sigma _0^2{\mathit{\boldsymbol{P}}^{ - 1}}} \right)} \end{array}} \right. $

(3)

式中，${\mathit{\boldsymbol{x}}_{{\rm{ext}}}} = \left[ {\begin{array}{*{20}{c}} \mathit{\boldsymbol{x}}\\ { - 1} \end{array}} \right] $为x的增广向量；Δp为p的改正数；P为p的权阵。

将STLS问题的函数模型进一步变形为：

$ \begin{align} & \mathit{\pmb{S}} \left( \mathit{\pmb{p}} -\Delta \mathit{\pmb{p}} \right){{\mathit{\pmb{x}} }_{\text{ext}}}=0\Leftrightarrow \underset{t=1}{\overset{{{n}_{p}}}{\mathop{\Sigma }}}\,{{\mathit{\pmb{S}} }_{t}}\Delta {{\mathit{\pmb{p}} }_{t}}{{\mathit{\pmb{x}} }_{\text{ext}}}=\mathit{\pmb{S}} \left( \mathit{\pmb{p}} \right){{\mathit{\pmb{x}} }_{\text{ext}}} \\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \Leftrightarrow \underset{t=1}{\overset{{{n}_{p}}}{\mathop{\Sigma }}}\,{{\mathit{\pmb{S}} }_{t}}{{\mathit{\pmb{x}} }_{\text{ext}}}\Delta {{\mathit{\pmb{p}} }_{t}}=\mathit{\pmb{S}} \left( \mathit{\pmb{p}} \right){{\mathit{\pmb{x}} }_{\text{ext}}} \\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \Leftrightarrow \mathit{\pmb{y}} - \mathit{\pmb{Ax}} + \mathit{\pmb{G}} \left( \mathit{\pmb{x}} \right)\Delta \mathit{\pmb{p}} =0 \\ \end{align} $

(4)

式中，

$ \begin{array}{*{20}{l}} {\mathit{\boldsymbol{G}}\left( \mathit{\boldsymbol{x}} \right) = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{S}}_1}{\mathit{\boldsymbol{x}}_{{\rm{ext}}}}}&{{\mathit{\boldsymbol{S}}_2}{\mathit{\boldsymbol{x}}_{{\rm{ext}}}}}& \ldots &{{\mathit{\boldsymbol{S}}_{{n_p}}}{\mathit{\boldsymbol{x}}_{{\rm{ext}}}}} \end{array}} \right] = }\\ {\;\;\;\;\;\left( {\mathit{\boldsymbol{x}}_{{\rm{ext}}}^{\rm{T}} \otimes {\mathit{\boldsymbol{I}}_m}} \right)\mathit{\boldsymbol{\psi }} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{x}}^{\rm{T}}} \otimes {\mathit{\boldsymbol{I}}_m}}&{ - {\mathit{\boldsymbol{I}}_m}} \end{array}} \right]\mathit{\boldsymbol{\psi }}} \end{array} $

(5)

式中，⊗为克罗内克乘积; 定义ψ为结构矩阵，$ \mathit{\boldsymbol{\psi }} = \left[ {\begin{array}{*{20}{c}} {{\rm{vec}}\left( {{\mathit{\boldsymbol{S}}_1}} \right)}&{{\rm{vec}}\left( {{\mathit{\boldsymbol{S}}_2}} \right)}& \ldots &{{\rm{vec}}\left( {{\mathit{\boldsymbol{S}}_{{\mathit{\boldsymbol{n}}_p}}}} \right)} \end{array}} \right]$; vec(·)为矩阵向量化算子。当系数矩阵和观测向量中随机量不存在相同元素时，有$ \Delta \mathit{\boldsymbol{p}} = \left[ {\begin{array}{*{20}{l}} {{\mathit{\boldsymbol{e}}_A}}\\ {{\mathit{\boldsymbol{e}}_y}} \end{array}} \right]$，其中e_A为系数矩阵中随机量组成的误差向量。同时ψ是一个分块对角矩阵，即${\mathit{\boldsymbol{\psi }}_S} = {\mathop{\rm diag}\nolimits} \left( {{\mathit{\boldsymbol{C}}_A}\quad {\mathit{\boldsymbol{I}}_m}} \right) $，其中C_A为系数矩阵中随机量对应的结构矩阵，其列数等于系数矩阵中随机量的个数k。当系数矩阵存在固定列，并假设固定列位于系数矩阵前j列，此时$ \begin{equation} C_{A}=\left[\begin{array}{c} {0_{m j \times k}} \\ {C_{A_{k}}} \end{array}\right] \end{equation}$，其中$ {0_{mj \times k}}$为mj×k的零矩阵，${\mathit{\boldsymbol{C}}_{{A_k}}} $为非固定列中随机量对应的结构矩阵。

2 STLS估计 2.1 STLS算法

Partial EIV模型本质上是一种非线性模型，令参数和系数矩阵的增广矩阵中随机误差近似值分别为$ {\mathit{\boldsymbol{\widehat x}}^i}$和$\Delta {{\mathit{\boldsymbol{\hat p}}}^i} $，对式(4)进行一阶泰勒展开：

$ \begin{array}{l} f\left( {\Delta \mathit{\boldsymbol{p}}, \mathit{\boldsymbol{x}}} \right) \approx f\left( {\Delta {{\mathit{\boldsymbol{\hat p}}}^i}, {{\mathit{\boldsymbol{\hat x}}}^i}} \right) + \mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\left( {\Delta \mathit{\boldsymbol{p}} - \Delta {{\mathit{\boldsymbol{\hat p}}}^i}} \right) - {{\mathit{\boldsymbol{\hat A}}}^i}\left( {\mathit{\boldsymbol{x}} - {{\mathit{\boldsymbol{\hat x}}}^i}} \right) = \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta \mathit{\boldsymbol{p}} - {{\mathit{\boldsymbol{\hat A}}}^i}\mathit{\boldsymbol{x}} + \left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}{{\mathit{\boldsymbol{\hat x}}}^i} + \mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta {{\mathit{\boldsymbol{\hat p}}}^i}} \right) - \mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta {{\mathit{\boldsymbol{\hat p}}}^i} + {{\mathit{\boldsymbol{\hat A}}}^i}{{\mathit{\boldsymbol{\hat x}}}^i} = \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta \mathit{\boldsymbol{p}} - {{\mathit{\boldsymbol{\hat A}}}^i}\mathit{\boldsymbol{x}} + \left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{\hat E}}_\mathit{A}^i{{\mathit{\boldsymbol{\hat x}}}^i}} \right) = \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta \mathit{\boldsymbol{p}} - {{\mathit{\boldsymbol{\hat A}}}^i}\mathit{\boldsymbol{x}} + {\mathit{\boldsymbol{\omega }}^i} = 0 \end{array} $

(6)

式中，${{\mathit{\boldsymbol{\hat A}}}^i} = \mathit{\boldsymbol{A}} - \mathit{\boldsymbol{\hat E}}_A^i $；${\mathit{\boldsymbol{\omega }}^i} = \mathit{\boldsymbol{y}} - \mathit{\boldsymbol{\hat E}}_A^i{{\mathit{\boldsymbol{\hat x}}}^i} $; $ \mathit{\boldsymbol{\hat E}}_A^i$为E_A的i次迭代估值。式(6)可以视为经典平差中附加参数的条件平差模型。在此采用两步法对x和Δp进行求解。令${\mathit{\boldsymbol{V}}_f} = \mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta \mathit{\boldsymbol{p}} $，$ {\mathit{\boldsymbol{V}}_f} \sim (0, \sigma _0^2\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right) \cdot $ $\left. {{\mathit{\boldsymbol{P}}^{ - 1}}{\mathit{\boldsymbol{G}}^{\rm{T}}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)} \right) $，故式(6)可变形为经典平差中的参数平差模型：

$ \left\{ {\begin{array}{*{20}{l}} {{\mathit{\boldsymbol{V}}_f} = {{\mathit{\boldsymbol{\hat A}}}^i}\mathit{\boldsymbol{x}} - {\mathit{\boldsymbol{\omega }}^\mathit{i}}}\\ {{\mathit{\boldsymbol{V}}_f} \sim \left( {0,\sigma _0^2\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right){\mathit{\boldsymbol{P}}^{ - 1}}{\mathit{\boldsymbol{G}}^{\rm{T}}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)} \right)} \end{array}} \right. $

(7)

根据参数平差法可得：

$ \mathit{\boldsymbol{x}} = {\left[ {{{\mathit{\boldsymbol{\hat A}}}^{i{\rm{T}}}}{{\left( {\mathit{\boldsymbol{Q}}_{{V_f}}^i} \right)}^{ - 1}}{{\mathit{\boldsymbol{\hat A}}}^i}} \right]^{ - 1}}{{\mathit{\boldsymbol{\hat A}}}^{i{\rm{T}}}}{\left( {\mathit{\boldsymbol{Q}}_{{V_f}}^i} \right)^{ - 1}}{\mathit{\boldsymbol{\omega }}^i} $

(8)

式中，$ \mathit{\boldsymbol{Q}}_{{V_f}}^i = \mathit{\boldsymbol{G}}\left( {{{\hat x}^i}} \right){\mathit{\boldsymbol{P}}^{ - 1}}{\mathit{\boldsymbol{G}}^{\rm{T}}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)$。

已知x的估值${\mathit{\boldsymbol{\widehat x}}^i} $后，式(6)可变形为经典平差中的条件平差模型：

$ \left\{ {\begin{array}{*{20}{l}} {\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right)\Delta \mathit{\boldsymbol{p}} + \left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}{{\mathit{\boldsymbol{\hat x}}}^i}} \right) = 0}\\ {\Delta \mathit{\boldsymbol{p}} \sim \left( {0,\sigma _0^2{\mathit{\boldsymbol{P}}^{ - 1}}} \right)} \end{array}} \right. $

(9)

根据条件平差法可得：

$ \Delta \mathit{\boldsymbol{p}} = - {\mathit{\boldsymbol{P}}^{ - 1}}{\mathit{\boldsymbol{G}}^{\rm{T}}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right){\left( {\mathit{\boldsymbol{Q}}_{{V_f}}^i} \right)^{ - 1}}\left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}{{\mathit{\boldsymbol{\hat x}}}^i}} \right) $

(10)

根据以上推导，可得STLS法的迭代过程如下。

1）初始值：假设A无误差，通过LS估计求得参数的初值${\mathit{\boldsymbol{\widehat x}}^0} $。

2) 由$ {\mathit{\boldsymbol{\widehat x}}^i}$通过式(5)可得$\mathit{\boldsymbol{G}}\left( {{{\mathit{\boldsymbol{\hat x}}}^i}} \right) $，进而通过式(10)求得$ \Delta {{\mathit{\boldsymbol{\hat p}}}^{i + 1}}$。

3) 由$ \Delta {{\mathit{\boldsymbol{\hat p}}}^{i + 1}}$重构$\mathit{\boldsymbol{\hat E}}_A^{i + 1} $，即$\mathit{\boldsymbol{\hat E}}_A^{i + 1} = \sum {\mathit{\boldsymbol{S}}_t^A} {\left( {\Delta \mathit{\boldsymbol{\hat p}}_t^A} \right)^{i + 1}} $，其中$ {\left( {\Delta \mathit{\boldsymbol{\hat p}}_t^A} \right)^{i + 1}}$和${\mathit{\boldsymbol{S}}_t^A} $分别为系数矩阵随机量的改正数及其对应的仿射矩阵，进一步得${{\mathit{\boldsymbol{\hat A}}}^{i + 1}} = \mathit{\boldsymbol{A}} - \mathit{\boldsymbol{\hat E}}_A^{i + 1} $。最终通过式(8)可得估计参数$ {{\mathit{\boldsymbol{\hat x}}}^{i + 1}}$。

4) 重复步骤2)、3)，直到满足条件$ \left\| {{{\mathit{\boldsymbol{\hat x}}}^{i + 1}} - {{\mathit{\boldsymbol{\hat x}}}^i}} \right\| < \varepsilon $，其中ε为参数迭代收敛精度。

2.2 精度评定

TLS估计是EIV模型下的最优无偏估计，当观测个数趋近于无穷时，估计结果具有渐进无偏性和弱一致性，同时EIV模型本质上是非线性模型，因而一般情况下只能获得TLS估计的有偏单位权中误差^[4]以及参数估值和误差估值的一阶近似精度^[19]。

根据TLS原理，上述STLS法的单位权方差可以表示为：

$ {{\hat \sigma }^2} = {\rm{\Delta }}{{\mathit{\boldsymbol{\hat p}}}^{\rm{T}}}\mathit{\boldsymbol{P}}{\rm{\Delta }}\mathit{\boldsymbol{\hat p}}/r $

(11)

式中，r为自由度，即$ r = m - n$。

对于参数估值的精度，由参数平差模型式(7)和参数估值迭代表达式(8)，根据协因数传播律可得：

$ {\mathit{\boldsymbol{Q}}_{\hat x\hat x}} = {\left[ {{{\mathit{\boldsymbol{\hat A}}}^i}^{\rm{T}}{{(Q_{{V_f}}^i)}^{ - 1}}{{\mathit{\boldsymbol{\hat A}}}^i}} \right]^{ - 1}} $

(12)

对于随机误差估值的精度，由条件平差模型式(9)和随机误差估值迭代表达式(10)，根据协因数传播律可得：

$ {\mathit{\boldsymbol{Q}}_{{\rm{\Delta }}\hat p{\rm{\Delta }}\hat p}} = {\mathit{\boldsymbol{P}}^{ - 1}}{\mathit{\boldsymbol{G}}^{\rm{T}}}\left( {{{\hat x}^i}} \right){(\mathit{\boldsymbol{Q}}_{{V_f}}^i)^{ - 1}}\mathit{\boldsymbol{G}}\left( {{{\hat x}^i}} \right){\mathit{\boldsymbol{P}}^{ - 1}} $

(13)

3 算例与分析 3.1 算例1（直线拟合问题）

本文选取了直线拟合的算例^{[4-6, 23]}，称为算例1。此算例给出10组观测点，它们的坐标和对应的权值P_x、P_y见表 1。将其用于拟合直线的截距ξ₁和斜率ξ₂，对应的EIV模型如下：

表 1 观测点和对应的权 Tab. 1 Observed Points and Corresponding Weights

点号	x	y	P_x	P_y
1	0.0	5.9	1 000.0	1.0
2	0.9	5.4	1 000.0	1.8
3	1.8	4.4	500.0	4.0
4	2.6	4.6	800.0	8.0
5	3.3	3.5	200.0	20.0
6	4.4	3.7	80.0	20.0
7	5.2	2.8	60.0	70.0
8	6.1	2.8	20.0	70.0
9	6.5	2.4	1.8	100.0
10	7.4	1.5	1.0	500.0

表选项

$ \left[ {\begin{array}{*{20}{c}} {{y_1}}\\ {{y_2}}\\ \vdots \\ {{y_{10}}} \end{array}} \right] - \left[ {\begin{array}{*{20}{c}} {{e_{{y_1}}}}\\ {{e_{{y_2}}}}\\ \vdots \\ {{e_{{y_{10}}}}} \end{array}} \right] = \left( {\left[ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} 1\\ 1\\ \vdots \\ 1 \end{array}}&{\begin{array}{*{20}{c}} {{x_1}}\\ {{x_2}}\\ \vdots \\ {{x_{10}}} \end{array}} \end{array}} \right] - \left[ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} 0\\ 0\\ \vdots \\ 0 \end{array}}&{\begin{array}{*{20}{c}} {{e_{{x_1}}}}\\ {{e_{{x_2}}}}\\ \vdots \\ {{e_{{x_{10}}}}} \end{array}} \end{array}} \right]} \right) \cdot \left[ {\begin{array}{*{20}{c}} {{\xi _1}}\\ {{\xi _2}} \end{array}} \right] $

(14)

由式(14)可知，系数矩阵中第一列为固定值，故本质上直线拟合是STLS问题。分别采用本文算法、文献[6]、文献[8]和文献[23]中相关算法进行计算，各种算法参数收敛阈值设为1×10^-10，最终得到的各种算法的参数估值、验后单位权方差、参数估值精度均一致(见表 2)。

表 2 不同方法所得结果 Tab. 2 Results of Different Methods

估计方法	${{\hat \xi }_1} $	${{\hat \xi }_2} $	$ {{\hat \sigma }_0}$	$\hat \sigma _{{{\hat \xi }_1}}^2 $	$\hat \sigma _{{{\hat \xi }_2}}^2 $
文献[6]	5.479 910 224	－0.480 533 407	1.217 905 641	0.129 058 064	0.004 987 223
文献[8]	5.479 910 224	－0.480 533 407	1.217 905 641	0.129 058 064	0.004 987 223
文献[23]	5.479 910 224	－0.480 533 407	1.217 905 641	0.129 058 064	0.004 987 223
本文算法	5.479 910 224	－0.480 533 407	1.217 905 641	0.129 058 064	0.004 987 223

表选项

3.2 算例2（约束STLS问题）

针对系数矩阵存在固定列、随机量和固定量的情况，设计算例说明本文算法的特点。在三维空间直角坐标系O-XYZ中存在平面Z=ζ₁+ζ₂X+ζ₃Y，其中ζ₁、ζ₂、ζ₃的真值分别为1、8、9。给定区间[－10, 10]，X和Y分别取上述区间中的整数，生成121组平面坐标，将其依次代入上述平面方程得到相应的坐标分量Z，形成121组空间坐标。同时建立线性约束方程25=ζ₁+7.5ζ₂-4^*ζ₃，其中4^*表示其存在随机误差。故算例2对应的EIV模型如下：

$ \left[ {\begin{array}{*{20}{c}} {{Z_1}}\\ {{Z_2}}\\ \vdots \\ {{Z_{121}}}\\ {{Z_{122}}} \end{array}} \right] - \left[ {\begin{array}{*{20}{c}} {{e_{{Z_1}}}}\\ {{e_{{Z_2}}}}\\ \vdots \\ {{e_{{Z_{121}}}}}\\ 0 \end{array}} \right] = \left( {\left[ {\begin{array}{*{20}{c}} 1&{{X_1}}&{{Y_1}}\\ 1&{{X_2}}&{{Y_2}}\\ \vdots & \vdots & \vdots \\ 1&{{X_{121}}}&{{Y_{121}}}\\ 1&{{X_{122}}}&{{Y_{122}}} \end{array}} \right] - \left[ {\begin{array}{*{20}{c}} 0&{{e_{{X_1}}}}&0\\ 0&0&{{e_{{Y_2}}}}\\ \vdots & \vdots & \vdots \\ 0&{{e_{{X_{121}}}}}&0\\ 0&0&{{e_{{Y_{122}}}}} \end{array}} \right]} \right) \cdot \left[ {\begin{array}{*{20}{c}} {{\zeta _1}}\\ {{\zeta _2}}\\ {{\zeta _3}} \end{array}} \right] $

(15)

为了说明本文算法的有效性，避免模拟数据偶然性的影响，以上述系数矩阵和观测向量中随机量的真值为基础，假设各随机量等精度不相关(其他误差特性情况与此相同)，分别模拟不同的噪声水平σ，并生成10 000组随机误差，计算本文算法和LS法的参数估值，并得到各种统计量，其中均方根误差（root mean square, RMS）：

$ \text{RM}{{\text{S}}_{{{\zeta }_{i}}}}=\sqrt{\sum\limits_{j=1}^{10000}{{{\left( \hat{\zeta }_{i}^{j}-{{{\tilde{\zeta }}}_{i}} \right)}^{2}}}/10\ 000},i=1,2,3 $

(16)

平均值为：

$ {{\operatorname{MEAN}}_{{{\xi }_{i}}}}={{{\bar{\zeta }}}_{i}}=\frac{1}{10\ 000}\sum\limits_{j=1}^{10\ 000}{\hat{\zeta }_{i}^{j}},i=1,2,3 $

(17)

标准差为：

$ \text{ST}{{\text{D}}_{{{\xi }_{i}}}}=\sqrt{\sum\limits_{j=1}^{10\ 000}{{{\left( \hat{\zeta }_{i}^{j}-{{{\bar{\zeta }}}_{i}} \right)}^{2}}}/10\ 000},i=1,2,3 $

(18)

最大值为：

$ \text{MA}{{\text{X}}_{{{\zeta }_{i}}}}=\text{max}\left( \hat{\zeta }_{i}^{j} \right),i=1,2,3,j=1\sim 10\ 000 $

(19)

最小值为：

$ \text{MI}{{\text{N}}_{{{\zeta }_{i}}}}=\text{min}\left( \hat{\zeta }_{i}^{j} \right),i=1,2,3,j=1\sim 10\ 000 $

(20)

式中，${\hat \zeta } $和$ {\bar \zeta }$分别为参数估值及其真值。各参数估值统计量结果如图 1所示(横坐标为随机量的噪声水平)，图 2给出了σ=1时LS法和STLS法10 000次模拟实验所得参数估值真误差的统计直方图。令σ=0.5，计算了4组不同参数下随机量的模拟误差和估计误差，绘于图 3(横坐标中1~61、62~122对应系数矩阵中第二、三列各随机量，123~243对应观测向量中各随机量)。

图 1 LS法和STLS法参数估值统计量 Fig. 1 Statistics of Estimated Parameters from LS and STLS

图选项

图 2 LS和STLS参数估值真误差统计直方图(σ=1) Fig. 2 Histograms of Estimated Parameter Errors from LS and STLS(σ=1)

图选项

图 3 不同参数下随机量模拟误差和估计误差 Fig. 3 Simulated and Estimated Random Errors in Different Parameters

图选项

3.3 算例分析

由算例1可知，本文算法与已有的WTLS法、STLS法或者混合LS-TLS法的参数估计结果相同，验证了本文算法的有效性。WTLS法需要遵循一系列定权规则，如系数矩阵中固定值需设置零权，当系数矩阵固定值较多或者结构性较强时，权矩阵构造将相当复杂；进一步，随着观测方程的增加，估计参数将增多，影响计算效率。而其他STLS法将系数矩阵中的随机量提取为最优化随机向量，与本文算法的计算效率相当。同时，本文考虑了系数矩阵和观测向量的完整结构，并采用变量投影法给出了系数矩阵和观测向量一致化的结构特征提取方法，该方法规律性很强，易于编程实现。

算例2给出了一个系数矩阵含有固定列、固定量和随机量的空间平面拟合问题，同时设计了一个特殊的约束方程。由图 1可知，由于LS法忽略了系数矩阵中的随机量，所以LS法得到的参数估值各项统计量显著大于本文算法，并且参数估值各项统计量随噪声水平的上升增加速度也更快。STLS法得到的参数估值的均值并未随噪声水平的上升而发生显著变化，一直维持在0附近，最大误差和最小误差关于0对称。而LS法得到的参数估值的最大误差和最小误差并无此对称特征，且最大误差随着噪声水平的上升趋势性也发生变化。对比图 2可知，STLS法得到的参数估值真误差的统计直方图更趋近于正态分布，而LS法得到的参数估值真误差的统计直方图整体向横轴负方向偏移，均值为负数，最小误差绝对值大于最大误差，与正态分布的一致性较差，说明LS法并不是EIV模型的最优估计方法。图 3给出了4组不同参数下随机量的模拟误差和估计误差，对图 3中模拟误差和估计误差一致性出现的位置进行统计，具体见表 3。

表 3 模拟误差和估计误差一致性出现的位置统计 Tab. 3 Consistency Between Simulated Errors and Estimated Random Errors

统计项	参数
统计项	(1, 0.1, 0.1)	(1, 0.1, 10)	(10, 10, 10)	(10, 10, 1)
误差一致性出现的位置	观测向量中随机量	系数矩阵中与ζ₃对应列中的随机量	系数矩阵中全部随机量	系数矩阵中与ζ₂对应列中的随机量

表选项

由式(10)可知，系数矩阵和观测向量中的随机量估计误差与观测权阵、平差系统的几何结构、参数、随机量观测误差等相关。在等权情况下，系数矩阵的增广矩阵中第k列所对应的估计误差为：

$ {\rm{\Delta }}{\mathit{\boldsymbol{p}}_k} = {\hat x_k}{\mathit{\boldsymbol{\psi }}_k}\mathit{\boldsymbol{Q}}_V^{ - 1}\left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}\hat x} \right) $

(21)

式中，ψk为ψ中与${\mathit{\boldsymbol{\widehat x}}_k} $对应的子矩阵。算例2中${\hat x_{{\rm{ext}}}} = {\left[ {\begin{array}{*{20}{c}} {\hat x}&{ - 1} \end{array}} \right]^{\rm{T}}} = {\left[ {\begin{array}{*{20}{c}} {{{\hat \zeta }_1}}&{{{\hat \zeta }_2}}&{{{\hat \zeta }_3}}&{ - 1} \end{array}} \right]^{\rm{T}}} $。不难看出，${{{\hat \zeta }_1}} $对应列中的随机量无误差，-1对应列为观测向量。同时，假设系数矩阵的增广矩阵不存在结构特征，即${\mathit{\boldsymbol{\psi }}_k} = {\mathit{\boldsymbol{I}}_m} $，式(21)进一步简化为：

$ {\rm{\Delta }}{\mathit{\boldsymbol{p}}_k} = {{\hat x}_k}{({\mathit{\boldsymbol{Q}}_{{V_f}}})^{ - 1}}\left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}\hat x} \right) $

(22)

式（22）说明，系数矩阵的增广矩阵中各列对应的估计误差成比例，比例系数即为参数。由上述分析可知，估计误差存在公共项${\mathit{\boldsymbol{Q}}_{{V_f}}}^{ - 1}\left( {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{A}}\hat x} \right) $，在其基础上，根据随机量的观测权阵、结构特征以及所对应的参数对估计误差进行调整，其中参数可以视为估计误差的尺度因子。估计误差的这种相关性说明，虽然系数矩阵中的随机量可以视为观测量，但是其在平差系统中并不构成冗余观测，故TLS法得到的估计误差并不能完全反映观测误差。

对表 3进行分析可知，参数的相对大小对估计误差的一致性有直接影响，即在系数矩阵的增广矩阵中，当某列所对应的参数绝对值相对其他列大时，该列中的随机量估计误差与模拟误差更一致，其中观测向量中随机量对应的参数恒为－1。

4 结语

EIV模型下的TLS估计具有弱一致性和渐进无偏性。在实际应用中，测量问题对应的EIV模型多存在结构特征，对于这种情况可以采用两类方法解决：①通过设计特殊结构的权阵，采用WTLS法求解；②采用STLS法或者混合LS-TLS法求解。本文考虑了EIV模型的结构特征及随机量之间的线性关系，将EIV模型表示为非线性GH模型，采用非线性最小二乘原理推导了一种STLS法。

本文采用变量投影法将系数矩阵的增广矩阵展开成仿射函数形式，考虑了系数矩阵和观测向量的结构特征，可以用于自回归问题的求解。同时变量投影法对EIV模型结构特征的描述简洁，规律性强，易于编程实现。

本文将STLS问题转换为附加参数的条件平差问题进行求解，因而可以将TLS问题纳入到最小二乘平差理论体系中，有利于TLS方法的扩展，例如总体最小二乘方差分量估计。

EIV模型下观测量(随机量)的概念得到了广义化，系数矩阵中的元素同样也是一种观测量。通过算例2可知，固定量的概念也可以进行广义化，固定量不仅可以存在于系数矩阵中，同时也可以存在于观测向量中。①当观测向量中部分元素为固定量时，可以将其表示为约束方程，但不同于一般的约束方程，此时要求约束方程中的系数至少有一个存在随机性，以保证式(7)中V_f协因数矩阵的正定性；②当观测向量为固定列时，平差问题变成结构数据最小二乘(structural data least squares，SDLS)平差，此时本文算法相当于SDLS法；③当系数矩阵中元素全部为固定量时，Partial EIV模型转化为灰色模型，此时本文算法相当于LS法。因此本文算法统一了普通的STLS法、SDLS法以及LS法，更具有一般性。

TLS法得到EIV模型下参数的渐进无偏估计，但是TLS法得到的系数矩阵和观测向量中随机量的估计误差与真误差的一致性关系相对复杂。本文在等权条件下对平面拟合问题中估计误差与模拟误差的一致性问题进行了定性分析，并得到了有益的结论。估计误差与真误差的一致性对于基于TLS法的抗差估计问题^[24]以及预测回归问题^[25]至关重要，一致的估计误差是抗差有效性的基础，也是预测可靠性的前提。

参考文献

[1]	Goulub G H, van Loan C. An Analysis of the Total Least Squares Problem[J]. SIAM Journal on Numerical Analysis, 1980, 17(6): 883-893. DOI:10.1137/0717073

[2]	van Huffel S, van Dewalle J. The Total Least- Squares Problem: Computational Aspects and Analysis[M]. Philadelphia: Society for Industrial and Applied Mathematics, 1991.

[3]	Fang Xing. Weighted Total Least Squares: Necessary and Sufficient Conditions, Fixed and Random Parameters[J]. Journal of Geodesy, 2013, 87(8): 733-749. DOI:10.1007/s00190-013-0643-2

[4]	Shen Yunzhong, Li Bofeng, Chen Yi. An Itera- tive Solution of Weighted Total Least Squares Adjustment[J]. Journal of Geodesy, 2011, 85(4): 229-238. DOI:10.1007/s00190-010-0431-1

[5]	Amiri-Sinkooei A R, Jazaeri S. Weighted Total Least Squares Formulated by Standard Least Squares Theory[J]. Journal of Geodetic Science, 2012, 2(2): 113-124.

[6]	Jazeri S, Amiri-Sinkooei A R, Sharifi M A. Iterative Algorithm for Weighted Total Least Squares Adjustment[J]. Survey Review, 2014, 46(334): 19-27. DOI:10.1179/1752270613Y.0000000052

[7]	Hu Chuan, Chen Yi. An Iterative Algorithm for Nonlinear Total Least Squares Adjustment[J]. Acta Geodaetica et Cartographica Sinica, 2014, 43(7): 668-674. (胡川, 陈义. 非线性整体最小平差迭代算法[J]. 测绘学报, 2014, 43(7): 668-674. )

[8]	Fang Xing. A Structured and Constrained Total Least-Squares Solution with Cross-Covariances[J]. Studia Geophysica et Geodaetica, 2014, 58(1): 1-16. DOI:10.1007/s11200-012-0671-z

[9]	Zeng Wenxian, Liu Jingnan, Yao Yibin. On Partial Errors-in-Variables Models with Inequality Constraints of Parameters and Variables[J]. Journal of Geodesy, 2015, 89(2): 111-119. DOI:10.1007/s00190-014-0775-z

[10]	Yao Yibin, Huang Shuhua, Kong Jian, et al. Total Least Squares Algorithm for Fitting Spatial Straight Lines[J]. Geomatics and Information Science of Wuhan University, 2014, 39(5): 571-574. (姚宜斌, 黄书华, 孔建, 等. 空间直线拟合的整体最小二乘算法[J]. 武汉大学学报·信息科学版, 2014, 39(5): 571-574. )

[11]	Wei Erhu, Yin Zhixiang, Li Guangwen, et al. On 3D Coordinate Transformations with Virtual Observation Method[J]. Geomatics and Information Science of Wuhan University, 2014, 39(2): 152-156. (魏二虎, 殷志祥, 李广文, 等. 虚拟观测值法在三维坐标转换中的应用研究[J]. 武汉大学学报·信息科学版, 2014, 39(2): 152-156. )

[12]	Ma Youqing, Liu Shaochuang, Wei Shiyan, et al. Terrestrial Analytical Photogrammetry with Weighted Total Least Squares[J]. Geomatics and Information Science of Wuhan University, 2015, 40(5): 594-598. (马友青, 刘少创, 魏士俨, 等. 加权总体最小二乘的地面解析摄影测量算法[J]. 武汉大学学报·信息科学版, 2015, 40(5): 594-598. )

[13]	Chen Yi, Lu Jue, Zheng Bo. Application of Total Least Squares to Space Resection[J]. Geomatics and Information Science of Wuhan University, 2008, 33(12): 1 271-1 274. (陈义, 陆珏, 郑波. 总体最小二乘方法在空间后方交会中的应用[J]. 武汉大学学报·信息科学版, 2008, 33(12): 1 271-1 274. )

[14]	Wang Leyang, Xu Guangyu. The Effect of the Random Coefficient Matrix on Adjustment of the Inversion of Crust Strain Parameters Model[J]. Geomatics and Information Science of Wuhan University, 2017, 42(10): 1 453-1 460. (王乐洋, 许光煜. 系数矩阵误差对地壳应变参数反演的影响[J]. 武汉大学学报·信息科学版, 2017, 42(10): 1 453-1 460. )

[15]	Yao Yibin, Huang Shuhua, Chen Jiajun. A New Method of TLS to Solving the Autoregressive Model Parameter[J]. Geomatics and Information Science of Wuhan University, 2014, 39(12): 1 463-1 466. (姚宜斌, 黄书华, 陈家君. 求解自回归模型参数的整体最小二乘新方法[J]. 武汉大学学报·信息科学版, 2014, 39(12): 1 463-1 466. )

[16]	Rosen J B, Park H, Glick J. Total Least Norm Formulation and Solution for Structured Problems[J]. SIAM Journal on Matrix Analysis & Applications, 1996, 17(1): 110-126.

[17]	de Moor B. Total Least Squares for Affinely Structured Matrices and the Noisy Realization Problem[J]. IEEE Transactions on Signal Processing, 1994, 42(11): 3 104-3 113. DOI:10.1109/78.330370

[18]	Markovsky I, van Huffel S, Pintelon R. Block- Toeplitz/Hankel Structured Total Least Squares[J]. SIAM Journal on Matrix Analysis & Applications, 2005, 26(4): 1 083-1 099.

[19]	Xu Peiliang, Liu Jiangnan, Shi Chuang. Total Least Squares Adjustment in Partial Errors-in-Variables Models: Algorithm and Statistical Analysis[J]. Journal of Geodesy, 2012, 86(8): 661-675. DOI:10.1007/s00190-012-0552-9

[20]	Teunissen P J G. The Geometry of Geodetic Inverse Linear Mapping and Nonlinear Adjustment[M]. Delft: Netherlands Geodetic Commission, 1985.

[21]	Snow K. Topics in Total Least-Squares Adjustment within the Errors-in-Variables Model: Singular Cofactor Matrices and Priori Information[D]. Columbus: The Ohio State University, 2012

[22]	Fang Xing, Zeng Wenxian, Liu Jingnan, et al. Mixed LS-TLS Estimation Based on Nonlinear Gauss-Helmert Model[J]. Acta Geodaetica et Cartographica Sinica, 2016, 45(3): 291-296. (方兴, 曾文宪, 刘经南, 等. 基于非线性高斯-赫尔默特模型的混合整体最小二乘估计[J]. 测绘学报, 2016, 45(3): 291-296. )

[23]	Wang Leyang, Xu Guangyu, Wen Guisen. A Method for Partial EIV Model with Correlated Observation[J]. Acta Geodaetica et Cartographica Sinica, 2017, 46(8): 978-987. (王乐洋, 许光煜, 温贵森. 一种相关观测的Partial EIV模型求解方法[J]. 测绘学报, 2017, 46(8): 978-987. )

[24]	Lu Jue, Chen Yi, Li Bofeng, et al. Robust Total Least Squares with Reweighting Iteration for Three-Dimensional Similarity Transformation[J]. Survey Review, 2014, 46(334): 28-36. DOI:10.1179/1752270613Y.0000000050

[25]	Li Bofeng, Shen Yunzhong, Li Weixiao. The Seamless Model for Three-Dimensional Datum Transformation[J]. Science China: Earth Science, 2012, 55(12): 2 099-2 108. DOI:10.1007/s11430-012-4418-z