Computationally efficient trajectory tracking control of AUVs with nonlinear model predictive control using neuralbased dynamics modeling
Copyright © The Korean Society of Marine Engineering
This is an Open Access article distributed under the terms of the Creative Commons Attribution NonCommercial License (http://creativecommons.org/licenses/bync/3.0), which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
A computationally efficient nonlinear model predictive control (MPC) scheme is presented for the problem of trajectory tracking control, in the 3D space, of autonomous underwater vehicles (AUVs) under constrained speeds and thruster forces, and possibly in the presence of disturbances and actuator limitations. The proposed scheme considers AUV dynamics with four degrees of maneuverability and uses an accurate modeling of the discretetime dynamics of the system using a suitable neural network that enables efficient state propagations and MPC cost gradient computations owing to the parallel computation structure of the network, thereby allowing a more efficient solution of the constrained nonlinear MPC optimization problem using a sequential quadratic programmingbased approach. This also paves way for MPC optimizations over larger time horizons which may be necessary under certain situations. The effectiveness of the proposed scheme is verified with extensive simulations covering various scenarios including the ones that deal with the presence of random and nonrandom timevarying disturbances and, additionally, the condition of underactuation of the vehicle due to the failure of an actuator.
Keywords:
Trajectory tracking, Nonlinear model predictive control, Autonomous underwater vehicles, Underactuated AUV, Computationally efficient NMPC1. Introduction
Over the past several years, there has been an increasing academic, technological and commercial interest in unmanned autonomous systems operating in various environments – on ground and water, in air and space and under the sea [1]. An autonomous underwater vehicle (AUV) is a computer controlled robotic vehicle equipped with a propulsion system enabling its maneuvering in the threedimensional (3D) space under the sea. Like other unmanned autonomous systems, AUVs have extensive scientific, military and commercial applications such as mapping and imaging of underwater landscapes, deepsea exploration of marine life, inspection of underwater infrastructure, detection and clearance of pollutants and mines, undersea search and rescue and so on (see, e.g., [2][5]). Behind the successful design and operation of AUVs are various enabling technologies in the field of sensing, propulsion, communications, guidance, navigation and control [6].
Target or trajectory tracking ability is very crucial for most AUV applications and this requires sophisticated navigation and control schemes since an AUV and the environment that it operates in are characterized by highdimensional nonlinear dynamics, often uncertain, constrained and underactuated, and usually unpredictable oceanic disturbances. Research on reliable control of AUVs is extensive and continuing [7]. A survey of early works on AUV control can be found in [8]. Traditional control design for AUVs includes various analytical design approaches employing nonlinear feedback [9], gainscheduled or parameterized linear feedback [10], inputoutput feedback linearization [11], and so on. To deal with control challenges owing to the highly nonlinear dynamics involving parameter uncertainties and disturbances, techniques such as sliding mode control [12] and adaptive control [13] have been tried and continue to be explored [14][16]. Alternative approaches that have been extensively explored include the more computationoriented techniques such as those using fuzzy logic [17], and neural networks [18][19].
Control approaches explored in the works mentioned above do not directly deal with state and input constraints whose satisfaction is usually crucial in trajectory tracking applications. Model predictive control (MPC) is a promising computationoriented control approach that handles constraints in a systematic way [20]. This approach is based on solving an openloop control optimization problem online in real time and, with the advancements in computing hardware speed, the technology has become applicable for an increasing variety of linear and nonlinear control applications including trajectory planning and tracking applications. Theoretical analysis of the method has also matured over the years [21]. MPC has been used for trajectory tracking control of marine vehicles in several earlier works [22][28]. Nonlinear MPC is used together with an unscented Kalman filter for positioning of ships in [22]. In [23], the authors present a combination of MPC and genetic algorithms to achieve dynamic trajectory tracking. Similarly, a Lyapunovbased MPC is explored for robust trajectory tracking of AUVs in [24]. An eventtriggered nonlinear MPC is presented in [25] for trajectory tracking by an underactuated ship with reduced computational burden and a similar approach is used for AUVs in the presence of disturbances in [26]. In these schemes, the MPC optimization is carried out only when the vehicle’s deviation from the desired state is outside a chosen bound. In [27], a robust nonlinear MPC is designed to achieve robust trajectory tracking of underactuated AUVs in the presence of disturbance inputs.
While MPC offers a remarkable control performance, it is computationally demanding, which is particularly significant for nonlinear systems. So, a fast and reliable algorithm to solve the nonlinear MPC problem is always a necessity [28]. Since MPC is primarily envisaged in the discretetime (DT) domain, an accurate DT model of the system is required, and is not readily obtained for a nonlinear system. Most MPC schemes use DT models obtained using a simple Euler or RungeKutta (RK)based discretization, which requires a very small sampling period to be considered. This limits the horizon length over which the MPC optimization can be carried out in real time, and it can be a drawback in certain situations in trajectory tracking.
In this work, we present a trajectory tracking nonlinear MPC scheme that is based on an accurate modeling of the DT dynamics of the AUV and therefore allows an MPC optimization over a larger time horizon. In particular, we consider an AUV with four degrees of maneuverability and model its DT dynamics for a suitable sampling time with a feedforward neural network (NN). A network with a single hidden layer is chosen so that nested or sequential computations can be avoided during cost gradient computations. Such a model can be obtained for sampling intervals significantly larger than those that can be used with the approximate methods. Following [29], we present a sequential quadratic programming (sQP) algorithm to solve the resulting NMPC problem for trajectory tracking. The sQP algorithm ensures feasibility after each iteration and uses a simple trust region constraint to achieve convergence.
We verify the performance of the proposed tracking control scheme through extensive nonlinear simulations for an AUV mission in ideal and nonideal scenarios. Nonideal conditions include the presence of external disturbances and/or the failure of the control component along the transverse direction. Simulations show the effectiveness of the proposed approach.
Notations: I_{n} denotes an identity matrix of size n, 0_{m×n} denotes an m×n matrix of all zeros, and 1_{n} represents a ndimensional vector of all ones. If any subscript is omitted, the dimension should be clear from the context. For a vector v, diag(v) represents a diagonal matrix with the elements of v along the diagonal, and with entities Δ_{1}, Δ_{2}, …, diag(Δ_{1}, Δ_{2}, …) represents a (block) diagonal matrix with Δ_{1}, Δ_{2}, … along the diagonal. For a vector $$ \mathbf{v},{\Vert \mathbf{v}\Vert}_{\mathit{Q}}^{2}$$ with a symmetric matrix Q represents the quadratic form $$ {\mathbf{v}}^{T}Q\mathbf{v}$$. For a DT signal x(k), x(k + ik) denotes the future signal value x(k + i) predicted at time k.
2. System Description
2.1 Mathematical Model of AUV Motion
The motion of an AUV, like that of any marine vehicle, is conveniently described in 6 degrees of freedom (DOF) with its pose specified by its position p=(x, y, z) and its orientation θ=(ϕ, θ, ψ) (in terms of Euler angles) in an Earthfixed coordinate frame X_{E}Y_{E}Z_{E} and its linear and angular velocities v=(u, v, w) and ω=(p, q, r) specified in a bodyfixed reference frame X_{B}Y_{B}Z_{B} as illustrated in Figure 1. The origin O of the bodyfixed frame is chosen at the vehicle’s center of gravity (CG) and its axes are chosen to coincide with the three principal axes of inertia – longitudinal, transverse and normal. So, the linear velocity components u, v and w correspond to surge, sway and heave speeds and the angular velocity components p, q and r correspond to roll, pitch and yaw speeds.
The kinematic relationship between the pose vectors and the velocity vectors are given by
$$ \dot{\mathbf{p}}={\mathit{J}}_{1}\left(\mathbf{\theta}\right)\mathbf{v},\dot{\mathbf{\theta}}={\mathit{J}}_{2}\left(\mathbf{\theta}\right)\mathbf{\omega}$$
where J_{1} (θ) and J_{2} (θ) are 3×3 transformation matrices with elements depending on ϕ, θ, ψ (see, e.g., [30] for the details).
In this paper, we consider underwater vehicles equipped with metacentric restoring forces preventing roll and pitch motions so that the roll and pitch angles ϕ and θ and the angular speeds p and q are negligible and close to zero. Under this assumption, the reducedorder kinematic relationship is described by
$$$\dot{\mathbf{p}}={\mathit{J}}_{1}\left(\psi \right)\mathbf{v}=\left[\begin{array}{ccc}\mathrm{c}\mathrm{o}\mathrm{s}\psi & \mathrm{s}\mathrm{i}\mathrm{n}\psi & 0\\ \mathrm{s}\mathrm{i}\mathrm{n}\psi & \mathrm{}\mathrm{}\mathrm{}\mathrm{c}\mathrm{o}\mathrm{s}\psi & 0\\ 0& 0& 1\end{array}\right]\mathbf{v},\dot{\psi}=r$$$  (1) 
Further, the dynamics of the vehicle are specified in terms of the combined velocity vector υ=(v, r) by the relationship
$$$\mathit{M}\dot{\mathit{\upsilon}}+\mathit{C}\left(\mathbf{\upsilon}\right)\mathbf{\upsilon}+\mathit{D}\left(\mathbf{\upsilon}\right)\mathbf{\upsilon}+\mathbf{g}=\mathbf{\tau}+\mathbf{d}$$$  (2) 
where M is the matrix of mass/inertia components (including the added mass/inertia), C(υ) is the matrix of Coriolis terms (including that of added mass/inertia), D(υ) is the hydrodynamic damping matrix, g is the vector of gravitational forces and moments, τ is the control input vector comprising forces and torques and d is the vector of disturbance forces/torques, mainly due to water currents. Assuming the AUV body to be symmetric about the xy and xz planes, and that it has a slightly positive buoyancy, we have,
$$$\begin{array}{l}=\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({m}_{X},{m}_{Y},{m}_{Z},{I}_{Z}\right),\\ \left(\mathbf{\upsilon}\right)=\left[\begin{array}{cccc}0& 0& 0& {m}_{Y}v\\ 0& 0& 0& {m}_{X}u\\ 0& 0& 0& 0\\ {m}_{Y}v& {m}_{X}u& 0& 0\end{array}\right]\\ \mathit{D}\left(\mathbf{\upsilon}\right)=\begin{array}{l}\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({X}_{u},{Y}_{v},{Z}_{w},{N}_{r}\right)\\ \mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({X}_{u\leftu\right}\leftu\right,{Y}_{v\leftv\right}\leftv\right,{Z}_{w\leftw\right}\leftw\right,{N}_{r\leftr\right}\leftr\right\right)\end{array}\\ \mathbf{g}={\left[00BW0\right]}^{T}\\ \mathbf{\tau}={\left[{\tau}_{X}{\tau}_{Y}{\tau}_{Z}{\tau}_{N}\right]}^{T}\\ \mathbf{d}={\left[{d}_{X}{d}_{Y}{d}_{Z}{d}_{N}\right]}^{T}\end{array}$$$ 
where W and B are the weight and the buoyancy of the vehicle, and, m_{X}, m_{Y}, m_{Z}, I_{Z} are mass/inertia terms (including added mass/inertia), X_{u}, Y_{v}, Z_{w}, N_{r} and X_{uu} , Y_{vv} , Z_{ww} , N_{rr} are hydrodynamic damping terms, τ_{X}, τ_{Y}, τ_{Z}, τ_{N} are the control forces/torque generated by the thrusters and d_{X}, d_{Y}, d_{Z}, d_{N} are disturbance forces/torque effective along/about the surge, sway, heave and yaw directions.
Thus, combining Equations (1) and (2), we can write the mathematical description of the AUV motion in the 3D space in the form of the following nonlinear statespace model
$$$\dot{\mathbf{x}}=\mathit{f}\left(\mathbf{x},\mathbf{u},\mathbf{d}\right),\mathbf{y}=C\mathbf{x}$$$  (3) 
where $$ \mathbf{x}={\left[xyz\psi uvwr\right]}^{T}\in {\mathbb{R}}^{8}$$ is the state vector, $$ \mathbf{u}=\mathbf{\tau}\in {\mathbb{R}}^{4}$$ is the control input vector, $$ \mathbf{d}\in {\mathbb{R}}^{4}$$ is the disturbance input vector, and
$$$\mathit{f}\left(\mathbf{x},\mathbf{u},\mathbf{d}\right)=\left[\begin{array}{c}u\mathrm{cos}\psi v\mathrm{sin}\psi \\ u\mathrm{sin}\psi +v\mathrm{cos}\psi \\ w\\ r\\ \frac{1}{{m}_{X}}({m}_{Y}vr+{(X}_{u}+{X}_{u\leftu\right}\leftu\right)u+{\tau}_{X}+{d}_{X})\\ \frac{1}{{m}_{Y}}({m}_{X}ur+{(Y}_{v}+{Y}_{v\leftv\right}\leftv\right)v+{\tau}_{Y}+{d}_{Y})\\ \frac{1}{{m}_{Z}}(WB+{(Z}_{w}+{Z}_{w\leftw\right}\leftw\right)w+{\tau}_{Z}+{d}_{Z})\\ \frac{1}{{I}_{Z}}\left(\right({m}_{X}{m}_{Y})uv+{(N}_{r}+{N}_{r\leftr\right}\leftr\right)r+{\tau}_{N}+{d}_{N})\end{array}\right]$$$ 
Further, the output vector $$ ={\left[xyz\psi \right]}^{T}\in {\mathbb{R}}^{4}$$ so that $$ =\left[{I}_{4}{0}_{4\times 4}\right]$$.
Remark 1: The motion of an AVU described by (3) has 4 DOF and is supported by control inputs along the 4 directions. The AUV can still be maneuvered if the sway input component τ_{Y} is unavailable or the corresponding actuator is faulty. A vehicle in such a situation is considered to be underactuated. Several works on AUV trajectory tracking control have focused on underactuated AUVs (e.g., [27])
2.2 Problem Description
We wish to control the motion of an AUV for missions that require tracking a constant target point or tracking a specified timevarying reference trajectory in some optimal way while satisfying all requirements including system constraints.
The motion of an AUV is usually constrained by limitations on the values of control inputs and state components; that is, the limits on the thruster forces and torque that can be generated and the limits on the velocities that are tolerated in each direction. Considering that the control input components and the velocity components are bounded in their absolute values, we express the constraints in collective forms mentioned below:
$$$\mathbf{u}=\mathbf{\tau}\in \mathbb{U}=\left\{\mathbf{u}\in {\mathbb{R}}^{4}\u2223\begin{array}{c}{u}_{1}\left\le {\widehat{\tau}}_{X},{u}_{2}\right\le {\widehat{\tau}}_{Y},\\ {u}_{3}\left\le {\widehat{\tau}}_{Z},{u}_{4}\right\le {\widehat{\tau}}_{N}\end{array}\right\}$$$  (4a) 
$$$\mathbf{x}\in \mathbb{X}=\left\{\mathbf{x}\in {\mathbb{R}}^{8}\u2223\begin{array}{c}{x}_{5}\left\le \widehat{u},{x}_{6}\right\le \widehat{v},\\ {x}_{7}\left\le \widehat{w},{x}_{8}\right\le \widehat{r}\end{array}\right\}$$$  (4b) 
Constraints on the position components of the state may exist but they mostly depend on the mission and the environment and may be specified dynamically during the mission.
When a target reference point is given, it is the desired final position p_{d} which the vehicle is supposed to eventually reach. The reference trajectory and the final time may not be specified in this scenario.
When a target reference trajectory is given, it is specified in the form of a timevarying position vector $$ {\mathbf{p}}_{d}\left(t\right)=\left({x}_{d}\left(t\right),{y}_{d}\left(t\right),{z}_{d}\left(t\right)\right)$$ which is a smooth and continuous function of time. The other components of the desired state $$ {\mathbf{x}}_{d}\left(t\right)$$ can be readily obtained by first finding $$ {\psi}_{d}\left(t\right)=\mathrm{atan}\left(\frac{{y}_{d}\left(t\right)}{{x}_{d}\left(t\right)}\right),$$, and then finding the derivatives: $$ {\mathbf{v}}_{d}\left(t\right)={\mathit{J}}_{1}^{1}\left(\psi \right){\dot{\mathbf{p}}}_{d}\left(t\right)$$ and $$ {r}_{d}\left(t\right)={\dot{\psi}}_{d}\left(t\right)$$.
Here, we assume that the target reference point or reference trajectory is feasible in the sense that it can be tracked by the AUV while satisfying all constraints. This is theoretically so if the desired position trajectory is a part of a feasible solution to the system model (3) satisfying all constraints.
Given a constant target position p_{d} (t) = p_{d} or a timevarying reference trajectory p_{d} (t), the primary control objective is to drive the AUV as close as possible to the reference point or trajectory in some best possible way, particularly in the sense of minimizing a quadratic function of the position error
$$${\int}_{t={t}_{0}}^{{t}_{f}}{\mathbf{\Delta}\mathbf{p}}^{T}\left(t\right){Q}_{p}\mathbf{\Delta}\mathbf{p}\left(t\right)dt,\mathbf{\Delta}\mathbf{p}\left(t\right)={\mathbf{p}\left(t\right)\mathbf{p}}_{d}\left(t\right)$$$  (5) 
over the specified or some reasonable time horizon [t_{0}, t_{f}]. The control scheme should ensure that all system constraints are satisfied at all times. Further, it is also expected to show a reasonably good performance even when there are disturbances and parameter uncertainties or some minor faults in the system model.
A control scheme designed to achieve the abovementioned control objective needs to be aware of the desired reference trajectory and the vehicle state at all times. It is assumed that the AUV receives tracking mission information through suitable communications from some external agency or it is capable of generating the reference trajectory itself. Further, it is supported by an appropriate navigation system with several sensors whose readings can be used to estimate the actual state of the AUV. Navigation schemes for autonomously operating systems have been well researched and newer and more promising techniques continue to be explored (see, e.g., [31], [32]).
3. AUV Trajectory Tracking with NMPC using NeuralModeled System Dynamics
3.1 Basic NMPC Algorithm
Among various control approaches, MPC is perhaps one of the best suited control schemes to achieve the control goal mentioned in Section 2.2.3 for a constrained AUV system. MPC uses the system dynamics model, usually in the DT framework, to numerically optimize the control input over some time horizon so as to minimize a suitably defined cost function over that horizon. This optimization is done regularly in real time, usually at every discrete time step to update the control input to be applied to the system.
Consider a DT version of model (3), viz.,
$$$\mathbf{x}\left(k+1\right)=\mathit{\varphi}\left(\mathbf{x}\left(k\right),\mathbf{u}\left(k\right),\mathbf{d}\left(k\right)\right),\mathbf{y}\left(k\right)=C\mathbf{x}\left(k\right)$$$  (6) 
Here, x(k), u(k), d(k) and y(k) represent the values of the signals at the discrete sampling time $$ kT,k\in \mathbb{Z}$$, where T is the sampling period. An MPC scheme meant to achieve the desired trajectory tracking by the AUV uses a cost function of the form
$$$J\left(\mathbf{x}\left(k\right),U\left(k\right),k\right)={\sum}_{i=0}^{N1}\mathcal{l}\left(\mathbf{x}\left(k+ik\right),\mathbf{u}\left(k+ik\right),k+i\right)$$$  (7) 
with $$ \mathcal{l}\left(\mathbf{x},\mathbf{u},k\right)={\Vert \mathbf{x}{\mathbf{x}}_{d}\left(k\right)\Vert}_{Q}^{2}+{\Vert \mathbf{u}{\mathbf{u}}_{d}\left(k\right)\Vert}_{R}^{2}$$
where Q and R are positive (semi)definite matrices defining the stage cost function l(., ., .) and
$$$U\left(k\right)={\left[\mathbf{u}{\left(kk\right)}^{T}\mathbf{u}{\left(k+1k\right)}^{T}\dots \dots \mathbf{u}{\left(k+N1k\right)}^{T}\right]}^{T}$$$ 
is the stacked vector of inputs to be optimized. The MPC optimization problem to be solved at each time step k is stated as
$$$\underset{U\left(k\right)}{\mathrm{minimize}}J\left(\mathbf{x}\left(k\right),U\left(k\right),k\right)$$$  (8) 
such that
$$$\begin{array}{l}\mathbf{x}\left(kk\right)=\mathbf{x}\left(k\right)\\ \begin{array}{c}\mathbf{x}\left(k+i+1k\right)=\mathit{\varphi}\left(\mathbf{x}\left(k+ik\right),\mathbf{u}\left(k+ik\right),\mathbf{d}\left(k+ik\right)\right),\\ i=1,..,N1\end{array}\\ {M}_{\mathbf{x}}\mathbf{}\mathbf{x}\left(k+ik\right)\le 1,i=\mathrm{1,2},..,N\\ {M}_{\mathbf{u}}\mathbf{}\mathbf{u}\left(k+ik\right)\le 1,i=\mathrm{0,1},..,N1\end{array}$$$ 
where matrices M_{x} and M_{u} in the last two constraints are so chosen that they represent $$ \mathbf{x}\left(k+ik\right)\in \mathbb{X}$$ and $$ \mathbf{u}\left(k+ik\right)\in \mathbb{U}$$.
A basic MPC algorithm based on problem (8) is outlined below.
Algorithm 1: Online MPC algorithm: At each time step k: i) Measure or estimate the state of the system, x(k). ii) Solve problem (8) to obtain an optimal U*(k). iii) Apply the control input u*(kk) to the AUV. 
The cost function (7) used in the MPC algorithm can be made to approximate (5) with the choices Q=C^{T}C and R=0. Still, there are various difficulties in successfully achieving the control objective by implementing Algorithm 1. These include theoretical and practical concerns related to stability, robustness and implementation.
3.2 Stability
If the time horizon [t_{0}, t_{f}] of the reference trajectory is significant, covering it entirely in (7) may be computationally prohibitive. It is desired that a repeated solution of (8) with a modest value of N in (7) as outlined in Algorithm 1 ensures the asymptotic or exponential convergence of the tracking error Δx(k) = x(k)  x_{d}(k) to the origin or to a set around it.
Let us consider the disturbancefree dynamics
$$$\mathbf{x}\left(k+1\right)={\mathit{\varphi}}_{\mathit{o}}\left(\mathbf{x}\left(k\right),\mathbf{u}\left(k\right)\right)$$$  (9) 
where ϕ_{o} (x, u) = ϕ(x, u, 0).We make the following assumption about the reference trajectory.
Assumption 1: The desired reference trajectory (x_{d}(k), u_{d}(k)) satisfies the DT state dynamics in (9) :
$$${\mathbf{x}}_{d}\left(k+1\right)={\mathit{\varphi}}_{\mathit{o}}\left({\mathbf{x}}_{d}\left(k\right),{\mathbf{u}}_{d}\left(k\right)\right)$$$ 
Under Assumption 1, the tracking error dynamics are given by
$$$\mathrm{\Delta}\mathbf{x}\left(k+1\right)={\mathit{\varphi}}_{\mathit{o}}\left({\mathbf{x}}_{d}\left(k\right)+\mathrm{\Delta}\mathbf{x}\left(k\right),{\mathbf{u}}_{d}\left(k\right)+\mathrm{\Delta}\mathbf{u}\left(k\right)\right){\mathit{\varphi}}_{\mathit{o}}\left({\mathbf{x}}_{d}\left(k\right),{\mathbf{u}}_{d}\left(k\right)\right)$$$ 
where Δu(k) = u(k)  u_{d}(k). A local linearization of the error dynamics about a reference point r=(x_{d}(k), u_{d}(k)) in the desired trajectory results in
$$$\mathrm{\Delta}\mathbf{x}\left(k+1\right)={A}_{\mathbf{r}}\mathrm{\Delta}\mathbf{x}\left(k\right)+{B}_{\mathbf{r}}\mathrm{\Delta}\mathbf{u}\left(k\right)+\vartheta \left(\mathrm{\Delta}\mathbf{x}\left(k\right),\mathrm{\Delta}\mathbf{u}\left(k\right)\right)$$$ 
where A$$ {A}_{\mathbf{r}}={\left.\frac{\partial {\mathit{\varphi}}_{\mathit{o}}}{\partial \mathbf{x}}\right}_{\mathbf{r}},{B}_{\mathbf{r}}={\left.\frac{\partial {\mathit{\varphi}}_{\mathit{o}}}{\partial \mathbf{u}}\right}_{\mathbf{r}}$$ and $$ \vartheta (.,.)$$ represents the higher order terms.
If (A_{r}, B_{r}) is stabilizable, for a Δx(k) sufficiently close to the origin, there exists a state feedback gain matrix K_{r} such that with u(k) = u_{d}(k) + K_{r}Δx(k), the tracking error satisfies the contractivity property
$$${\Vert \mathrm{\Delta}\mathbf{x}\left(k+1\right)\Vert}_{{P}_{\mathbf{r}}}^{2}<\rho {\Vert \mathrm{\Delta}\mathbf{x}\left(k\right)\Vert}_{{P}_{\mathbf{r}}}^{2}$$$  (10) 
with ρ ∈ (0, 1) for some positive definite Lyapunov matrix P_{r} [21]. This fact can be used to construct terminal state feedback controllers together with associated invariant sets to be used as terminal state constraints in the online MPC optimization problem to guarantee stability. For a class of tracking problems, a parameterized state feedback gain and the associated invariant set may be computed and used for a range of desired stateinput pairs [29]. However, computing terminal statefeedback gains and the associated invariant sets is usually computationally burdensome. As explored in [21], under the local stabilizability condition (which the AUV system satisfies), Algorithm 1 with a positive definite matrix Q in (7) and a sufficiently large horizon length N ensures that the tracking error exponentially converges to the origin. However, the theoretical bound on the required horizon length N can be quite conservative.
3.3 Handling Disturbances and Uncertainties
The AUV motion model includes the disturbance term d(t) which is used to mainly model the effect of ocean currents. This creates a difficulty in the evaluation of the cost function and in ensuring the constraints in (8). Since the bounds on the disturbance components can be estimated, we can use a robust approach to cost evaluation and constraint satisfaction. Potential robust approaches include the tubebased approach [33], disturbancefeedbackbased approach (e.g., [34]), and so on. The latter deals with linear timevarying systems, and, as we shall see in Section 3.4, when problem (8) is solved using the sQP approach, nonlinear dynamics constraints are linearized about the existing stateinput trajectories effectively resulting in an MPC problem for a lineartimevarying system.
The robust approaches, however, usually lead to conservative results while significantly increasing the computational burden. In our problem, the disturbance input is an input disturbance that enters the system in the same way as the control input. It can be estimated using an appropriate filter and its effect countered effectively using the control input. Since the disturbance is slowly timevarying, it can be assumed to remain constant through the prediction horizon. Slightly tightened input constraints can be used to account for potential variations in the disturbance values.
Since the system model that we use is based on several simplifying assumptions and the model uses a number of parameters whose values may only be approximately known, there are other potential sources of uncertainties. However, the inherent robustness of the MPC approach due to its receding horizon implementation can be trusted to handle the effects of these uncertainties.
3.4 Numerical Implementation
The desired performance of the control scheme outlined in Algorithm 1 is achieved only under the condition that the DT dynamics model is accurate and that the nonlinear optimization problem can be numerically solved within a fraction of the sampling period. An exact equivalent DT model of a continuoustime nonlinear system is not readily available. Numerical approximations using Euler difference methods or RK methods are often used in applications. However, these numerical approximations are accurate only for small sampling intervals. Since we envisage the possibility of considering optimizations over longer horizon lengths, we consider a more accurate modeling of the DT dynamics using alternative approximations. A suitable approximator in this context is a singlelayer feedforward neuralnetwork (ffNN) which does not involve nested nonlinear function evaluations. This simplifies and speeds up future state computations and also provides a simple analytical expression for gradient evaluations when solving the numerical optimization problem.
In the continuoustime state equation, the function f(x, u, d) is such that the input vectors u and d always appear together and none of the component functions depend on position variables x, y and z. So, we can assume a single input vector u representing the combined input and also omit x, y, z as variable inputs to the DT nonlinear dynamics function. Under these considerations, defining a vector $$ \mathbf{\chi}={\left[\psi uvwr\right]}^{T},$$ the DT state equation can be written as
$$$\mathbf{x}\left(k+1\right)=\stackrel{}{A}\mathbf{}\mathbf{x}\left(k\right)+{\mathit{\varphi}}_{\mathit{n}}\left(\mathbf{\chi}\left(k\right),\mathbf{u}\left(k\right)\right)$$$  (11) 
where $$ \stackrel{}{A}==\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({I}_{3},{0}_{5\times 5}\right)$$ and $$ {\mathit{\varphi}}_{\mathit{n}}\left(\mathbf{\chi}\left(k\right),\mathbf{u}\left(k\right)\right)$$ represents the nonlinear part of DT dynamics. Clearly, $$ {\mathit{\varphi}}_{\mathit{n}}\left(\mathbf{\chi}\mathbf{},\mathbf{u}\right)$$ is a continuous function of its arguments and it can be approximated to any desired accuracy by a ffNN with a single hidden layer of a sufficiently large number of neurons. Let it be approximated by a network with η neurons in the hidden layer:
$$${\mathit{\varphi}}_{\mathit{n}}\left(\mathbf{\chi},\mathbf{u}\right)={W}_{o}\mathit{\phi}\left({W}_{\chi}\mathbf{\chi}+{W}_{u}\mathbf{u}+\mathbf{b}\right)+\mathbf{e}\left(\mathbf{\chi},\mathbf{u}\right)$$$ 
where $$ {W}_{\chi}{\in \mathbb{R}}^{\eta \times 5}$$ and $$ {W}_{u}{\in \mathbb{R}}^{\eta \times 4}$$ are the hidden layer weights for network inputs χ and $$ \mathbf{u},\mathbf{}\mathbf{b}\in {\mathbb{R}}^{\eta}$$ is the bias vector and $$ {W}_{o}{\in \mathbb{R}}^{8\times \eta}$$ is the output layer weight matrix. The vector function $$ \mathit{\phi}:{\mathbb{R}}^{\eta}\to {\mathbb{R}}^{\eta}$$ is a diagonal operator that applies an identical continuous nonlinearity φ_{i}(.) = φ(.), usually a sigmoidal function (e.g., tanh(.) function) to each component of the argument of φ(.). e(χ, u) represents the approximation error and is assumed to be small.
The ffNN representing the nonlinear DT dynamics function ϕ_{n} (χ, u) needs to be suitably trained by generating a rich set of inputoutput data samples through numerical simulations. Input data samples are chosen randomly from within the bounds considered and the corresponding output samples are computed using highly accurate numerical integration.
Having obtained a sufficiently accurate neuralbased model of nonlinear DT dynamics, we follow the sQPbased algorithm mentioned in [29] to numerically solve the MPC optimization problem (8) where we replace the nonlinear dynamics constraint by the neuralbased model
$$$\mathbf{x}\left(k+i+1k\right)=\stackrel{}{A}\mathbf{x}\left(k+ik\right){+W}_{o}\mathit{\phi}\left(\mathbf{z}\left(k+ik\right)\right)$$$  (12) 
where $$ \mathbf{z}\left(k+ik\right)={W}_{\chi}G\mathbf{x}\left(k+ik\right)+{W}_{u}\mathbf{u}\left(k+ik\right)+\mathbf{b}$$ with $$ G=\left[{0}_{5\times 3}{I}_{5}\right]$$. Here, we have assumed that the approximation error is negligible. The algorithm is based on the linearization of the nonlinear equation in (12) about the existing feasible state and input trajectories. Given an existing feasible input sequence
$$$\stackrel{}{U}\left(k\right)={\left[\stackrel{}{\mathbf{u}}{\left(kk\right)}^{T}\stackrel{}{\mathbf{u}}{\left(k+1k\right)}^{T}\dots \dots \stackrel{}{\mathbf{u}}{\left(k+N1k\right)}^{T}\right]}^{T}$$$ 
the corresponding state sequence
$$$\stackrel{}{X}\left(k\right)={\left[\stackrel{}{\mathbf{x}}{\left(kk\right)}^{T}\stackrel{}{\mathbf{x}}{\left(k+1k\right)}^{T}\dots \dots \stackrel{}{\mathbf{x}}{\left(k+Nk\right)}^{\mathit{T}}\right]}^{T}$$$ 
can be immediately obtained using (12). Linearizing (12) using the fistorder Taylor expansion of the function φ(.) about the existing stateinput trajectories gives us
$$$\begin{array}{c}\mathbf{x}\left(k+i+1k\right)=A\left(k+ik\right)\mathit{}\mathbf{x}\left(k+ik\right)\\ +B\left(k+ik\right)\mathbf{}\mathbf{u}\left(k+ik\right)+\mathbf{h}\left(k+ik\right)\end{array}$$$  (13) 
where
$$$\begin{array}{l}A\left(k+ik\right)=\stackrel{}{A}+{W}_{o}\mathit{}\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({\mathit{\phi}}^{\mathit{\text{'}}}\left(\stackrel{}{\mathbf{z}}\left(k+ik\right)\right)\right){W}_{\chi}G\\ B\left(k+ik\right)={W}_{o}\mathit{}\mathrm{d}\mathrm{i}\mathrm{a}\mathrm{g}\left({\mathit{\phi}}^{\mathit{\text{'}}}\left(\stackrel{}{\mathbf{z}}\left(k+ik\right)\right)\right){W}_{u}\\ \mathbf{h}\left(k+ik\right)=\begin{array}{l}{W}_{o}\mathit{\phi}\left(\stackrel{}{\mathbf{z}}\left(k+ik\right)\right)\left(A\left(k+ik\right)\stackrel{}{A}\right)\times \\ \stackrel{}{\mathbf{x}}\left(k+ik\right)B\left(k+ik\right)\stackrel{}{\mathbf{u}}\left(k+ik\right)\end{array}\end{array}$$$ 
Here, $$ {\mathit{\phi}}^{\mathit{\text{'}}}\left(.\right)={\left[{\phi}_{1}^{\text{'}}\left({\stackrel{}{z}}_{1}\right)\dots {\phi}_{\eta}^{\text{'}}\left({\stackrel{}{z}}_{\eta}\right)\right]}^{T}$$ is a vector with componentwise derivatives.
With the linearized version (13) of the dynamics constraint for i = 0, 1, .., N  1, the optimization problem (8), can be expressed as a QP problem
$$$\begin{array}{l}\underset{U\left(k\right)}{\mathrm{minimize}}{\Vert {\left[\mathbf{x}{\left(k\right)}^{T}\mathbf{}\mathbf{}\mathbf{}\mathbf{}U{\left(k\right)}^{T}\right]}^{T}\Vert}_{\mathbf{H}}^{2}\\ {\mathrm{s}\mathrm{u}\mathrm{c}\mathrm{h}\mathrm{}\mathrm{t}\mathrm{h}\mathrm{a}\mathrm{t}\mathit{}\mathit{}\mathit{}\mathbf{M}\left[\mathbf{x}{\left(k\right)}^{T}\mathbf{}\mathbf{}\mathbf{}\mathbf{}U{\left(k\right)}^{T}\right]}^{T}\le 1\end{array}$$$  (14) 
with appropriately defined matrices H and M (See, e.g., [29], [34] for the details.
In the following, we briefly outline the sQPbased algorithm to solve problem (14).
Algorithm 2: sQPbased solution of neuralmodeled NMPC problem

The approach outlined in Algorithm 2 is also referred to as feasibilityperturbed sequential QP approach [35] since the feasibility is ensured through nonlinear propagation after each iteration. The additional constraint used in Step (iii) is a trust region constraint imposed to ensure that the new computed vector U(k) does not deviate too much from $$ \stackrel{}{U}\left(k\right)$$ so that the convergence of the solution can be ensured. A proper initialization procedure starting with a small N is usually required to obtain a feasible solution at time k = 0. With a proper initialization and a suitable trust region constraint, the algorithm gives a nearoptimal solution within a few iterations.
4. Numerical Simulations
4.1 AUV Details
Extensive simulations are carried out to assess the performance of the proposed approach. Parameters specifying the dynamics of the AUV and the constraints applicable on speeds and thruster actions are mentioned in Table 1.
4.2 AUV DT Dynamics Modeling with NNs
We obtain two ffNNs – NN1 and NN2 to represent nonlinear DT dynamics for two different sampling periods – T_{1} = 1s and T_{2} = 0.25s. Both are trained with about 10000 samples of inputoutput data. Input data samples are selected randomly from the set 1.1($$ 1.1(\left[\pi ,\pi \right]\times G\mathbb{X}\times \mathbb{U}$$) where the set [π, π] is the range chosen for the yaw angle ψ. The factor 1.1 is used to provide some margin at the constraint boundaries. Outputs corresponding to input samples are obtained using the ode45 solver in MATLAB. Networks NN1 and NN2 are designed with η_{1} = 48 and η_{2} = 40 hidden layer neurons which use the tanh() activation function. After sufficient training, their mean squared error (MSE) performances are found to be of the order 10^{5} and 10^{6} respectively.
Table 2 gives a comparison, in terms of accuracy and computational efficiency, of model NN2 with the popular RK4 method (which is used, e.g., in [22], [26]). Here, the model accuracy is indicated by the MSE over all training and test samples, and, the computational efficiency is expressed in terms of the ratio of the average model function evaluation time over the average time taken by the accurate ode45 method in the same machine. While NN2 is expectedly more accurate for this step size, what is significant is its comparative computational advantage which should remain significant (for both function evaluation and gradient evaluation) even for small step sizes for which the accuracy advantage may not be significant. Note that the RK4 method is not stable with the time step size h = 1.
4.3 AUV Mission Simulation with NMPC
For tracking control performance evaluation, we consider an AUV mission comprising predefined timevarying trajectory tracking and free setpoint tracking segments. In particular, the following phases are considered in the mission:
 1) Descending phase: Trajectory tracking phase 1 (100s)
$$ {x}_{d}\left(t\right)=2+0.2t,{y}_{d}\left(t\right)=1+0.05t,$$
$$ {z}_{d}\left(t\right)=2220{e}^{\frac{{\left({x}_{d}\left(t\right)2\right)}^{2}+{\left({y}_{d}\left(t\right)1\right)}^{2}}{25}}$$  2) Free setpoint tracking phase (No fixed target time)
$$ \left({x}_{d},{y}_{d},{z}_{d}\right)=\left(\mathrm{60,10,30}\right)$$  3) Object inspection phase (Trajectory tracking phase 2 (800s))
$$ {x}_{d}\left(t\right)=7110\mathrm{cos}\left(0.05(t{t}_{2})\right)+0.01(t{t}_{2})$$
$$ {y}_{d}\left(t\right)=10+8\mathrm{sin}\left(0.05\right(t{t}_{2}\left)\right),$$
$$ {z}_{d}\left(t\right)=300.025(t{t}_{2})$$
where t_2 is the time when Phase 2 is completed.
We assume that the AUV is initially at rest at (0, 0, 0).
We first consider the case without disturbances in the simulation model. In the MPC cost function, we use the cost matrix Q = diag(I_{4}, 0.1I_{4} ) in all phases, and since the desired inputs are not specified, we mildly penalize the actual inputs with R = 10^{5}I_{4} in Phases 1 and 3 and R = 10^{3}I_{4} in Phase 2. We also look for a horizon length N that is sufficient to ensure that the predicted terminal state is close to the desired pose throughout the mission. It is found that with model NN1 (T_{1} = 1s), a horizon length of N = 10 is generally sufficient when the initial state is not very far from the corresponding reference point. In Phase 2, the target point is initially not close but to avoid using a large horizon length, we consider a virtual reference converging to the target point (60, 10, 30) and update the virtual reference in real time.
Figure 2 shows the reference and actual AUV position trajectories that we achieve with the MPC scheme using model NN1 in the disturbancefree scenario. It can be seen that the MPC is able to drive the AUV closely along the desired trajectory.
Figure 3 shows the position and velocity components of the AUV state and the thruster forces in x, y, z directions. It can be seen that the zdirection thruster reaches the limit during Phase 1 when the decent is steep. Similarly, the surge and sway speed limits are reached during Phase 2. Figure 4(a) shows the position tracking errors during Phase 1 and 3, and it can be seen from the figure that the tracking error is within ±0.025m in each dimension. Note that the tracking error is not relevant in Phase 2.
In order to assess the impact of the accuracy of the dynamics model, we also simulate the performance of the MPC scheme with model NN2. Since NN2 uses a sampling time of 0.25s, a horizon length of 10s would require N = 40. However, since the input can be updated quickly, we consider N = 10 as earlier. The tracking response is similar to the one obtained with NN1. However, as we can see in Figure 4(b), which presents the position tracking errors during Phase 1 and Phase 3, the tracking errors are smaller with NN2 than with NN1.
This can be expected since model NN2 is at least one order of magnitude more accurate (in terms of MSE values) than NN1 and it also updates the control actions 4 times more frequently.
We next consider a scenario with disturbance inputs. Disturbances are mainly due to water currents inside the sea and are slowing varying in time but there can be other sources too. We consider a disturbance vector of the following form:
$$$\mathbf{d}=\left[\begin{array}{c}8\mathrm{sin}\left(0.1t\right)+2{v}_{1}\\ 8\mathrm{cos}(0.1t0.3)+{2v}_{2}\\ 8\mathrm{cos}(0.1t+0.8)+2{v}_{3}\\ 2\mathrm{sin}\left(0.1t+0.5\right)+0.05{v}_{4}\end{array}\right]$$$ 
where $$ {v}_{1},{v}_{2},{v}_{3},{v}_{4}$$ are zeromean Gaussian random variables with unit variance.
Figure 5 shows the tracking errors in Phases 1 and 3 in the presence of disturbance inputs. The errors in Part (a) are for the scheme with NN1 and those in Part (b) are for that with NN2. Note that the transient phase is not shown. Evidently, the error magnitudes in the presence of disturbances are larger by a factor of about 5 (compare with Figure 4). Also, the error magnitudes are lower by about 50% with NN2 than with NN1 for reasons mentioned above. Nevertheless, the MPC scheme is able to limit the tracking errors within about 10cm in each direction when NN1 is used and within about 5cm when NN2 is used.
In this part, we explore the performance of the MPC scheme in a scenario with a faulty sway (transverse direction) actuator. Even without a fault, the sway control considered in our scheme is limited because of the limitation in sway speed and swaydirection thrust. When sway control is completely absent, the AUV is considered to be underactuated and its maneuvering ability may be restricted in some situations. We consider this situation in the presence of disturbances as mentioned in Section 4.3.2. Figure 6 shows the state and control components evolving under the MPC scheme (using NN1) in this scenario. It can be observed that while the control inputs (except τ_Y, which is zero) and the velocities react and fluctuate to correct the effect of the disturbances that affect the system, position components vary rather smoothly with time.
The plots in Figure 7 show 3D position errors $$ (e=\sqrt{{\left(x{x}_{d}\right)}^{2}+{\left(y{y}_{d}\right)}^{2}+{\left(z{z}_{d}\right)}^{2}})$$ during Phase 3 in the three different scenarios considered (in Sections 4.3.1~4.3.3).
The errors in the first figure are obtained with model NN1 and those in the second figure are obtained with model NN2. Clearly, the presence of disturbances increases the tracking error and the actuator fault further increases it. And, because of higher accuracy and smaller sampling time, with model NN2, errors are smaller by a factor of about 2.
Solving the nonlinear MPC problem is computationally demanding. Algorithm 2 requires a repeated solution of a QP problem at every time step. It is found that, except at the first step of every phase, the algorithm converges in up to 4 or 5 steps. The initial step requires the search for an initial feasible solution which may take 5 to 10 or more steps depending on the horizon length. We use the QP solver qpOASES [36] in MATLAB in a Windows machine with Intel i7 1.8 GHz processor and 24 GB RAM for the computations. It is found that the computations (QP solving and other preparatory computations) to be made after the measurement of the state at each step to obtain the optimal solution take, on the average, about 19ms when NN1 is used and about 15ms when NN2 is used for a horizon length of N = 10. In the absence of sway control, since we have a fewer number of optimization variables, the corresponding average computation times are slightly smaller – 15ms and 12ms respectively. Clearly, these computation times are small fractions of the respective sampling times and therefore they do not adversely affect the implementability of the control scheme.
5. Conclusion
The problem of 3D trajectory tracking of AUVs under 4 or 3 degrees of maneuverability was addressed with an effective and efficient nonlinear MPC scheme that uses a suitable modeling of the DT dynamics of the system using a ffNN. An accurate NNbased DT dynamics model simplifies the online state propagation and cost gradient computations when solving the nonlinear MPC optimization problem with a sequence of QPs. Realistic numerical simulations have shown the effectiveness of the approach in various situations including the presence of random and nonrandom disturbances and/or the lack of maneuverability along the sway direction.
Acknowledgments
This paper was supported by Education and Research promotion program of KOREATECH.
Author Contributions
All relevant contributions are by the Corresponding author.
References
 T. Fossen, K. Y. Pettersen, and H. Nijmeijer, Sensing and Control for Autonomous Vehicles: Applications to Land, Water and Air Vehicles, Lecture Notes in Control and Information Sciences, vol. 474, 2017. [https://doi.org/10.1007/9783319553726]
 H. Singh, C. Roman, O. Pizarro, R. Eustice, and A. Can, “Towards highresolution imaging from underwater vehicles, ” The International Journal of Robotics Research, vol. 26, no. 1, pp. 5574, 2007. [https://doi.org/10.1177/0278364907074473]
 D. Sward, J. Monk, and N. Barrett, “A systematic review of remotely operated vehicle surveys for visually assessing fish assemblages, ” Frontiers in Marine Science, vol. 6, 2019. [https://doi.org/10.3389/fmars.2019.00134]
 L. Wang, D. Zhu, W. Pang, and Y. Zhang, “A survey of underwater search for multitarget using multiAUV: Task allocation, path planning, and formation control, ” Ocean Engineering, vol. 278, 114393, 2023. [https://doi.org/10.1016/j.oceaneng.2023.114393]
 R. B. Wynn, et al., “Autonomous Underwater Vehicles (AUVs): Their past, present and future contributions to the advancement of marine geoscience, ” Marine Geology, vol. 352, pp. 451468, 2014. [https://doi.org/10.1016/j.margeo.2014.03.012]
 A. Wibisono, M. J. Piran, H. K. Song, and B. M. Lee, “A survey on unmanned underwater vehicles: Challenges, enabling technologies, and future research directions, ” Sensors, vol. 23, no. 17, p. 7321, 2023. [https://doi.org/10.3390/s23177321]
 A. Bashir, S. Khan, N. Iqbal, S. Bashmal, S. Ullah, Fayyaz, and M. Usman, “A review of the various control algorithms for trajectory control of unmanned underwater vehicles, ” Sustainability, vol. 15, no. 20, p. 14691, 2023. [https://doi.org/10.3390/su152014691]
 J. Yuh, “Design and control of autonomous underwater robots: A survey, ” Autonomous Robots, vol. 8, pp. 724, 2000. [https://doi.org/10.1023/A:1008984701078]
 K. Y. Pettersen and O. Egeland, “Timevarying exponential stabilization of the position and attitude of an underactuated autonomous underwater vehicle, ” IEEE Transactions on Automatic Control, vol. 44, no. 1, pp. 112115, 1999. [https://doi.org/10.1109/9.739086]
 C. Silvestre and A. Pascoal, “Control of the INFANTE AUV using gain scheduled static output feedback, ” Control Engineering Practice, vol. 12, no. 12, pp. 15011509, 2004. [https://doi.org/10.1016/j.conengprac.2004.02.012]
 C. Paliotta, E. Lefeber, K. Y. Pettersen, J. Pinto, M. Costa and J. T. de F. B. de Sousa, “Trajectory tracking and path following for underactuated marine vehicles, ” IEEE Transactions on Control Systems Technology, vol. 27, no. 4, pp. 14231437, 2019. [https://doi.org/10.1109/TCST.2018.2834518]
 M. MatNoh, R. MohdMokhtar, M. R. Arshad, Z. M. Zain, and Q. Khan, “Review of sliding mode control application in autonomous underwater vehicles, ” Indian Journal of Geo Marine Sciences, vol. 48, no. 7, pp. 973984, 2019.
 J. Nie, J. Yuh, E. Kardash, and T. I. Fossen, “Onboard sensorbased adaptive control of small UUVs in very shallow water, ” International Journal of Adaptive Control and Signal Processing, vol. 14, no. 4, pp. 441452, 2000. [https://doi.org/10.1002/10991115(200006)14:4<441::AIDACS565>3.0.CO;2M]
 J. Guerrero, J. Torres, V. Creuze, and A. Chemori, “Trajectory tracking for autonomous underwater vehicle: An adaptive approach, ” Ocean Engineering, vol. 172, pp. 511522, 2019. [https://doi.org/10.1016/j.oceaneng.2018.12.027]
 Z. Yan, M. Wang, and J. Xu, “Robust adaptive sliding mode control of underactuated autonomous underwater vehicles with uncertain dynamics, ” Ocean Engineering, vol. 173, pp. 802809, 2019. [https://doi.org/10.1016/j.oceaneng.2019.01.008]
 L. Qiao and W. Zhang, “Trajectory tracking control of AUVs via adaptive fast nonsingular integral terminal sliding mode control, ” IEEE Transactions on Industrial Informatics, vol. 16, no. 2, pp. 12481258, 2020. [https://doi.org/10.1109/TII.2019.2949007]
 B. Sun, D. Zhu, and S. Yang, “An optimized fuzzy control algorithm for threedimensional AUV path planning, ” International Journal of Fuzzy Systems, vol. 20, no. 5, pp. 114, 2017. [https://doi.org/10.1007/s4081501704031]
 B. S. Park, “Neural networkbased tracking control of underactuated autonomous underwater vehicles with model uncertainties, ” Journal of Dynamic Systems, Measurement, and Control, vol. 137, no. 2, p. 021004, 2015. [https://doi.org/10.1115/1.4027919]
 J. Li, J. Du, and C. L. P. Chen, “Commandfiltered robust adaptive NN control with the prescribed performance for the 3D trajectory tracking of underactuated AUVs, ” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 11, pp. 65456557, 2022. [https://doi.org/10.1109/TNNLS.2021.3082407]
 J. M. Maciejowski, Predictive Control with Constraints, Prentice Hall, Harlow, England, 2002.
 J. Köhler, M. A. Müller, and F. Allgöwer, “Nonlinear reference tracking: An economic model predictive control perspective, ” IEEE Transactions on Automatic Control, vol. 64, no. 1, pp. 254269, 2019. [https://doi.org/10.1109/TAC.2018.2800789]
 A. Jayasiri, A. Nandan, S. Imtiaz, D. Spencer, S. Islam, and S. Ahmed, “Dynamic positioning of vessels using a UKFbased observer and an NMPCbased controller, ” IEEE Transactions on Automation Science and Engineering, vol. 14, no. 4, pp. 17781785, 2017. [https://doi.org/10.1109/TASE.2017.2698923]
 W. Gan, D. Zhu, and D. Ji, “QPSOmodel predictive controlbased approach to dynamic trajectory tracking control for unmanned underwater vehicles, ” Ocean Engineering, vol. 158, pp. 208220, 2018. [https://doi.org/10.1016/j.oceaneng.2018.03.078]
 C. Shen, Y. Shi, and B. Buckham, “Trajectory tracking control of an autonomous underwater vehicle using Lyapunovbased model predictive control, ” IEEE Transactions on Industrial Electronics, vol. 65, no. 7, pp. 57965805, 2018. [https://doi.org/10.1109/TIE.2017.2779442]
 C. Liu, Q. Hu, X. Wang, and J. Yin, “Eventtriggeredbased nonlinear model predictive control for trajectory tracking of underactuated ship with multiobstacle avoidance, ” Ocean Engineering, vol. 253, 111278, 2022. [https://doi.org/10.1016/j.oceaneng.2022.111278]
 W. Zhang, Q. Wang, W. Wu, X. Du, Y. Zhang, and P. Han, “Eventtrigger NMPC for 3D trajectory tracking of UUV with external disturbances, ” Ocean Engineering, vol. 283, 115050, 2023. [https://doi.org/10.1016/j.oceaneng.2023.115050]
 S. HeshmatiAlamdari, A. Nikou, and D. V. Dimarogonas, “Robust trajectory tracking control for underactuated autonomous underwater vehicles in uncertain environments, ” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 3, pp. 12881301, 2021. [https://doi.org/10.1109/TASE.2020.3001183]
 C. Shen, B. Buckham, and Y. Shi, “Modified C/GMRES algorithm for fast nonlinear model predictive tracking control of AUVs, ” IEEE Transactions on Control Systems Technology, vol. 25, no. 5, pp. 18961904, 2017. [https://doi.org/10.1109/TCST.2016.2628803]
 A. Gautam and Y. C. Soh, “Stabilizing model predictive control using parameterdependent dynamic policy for nonlinear systems modeled with neural networks, ” Journal of Process Control, vol. 36, pp. 1121, 2015. [https://doi.org/10.1016/j.jprocont.2015.09.003]
 T. I. Fossen, Guidance and Control of Ocean Vehicles, John Willey & Sons, 1994.
 P. A. Miller, J. A. Farrell, Y. Zhao, and V. Djapic, “Autonomous underwater vehicle navigation, ” IEEE Journal of Oceanic Engineering, vol. 35, no. 3, pp. 663678, 2010. [https://doi.org/10.1109/JOE.2010.2052691]
 Y. S. Han, M. J. Kim, H. I. Seo, and D. H. Seo, “Direct orientation estimation through inertial odometry based on a deep transformer model, ” Journal of Advanced Marine Engineering and Technology, vol. 48, no. 2, pp. 96106, 2024. [https://doi.org/10.5916/jamet.2024.48.2.96]
 D. Q. Mayne, E. C. Kerrigan, E. V. Wyk, and P. Falugi, “Tubebased robust nonlinear model predictive control, ” International Journal of Robust and Nonlinear Control, vol. 21, no. 11, pp. 13411353, 2011. [https://doi.org/10.1002/rnc.1758]
 A. Gautam, Y. C. Chu, and Y. C. Soh, “Optimized dynamic policy for receding horizon control of linear timevarying systems with bounded disturbances, ” IEEE Transactions on Automatic Control, vol. 57, no. 4, pp. 973988, 2012. [https://doi.org/10.1109/TAC.2011.2170109]
 S. J. Wright and M. J. Tenny, “A feasible trustregion sequential quadratic programming algorithm, ” SIAM Journal of Optimization, vol. 14, no. 4, pp. 10741105, 2004. [https://doi.org/10.1137/S1052623402413227]
 H. J. Ferreau, C. Kirches, A. Potschka, H. G. Bock, and M. Diehl, “A parametric activeset algorithm for quadratic programming, ” Mathematical Programming Computation, vol. 6, no. 4, pp. 327363, 2014. [https://doi.org/10.1007/s1253201400711]