If you see this, something is wrong
First published on Friday, May 16, 2025 and last modified on Friday, May 16, 2025 by François Chaplais.
Institute for Systems Theory and Automatic Control, University of Stuttgart, 70550 Stuttgart, Germany ,Email. J. Venkatasubramanian thanks the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting her.
the Institute for Dynamic Systems and Control, ETH Zürich, Zürich CH-80092, Switzerland, Email
Institute for Systems Theory and Automatic Control, University of Stuttgart, 70550 Stuttgart, Germany, Email. F. Allgöwer is thankful that his work was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 - 390740016 and under grant 468094890. F. Allgöwer acknowledges the support by the Stuttgart Center for Simulation Science (SimTech).
In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.
System identification bridges control theory and machine learning by providing methods to model dynamical systems from data, enabling better prediction and control [1]. Within this field, optimal experiment design and targeted exploration [2] develop methods to optimally excite dynamical systems to reduce model uncertainty, thereby (i) attaining a model with desired accuracy [3, 4], or (ii) ensuring the feasibility of a subsequent robust control design [5, 6, 7, 8, 9]. Theoretical results on optimal experiment design rely primarily on asymptotic bounds for identification error, which are valid in the limit as the amount of data approaches infinity [1]. However, in practice we only have finite data from experiments and the reliability of the model-based controller depends significantly on the quality of the data used for system identification. Recently, numerous results have been developed for non-asymptotic (finite-sample) analysis of system identification [10, 11, 12] and subsequent control [13, 14, 15]. An overview of these methods is provided in [16]. In general, these non-asymptotic results depend on excitation from applying random inputs to ensure learning, and do not optimize the inputs for excitation. Recent works [17, 18] consider targeted exploration with periodic/sinusoidal inputs, adopting a frequency-domain analysis for parameter estimation. However, neither approach explicitly accounts for transient effects [1] during input design or analysis. Hence their guarantees are contingent on the dissipation of transient error and the attainment of steady-state response. This leads to prolonged experiment durations, which may be impractical for many real-world applications.
In this paper, we design a targeted exploration strategy that ensures a desired error bound on the estimated parameters with high probability by utilizing a non-asymptotic data-dependent uncertainty bound [13, 11]. In particular, we derive a priori guarantees on the error bound that can be ensured before exploration. To shape and reduce uncertainty in a targeted manner, we consider multi-sine exploration inputs in selected frequencies and optimized amplitudes. As our main contribution, we derive sufficient conditions directly in terms of the exploration inputs that ensure a desired parameter error bound with high probability through finite-time exploration. In particular, we explicitly account for transient error due to the input and the effect of the disturbances in the design of targeted exploration. We use these sufficient conditions to formulate linear matrix inequalities (LMIs) for exploration, ensuring the desired parameter error bound. This leads to a targeted exploration design with minimal input energy based on a semidefinite program (SDP).
Outline: We first provide the setting and define the exploration objective in Section 2. We then summarize data-dependent non-asymptotic identification bounds from [13, 11] in Section 3. In Section 4, we derive sufficient conditions on the exploration data. In Section 5 we derive sufficient conditions on the exploration inputs by leveraging the theory of spectral lines [18] and explicitly accounting for transient error due to the input in the exploration design. Finally, in Section 6, we provide a numerical example to highlight the effect of the finite time of the experiment on the required input energy.
Notation: The transpose (conjugate transpose) of a matrix \( A \in \mathbb{R}^{n \times m} (\mathbb{C}^{n \times m})\) is denoted by \( A^\top\) (\( A^\mathsf{H}\) ). The positive semi-definiteness (definiteness) of a matrix \( A \in \mathbb{C}^{n \times n}\) is denoted by \( A = A^\mathsf{H} \succeq (\succ) 0\) . The Kronecker product operator is denoted by \( \otimes\) . The operator \( \mathrm{vec}(A)\) stacks the columns of \( A\) to form a vector. The operator \( \textnormal{diag}(A_1,\dots,A_n)\) creates a block diagonal matrix by aligning the matrices \( A_1,\dots,A_n\) along the diagonal. The Euclidean norm for a vector \( x\in\mathbb{R}^n\) is denoted by \( \|x\|=\sqrt{x^\top x}\) . Given a matrix \( P \in \mathbb{R}^{n \times n},\, P \succ 0\) , the weighted Euclidean norm is given by \( \|x\|_P=\sqrt{x^\top P x}\) . The largest singular value of a matrix \( A\in\mathbb{C}^{m\times n}\) is denoted by \( \|A\|\) . \( \mathbb{E}[X]\) denotes the expected value of a random variable \( X\) . \( \mathbb{P}[E]\) denotes the probability of an event \( E\) occurring.
Consider a discrete-time linear time-invariant dynamical system of the form
(1)
where time \( k \in \mathbb{N}\) , \( x_k \in \mathbb{R}^{n_\mathrm{x}}\) is the state, \( u_k \in \mathbb{R}^{n_\mathrm{u}}\) is the control input, and \( w_k \in \mathbb{R}^{n_\mathrm{x}}\) is the disturbance. It is assumed the state \( x_k\) is directly measurable. In order to simplify the exposition, we assume that the initial state \( x_0\) is zero. The true system parameters \( A_\mathrm{tr}\) , \( B_\mathrm{tr}\) are initially uncertain, and it is necessary to collect informative data through the process of exploration to enhance the accuracy of the parameters. Henceforth, we denote \( \phi_k=[x_k^\top\,, u_k^\top]^\top \in \mathbb{R}^{n_\phi}\) where \( n_\phi=n_\mathrm{x}+n_\mathrm{u}\) . The system (1) can be re-written in terms of the unknown parameter \( \theta_\mathrm{tr}=\textnormal{vec}([A_\mathrm{tr},B_\mathrm{tr}]) \in \mathbb{R}^{n_\theta}\) as
(2)
where \( n_\theta = n_\mathrm{x}n_\phi\) . We consider a filtration of \( \sigma\) -algebras \( \{\mathcal{F}_k \}_{k \geq 0}\) such that \( \phi_k\) is \( \mathcal{F}_{k-1}\) -measurable and \( w_k\) is \( \mathcal{F}_k\) -measurable.
Definition 1
(Sub-Gaussian random vector [19, Definition 2]) A random vector \( w_k \in \mathbb{R}^{n_\mathrm{x}}\) with mean \( \mathbb{E}[w_k]=\mu\) is called sub-Gaussian with proxy variance \( \Sigma \succeq 0\) , i.e., \( w_k \sim \mathsf{subG}(\mu,\Sigma)\) , if \( \forall \lambda \in \mathbb{R}^{n_\mathrm{x}}\) ,
Assumption 1
The disturbance \( w_k\) is conditionally sub-Gaussian with zero mean and known proxy variance \( \sigma_\mathrm{w}^2I_{n_\mathrm{x}}\) [13], i.e., \( w_k \sim \mathsf{subG}(0,\sigma_\mathrm{w}^2I_{n_\mathrm{x}})\) .
The sub-Gaussian distributions encompass Gaussian, uniform, and other distributions with light tails. We assume that some prior knowledge on the dynamics is available.
Assumption 2
The system matrix \( A_\mathrm{tr}\) is Schur stable and the pair \( (A_\mathrm{tr},B_\mathrm{tr})\) is controllable.
Assumption 3
The parameters \( \theta_\mathrm{tr}=\mathrm{vec}([A_\mathrm{tr},B_\mathrm{tr}]) \in \mathbb{R}^{n_\theta}\) lie in a known set \( \mathbf{\Theta}_0\) , i.e., \( \theta_\mathrm{tr} \in \mathbf{\Theta}_0\) , where
(3)
with an estimate \( \hat{\theta}_0\) and for some \( D_0 \succ 0\) .
Exploration goal: Since the true system parameters \( \theta_\mathrm{tr} = \text{vec}([A_\mathrm{tr}\) , \( B_\mathrm{tr}])\) are not precisely known, there is a necessity to gather informative data through the process of exploration. Our primary goal is to design exploration inputs that excite the system, for a fixed user chosen time \( T \in \mathbb{N}\) , in a manner as to obtain an estimate \( \hat{\theta}_T=\text{vec}([\hat{A}_T,\hat{B}_T])\) that satisfies
(4)
Here, \( D_\mathrm{des} \succ 0\) is a user-defined matrix characterizing how closely \( \hat{\theta}_T\) approximates the true parameters \( \theta_\mathrm{tr}\) . The exploration inputs are computed such that it excites the system sufficiently with minimal input energy, based on the initial parameter estimates (cf. Assumption 3). Denote \( U=[u_0^\top,…,u_{T-1}^\top]^\top \in \mathbb{R}^{T n_\mathrm{u}}\) . Bounding the input energy by a constant \( T \gamma_\mathrm{e}^2\geq 0\) can be equivalently written as
(5)
In this section, we focus on quantifying the uncertainty associated with the unknown parameters \( \theta_\mathrm{tr}=\textnormal{vec}([A_{\mathrm{tr}},B_{\mathrm{tr}}])\) . For this purpose, we first outline regularized least squares estimation, and then summarize existing results by [13, 11] that provide a data-dependent uncertainty bound on the uncertain parameters.
Given observed data \( \mathcal{D}_{T+1}=\{x_k,u_k\}_{k=0}^{T}\) of length \( T+1\) , we denote
(6)
Given (6), the regularized least squares estimate is given by
(7)
with the regularization constant \( \lambda > 0\) and the excitation
(8)
Using (7), the estimate \( \hat{\theta}_{T}\) can be expressed as
(9)
(10)
Thus, the regularized least squares error satisfies
(11)
Self-normalized Martingales: The sequence \( \{S_T \}_{ T \geq 0}\) with \( S_T=(\Phi\otimes I_{n_\mathrm{x}}) W\) is a martingale with respect to \( \{\mathcal{F}_T \}_{T=0}^\infty\) ((1), [20]). This sequence is crucial for the construction of confidence ellipsoids for \( \theta_\mathrm{tr}\) . In [13, Theorem 1], a `self-normalized bound' for the martingale \( \{S_T \}_{T \geq 0}\) is derived by assuming scalar disturbances \( w_k\) , i.e., \( n_\mathrm{x}=1\) . The following lemma generalizes the bound in [13, Theorem 1] to vector-valued disturbances, i.e., \( n_\mathrm{x} > 1\) , by utilizing covering techniques as in [11, Proposition 8.2] or [16].
Lemma 1
(Self-normalized bound for vector processes) Let Assumption 1 hold. For any \( \delta \in (0,1)\) , with probability at least \( 1-\delta\) , for all \( T \in \mathbb{N}\) ,
(12)
The proof of Lemma 1 may be found in [11, Proposition 8.2] or [16]. In order to derive the confidence ellipsoid, we first derive a bound on \( \theta_\mathrm{tr}\) of the form \( \|\theta_\mathrm{tr}\| \leq \bar{\theta}\) . By applying the Schur complement twice to the condition in (3), we get \( (\hat{\theta}_0-\theta_\mathrm{tr})(\hat{\theta}_0-\theta_\mathrm{tr})^\top \preceq (D_0 \otimes I_{n_\mathrm{x}})^{-1}\) , from which we have
(13)
Using the triangle inequality, we have
(14)
The following theorem utilizes the result in Lemma 1 to derive a data-dependent uncertainty bound in the form of a confidence ellipsoid centered at the estimate \( \hat{\theta}_{T}\) for the uncertain parameters \( \theta_\mathrm{tr}\) .
Theorem 1
[13, Theorem 2] Let Assumptions 1 and 3 hold. Then, for any \( \delta > 0\) , \( \mathbb{P}[\theta_\mathrm{tr} \in \Theta_T, \forall T] \geq 1-\delta\) where
(15)
We utilize this data-dependent uncertainty bound to design a targeted exploration strategy that achieves the exploration goal provided in (4).
Given the uncertainty bound in Theorem 1, the following theorem presents conditions on the exploration data that imply the exploration goal (4).
Theorem 2
Let Assumptions 1 and 3 hold. Suppose \( \Phi\) satisfies
(16)
with \( \bar{D}_T\) according to (7). Then, the estimate \( \hat{\theta}_{T}\) computed as in (9) satisfies the exploration goal (4) with probability at least \( 1-\delta\) .
Proof
By applying the Schur complement twice to the condition in (15), we have
(17)
with probability at least \( 1-\delta\) . Inequality (16) can be written as
(18)
(19)
By inserting (18) in (17), we get
(20)
with probability at least \( 1-\delta\) . Finally, applying the Schur complement twice to (20) yields the exploration goal (4).
In contrast to standard asymptotic bounds, e.g., [6, 9], Theorem 2 depends additionally on the term \( (R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta})^2\) which is nonconvex in \( \bar{D}_T\) . Hence, in the following lemma we derive a sufficient condition that is convex in \( \bar{D}_T\) .
Lemma 2
For any \( \bar{D}_T \in \mathbb{R}^{n_\theta}\) , we have
(21)
where
(22)
Proof
We first derive an upper bound on the term \( R(\bar{D}_T)\) as follows:
(23)
Finally, by Young’s inequality [21], we have that
which yields Inequality (21).
Proposition 1
Let Assumptions 1 and 3 hold. Suppose \( \Phi\) satisfies
(24)
(25)
Then, the estimate \( \hat{\theta}_{T}\) computed as in (9) satisfied the exploration goal (4) with probability at least \( 1-\delta\) .
Proof
Starting from (24), we have
which yields condition (16) in Theorem 2. Hence, the exploration goal (4) is achieved with probability at least \( 1-\delta\) .
Note that Inequality (24) depends quadratically on \( \Phi\) , specifically on the finite excitation term \( \Phi \Phi^\top\) [18]. This further depends on the amplitudes of the inputs \( U_\mathrm{e}\) and the disturbances. Additionally, the linear mapping from the input sequence to the state sequence is uncertain due to the uncertainty in the true dynamics. These issues are addressed in the next section.
In this section, we determine sufficient conditions for exploration in terms of the spectral content of \( \Phi\) in order to pose the exploration problem directly in terms of the inputs. To this end, we first define the amplitude of a spectral line.
Definition 2
(Amplitude of a spectral line [18]) Given a sequence \( \{\phi_k\}_{k=0}^{T-1}\) , the amplitude of the spectral line of the sequence \( \bar{\phi}(\omega_0)\) at a frequency \( \omega_0 \in \Omega_T:=\{0,1/T,\dots,(T-1)/T\}\) is given by
(26)
In what follows, we first establish the exploration strategy in Section 5.1 and then provide a bound on the excitation \( \Phi \Phi^\top\) in terms of spectral information in Section 5.2. We then explicitly account for transient error due to the input and the effect of disturbances in Section 5.3. In order to derive conditions linear in the decision variables, we provide a convex relaxation procedure in Section 5.4 and an upper bound on excitation in Section 5.5. Finally, we account for parametric uncertainty and obtain an SDP for exploration which provides us with exploration inputs that ensure the exploration goal in Section 5.6.
The exploration input sequence takes the form
(27)
where \( \bar{u}(\omega_i) \in \mathbb{R}^{n_\mathrm{u}}\) are the amplitudes of the sinusoidal inputs at \( L \in \mathbb{N}\) distinct selected frequencies \( \omega_i \in \Omega_T\) . The amplitude of the spectral line of the sequence \( \{u_k\}_{k=0}^{T-1}\) at frequency \( \omega_i\) is \( \bar{u}(\omega_i)\) . Denote \( U_\mathrm{e}=\mathrm{diag}(\bar{u}(\omega_1),\dots,\bar{u}(\omega_L)) \in \mathbb{R}^{Ln_\mathrm{u} \times L}\) . By the Parseval-Plancheral identity, bounding the average input energy by a constant \( \gamma^2_\mathrm{e}\) (5) can be equivalently written as
(28)
where \( \mathbf{1}_{L} \in \mathbb{R}^{L\times 1}\) is a vector of ones, and the bound \( \gamma_\mathrm{e} \geq 0\) is desired to be small. Using the Schur complement, this criterion is equivalent to the following LMI:
(29)
The uncertainty bound on the parameters of the system evolving under the exploration input sequence (27) can be computed from the spectral content of the observed state \( x_k\) and the applied input \( u_k\) . Given \( u_k\) as in (27), \( x_k\) and \( u_k\) have \( L\) spectral lines from \( 0\) to \( T-1\) at distinct frequencies \( \omega_i \in \Omega_T\) , \( i=1,\dots,L\) with amplitudes \( \bar{x}(\omega_i)\) and \( \bar{u}(\omega_i)\) . Recall \( \phi_k=[x_k^\top\,u_k^\top]^\top\) . From (1), we have
(30)
where \( \bar{x}_{\mathrm{u,d}}(\omega_i)\) denotes the asymptotic effect of the input, \( \bar{x}_{\mathrm{u,t}}(\omega_i)\) denotes the transient error and \( \bar{x}_{\mathrm{w}}(\omega_i)\) denotes the effect of the disturbance. Furthermore, we have
(31)
(32)
Further, \( \bar{u}(\omega_i)\) can be written as
(33)
Note that for all \( \omega_i \in \Omega_T\) , we have
(34)
Using (30), we have
(35)
where
(36)
We compactly define
(37)
where \( \ast \in \{\mathrm{w};\mathrm{u};\mathrm{u,d};\mathrm{e,t} \}\) , which satisfies
(38)
(39)
The following lemma provides a lower bound on \( \Phi \Phi^\top\) in terms of the spectral lines of \( \phi_k\) .
Lemma 3
For any \( \epsilon \in (0,1)\) , \( \phi_k\) satisfies
(40)
Proof
From the Parseval-Plancherel identity, we have
For any \( \epsilon >0\) , applying Young’s inequality [21] twice gives
which yields Inequality (40).
The result in Lemma 3 depends on the transient error due to the input and the effect of the disturbance, which is unknown. Therefore, in what follows, we derive suitable bounds.
Due to bounded inputs and strict stability, the transient error \( \bar{x}_\mathrm{u,t}(\omega_i)\) decays uniformly to zero as \( T \to \infty\) . Let us denote the impulse response of the system with respect to the input \( u_k\) as \( G(i)=A_\mathrm{tr}^{i-1}B_\mathrm{tr}\) . The following lemma provides a deterministic bound on the transient error for finite \( T\) .
Lemma 4
[1, Theorem 2.1] Let Assumption 2 hold. We have
(41)
where \( G_\mathrm{tr} = \sum_{i=1}^\infty i \|G(i)\| < \infty\) .
Since \( A_\mathrm{tr}\) is Schur stable (cf. Assumption 2), there exist constants \( C \geq 1\) and \( \rho \in (0,1)\) such that \( \|A_\mathrm{tr}^k\| \leq C \rho^k\) , \( \forall k \in \mathbb{N}\) . From the convergence of arithmetico-geometric series, we have
(42)
Using (30) and (32), we can write \( \bar{\Phi}_\mathrm{w}\) as
(43)
Given that \( W \sim \mathsf{subG}_{Tn_\mathrm{x}}(\sigma_\mathrm{w}^2I_{Tn_\mathrm{x}})\) with zero mean, [19, Corollary 1] implies
(44)
Lemma 5
The matrices \( \bar{\Phi}_\mathrm{u,t}\) and \( \bar{\Phi}_\mathrm{w}\) satisfy
(45)
(46)
Proof
Using (41) from Lemma 4 and (42), we have
which yields (45). By using the sub-multiplicativity property of the operator norm, we have
with probability at least \( 1-\delta\) , which yields (46).
The following proposition provides a sufficient condition linear in \( U_\mathrm{e}\) .
Proposition 2
Let Assumptions 1, 2 and 3 hold. Suppose matrices \( U_\mathrm{e}\) , \( \tilde{U}\) and \( \Phi\) satisfy
(47)
Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2 \delta\) .
Proof
The proof is divided into two parts. In the first part, we show that condition (47) implies condition (24) in Proposition 1. In the second part, we derive joint probabilistic bounds for the Inequality (47).
Part I. From [8, Lemma 12], we have the following convex relaxation:
(48)
Starting from (47), we have
(49)
Multiplying (49) by \( T\) yields (24) in Proposition 1.
Part II. We have
(50)
wherein the penultimate inequality follows from De Morgan’s law.
In Part I, assuming \( \hat{\theta}_\mathrm{tr} \in \Theta_T\) (15) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) (44), Inequality (47) implies (4). Since both \( \hat{\theta}_\mathrm{tr} \in \Theta_T\) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) hold jointly with probability at least \( 1-2\delta\) as shown in Part II, Inequality (47) implies (4) with probability at least \( 1-2\delta\) .
Note that Inequality (47) in Proposition 2 requires an upper bound on the exploration trajectory data \( \bar{D}_T\) (cf. (7)) in order to derive an LMI for exploration.
In the following lemma, we provide an upper bounds on the term \( \log(\det(\bar{D}_T))\) .
Lemma 6
Let \( S_{\textnormal{energy-bound-1}}(\gamma_\mathrm{e},U_\mathrm{e}) \succeq 0\) in (29) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) hold. Then, the matrix \( \bar{D}_T\) as in (7) satisfies
(51)
(52)
Proof
With \( L=T\) , and using (31) and (32), we have
Similar to (46), using the sub-multiplicativity property of the operator norm, (28), and (34), we have
(53)
(54)
By the Parseval-Plancheral identity, Young’s inequality [21], Inequalities (46), (53), and with \( L=T\) , we have
Hence, we have
which yields (51).
Recall \( C_1\) from (22). Henceforth, we define
(55)
Proposition 3
Let Assumptions 1, 2 and 3 hold. Suppose matrices \( U_\mathrm{e}\) , \( \tilde{U}\) satisfy
(56)
Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2 \delta\) .
Proof
Using \( C_2\) and \( C_3\) (55), and inserting Inequality (51) in (56) yields condition (47) in Proposition 2.
The result in Proposition 3 depends on the transfer matrix \( V_\mathrm{tr}\) which is dependent on the true dynamics \( A_\mathrm{tr},\,B_\mathrm{tr}\) , and hence uncertain. Therefore, in what follows, we derive suitable bounds.
Denote
(57)
(58)
where \( \hat{V}\) is computed using the initial estimates \( \hat{\theta}_0=\mathrm{vec}([\hat{A}_0,\hat{B}_0])\) (cf. Assumption 3). We compute a matrix \( \tilde{\Gamma}_\mathrm{V} \succeq 0\) using \( \theta_\mathrm{tr} \in \mathbf{\Theta}_0\) such that
(59)
Since Inequality (56) in Proposition 3 is nonlinear in the decision variable \( \gamma_\mathrm{e}\) , we impose an upper bound of the form \( \gamma_\mathrm{e}^2 \leq \bar{\gamma}\) , where \( \bar{\gamma}\) is fixed. Using the Schur complement, this criterion is equivalent to
(60)
(61)
(62)
(63)
The following theorem provides a sufficient condition linear in \( U_\mathrm{e}\) , which ensures the exploration goal (4).
Theorem 3
Let Assumptions 1, 2 and 3 hold. Suppose the following SDP is feasible:
(64)
where \( S_{\textnormal{energy-bound-1}}\) , \( S_\textnormal{energy-bound-2}\) and \( S_\textnormal{exp}\) are defined in (29), (60) and (62), respectively. Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2\delta\) .
Proof
By using (57), Inequality (59) can be written as
(65)
By using the matrix S-lemma [22], Inequality (61) holds for all \( V_\mathrm{tr}\) satisfying (65) if \( S_\textnormal{exploration}(\epsilon, \lambda, \tau, U_\mathrm{e}, \tilde{U}, \gamma_\mathrm{e}, \bar{\gamma}, \Gamma_\mathrm{u}, \Gamma_\mathrm{w}, \tilde{\Gamma}_\mathrm{v}, \hat{V}, D_\mathrm{des}) \succeq 0\) (62) with \( \tau \geq 0\) . Inequality (61) can be written as
Inserting the inequality \( \gamma_\mathrm{e}^2 \leq \bar{\gamma}\) , i.e., \( S_\textnormal{energy-bound-2}(\gamma_\mathrm{e},\bar{\gamma}) \succeq 0\) , yields condition (56) in Proposition 3. Hence, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the solution of Problem (64), \( U_\mathrm{e}\) , satisfies the exploration goal (4) with probability at least \( 1-2\delta\) .
A solution of (64) yields \( U_\mathrm{e}=\mathrm{diag}(\bar{u}(\omega_1),\dots,\bar{u}(\omega_L))\) , i.e., the exploration input, which guarantees the desired uncertainty bound \( D_\mathrm{des}\) (4). In LMI (62), the terms \( \Gamma_\mathrm{u}\) , \( \Gamma_w\) , \( C_3\) and \( G_\mathrm{tr}\) comprise constants \( \|A_\mathrm{w}\|\) , \( \|A_\mathrm{u}\|\) , \( \|B_\mathrm{tr}\|\) , \( C\) and \( \rho\) which are uncertain since they depend on the true dynamics \( A_\mathrm{tr}, B_\mathrm{tr}\) . Given \( \theta_\mathrm{tr} \in \Theta_0\) , such constants can be computed using robust control tools or scenario optimization [9, Appendices A, C]. Furthermore, the matrix \( \tilde{\Gamma}_\mathrm{V} \succ 0\) may be computed using robust control methods, cf. [9, Appendices A, C] since (59) is an LMI. Note that imposing the upper bound \( \bar{\gamma} \geq \gamma_\mathrm{e}^2\) and the convex relaxation procedure introduces suboptimality in the solution of (64) This suboptimality can be reduced by iterating (64) multiple times by re-computing \( \tilde{U}\) and \( \bar{\gamma}\) , until they do not change, for the next iteration as
(66)
where \( U_\mathrm{e}^*, \gamma_\mathrm{e}^*\) is the solution from the previous iteration. Since the previous optimal solution \( U_\mathrm{e}^*\) remains feasible in the next iteration, \( \gamma_\mathrm{e}\) is guaranteed to be non-increasing. From LMI (62), it can be inferred that \( \gamma_\mathrm{e}^2\) scales as \( \frac{1}{T}\) . As \( D_\mathrm{des}\) and \( T\) grow proportionally, \( U_\mathrm{e}\) remains nearly unchanged. The proposed targeted exploration strategy is outlined in Algorithm 1. Note that the proposed algorithm directly ensures the desired accuracy of the parameters, and requires no iterative experiments.
In this section, we demonstrate the applicability of the exploration strategy through a numerical example. Numerical simulations were performed on MATLAB using CVX [23] in conjunction with the default SDP solver SDPT3. We consider a linear system (1) with
(67)
which belongs to a class of systems identified as ‘hard to learn’ [24]. In our simulations, we set the desired error bound as \( D_\mathrm{des}^{-1}=10^4 I_{n_\phi}\) (cf. (4)). Furthermore, we select \( L=5\) frequencies \( \omega_i\in\{0,0.1,0.2,0.3,0.4\}\) and the initial estimate as \( \hat{\theta}_0=\theta_\mathrm{tr}+(4 \times 10^{-4})\mathbf{1}_{n_\mathrm{x}n_\phi}\) with initial uncertainty level \( D_0 = 10^3 I_{n_\phi}\) . We set \( \epsilon=0.5\) , \( \lambda =1\) , \( \delta = 0.05\) and \( \sigma_\mathrm{w}=0.01\) . In what follows, we study the effect of the finite duration \( T\) of the experiment on the average input energy required.
Non-asymptotic behaviour: In the asymptotic case where \( T \to \infty\) [9], the required input energy \( T\gamma_\mathrm{e}^2\) scales linearly with the desired accuracy \( \|D_\mathrm{des}\|\) . In this example, we study the effect of the exploration time \( T\) and the transient error decay, on the required input energy \( \gamma_\mathrm{e}^2\) . For this purpose, we run six trials for the following exploration times \( T \in \{ 10^{10},10^{11},10^{12},10^{13},10^{14},10^{15}\}\) . For each trial, we scale \( D_\mathrm{des}\) proportionately, yielding \( D_\mathrm{des} \in \{10^0,10^1,10^2,10^3,10^4,10^5 \}\) , respectively. Each trial comprises executing Algorithm 1 to obtain the exploration inputs (27) and average input energy \( \gamma_\mathrm{e}^2\) . In Figure 1, we see that for \( T >> 10^{12}\) , we approximately recover this linear relationship, while for \( T < 10^{11}\) , we see that the transient effects are significant and slightly more input energy is required. This particularly highlights the role of transient effects in finite-time experiment design, indicating the need for careful input design in order to mitigate transient effects even for large times \( T\) .
[1] System Identification: Theory for the User Prentice Hall PTR 1999 2nd
[2] Optimal experiment design for open and closed-loop system identification Communications in Information and Systems 2011 11 3 197–224
[3] Input design via LMIs admitting frequency-wise model specifications in confidence regions IEEE transactions on Automatic Control 2005 50 10 1534–1549
[4] Robust optimal identification experiment design for multisine excitation Automatica 2021 125 109431
[5] Identification and control: Joint input design and \( H_\infty\) state feedback with ellipsoidal parametric uncertainty via LMIs Automatica 2008 44 2 543–551
[6] Robust exploration in linear quadratic reinforcement learning Advances in Neural Information Processing Systems 2019 15310-15320
[7] Learning robust LQ-controllers using application oriented exploration IEEE Control Systems Letters 2019 4 1 19–24
[8] Robust targeted exploration for systems with non-stochastic disturbances arXiv preprint arXiv:2412.20426 2024
[9] Sequential learning and control:Targeted exploration for robust performance IEEE Transactions on Automatic Control 2024 1-16 10.1109/TAC.2024.3430088
[10] Learning without mixing: Towards a sharp analysis of linear system identification Conference On Learning Theory 2018 439–473 PMLR
[11] Near optimal finite time identification of arbitrary linear dynamical systems International Conference on Machine Learning 2019 5610–5618 PMLR
[12] Finite sample analysis of stochastic system identification Proc. 58th Conference on Decision and Control (CDC) 2019 3648–3654 IEEE
[13] Improved algorithms for linear stochastic bandits Advances in neural information processing systems 2011 24
[14] On the sample complexity of the linear quadratic regulator Foundations of Computational Mathematics 2019 1–47
[15] Certainty equivalence is efficient for linear quadratic control Advances in Neural Information Processing Systems 2019 32
[16] Statistical learning theory for control: A finite-sample perspective IEEE Control Systems Magazine 2023 43 6 67–97
[17] Active learning for identification of linear dynamical systems Conference on Learning Theory 2020 3487–3582 PMLR
[18] Accurate parameter estimation for safety-critical systems with unmodeled dynamics Artificial Intelligence 2023 316 103857 https://doi.org/10.1016/j.artint.2023.103857
[19] Stochastic Model Predictive Control for Sub-Gaussian Noise arXiv preprint arXiv:2503.08795 2025
[20] Probability with martingales Cambridge university press 1991
[21] LMI arXiv preprint arXiv:1903.08599 2019
[22] From Noisy Data to Feedback Controllers: Nonconservative Design via a Matrix S-Lemma IEEE Transactions on Automatic Control 2022 67 1 162-175 10.1109/TAC.2020.3047577
[23] CVX mar 2014
[24] Linear systems can be hard to learn Proc. 60th IEEE Conference on Decision and Control (CDC) 2021 2903–2910 IEEE