LaTex2Web logo

Documents Live, a web authoring and publishing system

If you see this, something is wrong

Table of contents

First published on Friday, May 16, 2025 and last modified on Friday, May 16, 2025 by François Chaplais.

Beyond Asymptotics: Targeted exploration with finite-sample guarantees
arXiv
Published version: 10.48550/arXiv.2504.02380

Janani Venkatasubramanian Institute for Systems Theory and Automatic Control, University of Stuttgart, 70550 Stuttgart, Germany ,Email. J. Venkatasubramanian thanks the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting her.

Johannes Köhler the Institute for Dynamic Systems and Control, ETH Zürich, Zürich CH-80092, Switzerland, Email

Frank Allgöwer Institute for Systems Theory and Automatic Control, University of Stuttgart, 70550 Stuttgart, Germany, Email. F. Allgöwer is thankful that his work was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 - 390740016 and under grant 468094890. F. Allgöwer acknowledges the support by the Stuttgart Center for Simulation Science (SimTech).

Abstract

In this paper, we introduce a targeted exploration strategy for the non-asymptotic, finite-time case. The proposed strategy is applicable to uncertain linear time-invariant systems subject to sub-Gaussian disturbances. As the main result, the proposed approach provides a priori guarantees, ensuring that the optimized exploration inputs achieve a desired accuracy of the model parameters. The technical derivation of the strategy (i) leverages existing non-asymptotic identification bounds with self-normalized martingales, (ii) utilizes spectral lines to predict the effect of sinusoidal excitation, and (iii) effectively accounts for spectral transient error and parametric uncertainty. A numerical example illustrates how the finite exploration time influence the required exploration energy.

1 Introduction

System identification bridges control theory and machine learning by providing methods to model dynamical systems from data, enabling better prediction and control [1]. Within this field, optimal experiment design and targeted exploration [2] develop methods to optimally excite dynamical systems to reduce model uncertainty, thereby (i) attaining a model with desired accuracy [3, 4], or (ii) ensuring the feasibility of a subsequent robust control design [5, 6, 7, 8, 9]. Theoretical results on optimal experiment design rely primarily on asymptotic bounds for identification error, which are valid in the limit as the amount of data approaches infinity [1]. However, in practice we only have finite data from experiments and the reliability of the model-based controller depends significantly on the quality of the data used for system identification. Recently, numerous results have been developed for non-asymptotic (finite-sample) analysis of system identification [10, 11, 12] and subsequent control [13, 14, 15]. An overview of these methods is provided in [16]. In general, these non-asymptotic results depend on excitation from applying random inputs to ensure learning, and do not optimize the inputs for excitation. Recent works [17, 18] consider targeted exploration with periodic/sinusoidal inputs, adopting a frequency-domain analysis for parameter estimation. However, neither approach explicitly accounts for transient effects [1] during input design or analysis. Hence their guarantees are contingent on the dissipation of transient error and the attainment of steady-state response. This leads to prolonged experiment durations, which may be impractical for many real-world applications.

In this paper, we design a targeted exploration strategy that ensures a desired error bound on the estimated parameters with high probability by utilizing a non-asymptotic data-dependent uncertainty bound [13, 11]. In particular, we derive a priori guarantees on the error bound that can be ensured before exploration. To shape and reduce uncertainty in a targeted manner, we consider multi-sine exploration inputs in selected frequencies and optimized amplitudes. As our main contribution, we derive sufficient conditions directly in terms of the exploration inputs that ensure a desired parameter error bound with high probability through finite-time exploration. In particular, we explicitly account for transient error due to the input and the effect of the disturbances in the design of targeted exploration. We use these sufficient conditions to formulate linear matrix inequalities (LMIs) for exploration, ensuring the desired parameter error bound. This leads to a targeted exploration design with minimal input energy based on a semidefinite program (SDP).

Outline: We first provide the setting and define the exploration objective in Section 2. We then summarize data-dependent non-asymptotic identification bounds from [13, 11] in Section 3. In Section 4, we derive sufficient conditions on the exploration data. In Section 5 we derive sufficient conditions on the exploration inputs by leveraging the theory of spectral lines [18] and explicitly accounting for transient error due to the input in the exploration design. Finally, in Section 6, we provide a numerical example to highlight the effect of the finite time of the experiment on the required input energy.

Notation: The transpose (conjugate transpose) of a matrix \( A \in \mathbb{R}^{n \times m} (\mathbb{C}^{n \times m})\) is denoted by \( A^\top\) (\( A^\mathsf{H}\) ). The positive semi-definiteness (definiteness) of a matrix \( A \in \mathbb{C}^{n \times n}\) is denoted by \( A = A^\mathsf{H} \succeq (\succ) 0\) . The Kronecker product operator is denoted by \( \otimes\) . The operator \( \mathrm{vec}(A)\) stacks the columns of \( A\) to form a vector. The operator \( \textnormal{diag}(A_1,\dots,A_n)\) creates a block diagonal matrix by aligning the matrices \( A_1,\dots,A_n\) along the diagonal. The Euclidean norm for a vector \( x\in\mathbb{R}^n\) is denoted by \( \|x\|=\sqrt{x^\top x}\) . Given a matrix \( P \in \mathbb{R}^{n \times n},\, P \succ 0\) , the weighted Euclidean norm is given by \( \|x\|_P=\sqrt{x^\top P x}\) . The largest singular value of a matrix \( A\in\mathbb{C}^{m\times n}\) is denoted by \( \|A\|\) . \( \mathbb{E}[X]\) denotes the expected value of a random variable \( X\) . \( \mathbb{P}[E]\) denotes the probability of an event \( E\) occurring.

2 Problem Statement

Consider a discrete-time linear time-invariant dynamical system of the form

\[ \begin{equation} x_{k+1}=A_{\mathrm{tr}}x_k+B_{\mathrm{tr}}u_k+w_k, \end{equation} \]

(1)

where time \( k \in \mathbb{N}\) , \( x_k \in \mathbb{R}^{n_\mathrm{x}}\) is the state, \( u_k \in \mathbb{R}^{n_\mathrm{u}}\) is the control input, and \( w_k \in \mathbb{R}^{n_\mathrm{x}}\) is the disturbance. It is assumed the state \( x_k\) is directly measurable. In order to simplify the exposition, we assume that the initial state \( x_0\) is zero. The true system parameters \( A_\mathrm{tr}\) , \( B_\mathrm{tr}\) are initially uncertain, and it is necessary to collect informative data through the process of exploration to enhance the accuracy of the parameters. Henceforth, we denote \( \phi_k=[x_k^\top\,, u_k^\top]^\top \in \mathbb{R}^{n_\phi}\) where \( n_\phi=n_\mathrm{x}+n_\mathrm{u}\) . The system (1) can be re-written in terms of the unknown parameter \( \theta_\mathrm{tr}=\textnormal{vec}([A_\mathrm{tr},B_\mathrm{tr}]) \in \mathbb{R}^{n_\theta}\) as

\[ \begin{align} x_{k+1}=(\phi_k^\top \otimes I_{n_\mathrm{x}})\theta_\mathrm{tr} + w_k \end{align} \]

(2)

where \( n_\theta = n_\mathrm{x}n_\phi\) . We consider a filtration of \( \sigma\) -algebras \( \{\mathcal{F}_k \}_{k \geq 0}\) such that \( \phi_k\) is \( \mathcal{F}_{k-1}\) -measurable and \( w_k\) is \( \mathcal{F}_k\) -measurable.

Definition 1

(Sub-Gaussian random vector [19, Definition 2]) A random vector \( w_k \in \mathbb{R}^{n_\mathrm{x}}\) with mean \( \mathbb{E}[w_k]=\mu\) is called sub-Gaussian with proxy variance \( \Sigma \succeq 0\) , i.e., \( w_k \sim \mathsf{subG}(\mu,\Sigma)\) , if \( \forall \lambda \in \mathbb{R}^{n_\mathrm{x}}\) ,

\[ \begin{align*} \mathbb{E}\left[ \exp (\lambda^\top (w_k-\mu))|\mathcal{F}_{k-1}\right] \leq \exp \left( \frac{\|\lambda\|_{\Sigma}^2}{2} \right). \end{align*} \]

Assumption 1

The disturbance \( w_k\) is conditionally sub-Gaussian with zero mean and known proxy variance \( \sigma_\mathrm{w}^2I_{n_\mathrm{x}}\) [13], i.e., \( w_k \sim \mathsf{subG}(0,\sigma_\mathrm{w}^2I_{n_\mathrm{x}})\) .

The sub-Gaussian distributions encompass Gaussian, uniform, and other distributions with light tails. We assume that some prior knowledge on the dynamics is available.

Assumption 2

The system matrix \( A_\mathrm{tr}\) is Schur stable and the pair \( (A_\mathrm{tr},B_\mathrm{tr})\) is controllable.

Assumption 3

The parameters \( \theta_\mathrm{tr}=\mathrm{vec}([A_\mathrm{tr},B_\mathrm{tr}]) \in \mathbb{R}^{n_\theta}\) lie in a known set \( \mathbf{\Theta}_0\) , i.e., \( \theta_\mathrm{tr} \in \mathbf{\Theta}_0\) , where

\[ \begin{align} \mathbf{\Theta}_0:=\left\{\theta:(\hat{\theta}_0-\theta)^\top (D_0 \otimes I_{n_\mathrm{x}}) (\hat{\theta}_0-\theta) \leq 1 \right\}, \end{align} \]

(3)

with an estimate \( \hat{\theta}_0\) and for some \( D_0 \succ 0\) .

Exploration goal: Since the true system parameters \( \theta_\mathrm{tr} = \text{vec}([A_\mathrm{tr}\) , \( B_\mathrm{tr}])\) are not precisely known, there is a necessity to gather informative data through the process of exploration. Our primary goal is to design exploration inputs that excite the system, for a fixed user chosen time \( T \in \mathbb{N}\) , in a manner as to obtain an estimate \( \hat{\theta}_T=\text{vec}([\hat{A}_T,\hat{B}_T])\) that satisfies

\[ \begin{align} (\theta_\mathrm{tr}-\hat{\theta}_T)^\top \left( D_\mathrm{des} \otimes I_\mathrm{n_x} \right)(\theta_\mathrm{tr}-\hat{\theta}_T)\leq 1. \end{align} \]

(4)

Here, \( D_\mathrm{des} \succ 0\) is a user-defined matrix characterizing how closely \( \hat{\theta}_T\) approximates the true parameters \( \theta_\mathrm{tr}\) . The exploration inputs are computed such that it excites the system sufficiently with minimal input energy, based on the initial parameter estimates (cf. Assumption 3). Denote \( U=[u_0^\top,…,u_{T-1}^\top]^\top \in \mathbb{R}^{T n_\mathrm{u}}\) . Bounding the input energy by a constant \( T \gamma_\mathrm{e}^2\geq 0\) can be equivalently written as

\[ \begin{align} \sum_{k=0}^{T-1}\|u_k\|^2 = \|U\|^2\leq T \gamma_\mathrm{e}^2. \end{align} \]

(5)

3 Preliminaries: Non-asymptotic uncertainty bound

In this section, we focus on quantifying the uncertainty associated with the unknown parameters \( \theta_\mathrm{tr}=\textnormal{vec}([A_{\mathrm{tr}},B_{\mathrm{tr}}])\) . For this purpose, we first outline regularized least squares estimation, and then summarize existing results by [13, 11] that provide a data-dependent uncertainty bound on the uncertain parameters.

Given observed data \( \mathcal{D}_{T+1}=\{x_k,u_k\}_{k=0}^{T}\) of length \( T+1\) , we denote

\[ \begin{align} \nonumber \Phi & = [\phi_0,\dots,\phi_{T-1}] \in \mathbb{R}^{n_\phi\times T}, \end{align} \]

(6)

\[ \begin{align} X & = [x_1^\top,\dots, x_T^\top]^\top \in \mathbb{R}^{Tn_\mathrm{x} \times 1},\\ W & = [w_0^\top,\dots, w_{T-1}^\top]^\top \in \mathbb{R}^{Tn_\mathrm{x} \times 1}. \\\end{align} \]

Given (6), the regularized least squares estimate is given by

\[ \begin{align} \hat{\theta}_{T}=\underbrace{( (D_T \otimes I_{n_\mathrm{x}})+\lambda I_{n_\phi n_\mathrm{x}})^{-1}}_{=:\bar{D}_T^{-1}} (\Phi\otimes I_{n_\mathrm{x}}) X, \end{align} \]

(7)

with the regularization constant \( \lambda > 0\) and the excitation

\[ \begin{align} D_T= \Phi \Phi^\top. \end{align} \]

(8)

Using (7), the estimate \( \hat{\theta}_{T}\) can be expressed as

\[ \begin{align} \nonumber \hat{\theta}_{T} \overset{(2)}{=}& \bar{D}_T^{-1} (\Phi\otimes I_{n_\mathrm{x}})((\Phi^\top \otimes I_{n_\mathrm{x}})\theta_\mathrm{tr} + W) \end{align} \]

(9)

\[ \begin{align} =&\bar{D}_T^{-1} ((D_T \otimes I_{n_\mathrm{x}}) \theta_\mathrm{tr} + (\Phi \otimes I_{n_\mathrm{x}})W). \\\end{align} \]

(10)

Thus, the regularized least squares error satisfies

\[ \begin{align} \hat{\theta}_{T}-\theta_\mathrm{tr}=& -\lambda \bar{D}_T^{-1} \theta_\mathrm{tr} + \bar{D}_T^{-1} \underbrace{(\Phi\otimes I_{n_\mathrm{x}}) W}_{=:S_T}. \end{align} \]

(11)

Self-normalized Martingales: The sequence \( \{S_T \}_{ T \geq 0}\) with \( S_T=(\Phi\otimes I_{n_\mathrm{x}}) W\) is a martingale with respect to \( \{\mathcal{F}_T \}_{T=0}^\infty\) ((1), [20]). This sequence is crucial for the construction of confidence ellipsoids for \( \theta_\mathrm{tr}\) . In [13, Theorem 1], a `self-normalized bound' for the martingale \( \{S_T \}_{T \geq 0}\) is derived by assuming scalar disturbances \( w_k\) , i.e., \( n_\mathrm{x}=1\) . The following lemma generalizes the bound in [13, Theorem 1] to vector-valued disturbances, i.e., \( n_\mathrm{x} > 1\) , by utilizing covering techniques as in [11, Proposition 8.2] or [16].

Lemma 1

(Self-normalized bound for vector processes) Let Assumption 1 hold. For any \( \delta \in (0,1)\) , with probability at least \( 1-\delta\) , for all \( T \in \mathbb{N}\) ,

\[ \begin{align} \|S_T\|_{\bar{D}_T^{-1}}^2 \leq \underbrace{8 \sigma_\mathrm{w}^2 \log \left( \frac{5^{n_\mathrm{x}} \det (\bar{D}_T)^{\frac{1}{2}}}{\delta \det(\lambda I_{n_\mathrm{x} n_\phi})^{\frac{1}{2}}} \right)}_{:=R(\bar{D}_T)}. \end{align} \]

(12)

The proof of Lemma 1 may be found in [11, Proposition 8.2] or [16]. In order to derive the confidence ellipsoid, we first derive a bound on \( \theta_\mathrm{tr}\) of the form \( \|\theta_\mathrm{tr}\| \leq \bar{\theta}\) . By applying the Schur complement twice to the condition in (3), we get \( (\hat{\theta}_0-\theta_\mathrm{tr})(\hat{\theta}_0-\theta_\mathrm{tr})^\top \preceq (D_0 \otimes I_{n_\mathrm{x}})^{-1}\) , from which we have

\[ \begin{align} \|\hat{\theta}_0-\theta_\mathrm{tr}\| \leq \|D_0^{-\frac{1}{2}}\|. \end{align} \]

(13)

Using the triangle inequality, we have

\[ \begin{align} \|\theta_\mathrm{tr}\| & \leq \|\hat{\theta}_0\| + \|\hat{\theta}_0-\theta_\mathrm{tr}\| \overset{(13)}{\leq} \|\hat{\theta}_0\| + \|D_0^{-\frac{1}{2}}\|=: \bar{\theta}. \end{align} \]

(14)

The following theorem utilizes the result in Lemma 1 to derive a data-dependent uncertainty bound in the form of a confidence ellipsoid centered at the estimate \( \hat{\theta}_{T}\) for the uncertain parameters \( \theta_\mathrm{tr}\) .

Theorem 1

[13, Theorem 2] Let Assumptions 1 and 3 hold. Then, for any \( \delta > 0\) , \( \mathbb{P}[\theta_\mathrm{tr} \in \Theta_T, \forall T] \geq 1-\delta\) where

\[ \begin{align} \Theta_T =\left\{ \theta: \, \|\hat{\theta}_{T} - \theta_\mathrm{tr} \|_{\bar{D}_T} \leq R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right\}. \end{align} \]

(15)

We utilize this data-dependent uncertainty bound to design a targeted exploration strategy that achieves the exploration goal provided in (4).

4 Data-dependent sufficient conditions for targeted exploration

Given the uncertainty bound in Theorem 1, the following theorem presents conditions on the exploration data that imply the exploration goal (4).

Theorem 2

Let Assumptions 1 and 3 hold. Suppose \( \Phi\) satisfies

\[ \begin{align} \Phi \Phi^\top + \lambda I_{n_\phi}-\left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 D_\mathrm{des} \succeq 0 \end{align} \]

(16)

with \( \bar{D}_T\) according to (7). Then, the estimate \( \hat{\theta}_{T}\) computed as in (9) satisfies the exploration goal (4) with probability at least \( 1-\delta\) .

Proof

By applying the Schur complement twice to the condition in (15), we have

\[ \begin{align} \nonumber &(\hat{\theta}_T-\theta_\mathrm{tr}) (\hat{\theta}_T-\theta_\mathrm{tr})^\top \end{align} \]

(17)

\[ \begin{align} \preceq& \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 \bar{D}_T^{-1}\\ \overset{(7)}{=}& \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 ((\Phi \Phi^\top) \otimes I_{n_\mathrm{x}} + \lambda I_{n_\phi n_\mathrm{x}})^{-1} \\\end{align} \]

with probability at least \( 1-\delta\) . Inequality (16) can be written as

\[ \begin{align} \nonumber & \left((\Phi \Phi^\top) \otimes I_{n_\mathrm{x}} + \lambda I_{n_\phi n_\mathrm{x}}\right)^{-1} \end{align} \]

(18)

\[ \begin{align} \preceq & \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^{-2} (D_\mathrm{des} \otimes I_{n_\mathrm{x}})^{-1}. \\\end{align} \]

(19)

By inserting (18) in (17), we get

\[ \begin{align} (\theta-\hat{\theta}_T) (\theta-\hat{\theta}_T)^\top \preceq (D_\mathrm{des}\otimes I_{n_\mathrm{x}})^{-1} \end{align} \]

(20)

with probability at least \( 1-\delta\) . Finally, applying the Schur complement twice to (20) yields the exploration goal (4).

In contrast to standard asymptotic bounds, e.g., [6, 9], Theorem 2 depends additionally on the term \( (R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta})^2\) which is nonconvex in \( \bar{D}_T\) . Hence, in the following lemma we derive a sufficient condition that is convex in \( \bar{D}_T\) .

Lemma 2

For any \( \bar{D}_T \in \mathbb{R}^{n_\theta}\) , we have

\[ \begin{align} \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 \leq \left(2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2\right) \end{align} \]

(21)

where

\[ \begin{align} C_1:=4 \sigma_\mathrm{w}^2 \left( \log \left( \frac{5^{2 n_\mathrm{x}}}{\delta^{2}}\right) -n_\theta\log (\lambda) \right). \end{align} \]

(22)

Proof

We first derive an upper bound on the term \( R(\bar{D}_T)\) as follows:

\[ \begin{align} \nonumber & R(\bar{D}_T)\overset{(12)}{=} 4 \sigma_\mathrm{w}^2 \log \left( \frac{5^{2 n_\mathrm{x}} \det (\bar{D}_T)}{\delta^2 \det(\lambda I_{n_\theta})} \right) \end{align} \]

(23)

\[ \begin{align} = & 4 \sigma_\mathrm{w}^2 \left( \log \left( \frac{5^{2 n_\mathrm{x}}}{\delta^{2}}\right) - \log (\det(\lambda I_{n_\theta})) + \log(\det(\bar{D}_T)) \right) \\ \leq & \underbrace{4 \sigma_\mathrm{w}^2 \left( \log \left( \frac{5^{2 n_\mathrm{x}}}{\delta^{2}}\right) -n_\theta\log (\lambda) \right)}_{C_1} + 4 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)). \\\end{align} \]

Finally, by Young’s inequality [21], we have that

\[ \begin{align*} \nonumber \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 \leq & \left( 2 R(\bar{D}_T) + 2 \lambda \bar{\theta}^2 \right) \\ \overset{(23)}{\leq} & \left( 2C_1 +8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) \end{align*} \]

which yields Inequality (21).

Proposition 1

Let Assumptions 1 and 3 hold. Suppose \( \Phi\) satisfies

\[ \begin{align} \nonumber \Phi \Phi^\top + \lambda I_{n_\phi}& \end{align} \]

(24)

\[ \begin{align} -\left( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) D_\mathrm{des} &\succeq 0. \\\end{align} \]

(25)

Then, the estimate \( \hat{\theta}_{T}\) computed as in (9) satisfied the exploration goal (4) with probability at least \( 1-\delta\) .

Proof

Starting from (24), we have

\[ \begin{align} 0 \preceq & \Phi \Phi^\top + \lambda I_{n_\phi}\\ \nonumber & -\left( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) D_\mathrm{des}\\ \nonumber \overset{(21)}{\preceq}& \Phi \Phi^\top + \lambda I_{n_\phi} - \left(R(\bar{D}_T)^\frac{1}{2} + \lambda^\frac{1}{2} \bar{\theta} \right)^2 D_\mathrm{des}, \\\end{align} \]

which yields condition (16) in Theorem 2. Hence, the exploration goal (4) is achieved with probability at least \( 1-\delta\) .

Note that Inequality (24) depends quadratically on \( \Phi\) , specifically on the finite excitation term \( \Phi \Phi^\top\) [18]. This further depends on the amplitudes of the inputs \( U_\mathrm{e}\) and the disturbances. Additionally, the linear mapping from the input sequence to the state sequence is uncertain due to the uncertainty in the true dynamics. These issues are addressed in the next section.

5 Sufficient conditions for targeted exploration based on spectral lines

In this section, we determine sufficient conditions for exploration in terms of the spectral content of \( \Phi\) in order to pose the exploration problem directly in terms of the inputs. To this end, we first define the amplitude of a spectral line.

Definition 2

(Amplitude of a spectral line [18]) Given a sequence \( \{\phi_k\}_{k=0}^{T-1}\) , the amplitude of the spectral line of the sequence \( \bar{\phi}(\omega_0)\) at a frequency \( \omega_0 \in \Omega_T:=\{0,1/T,\dots,(T-1)/T\}\) is given by

\[ \begin{align} \bar{\phi}(\omega_0):=\frac{1}{T}\sum_{k=0}^{T-1}\phi_k e^{-j2\pi \omega_0 k}. \end{align} \]

(26)

In what follows, we first establish the exploration strategy in Section 5.1 and then provide a bound on the excitation \( \Phi \Phi^\top\) in terms of spectral information in Section 5.2. We then explicitly account for transient error due to the input and the effect of disturbances in Section 5.3. In order to derive conditions linear in the decision variables, we provide a convex relaxation procedure in Section 5.4 and an upper bound on excitation in Section 5.5. Finally, we account for parametric uncertainty and obtain an SDP for exploration which provides us with exploration inputs that ensure the exploration goal in Section 5.6.

5.1 Exploration strategy

The exploration input sequence takes the form

\[ \begin{align} u_k=\sum_{i=1}^{L} \bar{u}(\omega_i) \cos(2 \pi \omega_i k), ~\, k=0,\dots,T-1 \end{align} \]

(27)

where \( \bar{u}(\omega_i) \in \mathbb{R}^{n_\mathrm{u}}\) are the amplitudes of the sinusoidal inputs at \( L \in \mathbb{N}\) distinct selected frequencies \( \omega_i \in \Omega_T\) . The amplitude of the spectral line of the sequence \( \{u_k\}_{k=0}^{T-1}\) at frequency \( \omega_i\) is \( \bar{u}(\omega_i)\) . Denote \( U_\mathrm{e}=\mathrm{diag}(\bar{u}(\omega_1),\dots,\bar{u}(\omega_L)) \in \mathbb{R}^{Ln_\mathrm{u} \times L}\) . By the Parseval-Plancheral identity, bounding the average input energy by a constant \( \gamma^2_\mathrm{e}\) (5) can be equivalently written as

\[ \begin{align} \frac{1}{T}\|U\|^2 =\sum_{i=1}^L \|\bar{u}(\omega_i)\|^2= \mathbf{1}_{L}^\top U_\mathrm{e}^\top U_\mathrm{e} \mathbf{1}_{L} \leq \gamma_\mathrm{e}^2 \end{align} \]

(28)

where \( \mathbf{1}_{L} \in \mathbb{R}^{L\times 1}\) is a vector of ones, and the bound \( \gamma_\mathrm{e} \geq 0\) is desired to be small. Using the Schur complement, this criterion is equivalent to the following LMI:

\[ \begin{equation} S_{\textnormal{energy-bound-1}}(\gamma_\mathrm{e},U_\mathrm{e})\mathrm{:=} \begin{bmatrix} \gamma_\mathrm{e} & \mathbf{1}_{L}^\top U_\mathrm{e}^\top \\ U_\mathrm{e} \mathbf{1}_{L} & \gamma_\mathrm{e} I \end{bmatrix} \succeq 0. \end{equation} \]

(29)

5.2 Bound on excitation based on spectral lines

The uncertainty bound on the parameters of the system evolving under the exploration input sequence (27) can be computed from the spectral content of the observed state \( x_k\) and the applied input \( u_k\) . Given \( u_k\) as in (27), \( x_k\) and \( u_k\) have \( L\) spectral lines from \( 0\) to \( T-1\) at distinct frequencies \( \omega_i \in \Omega_T\) , \( i=1,\dots,L\) with amplitudes \( \bar{x}(\omega_i)\) and \( \bar{u}(\omega_i)\) . Recall \( \phi_k=[x_k^\top\,u_k^\top]^\top\) . From (1), we have

\[ \begin{align} \bar{x}(\omega_i) & = \underbrace{\left[ (e^{j\omega_i}I-A_\mathrm{tr})^{-1}B_\mathrm{tr}\right] \bar{u}(\omega_i)}_{\bar{x}_\mathrm{u,d}(\omega_i)} + \bar{x}_\mathrm{u,t}(\omega_i) + \bar{x}_\mathrm{w}(\omega_i), \end{align} \]

(30)

where \( \bar{x}_{\mathrm{u,d}}(\omega_i)\) denotes the asymptotic effect of the input, \( \bar{x}_{\mathrm{u,t}}(\omega_i)\) denotes the transient error and \( \bar{x}_{\mathrm{w}}(\omega_i)\) denotes the effect of the disturbance. Furthermore, we have

\[ \begin{align} \nonumber \bar{x}_\mathrm{u}(\omega_i) =& \bar{x}_\mathrm{u,d}(\omega_i)+\bar{x}_\mathrm{u,t}(\omega_i)\\ \nonumber =&\frac{1}{T}\sum_{k=0}^{T-1} \left( \sum_{l=0}^k A_\mathrm{tr}^{k-l}B_\mathrm{tr} u_l \right) e^{-j2 \pi \omega_i k}\\ \nonumber =& \underbrace{\frac{1}{T}\left(\begin{bmatrix} e^{-j2 \pi \omega_i 0} & \cdots & e^{-j2 \pi \omega_i (T-1)} \end{bmatrix} \otimes I_{n_\mathrm{x}} \right)}_{=:F_{x,\omega_i}}\\ &\times \underbrace{\begin{bmatrix} B_\mathrm{tr} & 0 & \cdots & \vdots\\ A_\mathrm{tr}B_\mathrm{tr} & B_\mathrm{tr} & \cdots & \vdots \\ \vdots & \cdots & B_\mathrm{tr}& 0 \\ A_\mathrm{tr}^{T-1}B_\mathrm{tr} & \cdots & A_\mathrm{tr}B_\mathrm{tr} & B_\mathrm{tr} \end{bmatrix}}_{A_\mathrm{u}} U, \end{align} \]

(31)

\[ \begin{align} \nonumber \bar{x}_\mathrm{w}(\omega_i)=&\frac{1}{T}\sum_{k=0}^{T-1} \left( \sum_{l=0}^k A_\mathrm{tr}^{k-l} w_l \right) e^{-j2 \pi \omega_i k}\\ =&F_{x,\omega_i} \underbrace{\begin{bmatrix} I & 0 & \cdots & \vdots\\ A_\mathrm{tr} & I & \cdots & \vdots \\ \vdots & \cdots & I& 0 \\ A_\mathrm{tr}^{T-1} & \cdots & A_\mathrm{tr} & I \end{bmatrix}}_{A_\mathrm{w}} W. \end{align} \]

(32)

Further, \( \bar{u}(\omega_i)\) can be written as

\[ \begin{align} \bar{u}(\omega_i)=\underbrace{\frac{1}{T}\left(\begin{bmatrix} e^{-j2 \pi \omega_i 0} & \cdots & e^{-j2 \pi \omega_i (T-1)} \end{bmatrix} \otimes I_{n_\mathrm{u}} \right)}_{=:F_{u,\omega_i}}U. \\\end{align} \]

(33)

Note that for all \( \omega_i \in \Omega_T\) , we have

\[ \begin{align} \|F_{x,\omega_i}\|^2=\|F_{u,\omega_i}\|^2 = \frac{1}{T}. \end{align} \]

(34)

Using (30), we have

\[ \begin{align} \bar{\phi}(\omega_i)&=\underbrace{\left[\bar{x}_\mathrm{u,d}(\omega_i) \atop \bar{u}(\omega_i)\right]}_{\bar{\phi}_\mathrm{u,d}(\omega_i)}+\underbrace{\left[\bar{x}_\mathrm{u,t}(\omega_i) \atop 0\right]}_{\bar{\phi}_\mathrm{u,t}(\omega_i)}+\underbrace{\left[\bar{x}_\mathrm{w}(\omega_i) \atop 0\right]}_{\bar{\phi}_\mathrm{w}(\omega_i)}, \\\end{align} \]

(35)

where

\[ \begin{align} \bar{\phi}_\mathrm{u,d}(\omega_i)=\underbrace{\begin{bmatrix} (e^{j \omega_i}I - A_\mathrm{tr})^{-1}B_\mathrm{tr} \\ I_{n_\mathrm{u}} \end{bmatrix}}_{=:V_i} \bar{u}(\omega_i). \\\end{align} \]

(36)

We compactly define

\[ \begin{align} \bar{\Phi}_\ast& =[\bar{\phi}_\ast (\omega_1), \dots ,\bar{\phi}_\ast (\omega_L)] \in \mathbb{C}^{n_\phi \times L}, \\\end{align} \]

(37)

where \( \ast \in \{\mathrm{w};\mathrm{u};\mathrm{u,d};\mathrm{e,t} \}\) , which satisfies

\[ \begin{align} \nonumber \bar{\Phi} & = \underbrace{V_\mathrm{tr}U_\mathrm{e} + \bar{\Phi}_\mathrm{u,t}}_{=:\bar{\Phi}_\mathrm{u}} + \bar{\Phi}_\mathrm{w}, \end{align} \]

(38)

\[ \begin{align} V_{\mathrm{tr}}&:=[V_{1},\cdots,V_{L}] \in \mathbb{C}^{n_\mathrm{\phi}\times n_\mathrm{u}L}. \\\end{align} \]

(39)

The following lemma provides a lower bound on \( \Phi \Phi^\top\) in terms of the spectral lines of \( \phi_k\) .

Lemma 3

For any \( \epsilon \in (0,1)\) , \( \phi_k\) satisfies

\[ \begin{align} \frac{1}{T}\Phi \Phi^\top\succeq & (1-\epsilon)V_\mathrm{tr}U_\mathrm{e}U_\mathrm{e}^\top V_\mathrm{tr}^\mathsf{H}-\tfrac{2(1-\epsilon)}{\epsilon}(\bar{\Phi}_\mathrm{u,t}\bar{\Phi}_\mathrm{u,t}^\mathsf{H} +\bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H}). \end{align} \]

(40)

Proof

From the Parseval-Plancherel identity, we have

\[ \begin{align*} \nonumber \frac{1}{T}\Phi \Phi^\top &\overset{(26)}{=} \sum_{i=1}^{T} \bar{\phi}(\omega_i) \bar{\phi}(\omega_i)^\mathsf{H}\succeq \sum_{i=1}^{L} \bar{\phi}(\omega_i) \bar{\phi}(\omega_i)^\mathsf{H}\\ \nonumber & \overset{(38)}{=}(V_\mathrm{tr}U_\mathrm{e}+\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w}) (V_\mathrm{tr}U_\mathrm{e}+\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w})^\mathsf{H}. \end{align*} \]

For any \( \epsilon >0\) , applying Young’s inequality [21] twice gives

\[ \begin{align*} \nonumber & (V_\mathrm{tr}U_\mathrm{e}+\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w}) (V_\mathrm{tr}U_\mathrm{e}+\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w})^\mathsf{H} \\ \nonumber \succeq & (1-\epsilon)V_\mathrm{tr}U_\mathrm{e}U_\mathrm{e}^\top V_\mathrm{tr}^\mathsf{H}-\left(\tfrac{1-\epsilon}{\epsilon}\right)(\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w}) (\bar{\Phi}_\mathrm{u,t}+\bar{\Phi}_\mathrm{w})^\mathsf{H} \\ \succeq & (1-\epsilon)V_\mathrm{tr}U_\mathrm{e}U_\mathrm{e}^\top V_\mathrm{tr}^\mathsf{H}-2\left(\tfrac{1-\epsilon}{\epsilon}\right)(\bar{\Phi}_\mathrm{u,t}\bar{\Phi}_\mathrm{u,t}^\mathsf{H} +\bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H}), \end{align*} \]

which yields Inequality (40).

The result in Lemma 3 depends on the transient error due to the input and the effect of the disturbance, which is unknown. Therefore, in what follows, we derive suitable bounds.

5.3 Bounds on transient error and the effect of disturbance

Due to bounded inputs and strict stability, the transient error \( \bar{x}_\mathrm{u,t}(\omega_i)\) decays uniformly to zero as \( T \to \infty\) . Let us denote the impulse response of the system with respect to the input \( u_k\) as \( G(i)=A_\mathrm{tr}^{i-1}B_\mathrm{tr}\) . The following lemma provides a deterministic bound on the transient error for finite \( T\) .

Lemma 4

[1, Theorem 2.1] Let Assumption 2 hold. We have

\[ \begin{align} \sup_{\omega_i \in \Omega_T}\|\bar{x}_{\mathrm{u,t}}(\omega_i)\| \overset{(29)}{\leq} \frac{2 \gamma_\mathrm{e} G_\mathrm{tr}}{\sqrt{T}}\end{align} \]

(41)

where \( G_\mathrm{tr} = \sum_{i=1}^\infty i \|G(i)\| < \infty\) .

Since \( A_\mathrm{tr}\) is Schur stable (cf. Assumption 2), there exist constants \( C \geq 1\) and \( \rho \in (0,1)\) such that \( \|A_\mathrm{tr}^k\| \leq C \rho^k\) , \( \forall k \in \mathbb{N}\) . From the convergence of arithmetico-geometric series, we have

\[ \begin{align}G_\mathrm{tr}=\sum_{k=1}^\infty k \|A_\mathrm{tr}^k B_\mathrm{tr}\| \leq \|B_\mathrm{tr}\| \sum_{k=1}^\infty k C\rho^k = \dfrac{\|B_\mathrm{tr}\| C\rho}{(1-\rho)^2}. \end{align} \]

(42)

Using (30) and (32), we can write \( \bar{\Phi}_\mathrm{w}\) as

\[ \begin{align} \bar{\Phi}_\mathrm{w}=\begin{bmatrix} F_{x,\omega_1}A_\mathrm{w}W&\cdots &F_{x,\omega_L}A_\mathrm{w}W\\ 0 & \cdots & 0 \end{bmatrix}. \end{align} \]

(43)

Given that \( W \sim \mathsf{subG}_{Tn_\mathrm{x}}(\sigma_\mathrm{w}^2I_{Tn_\mathrm{x}})\) with zero mean, [19, Corollary 1] implies

\[ \begin{align} \mathbb{P} \bigg[ \|W\|^2 \leq \underbrace{\sigma_\mathrm{w}^2 \left((1 + \log 4)T n_\mathrm{x} + 4 \log \delta^{-1}\right)}_{=:\gamma_\mathrm{w}}\bigg] \geq 1-\delta. \end{align} \]

(44)

Lemma 5

The matrices \( \bar{\Phi}_\mathrm{u,t}\) and \( \bar{\Phi}_\mathrm{w}\) satisfy

\[ \begin{align} \bar{\Phi}_\mathrm{u,t} \bar{\Phi}_\mathrm{u,t}^\mathsf{H} \preceq\frac{ 4 L \gamma_\mathrm{e}^2}{T} \left(\dfrac{\|B_\mathrm{tr}\| C\rho}{(1-\rho)^2}\right)^2I_{n_\phi} = : \Gamma_\mathrm{u}(\gamma_\mathrm{e}^2), \end{align} \]

(45)

\[ \begin{align} \mathbb{P} \left[\bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H} \preceq \left(\tfrac{L}{T}\|A_\mathrm{w}\|^2 \gamma_\mathrm{w}\right) I_{n_\phi}=:\Gamma_\mathrm{w}\right] \geq 1-\delta. \end{align} \]

(46)

Proof

Using (41) from Lemma 4 and (42), we have

\[ \begin{align*} \nonumber \bar{\Phi}_\mathrm{u,t}\bar{\Phi}_\mathrm{u,t}^\mathsf{H} & \preceq \left(\sum_{i=1}^L \max_{\omega_i \in \Omega_T}\|\bar{x}_{\mathrm{u,t}}(\omega_i)\|^2 \right)I_{n_\phi}\overset{(41)}{\preceq} \frac{ 4 L \gamma_\mathrm{e}^2 G_\mathrm{tr}^2}{T} I_{n_\phi}\\ & \overset{(42)}{\preceq} \frac{ 4 L \gamma_\mathrm{e}^2}{T} \left(\dfrac{\|B_\mathrm{tr}\| C\rho}{(1-\rho)^2}\right)^2I_{n_\phi}, \end{align*} \]

which yields (45). By using the sub-multiplicativity property of the operator norm, we have

\[ \begin{align} \bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H} \overset{(43)}{\preceq} & \left(\max_{\omega_i \in \Omega_T} \sqrt{L} \|F_{\omega_i} A_\mathrm{w} W\|\right)^2 I_{n_\phi} \\ \nonumber \overset{(34)}{\preceq} & \left( \tfrac{L}{T} \|A_\mathrm{w}\|^2 \|W\|^2\right) I_{n_\phi} \overset{ (44)}{\preceq} \left(\tfrac{L}{T} \|A_\mathrm{w}\|^2 \gamma_\mathrm{w}\right) I_{n_\phi}, \\\end{align} \]

with probability at least \( 1-\delta\) , which yields (46).

5.4 Convex relaxation

The following proposition provides a sufficient condition linear in \( U_\mathrm{e}\) .

Proposition 2

Let Assumptions 1, 2 and 3 hold. Suppose matrices \( U_\mathrm{e}\) , \( \tilde{U}\) and \( \Phi\) satisfy

\[ \begin{align} \nonumber (1-\epsilon) V_\mathrm{tr}( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top) V_\mathrm{tr}^\mathsf{H} & \end{align} \]

(47)

\[ \begin{align} - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\gamma_\mathrm{e}^2)+ \tfrac{\lambda}{T} I_{n_\phi} - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w}& \\ -\tfrac{1}{T} \big( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \big) D_\mathrm{des} & \succeq 0. \\\end{align} \]

Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2 \delta\) .

Proof

The proof is divided into two parts. In the first part, we show that condition (47) implies condition (24) in Proposition 1. In the second part, we derive joint probabilistic bounds for the Inequality (47).

Part I. From [8, Lemma 12], we have the following convex relaxation:

\[ \begin{align} V_\mathrm{tr} U_\mathrm{e} U_\mathrm{e}^\top {V_\mathrm{tr}}^\mathsf{H} \succeq V_\mathrm{tr}\left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right)V_\mathrm{tr}^\mathsf{H}. \end{align} \]

(48)

Starting from (47), we have

\[ \begin{align} & 0 \end{align} \]

(49)

\[ \begin{align} \preceq &(1-\epsilon) V_\mathrm{tr}\left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right) V_\mathrm{tr}^\mathsf{H}\\ \nonumber & - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\gamma_\mathrm{e}^2) - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w}+ \tfrac{\lambda}{T} I_{n_\phi}\\ \nonumber & -\tfrac{1}{T} \big( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \big) D_\mathrm{des}\\ \nonumber \overset{(48)}{\preceq}& (1-\epsilon) V_\mathrm{tr} U_\mathrm{e} U_\mathrm{e}^\top V_\mathrm{tr}^\mathsf{H} - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\gamma_\mathrm{e}^2)- \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w}\\ \nonumber & + \tfrac{\lambda}{T} I_{n_\phi} -\tfrac{1}{T}\left( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) D_\mathrm{des}\\ \nonumber \overset{(45), \atop (46)}{\preceq} & (1-\epsilon) V_\mathrm{tr} U_\mathrm{e} U_\mathrm{e}^\top V_\mathrm{tr}^\mathsf{H} - \tfrac{2(1-\epsilon)}{\epsilon} \bar{\Phi}_\mathrm{u,t}\bar{\Phi}_\mathrm{u,t}^\mathsf{H}- \tfrac{2(1-\epsilon)}{\epsilon} \bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H} \\ \nonumber & + \tfrac{\lambda}{T} I_{n_\phi} -\tfrac{1}{T}\left( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) D_\mathrm{des}\\ \nonumber \overset{(40)}{\preceq}& \tfrac{1}{T} (\Phi \Phi^\top + \lambda I_{n_\phi} -\left( 2C_1 + 8 \sigma_\mathrm{w}^2 \log(\det(\bar{D}_T)) + 2 \lambda \bar{\theta}^2 \right) D_\mathrm{des}). \\\end{align} \]

Multiplying (49) by \( T\) yields (24) in Proposition 1.

Part II. We have

\[ \begin{align} \nonumber & \mathbb{P}\left[ \|\hat{\theta}_T - \theta_\mathrm{tr}\|_{(D_\mathrm{des}\otimes I_{n_\mathrm{x}})} \leq 1 \right] \end{align} \]

(50)

\[ \begin{align} \geq &\mathbb{P}\left[(\hat{\theta}_\mathrm{tr} \in \Theta_T)\cap (\|W\|^2\leq \gamma_\mathrm{w})\right]\\ \geq & 1 - \mathbb{P}\left[\hat{\theta}_\mathrm{tr} \notin \Theta_T\right] - \mathbb{P}\left[\|W\|^2\nleq \gamma_\mathrm{w}\right] \overset{(15), \atop (44)}{\geq} 1-2\delta, \\\end{align} \]

wherein the penultimate inequality follows from De Morgan’s law.

In Part I, assuming \( \hat{\theta}_\mathrm{tr} \in \Theta_T\) (15) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) (44), Inequality (47) implies (4). Since both \( \hat{\theta}_\mathrm{tr} \in \Theta_T\) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) hold jointly with probability at least \( 1-2\delta\) as shown in Part II, Inequality (47) implies (4) with probability at least \( 1-2\delta\) .

Note that Inequality (47) in Proposition 2 requires an upper bound on the exploration trajectory data \( \bar{D}_T\) (cf. (7)) in order to derive an LMI for exploration.

5.5 Upper bound on excitation and \( \bar{D}_T\)

In the following lemma, we provide an upper bounds on the term \( \log(\det(\bar{D}_T))\) .

Lemma 6

Let \( S_{\textnormal{energy-bound-1}}(\gamma_\mathrm{e},U_\mathrm{e}) \succeq 0\) in (29) and \( \|W\|^2 \leq \gamma_\mathrm{w}\) hold. Then, the matrix \( \bar{D}_T\) as in (7) satisfies

\[ \begin{align} \nonumber & \log(\det(\bar{D}_T)) \end{align} \]

(51)

\[ \begin{align} \leq &\, n_\theta \log (2T\|A_\mathrm{w}\|^2 \gamma_\mathrm{w} + 2T^2 (\|A_\mathrm{u}\|^2+1) \gamma_\mathrm{e}^2 + \lambda). \\\end{align} \]

(52)

Proof

With \( L=T\) , and using (31) and (32), we have

\[ \begin{align} \bar{\Phi}_\mathrm{u}&= \begin{bmatrix}F_{x,\omega_1}A_\mathrm{u}U&\cdots&F_{x,\omega_{T}}A_\mathrm{u}U \\ F_{u,\omega_1}U&\cdots&F_{u,\omega_{T}}U \end{bmatrix},\\ \nonumber \bar{\Phi}_\mathrm{w}&=\begin{bmatrix}F_{x,\omega_1}A_\mathrm{w}W & \cdots & F_{x,\omega_{T}}A_\mathrm{w}W \\ 0 & \cdots & 0\end{bmatrix}. \\\end{align} \]

Similar to (46), using the sub-multiplicativity property of the operator norm, (28), and (34), we have

\[ \begin{align} \nonumber \bar{\Phi}_\mathrm{u} \bar{\Phi}_\mathrm{u}^\mathsf{H} &\preceq \max_{\omega_i \in \Omega_T} T\left( \|F_{x,\omega_i} A_\mathrm{u} U\|^2 + \|F_{u,\omega_i} U\|^2 \right) I_{n_\phi} \end{align} \]

(53)

\[ \begin{align} &\preceq T \left(\|A_\mathrm{u}\|^2+1 \right)\gamma_\mathrm{e}^2 I_{n_\phi}. \\\end{align} \]

(54)

By the Parseval-Plancheral identity, Young’s inequality [21], Inequalities (46), (53), and with \( L=T\) , we have

\[ \begin{align*} \nonumber \Phi \Phi^\top &= T \sum_{i=1}^{T}\bar{\phi}(\omega_i) \bar{\phi}(\omega_i)^\mathsf{H}= T \left(\bar{\Phi}_\mathrm{u}+\bar{\Phi}_\mathrm{w}\right) \left(\bar{\Phi}_\mathrm{u}+\bar{\Phi}_\mathrm{w}\right)^\mathsf{H}\\ \nonumber & \preceq 2T\left(\bar{\Phi}_\mathrm{u}\bar{\Phi}_\mathrm{u}^\mathsf{H} + \bar{\Phi}_\mathrm{w}\bar{\Phi}_\mathrm{w}^\mathsf{H}\right)\\ & \overset{(46), \atop (53)}{\preceq} 2T(\|A_\mathrm{w}\|^2 \gamma_\mathrm{w} + T (\|A_\mathrm{u}\|^2+1) \gamma_\mathrm{e}^2)I_{n_\phi}. \end{align*} \]

Hence, we have

\[ \begin{align*} \nonumber \bar{D}_T&\overset{(7)}{=}(\Phi \Phi^\top)\otimes I_{n_\mathrm{x}} + \lambda I_{n_\theta}\\ &\preceq \left( 2T\|A_\mathrm{w}\|^2 \gamma_\mathrm{w} + 2T^2 (\|A_\mathrm{u}\|^2+1) \gamma_\mathrm{e}^2 + \lambda \right) I_{n_\theta}, \end{align*} \]

which yields (51).

Recall \( C_1\) from (22). Henceforth, we define

\[ \begin{align} C_2:=&\left( 2C_1 + 2 \lambda \bar{\theta}^2 \right), \end{align} \]

(55)

\[ \begin{align} C_3(\gamma_\mathrm{e}^2):=&8 \sigma_\mathrm{w}^2 n_\theta\log (2T\|A_\mathrm{w}\|^2 \gamma_\mathrm{w} + 2T^2 (\|A_\mathrm{u}\|^2+1) \gamma_\mathrm{e}^2 + \lambda). \\\end{align} \]

Proposition 3

Let Assumptions 1, 2 and 3 hold. Suppose matrices \( U_\mathrm{e}\) , \( \tilde{U}\) satisfy

\[ \begin{align} (1-\epsilon) V_\mathrm{tr}\left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right) V_\mathrm{tr}^\mathsf{H}+ \tfrac{\lambda}{T} I_{n_\phi} & \end{align} \]

(56)

\[ \begin{align} - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\gamma_\mathrm{e}^2) - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w} -\tfrac{1}{T}\left(C_2 + C_3 (\gamma_\mathrm{e}^2) \right) D_\mathrm{des}&\succeq 0. \\\end{align} \]

Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2 \delta\) .

Proof

Using \( C_2\) and \( C_3\) (55), and inserting Inequality (51) in (56) yields condition (47) in Proposition 2.

The result in Proposition 3 depends on the transfer matrix \( V_\mathrm{tr}\) which is dependent on the true dynamics \( A_\mathrm{tr},\,B_\mathrm{tr}\) , and hence uncertain. Therefore, in what follows, we derive suitable bounds.

5.6 Bounds on transfer matrices and Exploration SDP

Denote

\[ \begin{align} \tilde{V}&=V_\mathrm{tr}-\hat{V}, \end{align} \]

(57)

\[ \begin{align} \hat{V}&=[\hat{V}_{1},\cdots,\hat{V}_{L}] \in \mathbb{C}^{n_\mathrm{\phi}\times Ln_\mathrm{u}}, \end{align} \]

(58)

where \( \hat{V}\) is computed using the initial estimates \( \hat{\theta}_0=\mathrm{vec}([\hat{A}_0,\hat{B}_0])\) (cf. Assumption 3). We compute a matrix \( \tilde{\Gamma}_\mathrm{V} \succeq 0\) using \( \theta_\mathrm{tr} \in \mathbf{\Theta}_0\) such that

\[ \begin{align} \tilde{V} \tilde{V}^\mathsf{H} \preceq \tilde{\Gamma}_\mathrm{V}. \end{align} \]

(59)

Since Inequality (56) in Proposition 3 is nonlinear in the decision variable \( \gamma_\mathrm{e}\) , we impose an upper bound of the form \( \gamma_\mathrm{e}^2 \leq \bar{\gamma}\) , where \( \bar{\gamma}\) is fixed. Using the Schur complement, this criterion is equivalent to

\[ \begin{align} S_\textnormal{energy-bound-2}(\gamma_\mathrm{e},\bar{\gamma})=\begin{bmatrix} \bar{\gamma} & \gamma_\mathrm{e}\\ \gamma_\mathrm{e} & 1 \end{bmatrix} \succeq 0. \end{align} \]

(60)

\[ \begin{align} \begin{bmatrix} V_\mathrm{tr}^\mathsf{H} \\ I \end{bmatrix}^\mathsf{H} \begin{bmatrix} (1-\epsilon) \left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right) & 0 \\ 0 & - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\bar{\gamma}) - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w}+ \tfrac{\lambda}{T} I_{n_\phi} -\tfrac{1}{T}\left( C_2 + C_3(\bar{\gamma}) \right) D_\mathrm{des} \end{bmatrix} \begin{bmatrix} V_\mathrm{tr}^\mathsf{H} \\ I \end{bmatrix} \succeq 0 \end{align} \]

(61)

\[ \begin{align} \nonumber &S_\textnormal{exp}(\epsilon, \lambda, \tau, U_\mathrm{e}, \tilde{U}, \gamma_\mathrm{e}, \bar{\gamma}, \Gamma_\mathrm{u}, \Gamma_\mathrm{w}, \tilde{\Gamma}_\mathrm{v}, \hat{V}, D_\mathrm{des}):= \end{align} \]

(62)

\[ \begin{align} &\begin{bmatrix} (1-\epsilon) \left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right) & 0 \\ 0 & - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\bar{\gamma}) - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w}+ \tfrac{\lambda}{T} I_{n_\phi} -\tfrac{1}{T}\left( C_2 + C_3(\bar{\gamma}) \right) D_\mathrm{des} \end{bmatrix} - \tau \begin{bmatrix} -I & \hat{V}^\mathsf{H}\\ \hat{V} & \tilde{\Gamma}_\mathrm{v} -\hat{V}\hat{V}^\mathsf{H} \end{bmatrix} \succeq 0 \\\end{align} \]

(63)

The following theorem provides a sufficient condition linear in \( U_\mathrm{e}\) , which ensures the exploration goal (4).

Theorem 3

Let Assumptions 1, 2 and 3 hold. Suppose the following SDP is feasible:

\[ \begin{align} \underset{U_\mathrm{e},\gamma_\mathrm{e}, \atop {\tau \geq 0}}{\inf} & \; \gamma_\mathrm{e} \end{align} \]

(64)

\[ \begin{align} \mathrm{s.t. }& \; S_{\textnormal{energy-bound-1}}(\gamma_\mathrm{e},U_\mathrm{e})\succeq 0\\ \nonumber & \; S_\textnormal{energy-bound-2}(\gamma_\mathrm{e},\bar{\gamma}) \succeq 0\\ & \; S_\textnormal{exp}(\epsilon, \lambda, \tau, U_\mathrm{e}, \tilde{U}, \gamma_\mathrm{e}, \bar{\gamma}, \Gamma_\mathrm{u}, \Gamma_\mathrm{w}, \tilde{\Gamma}_\mathrm{v}, \hat{V}, D_\mathrm{des}) \succeq 0\nonumber \\\end{align} \]

where \( S_{\textnormal{energy-bound-1}}\) , \( S_\textnormal{energy-bound-2}\) and \( S_\textnormal{exp}\) are defined in (29), (60) and (62), respectively. Then, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the input (27) satisfies the exploration goal (4) with probability at least \( 1-2\delta\) .

Proof

By using (57), Inequality (59) can be written as

\[ \begin{align} \begin{bmatrix} V_\mathrm{tr}^\mathsf{H} \\ I \end{bmatrix}^\mathsf{H} \begin{bmatrix} -I & \hat{V}^\mathsf{H}\\ \hat{V} & \tilde{\Gamma}_\mathrm{v} -\hat{V}\hat{V}^\mathsf{H} \end{bmatrix} \begin{bmatrix} V_\mathrm{tr}^\mathsf{H} \\ I \end{bmatrix} \succeq 0. \end{align} \]

(65)

By using the matrix S-lemma [22], Inequality (61) holds for all \( V_\mathrm{tr}\) satisfying (65) if \( S_\textnormal{exploration}(\epsilon, \lambda, \tau, U_\mathrm{e}, \tilde{U}, \gamma_\mathrm{e}, \bar{\gamma}, \Gamma_\mathrm{u}, \Gamma_\mathrm{w}, \tilde{\Gamma}_\mathrm{v}, \hat{V}, D_\mathrm{des}) \succeq 0\) (62) with \( \tau \geq 0\) . Inequality (61) can be written as

\[ \begin{align} (1-\epsilon) V_\mathrm{tr}\left( U_\mathrm{e} \tilde{U}^\top + \tilde{U} U_\mathrm{e}^\top - \tilde{U} \tilde{U}^\top \right) V_\mathrm{tr}^\mathsf{H}+ \tfrac{\lambda}{T} I_{n_\phi} &\\ - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{u}(\bar{\gamma}) - \tfrac{2(1-\epsilon)}{\epsilon} \Gamma_\mathrm{w} -\tfrac{1}{T}\left(C_2 + C_3 (\bar{\gamma}) \right) D_\mathrm{des}&\succeq 0. \\\end{align} \]

Inserting the inequality \( \gamma_\mathrm{e}^2 \leq \bar{\gamma}\) , i.e., \( S_\textnormal{energy-bound-2}(\gamma_\mathrm{e},\bar{\gamma}) \succeq 0\) , yields condition (56) in Proposition 3. Hence, an estimate \( \hat{\theta}_T\) computed as in (7) upon the application of the solution of Problem (64), \( U_\mathrm{e}\) , satisfies the exploration goal (4) with probability at least \( 1-2\delta\) .

A solution of (64) yields \( U_\mathrm{e}=\mathrm{diag}(\bar{u}(\omega_1),\dots,\bar{u}(\omega_L))\) , i.e., the exploration input, which guarantees the desired uncertainty bound \( D_\mathrm{des}\) (4). In LMI (62), the terms \( \Gamma_\mathrm{u}\) , \( \Gamma_w\) , \( C_3\) and \( G_\mathrm{tr}\) comprise constants \( \|A_\mathrm{w}\|\) , \( \|A_\mathrm{u}\|\) , \( \|B_\mathrm{tr}\|\) , \( C\) and \( \rho\) which are uncertain since they depend on the true dynamics \( A_\mathrm{tr}, B_\mathrm{tr}\) . Given \( \theta_\mathrm{tr} \in \Theta_0\) , such constants can be computed using robust control tools or scenario optimization [9, Appendices A, C]. Furthermore, the matrix \( \tilde{\Gamma}_\mathrm{V} \succ 0\) may be computed using robust control methods, cf. [9, Appendices A, C] since (59) is an LMI. Note that imposing the upper bound \( \bar{\gamma} \geq \gamma_\mathrm{e}^2\) and the convex relaxation procedure introduces suboptimality in the solution of (64) This suboptimality can be reduced by iterating (64) multiple times by re-computing \( \tilde{U}\) and \( \bar{\gamma}\) , until they do not change, for the next iteration as

\[ \begin{align} \tilde{U}=U_\mathrm{e}^*,\;\;\; \bar{\gamma}= (\gamma_\mathrm{e}^*)^2, \end{align} \]

(66)

where \( U_\mathrm{e}^*, \gamma_\mathrm{e}^*\) is the solution from the previous iteration. Since the previous optimal solution \( U_\mathrm{e}^*\) remains feasible in the next iteration, \( \gamma_\mathrm{e}\) is guaranteed to be non-increasing. From LMI (62), it can be inferred that \( \gamma_\mathrm{e}^2\) scales as \( \frac{1}{T}\) . As \( D_\mathrm{des}\) and \( T\) grow proportionally, \( U_\mathrm{e}\) remains nearly unchanged. The proposed targeted exploration strategy is outlined in Algorithm 1. Note that the proposed algorithm directly ensures the desired accuracy of the parameters, and requires no iterative experiments.


Algorithm 1 Targeted exploration
1.Specify exploration length \( T\) , frequencies \( \omega_i,\,i=1,...,L\) , initial estimates \( \hat{A}_0,\,\hat{B}_0\) and uncertainty level \( D_0\) , desired accuracy of parameters \( D_\mathrm{des}\) , set \( \epsilon, \lambda\) , probability level \( \delta\) .
2.Compute \( \hat{V}\) (58) using the initial estimates and \( \tilde{\Gamma}_\mathrm{v}\) (59). Compute constants \( C_1\) (22), \( \|A_\mathrm{w}\|\) , \( \|A_\mathrm{u}\|\) , \( \|B_\mathrm{tr}\|\) , \( C\) and \( \rho\) for \( C_2\) and \( C_3\) (55), \( \Gamma_\mathrm{u}\) (45), \( \Gamma_\mathrm{w}\) (46) and \( G_\mathrm{tr}\) (42)
3.Select initial candidates \( \tilde{U}\) and \( \bar{\gamma}\) (66).
4.Set tolerance \( \mathrm{tol}>0\) .
5.while \( \lvert \frac{\gamma_\mathrm{e} - \gamma_\mathrm{e}^*}{\gamma_\mathrm{e}}\rvert \geq \mathrm{tol}\) do
6.Solve the optimization problem (64).
7.Update \( \tilde{U}\) and \( \bar{\gamma}\) (66).
8.end while
9.Apply the exploration input (27) for \( k=0,...,T-1\) .
10.Compute parameter estimate \( \hat{\theta}_T\) (7).

6 Numerical Example

In this section, we demonstrate the applicability of the exploration strategy through a numerical example. Numerical simulations were performed on MATLAB using CVX [23] in conjunction with the default SDP solver SDPT3. We consider a linear system (1) with

\[ \begin{align} A_\mathrm{tr}=\begin{bmatrix} 0.49 & 0.49 & 0 \\ 0 & 0.49 & 0.49 \\ 0 & 0 & 0.49 \end{bmatrix},\; B_\mathrm{tr}=\begin{bmatrix} 0 \\ 0 \\ 0.49 \end{bmatrix} \end{align} \]

(67)

which belongs to a class of systems identified as ‘hard to learn’ [24]. In our simulations, we set the desired error bound as \( D_\mathrm{des}^{-1}=10^4 I_{n_\phi}\) (cf. (4)). Furthermore, we select \( L=5\) frequencies \( \omega_i\in\{0,0.1,0.2,0.3,0.4\}\) and the initial estimate as \( \hat{\theta}_0=\theta_\mathrm{tr}+(4 \times 10^{-4})\mathbf{1}_{n_\mathrm{x}n_\phi}\) with initial uncertainty level \( D_0 = 10^3 I_{n_\phi}\) . We set \( \epsilon=0.5\) , \( \lambda =1\) , \( \delta = 0.05\) and \( \sigma_\mathrm{w}=0.01\) . In what follows, we study the effect of the finite duration \( T\) of the experiment on the average input energy required.

Non-asymptotic behaviour: In the asymptotic case where \( T \to \infty\) [9], the required input energy \( T\gamma_\mathrm{e}^2\) scales linearly with the desired accuracy \( \|D_\mathrm{des}\|\) . In this example, we study the effect of the exploration time \( T\) and the transient error decay, on the required input energy \( \gamma_\mathrm{e}^2\) . For this purpose, we run six trials for the following exploration times \( T \in \{ 10^{10},10^{11},10^{12},10^{13},10^{14},10^{15}\}\) . For each trial, we scale \( D_\mathrm{des}\) proportionately, yielding \( D_\mathrm{des} \in \{10^0,10^1,10^2,10^3,10^4,10^5 \}\) , respectively. Each trial comprises executing Algorithm 1 to obtain the exploration inputs (27) and average input energy \( \gamma_\mathrm{e}^2\) . In Figure 1, we see that for \( T >> 10^{12}\) , we approximately recover this linear relationship, while for \( T < 10^{11}\) , we see that the transient effects are significant and slightly more input energy is required. This particularly highlights the role of transient effects in finite-time experiment design, indicating the need for careful input design in order to mitigate transient effects even for large times \( T\) .

Illustration of (a) the exploration input energy T_e^2) , (b) the average input energy _e^2) , in comparison with the exploration time _w) for the initial uncertainty level |D_0|=10^3) and |D_des| T=10^{-10}) , i.e., the desired accuracy is increased linearly with the exploration time T) .
Figure 1. Illustration of (a) the exploration input energy \( T\gamma_\mathrm{e}^2\) , (b) the average input energy \( \gamma_\mathrm{e}^2\) , in comparison with the exploration time \( \gamma_\mathrm{w}\) for the initial uncertainty level \( \|D_0\|=10^3\) and \( \frac{\|D_\mathrm{des}\|}{T}=10^{-10}\) , i.e., the desired accuracy is increased linearly with the exploration time \( T\) .

Appendix

References

[1] Lennart Ljung System Identification: Theory for the User Prentice Hall PTR 1999 2nd

[2] Xavier Bombois and Michel Gevers and Roland Hildebrand and Gabriel Solari Optimal experiment design for open and closed-loop system identification Communications in Information and Systems 2011 11 3 197–224

[3] Henrik Jansson and Håkan Hjalmarsson Input design via LMIs admitting frequency-wise model specifications in confidence regions IEEE transactions on Automatic Control 2005 50 10 1534–1549

[4] Xavier Bombois and Federico Morelli and Håkan Hjalmarsson and Laurent Bako and Kévin Colin Robust optimal identification experiment design for multisine excitation Automatica 2021 125 109431

[5] Märta Barenthin and Håkan Hjalmarsson Identification and control: Joint input design and \( H_\infty\) state feedback with ellipsoidal parametric uncertainty via LMIs Automatica 2008 44 2 543–551

[6] Jack Umenberger and Mina Ferizbegovic and Thomas B Schön and Håkan Hjalmarsson Robust exploration in linear quadratic reinforcement learning Advances in Neural Information Processing Systems 2019 15310-15320

[7] Mina Ferizbegovic and Jack Umenberger and Håkan Hjalmarsson and Thomas B Schön Learning robust LQ-controllers using application oriented exploration IEEE Control Systems Letters 2019 4 1 19–24

[8] Janani Venkatasubramanian and Johannes Köhler and Mark Cannon and Frank Allgöwer Robust targeted exploration for systems with non-stochastic disturbances arXiv preprint arXiv:2412.20426 2024

[9] Janani Venkatasubramanian and Johannes Köhler and Julian Berberich and Frank Allgöwer Sequential learning and control:Targeted exploration for robust performance IEEE Transactions on Automatic Control 2024 1-16 10.1109/TAC.2024.3430088

[10] Max Simchowitz and Horia Mania and Stephen Tu and Michael I Jordan and Benjamin Recht Learning without mixing: Towards a sharp analysis of linear system identification Conference On Learning Theory 2018 439–473 PMLR

[11] Tuhin Sarkar and Alexander Rakhlin Near optimal finite time identification of arbitrary linear dynamical systems International Conference on Machine Learning 2019 5610–5618 PMLR

[12] Anastasios Tsiamis and George J Pappas Finite sample analysis of stochastic system identification Proc. 58th Conference on Decision and Control (CDC) 2019 3648–3654 IEEE

[13] Yasin Abbasi-Yadkori and Dávid Pál and Csaba Szepesvári Improved algorithms for linear stochastic bandits Advances in neural information processing systems 2011 24

[14] Sarah Dean and Horia Mania and Nikolai Matni and Benjamin Recht and Stephen Tu On the sample complexity of the linear quadratic regulator Foundations of Computational Mathematics 2019 1–47

[15] Horia Mania and Stephen Tu and Benjamin Recht Certainty equivalence is efficient for linear quadratic control Advances in Neural Information Processing Systems 2019 32

[16] Anastasios Tsiamis and Ingvar Ziemann and Nikolai Matni and George J Pappas Statistical learning theory for control: A finite-sample perspective IEEE Control Systems Magazine 2023 43 6 67–97

[17] Andrew Wagenmaker and Kevin Jamieson Active learning for identification of linear dynamical systems Conference on Learning Theory 2020 3487–3582 PMLR

[18] Arnab Sarker and Peter Fisher and Joseph E. Gaudio and Anuradha M. Annaswamy Accurate parameter estimation for safety-critical systems with unmodeled dynamics Artificial Intelligence 2023 316 103857 https://doi.org/10.1016/j.artint.2023.103857

[19] Yunke Ao and Johannes Köhler and Manish Prajapat and Yarden As and Melanie Zeilinger and Philipp Fürnstahl and Andreas Krause Stochastic Model Predictive Control for Sub-Gaussian Noise arXiv preprint arXiv:2503.08795 2025

[20] David Williams Probability with martingales Cambridge university press 1991

[21] Ryan James Caverly and James Richard Forbes LMI arXiv preprint arXiv:1903.08599 2019

[22] Henk J. van Waarde and M. Kanat Camlibel and Mehran Mesbahi From Noisy Data to Feedback Controllers: Nonconservative Design via a Matrix S-Lemma IEEE Transactions on Automatic Control 2022 67 1 162-175 10.1109/TAC.2020.3047577

[23] Michael Grant and Stephen Boyd CVX mar 2014

[24] Anastasios Tsiamis and George J Pappas Linear systems can be hard to learn Proc. 60th IEEE Conference on Decision and Control (CDC) 2021 2903–2910 IEEE