更改

添加13,381字节 、 2021年7月15日 (四) 15:40
Hyperlink to queueing networks now jumps directly to the section, making it easier for readers to obtain information
This article describes '''Lyapunov optimization''' for [[dynamical system]]s. It gives an example application to [[optimal control]] in [[Queueing_theory#Queueing_networks|queueing networks]].

==Introduction==

Lyapunov optimization refers to the use of a [[Lyapunov function]] to optimally control a dynamical system. Lyapunov functions are used extensively in control theory to ensure different forms of system stability. The state of a system at a particular time is often described by a multi-dimensional vector. A Lyapunov function is a nonnegative scalar measure of this multi-dimensional state. Typically, the function is defined to grow large when the system moves towards undesirable states. System stability is achieved by taking control actions that make the Lyapunov function drift in the negative direction towards zero.

Lyapunov drift is central to the study of optimal control in queueing networks. A typical goal is to stabilize all network queues while optimizing some performance objective, such as minimizing average energy or maximizing average throughput. Minimizing the drift of a quadratic Lyapunov function leads to the
[[backpressure routing]] algorithm for network stability, also called the ''max-weight algorithm''.<ref name=tass-radio-nets>L. Tassiulas and A. Ephremides,
"[https://drum.lib.umd.edu/bitstream/handle/1903/5346/TR_92-129.pdf?sequence=1 Stability Properties of Constrained Queueing Systems and Scheduling Policies for Maximum Throughput in Multihop Radio Networks], ''IEEE Transactions on Automatic Control'', vol. 37, no. 12, pp. 1936-1948, Dec. 1992.</ref><ref name = tass-server-allocation> L. Tassiulas and A. Ephremides, "[https://drum.lib.umd.edu/bitstream/handle/1903/5345/TR_92-128.pdf?sequence=1&isAllowed=y Dynamic Server Allocation to Parallel Queues with Randomly Varying Connectivity]," IEEE Transactions on Information Theory, vol. 39, no. 2, pp. 466-478, March 1993.</ref>
Adding a weighted penalty term to the Lyapunov drift and minimizing the sum leads to the [[Drift plus penalty|drift-plus-penalty algorithm]] for joint network stability and penalty minimization.<ref name=neely-fairness-infocom05> M. J. Neely, E. Modiano, and C. Li, "[https://www.researchgate.net/profile/Chih_Ping_Li/publication/221242398_Fairness_and_Optimal_Stochastic_Control_for_Heterogeneous_Networks/links/0a85e532265750bfc3000000.pdf Fairness and Optimal Stochastic Control for Heterogeneous Networks]," Proc. IEEE INFOCOM, March 2005.</ref><ref name=now> L. Georgiadis, M. J. Neely, and L. Tassiulas, "[https://www.nowpublishers.com/article/DownloadSummary/NET-001 Resource Allocation and Cross-Layer Control in Wireless Networks]," ''Foundations and Trends in Networking'', vol. 1, no. 1, pp. 1-149, 2006.</ref><ref name = sno-text>M. J. Neely. ''[https://www.morganclaypool.com/doi/abs/10.2200/s00271ed1v01y201006cnt007 Stochastic Network Optimization with Application to Communication and Queueing Systems],'' Morgan & Claypool, 2010.</ref> The drift-plus-penalty procedure can also be used to compute solutions to [[Convex optimization|convex programs]] and [[Linear programming|linear programs]].<ref name =neely-dcdis>M. J. Neely, "[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.422.2447&rep=rep1&type=pdf Distributed and Secure Computation of Convex Programs over a Network of Connected Processors]," DCDIS Conf, Guelph, Ontario, July 2005</ref>

==Lyapunov drift for queueing networks==

Consider a queueing network that evolves in discrete time with normalized time slots <math>t \in \{0, 1, 2, \ldots\}.</math> Suppose there are <math>N</math> queues in the network, and define the vector of queue backlogs at time <math>t</math> by:

:<math> Q(t) = (Q_1(t), \ldots, Q_N(t))</math>

===Quadratic Lyapunov functions===

For each slot <math>t,</math> define:

:<math>L(t) = \frac{1}{2}\sum_{i=1}^N Q_i(t)^2 </math>

This function is a scalar measure of the total queue backlog in the network. It is called ''quadratic Lyapunov function'' on the queue state. Define the ''Lyapunov drift'' as the change in this function from one slot to the next:

:<math>\Delta L(t) = L(t+1) - L(t)</math>

===Bounding the Lyapunov drift===

Suppose the queue backlogs change over time according to the following equation:

:<math>Q_i(t+1) = \max \left \{ Q_i(t) + a_i(t) - b_i(t), 0 \right \}</math>

where <math>a_i(t)</math> and <math>b_i(t)</math> are arrivals and service opportunities, respectively, in queue <math>i</math> on slot <math>t.</math> This equation can be used to compute a bound on the Lyapunov drift for any slot t:

:<math>Q_i(t+1)^2 = \left ( \max \left \{ Q_i(t) + a_i(t) - b_i(t), 0 \right \} \right )^2 \leqslant \left (Q_i(t) + a_i(t) - b_i(t) \right)^2</math>

Rearranging this inequality, summing over all <math>i,</math> and dividing by 2 leads to:

:<math>\Delta L(t) \leqslant B(t) + \sum_{i=1}^N Q_i(t) (a_i(t) - b_i(t)) \qquad (Eq. 1)</math>

where:

:<math>B(t) = \frac{1}{2}\sum_{i=1}^N \left (a_i(t) - b_i(t) \right )^2</math>

Suppose the second moments of arrivals and service in each queue are bounded, so that there is a finite constant <math>B>0</math> such that for all <math>t</math> and all possible queue vectors <math>Q(t)</math> the following property holds:

:<math>\mathbb{E}[B(t) | Q(t)] \leqslant B</math>

Taking conditional expectations of (Eq. 1) leads to the following bound on the ''conditional expected Lyapunov drift'':

:<math>\mathbb{E}[\Delta L(t) | Q(t)] \leqslant B + \sum_{i=1}^N Q_i(t)\mathbb{E} [a_i(t) - b_i(t) | Q(t)] \qquad (Eq. 2)</math>

===A basic Lyapunov drift theorem===

In many cases, the network can be controlled so that the difference between arrivals and service at each queue satisfies the following property for some real number <math>\varepsilon>0</math>:

:<math>\mathbb{E}[a_i(t) - b_i(t) | Q(t)] \leqslant -\varepsilon</math>

If the above holds for the same epsilon for all queues <math>i,</math> all slots <math>t,</math> and all possible vectors <math>Q(t),</math> then (Eq. 2) reduces to the drift condition used in the following Lyapunov drift theorem. The theorem below can be viewed as a variation on [[Foster's theorem]] for [[Markov chains]]. However, it does not require a Markov chain structure.

:'''Theorem (Lyapunov Drift).'''<ref name=sno-text/><ref name=leonardi>E. Leonardi, M. Mellia, F. Neri, and M. Ajmone Marsan, "[https://www.researchgate.net/profile/Emilio_Leonardi/publication/221244643_Bounds_on_Average_Delays_and_Queue_Size_Averages_and_Variances_in_Input-Queued_Cell-Based_Switches/links/546c9c490cf21e510f63ec2d/Bounds-on-Average-Delays-and-Queue-Size-Averages-and-Variances-in-Input-Queued-Cell-Based-Switches.pdf Bounds on Average Delays and Queue Size Averages and Variances in Input-Queued Cell-Based Switches]", Proc. IEEE INFOCOM, 2001.</ref> Suppose there are constants <math>B\geqslant 0, \varepsilon>0</math> such that for all <math>t</math> and all possible vectors <math>Q(t)</math> the conditional Lyapunov drift satisfies:
::<math>\mathbb{E}[\Delta L(t)|Q(t)] \leqslant B - \varepsilon \sum_{i=1}^N Q_i(t).</math>
:Then for all slots <math>t>0</math> the time average queue size in the network satisfies:
::<math>\frac{1}{t}\sum_{\tau=0}^{t-1} \sum_{i=1}^N \mathbb{E}[Q_i(\tau)] \leqslant \frac{B}{\varepsilon } + \frac{\mathbb{E}[L(0)]}{\varepsilon t}.</math>

'''Proof.''' Taking expectations of both sides of the drift inequality and using the law of iterated expectations yields:

:<math>\mathbb{E}[\Delta L(t)] \leqslant B - \varepsilon \sum_{i=1}^N \mathbb{E}[Q_i(t)]</math>

Summing the above expression over <math>\tau \in \{0, 1, \ldots, t-1\}</math> and using the law of telescoping sums gives:

:<math>\mathbb{E}[L(t)] - \mathbb{E}[L(0)] \leqslant Bt - \varepsilon \sum_{\tau=0}^{t-1}\sum_{i=1}^N \mathbb{E}[Q_i(\tau)]</math>

Using the fact that <math>L(t)</math> is non-negative and rearranging the terms in the above expression proves the result.

==Lyapunov optimization for queueing networks==

Consider the same queueing network as in the above section. Now define <math>p(t)</math> as a ''network penalty'' incurred on slot <math>t.</math> Suppose the goal is to stabilize the queueing network while minimizing the time average of <math>p(t).</math> For example, to stabilize the network while minimizing time average power, <math>p(t)</math> can be defined as the total power incurred by the network on slot t.<ref name = neely-energy-it>M. J. Neely, "[http://www-bcf.usc.edu/~mjneely/pdf_papers/neely-energy-it.pdf Energy Optimal Control for Time Varying Wireless Networks]," IEEE Transactions on Information Theory, vol. 52, no. 7, pp. 2915-2934, July 2006.</ref> To treat problems of maximizing the time average of some desirable ''reward'' <math>r(t),</math> the penalty can be defined <math>p(t) = -r(t).</math> This is useful for maximizing network throughput utility subject to stability.<ref name=neely-fairness-infocom05/>

To stabilize the network while minimizing the time average of the penalty <math>p(t),</math> network algorithms can be designed to make control actions that greedily minimize a bound on the following [[Drift plus penalty|drift-plus-penalty expression]] on each slot <math>t</math>:<ref name="sno-text"/>

:<math> \Delta L(t) + Vp(t)</math>

where <math>V</math> is a non-negative weight that is chosen as desired to affect a performance tradeoff. A key feature of this approach is that it typically does not require knowledge of the probabilities of the random network events (such as random job arrivals or channel realizations). Choosing <math>V=0</math> reduces to minimizing a bound on the drift every slot and, for routing in multi-hop queueing networks, reduces to the [[backpressure routing]] algorithm developed by Tassiulas and Ephremides.<ref name=tass-radio-nets/><ref name=tass-server-allocation/> Using <math>V>0</math> and defining <math>p(t)</math> as the network power use on slot <math>t</math> leads to the [[Drift plus penalty|drift-plus-penalty algorithm]] for minimizing average power subject to network stability developed by Neely.<ref name=neely-energy-it/> Using <math>V>0</math> and using <math>p(t)</math> as the negative of an admission control utility metric leads to the drift-plus-penalty algorithm for joint flow control and network routing developed by Neely, Modiano, and Li.<ref name=neely-fairness-infocom05/>

A generalization of the Lyapunov drift theorem of the previous section is important in this context. For simplicity of exposition, assume <math>p(t)</math> is bounded from below:

:<math>p(t) \geqslant p_{\min} \quad \forall t \in \{0, 1, 2, ...\}</math>

For example, the above is satisfied with <math>p_{\min} = 0</math> in cases when the penalty <math>p(t)</math> is always non-negative. Let <math>p^*</math> represent a desired target for the time average of <math>p(t).</math> Let <math>V</math> be a parameter used to weight the importance of meeting the target. The following theorem shows that if a drift-plus-penalty condition is met, then the time average penalty is at most O(1/V) above the desired target, while average queue size is O(V). The <math>V</math> parameter can be tuned to make time average penalty as close to (or below) the target as desired, with a corresponding queue size tradeoff.

:'''Theorem (Lyapunov Optimization).''' Suppose there are constants <math>\varepsilon >0, V, B \geqslant 0,</math> and <math>p^*</math> such that for all <math>t</math> and all possible vectors <math>Q(t)</math> the following drift-plus-penalty condition holds:
::<math>\mathbb{E}[\Delta L(t) + Vp(t) | Q(t)] \leqslant B + Vp^* - \varepsilon \sum_{i=1}^NQ_i(t)</math>
:Then for all <math>t>0</math> the time average penalty and time average queue sizes satisfy:
::<math>\frac{1}{t}\sum_{\tau=0}^{t-1} \mathbb{E}[p(\tau)] \leqslant p^* + \frac{B}{V} + \frac{\mathbb{E}[L(0)]}{Vt}</math>
::<math>\frac{1}{t}\sum_{\tau=0}^{t-1} \sum_{i=1}^N \mathbb{E}[Q_i(\tau)] \leqslant \frac{B + V(p^* - p_{\min})}{\varepsilon} + \frac{\mathbb{E}[L(0)]}{\varepsilon t} </math>

'''Proof.''' Taking expectations of both sides of the posited drift-plus-penalty and using the law of iterated expectations we have:

:<math>\mathbb{E}[\Delta L(t)] + V \mathbb{E}[p(t)] \leqslant B + Vp^* - \varepsilon \sum_{i=1}^N \mathbb{E}[Q_i(t)]</math>

Summing the above over the first <math>t</math> slots and using the law of telescoping sums gives:

:<math>\begin{align}
\mathbb{E}[L(t)] - \mathbb{E}[L(0)] + V\sum_{\tau=0}^{t-1}\mathbb{E}[p(\tau)] &\leqslant (B+Vp^*)t - \varepsilon \sum_{\tau=0}^{t-1}\sum_{i=1}^N \mathbb{E}[Q_i(\tau)] \\
- \mathbb{E}[L(0)] + V\sum_{\tau=0}^{t-1}\mathbb{E}[p(\tau)] &\leqslant (B+Vp^*)t && \text{Since } L(t), Q_i(t) \geqslant 0 \\
V\sum_{\tau=0}^{t-1}\mathbb{E}[p(\tau)] &\leqslant p^* Vt + Bt + \mathbb{E}[L(0)]
\end{align}</math>

Dividing by <math>Vt</math> and rearranging terms proves the time average penalty bound. A similar argument proves the time average queue size bound.

==Related links==
* [[Drift plus penalty]]
* [[Backpressure routing]]
* [[Lyapunov function]]
* [[Foster's theorem]]
* [[Control-Lyapunov function]]

==References==
{{Reflist|30em}}

==Primary Sources==
*M. J. Neely. ''Stochastic Network Optimization with Application to Communication and Queueing Systems'', Morgan & Claypool, 2010.

[[Category:Networking algorithms]]
[[Category:Queueing theory]]