Yesterday my wife asked me why \(0! = 1\) and I realized that I off the top of my head I only had two answers: (1) it's defined that way, and (2) the Gamma function at one is one. Well, as you might imagine, (1) was unsatisfying to her and (2) involved esoteric math that only nerds want to know about. So I began to wonder what elementary arguments there are that \(0! = 1\). After a few minutes on google, I decided to record them here. Note that I'm including everything I can think of for completeness. Some arguments directly or indirectly reduce to a convention, others are conceptual; some are harder, some are easier; and so on.
1. Argument from definition: we define \(0! = 1\).
2. Argument from empty product convention: we define \(n! = \prod_{k = 1}^n k\) then note that the this implies that \(0!\) is the empty product, which is 1 by convention. (See footnote)
3. Argument from recursive definition: we note that \(n! = \frac{(n+1)!}{n+1}\) and put \(n = 0\).
4. First argument from combinatorics: \(n!\) is the number of ways to arrange \(n\) things and there is exactly one way to arrange zero things: do nothing.
5. Second argument from combinatorics: \(_nC_p\) where \(n > m\) is the number of ways to arrange \(n\) things into groups of \(m\) things and is computed by the formula \[_nC_p = \frac{n!}{p!(n-p)!} .\]There is only one way to arrange \(n\) things into a group of \(n\) things. Thus, for the formula to work in this case, it follows that \(\frac{n!}{n!0!}= 1 \), so \(0! = 1\).
6. Argument from the Gamma function: Let \(\text{Re}(z) >0\) and define \(\Gamma(z) = \int_0^\infty t^{z-1}e^{-t}dt\). Note that, for all integers \(n > 1\), \(\Gamma(n) = (n-1)!\). Next, for all \(z\) in the right complex plane, define \(z! = \Gamma(z+1)\). Since \(\Gamma(1) = 1\) it follows that \(0! = 1\).
7. Argument from bijective mappings: this is really just a more esoteric rephrasing of argument 4. Define \(n!\) as the number of bijective mappings between two sets with \(n\) elements each. For \(n = 0\) this is the number of bijective mappings from the empty set to itself. There is exactly one such mapping and therefore \(0! = 1\).
Footnote: why is the empty product equal to one? Let \(A\) be a finite set and define \(P_A = \prod_{a\in A}a\). It follows that \(\ln P_A = \sum_{a\in A}a\). Now, let \(A\) be the empty set. By convention, it follows that \(\sum_{a\in A} = 0\), thus \(\ln P_A = 0\), hence \(P_A = 1\).
Diary of A Control Engineer
Friday, February 1, 2019
Saturday, September 2, 2017
Fourier Transforms Redux
A while ago I wrote a blog post on developing some intuition behind the Fourier transform. Recently a Math Stack Exchange post came up about it which I answered. Not to toot my own horn (reader's should recognize such clear false modesty from me by now...), but I think I was able to write a fairly good answer which also brought the moment generating function into focus. Here is a link to the post.
Monday, August 21, 2017
The frequencies of the frequency domain: \(f\), \(\omega\), \(s\), and \(z\)
This is a note for myself (any anyone else interested) on the relationship between the four major frequencies \(f\), \(\omega\), \(s\), and \(z\).
We learn very early on in our education that periodic functions are natural representations of many kinds of motion: a bouncing ball, or a rotating wheel. All of them are periodic functions. We next learn that complicated periodic functions can be represented as linear combinations of simple periodic functions (sines and cosines) via Fourier series. Finally, we extend the countable linear combinations of the Fourier series to an uncountable "linear combination" using the Fourier transform, and writing the sine/cosine mix in exponential form as \(e^{i\omega}\), we find a way to represent even aperiodic functions in a periodic basis over a finite interval.
The most basic physical interpretation of the frequency is the number of cycles of a pure sinusoid per unit time. It's the same peak-to-peak, trough-to-trough, any two points in between with the same spacing. The units which make the most sense to describe this number is cycles/second, which we call Hertz or Hz. That is,
1. The physical basis of frequency
The most basic physical interpretation of the frequency is the number of cycles of a pure sinusoid per unit time. It's the same peak-to-peak, trough-to-trough, any two points in between with the same spacing. The units which make the most sense to describe this number is cycles/second, which we call Hertz or Hz. That is,
\(f\) is the physical frequency of the oscillation and always has units of Hz = cycles/second.
2. \(\omega\): physical frequency with different units
Using the physical units of Hz is great until we need to make calculations. For instance, if we started a sinusoid of frequency \(f\) at \(t = 0\) and wanted to know how many cycles had passed at time \(t = T\), including the nearest fraction of a cycle, we would use the simple formula
$$N_{cycles} = f[Hz] \times T[sec]. $$
As an example, at a frequency of 60 Hz (60 cycles/second) and time of 1.3 sec, we would have \(60 \times 1.3 = 78\) cycles. Partial cycles are allowed too; at 1.36 sec of a 60 Hz sinusoid we have 81.6 cycles which have passed--81 full cycles and 60% of a full cycle.
Now suppose we want to look the actual value of the sinusoid (up to scaling by the magnitude) instead of just the fraction of the cycle. That is, we want an argument \(x(t)\) to put in so that at time $t = T$, \(\sin(x(T))\) gives the value that a of frequency \(f\) sine wave has at that time.
We can't use \(x(t) = ft\) because after a full cycle (starting at zero), when \(ft = 1\), the sine wave is again zero, but \(\sin(1) \neq 0\). We need to modify the argument function to reflect this. You can convince yourself that
$$x(t) = 2\pi ft$$
is what we want. Adding the \(2\pi\) can get messy however. If I need the fourth derivative of \(\sin(2\pi ft)\) for some reason, I get an explosion of \(2\pi\)'s. So we choose to define
$$\omega = 2\pi f$$
as a way of simplifying the algebra. We know the fourth derivative of \(\sin(\omega t)\) is \(\omega^4\sin(\omega t)\).
Another nice thing about defining \(\omega\) like this is it allows us to measure the frequency in terms of angles (in fact it's called the angular frequency). I actually didn't understand this in my basic courses and just followed the math, but it's the crux of the issue: sinusoids are circular functions in the sense that the \(x,y\)-coordinates of the unit circle are exactly \(\cos(x),\sin(x)\) (measuring positive angles counter-clockwise). We can express these angles in terms of radians and find that there are exactly \(2\pi\) radians to a single rotation about the unit circle. Or, equivalently, a single cycle of a pure sinusoid. That is,
1 cycle \(= 2\pi\) radians.
recall that \(f\) has units of cycles per second, so it follows that \(\omega = 2\pi f\) must have units of radians per second. That is, \(\omega\) measures the rate at which the angle of the \(x,y\) coordinate changes in time. To summarize:
\(\omega\) measures the angular rate of the oscillation and always has units of rad/sec. \(\omega\) is always larger than \(f\) by a factor of \(2\pi \approx 6\).
So if you're designing a filter with a 60 Hz rolloff point, don't put a pole at \(\omega = 60\)! Put it at \(f = 60\). If you're making or using a bode plot, make sure you use the same \(f\) or \(\omega\) as is appropriate for the design requirements!
2. \(s\): The Laplace Frequency
The first thing we typically do when we run into a linear ODE is take the Laplace Transform. The LT is essentially a one-directional (negatives aren't included, since negative time makes no sense) Fourier transform that allows for the frequency to be any complex number. I've blogged about how the FT is just the continuous set of coefficients for expressing a function in a basis of sines and cosines (that is, the FT is a continuous analog of Fourier series), and the LT is the same thing, but now using \(s = \sigma + j\omega\) instead of just \(j\omega\). This means we now include damped sines and cosines (or sines and cosines with an envelope, if you want to think about it that way) in the basis set.
Then a funny thing happens: in control, we always set \(\sigma = 0\).
Which makes the LT the same as the FT! (assuming the function is zero for \(t < 0\)). This means that the ubiquitous \(s\) factors found in control are no different than \(j\omega\). I once saw this referred to as the "Joukowski Substitution", but I've lost the source of that name and can't find it again. To controls engineers it's no more than a basic rule:
$$s = j \omega.$$
Since \(j\) is a unitless constant (square root of minus one), \(s\) also has units of rad/sec, just like \(\omega\). It's worth remembering that \(s\) doesn't have units of Hz however; I've seen more than one textbook get that wrong!
2. \(z\): The Discrete Frequency
Of course, modern controllers are not often implemented in continuous time (a tragedy, but alas we'll save that rant for another post). The invention of the microprocessor basically killed analog controllers for all but a few special instances. Computers think in digital time and chop continuous time up into discrete instances to do so. To see how to do this, consider the continuous time differential equation
$$\dot{x} = ax.$$
We want to sample it at a rate of(N\) samples/sec. This is the same as \(1/N\) sec/sample, which we call the sampling time \(T_s\) (the first one is called the sampling frequency and called \(f_s\)). If we suppose we know \(x\) at time \(t= nT_s\) (\(n\) samples from the time we turned the computer on) then we can integrate the equation from \(t_0 = nT_s\) to \(t =(n+1)T_s\). For any \(t\) and \(t_0\) the solution is
$$x(t) = e^{a(t-t_0)}x(t_0),$$
thus, putting the discrete time in the result, we get
$$x[n+1] = e^{aT_s}x[n]. $$
We know we can represent the first equation using the LT variable \(s\) via \((s-a)X(s) = x_0\). It also turns out there's a discrete $Z$-transform which turns the shift operator \(qx[n] = x[n+1]\) into a complex variable \(z\)--\(x[n+1] = zx[n]\). The full substitution turns out to be \((z-e^{aT_s})X(z) = zx_0\) The pole is in the same ultimate place regardless which transformation we use (\(z\) and \(s\) are both complex variables, thus \(X\) has to be the same, up to a change of coordinates), so we conclude that the continuous pole \(s = a\) corresponds to the discrete pole \(z = e^{aT_s}\), or
$$ z = e^{sT_s}. $$
We should be able to recover the continuous time version as \(T_s \rightarrow 0\), so let's do that. We have
$$X(e^{sT_s}) = \frac{e^{sT_s}}{e^{sT_s}-e^{aT_s}}x_0 = \frac{x_0}{1- e^{aT_s-sT_s}} \approx \frac{x_0/T_s}{s-a},$$
where the scaling by \(T_s\) washes away in the continuum limit of the inverse transform. The important part is that the pole is in the right place.
where the scaling by \(T_s\) washes away in the continuum limit of the inverse transform. The important part is that the pole is in the right place.
Sunday, July 31, 2016
The Gaussian Moment Integrals
In this blog I want to document a useful technique for evaluating integrals of the form
$$I_n(a) = \int_{-\infty}^{\infty}x^n e^{-ax^2}\ dx,$$
for any integer \(n\). Recall the Gaussian integral is given by
$$I_0(1) = \int_{-\infty}^{\infty}e^{-x^2}\ dx = \sqrt{\pi},$$
from which we easily deduce
$$ I_0(a) = \sqrt{\frac{\pi}{a}}.$$
Now note that for odd \(n\) the integrand is precisely anti-symmetric, thus the integral vanishes over the real line, e.g.
$$I_n(a) = 0, \ \ n\text{ odd}. $$
To see this analytically, simply recognize
$$ \frac{(-1)^{n+1}}{(2a)^n}\frac{\partial^n}{\partial x^n}e^{-ax^2} = x^n e^{-ax^2}, \ \ n\text{ odd}, $$
and apply the fundamental theorem of calculus.
Wonderful!
For even \(n\) we need only apply the trick commonly known as differentiation under the integral sign or otherwise Feynman integration, though Prof. Feynman did not originate it. We select a parameter of the integrand and evaluate its partial to sufficient order so that the integral is trivial, then we simply integrate with respect to the parameter itself to obtain the true value of the integral. Thus
$$ \int_{-\infty}^{\infty}x^n e^{-ax^2}\ dx = \int_{-\infty}^{\infty}\frac{\partial^{n/2}}{\partial a^{n/2}}e^{-ax^2}\ dx = \frac{\partial^{n/2}}{\partial a^{n/2}}\int_{-\infty}^{\infty}e^{-ax^2}\ dx$$
from which we obtain
$$I_n(a) = \sqrt{\pi}\frac{\partial^{n/2}}{\partial a^{n/2}}a^{-1/2} = \frac{\sqrt{\pi}(n+1)!!}{2^{n/2}a^{(n+1)/2}}, \ \ n\text{ even}$$
where we define
$$k!! = 1\cdot 3\cdot 5 \cdots (k-2)\cdot k.$$
Exercise:
Let \(\nu\) be a random white noise parameter which is Gaussianly distributed, e.g.
$$\nu \sim \frac{1}{\sqrt{2\pi}\sigma}e^{-\nu^2/(2\sigma^2)}.$$
Show that the RMS value of \(\nu\) is \(\sigma\). In other words, show
$$\sqrt{\bar{\nu^2}} = \sigma.$$
$$I_n(a) = \int_{-\infty}^{\infty}x^n e^{-ax^2}\ dx,$$
for any integer \(n\). Recall the Gaussian integral is given by
$$I_0(1) = \int_{-\infty}^{\infty}e^{-x^2}\ dx = \sqrt{\pi},$$
from which we easily deduce
$$ I_0(a) = \sqrt{\frac{\pi}{a}}.$$
Now note that for odd \(n\) the integrand is precisely anti-symmetric, thus the integral vanishes over the real line, e.g.
$$I_n(a) = 0, \ \ n\text{ odd}. $$
To see this analytically, simply recognize
$$ \frac{(-1)^{n+1}}{(2a)^n}\frac{\partial^n}{\partial x^n}e^{-ax^2} = x^n e^{-ax^2}, \ \ n\text{ odd}, $$
and apply the fundamental theorem of calculus.
Wonderful!
For even \(n\) we need only apply the trick commonly known as differentiation under the integral sign or otherwise Feynman integration, though Prof. Feynman did not originate it. We select a parameter of the integrand and evaluate its partial to sufficient order so that the integral is trivial, then we simply integrate with respect to the parameter itself to obtain the true value of the integral. Thus
$$ \int_{-\infty}^{\infty}x^n e^{-ax^2}\ dx = \int_{-\infty}^{\infty}\frac{\partial^{n/2}}{\partial a^{n/2}}e^{-ax^2}\ dx = \frac{\partial^{n/2}}{\partial a^{n/2}}\int_{-\infty}^{\infty}e^{-ax^2}\ dx$$
from which we obtain
$$I_n(a) = \sqrt{\pi}\frac{\partial^{n/2}}{\partial a^{n/2}}a^{-1/2} = \frac{\sqrt{\pi}(n+1)!!}{2^{n/2}a^{(n+1)/2}}, \ \ n\text{ even}$$
where we define
$$k!! = 1\cdot 3\cdot 5 \cdots (k-2)\cdot k.$$
Exercise:
Let \(\nu\) be a random white noise parameter which is Gaussianly distributed, e.g.
$$\nu \sim \frac{1}{\sqrt{2\pi}\sigma}e^{-\nu^2/(2\sigma^2)}.$$
Show that the RMS value of \(\nu\) is \(\sigma\). In other words, show
$$\sqrt{\bar{\nu^2}} = \sigma.$$
Thursday, June 2, 2016
Weekly blog 4: Derivations of the Fourier Transform
I swear I'll get to the discussion of first and second order systems in a bit, but before leaving the topic of transforms altogether, I wanted to write another blog on the Fourier transform. You see, the Fourier transform is special to me because, despite the fact it is really easy to understand intuitively, no one ever gave me this intuition and I had to build it on my own. That's really a shame, since I believe intuition can be a much more agile tool than rigor, especially for people who are practitioners rather than just theorists, and so I wanted to take some time to document my own intuition for the Fourier transform hoping that someone else might find it useful.
1. Sinusoidal Representations of Functions
Most people know that functions can be represented and approximated by other functions in a methodical way. For instance, the Taylor series can be used to approximate a function by a series of polynomials with the right coefficients, in the case of Taylor given by their derivatives evaluated at the expansion point. Most people also know that series of sines and cosines can be used to do the job for Periodic functions. In particular the series
$$S = \left\{\sin\frac{2\pi n t }{T}\right\}_{n\in\mathbb{N}}$$
is orthonormal with respect to the inner product
$$(f,g) = \frac{1}{T}\int_{-T/2}^{T/2}f(t) g(t)\ dt,$$
and thus, for an arbitrary function \(f\) on \([0,T]\), we can write an approximation consisting of all sines, e.g. find \(a_0,a_1,a_2,\dots,a_N\) such that
$$f(t) \approx \sum_{n=0}^Na_n\sin\frac{2\pi n t }{T},\ \ t\in [0,T].$$
The same is true of cosines. Notice that these are all sines and cosines whose frequencies are discrete harmonics of the fundamental frequency, defined by \(\omega_n = 2\pi n/T\).
2. The Spring Analogy
The trick I keep in mind when thinking about the Fourier transform is simple: I know that these sines and cosines arise as solutions for simple harmonic oscillators (say ideal, massless, infinitely elastic, undamped springs) which each satisfy the equation
$$\ddot{x} = -\omega^2_nx.$$
Actually, we can even write these solutions together, in an exponential representation given by \(e^{\pm j\omega_nt}\). What the Fourier series of a function essentially says is that the function may be approximated by a weighted sum of the solutions to simple harmonic oscillator equations. This is equivalent to weighting the solution by either the harmonics of a single spring of the fundamental of a (possibly infinite) series of springs whose natural frequencies are all harmonics of each other.
Now we imagine including not just positive harmonics but (mathematically possible) negative harmonics of an infinite series of springs. We write
$$f(t) \approx \sum_{n=-\infty}^{\infty}a_nS_n, $$
where \(S_n\) is any choice for representing the solution to the spring equation. Finally, we imagine an uncountably infinite (stay with me) number of springs which have natural frequencies at every number from \(-\infty\) to \(\infty\) and go ahead and use \(S_n = e^{j\omega_n t}\) where now \(n\in\mathbb{R}\). We can drop the subscripts on the frequencies (since now there's no point) and turn the sum into an integral. Since each frequency needs a unique coefficient (to determine its weighting in the overall sum) the sequence \(a_n\) becomes a function \(a(\omega)\) and we may write
$$f(t) = \int_{-\infty}^{\infty}a(\omega)e^{j\omega t}\ d\omega.$$
\(a(\omega)\) still represents the weighting of the springs, but using a continuous index now instead of a discrete index, and just as \(a_n\) as a sequence represented the function by the weighting given to a series of springs whose solutions were understood, so too does \(a(\omega)\). The difference however is that while \(a_n\) approximately represented \(f\) using a set number of harmonics, \(a(\omega)\) gives an exact representation in terms of all possible frequencies. Thus we say that \(a(\omega)\) is the frequency representation of \(f(t)\), and can be thought of as the function which assigns a weighting to the set of all possible springs so that their combined behavior imitates \(f(t)\) exactly. This representation is one-to-one with the set of all continuous, integrable, bounded functions \(f(t)\). In fact there is an inverse:
$$a(\omega) = \int_{-\infty}^{\infty}f(t)e^{-j\omega t}\ dt.$$
Now where have I seen that before?
$$a(\omega) = \int_{-\infty}^{\infty}f(t)e^{-j\omega t}\ dt.$$
Now where have I seen that before?
Weekly Blog 3: More About Integral Transforms
Welcome to the 3rd installment of the weekly controls blog! I was going to do a blog on first and second-order systems, but wow, those Laplace transforms last time just got my blood pumped to extend the discussion on integral transforms a bit! I've also recently become interested in applications of integral equations, Fredholm theory, and fractional calculus to control and I figure recounting a bit of common knowledge on integral transforms is a good way to dive into this.
Of course I won't do hardly any more on integral transforms in the large scope, there are 500 page books for that. Instead I'll work a little bit more with the Laplace Transform
$$(\mathcal{L}f)(s) = \int_{0}^{\infty}e^{-st}f(t)\ dt,$$
and the Fourier transform
$$(\mathcal{F}f)(\omega) = \int_{-\infty}^{\infty}e^{-j\omega t}f(t)\ dt,$$
where, since I'm an engineer, I prefer to annoy mathematicians by defining \(j^2 = -1\).
Of course I won't do hardly any more on integral transforms in the large scope, there are 500 page books for that. Instead I'll work a little bit more with the Laplace Transform
$$(\mathcal{L}f)(s) = \int_{0}^{\infty}e^{-st}f(t)\ dt,$$
and the Fourier transform
$$(\mathcal{F}f)(\omega) = \int_{-\infty}^{\infty}e^{-j\omega t}f(t)\ dt,$$
where, since I'm an engineer, I prefer to annoy mathematicians by defining \(j^2 = -1\).
1. Laplace Transform of a Delay
We've seen the widely used result \(\mathcal{L}\dot{y}(t) = sY(s)\) in the previous blog. This can also be seen as the Laplace transform of the operator \(d/dt\) which gives us the operator equation
$$ \mathcal{L}\frac{d}{dt} = s,$$
or, recursively
$$ \mathcal{L}\frac{d^n}{dt^n} = s^n.$$
We proved this by taking an arbitrary sample function, in this case \(y\), and using derivatives of this function (with the known Laplace transform \(Y(s)\)) we were able to compute the Laplace transform for the derivative itself.
Now let \(D_\tau\) be the delay operator which is defined as the operator for which
$$D_\tau f(t) = f(t-\tau).$$
That is, \(D_\tau\) simply delays an arbitrary function \(f\) by some amount of time \(\tau\). The question of this section will be to compute \(\mathcal{L}D_\tau\). We do this again by using a sample function, say \(f\) so that we have
$$ \mathcal{L}D_\tau f = \int_{0}^{\infty}e^{-st}D_\tau f(t)\ dt = \int_{0}^{\infty}e^{-st}f(t-\tau)\ dt.$$
Letting \(\sigma(t) = t-\tau\) we find \(d\sigma = dt\) and \(\sigma(0) = -\tau,\sigma(\infty)=\infty\). Then
$$\int_{0}^{\infty}e^{-st}f(t-\tau)\ dt = \int_{-\tau}^{\infty}e^{-s(\sigma+\tau)}f(\sigma)\ d\sigma = e^{-s\tau}\left(\int_{-\tau}^{0}e^{-s\sigma}f(\sigma)\ d\sigma + \int_{0}^{\infty}e^{-s\sigma}f(\sigma)\ d\sigma \right).$$
As before we define \(F(s) = \mathcal{L}f\). Now we need only take care of that bizarre term with the delay in the integral bounds. We have a couple of options: most people define \(f\) as zero on \(0\leq t\leq\tau\) which you will find makes the integral above work out to have only \(F(s)\) and not the second term, but this is ad hoc for me. We can equivalently simply define \(f = 0\) for all \(t < 0\), but again why should we? We obviously need to do something. For now let's just say
$$\int_{-\tau}^{0}e^{-s\sigma}f(\sigma)\ d\sigma + F(s) \approx F(s)$$
but keep that issue in the back of our minds. We thus have
$$\mathcal{L}(D_\tau) = e^{-s\tau}.$$
2. Fourier Transform of the Derivative and Neper Frequency
I'm finished with the Laplace transform for now, though I make no promises of never returning to it again...Let's go ahead and figure out what the Fourier transform of the derivative is. Say \(y\) is our sample function whose Fourier transform is \(Y(\omega)\). Then
$$(\mathcal{F}\dot{y})(s) = \int_{-\infty}^{\infty}e^{-j\omega t}\dot{y}(t)\ dt = -j\omega e^{-j\omega t}y(t)|_{-\infty}^{\infty} - \left(-j\omega\int_{-\infty}^{\infty}e^{-j\omega t}y(t)\ dt\right)$$
so, assuming again the surface term vanishes,
$$(\mathcal{F}\dot{y})(s) = j\omega Y(\omega),$$
or
$$\mathcal{F}\frac{d}{dt} = j\omega.$$
This in particular allows us to easily map linear systems from the \(s\)-domain to the frequency domain by sending \(s\mapsto j\omega\). This is the so-called Joukowski substitution, though it is unclear whether or not Joukowski was the first one to notice this fact.
Although the Joukowski substitution is the most typically used way of mapping into the frequency domain, it should be noted for completeness that it assumes the forcing frequency is perfect and non-attenuating. Consider for instance the signal
$$y(t) = \sin(\omega t)$$
The Joukowski substitution works (in the complexified case) and we have no problem, but what about
$$y(t) = A(t)\sin(\omega t),$$
where \(A\rightarrow 0\) as \(t\rightarrow \infty\)? In this case the Joukowski substitution is not the correct map to the frequency domain. Instead we must introduce the Neper Frequency, \(\sigma\), which keeps track of the signal attenuation, and send \(s\mapsto \sigma+j\omega\) to get into the frequency domain. In practice however, the frequency domain representation of a function is regarded as being consistent with a series of ideal simple harmonic oscillators being used to represent a function and the Neper frequency rarely comes into play when analyzing systems at a purely mathematical level.
Saturday, May 28, 2016
(Late) Weekly Blog 2: Transfer functions and Their Compositions
First: only one week into my project and I have already failed to reach my intended goal as the post slated for last week never happened. I apologize for this but also am undeterred in my commitment to continue this blog!
In order to make up for the post I missed, I'll deliver two by this sunday, of which this will be the first. The topic of this blog will be a really easy topic but also something which is totally essentially to SISO control: transfer functions. I'll also leave you guys with an open question I've been mulling over from a book on "open problems in control theory"
To talk about transfer functions we need to understand a few Laplace transforms. Laplace transforms are a specific case of the more general idea of integral transforms, which are essentially any linear transformation of the form
$$F(s) = \int_{x\in X} k(s,x) f(x)dx,$$
where \(f\) is the input \(F\) is the output and \(k(s,x)\) is a function called the kernel of the transformation. The kernel, along with the selection of the set the integral is taken over, are the elements which define the specific transformation. While the theory on general integral transforms is extensive, control theorists are most concerned with either the Laplace or Fourier transforms, and of these two mostly the Laplace transform. The Laplace transform is given by
$$F(s) = \mathcal{L}f(t) = \int_{0}^{\infty}e^{-st} f(t)dt,$$
and itself has a long and interesting history in theory of functions, but for our purposes is simply a way of solving differential equations by turning them into algebraic equations. It's actually easy to see how this happens. Let's suppose \(y(t)\) is a time-domain function whose Laplace transformation is denoted \(Y(s)\). We want to find the Laplace transform of \(\dot{y}(t)\). This is
$$\int_{0}^{\infty}e^{-st} \dot{y}(t)dt = -se^{-st}y(t)|_0^\infty - \int_{0}^{\infty}(-se^{-st}) y(t)dt = sY(s),$$
assuming the surface term vanishes. Applying this argument recursively yields the important result
$$\mathcal{L}y^{n}(t) = s^nY(s),$$
which we shall use in the next section.
3. An Open Question
In order to make up for the post I missed, I'll deliver two by this sunday, of which this will be the first. The topic of this blog will be a really easy topic but also something which is totally essentially to SISO control: transfer functions. I'll also leave you guys with an open question I've been mulling over from a book on "open problems in control theory"
1. The Laplace Transform
To talk about transfer functions we need to understand a few Laplace transforms. Laplace transforms are a specific case of the more general idea of integral transforms, which are essentially any linear transformation of the form
$$F(s) = \int_{x\in X} k(s,x) f(x)dx,$$
where \(f\) is the input \(F\) is the output and \(k(s,x)\) is a function called the kernel of the transformation. The kernel, along with the selection of the set the integral is taken over, are the elements which define the specific transformation. While the theory on general integral transforms is extensive, control theorists are most concerned with either the Laplace or Fourier transforms, and of these two mostly the Laplace transform. The Laplace transform is given by
$$F(s) = \mathcal{L}f(t) = \int_{0}^{\infty}e^{-st} f(t)dt,$$
and itself has a long and interesting history in theory of functions, but for our purposes is simply a way of solving differential equations by turning them into algebraic equations. It's actually easy to see how this happens. Let's suppose \(y(t)\) is a time-domain function whose Laplace transformation is denoted \(Y(s)\). We want to find the Laplace transform of \(\dot{y}(t)\). This is
$$\int_{0}^{\infty}e^{-st} \dot{y}(t)dt = -se^{-st}y(t)|_0^\infty - \int_{0}^{\infty}(-se^{-st}) y(t)dt = sY(s),$$
assuming the surface term vanishes. Applying this argument recursively yields the important result
$$\mathcal{L}y^{n}(t) = s^nY(s),$$
which we shall use in the next section.
2. Transfer Functions
Control theory has been said to have emerged from two strains of engineering heritage: electrical engineering and mechanics. Electrical engineering is formulated in terms of input-output relationships for black-box systems. A signal \(u\) is fed into the box and a response \(y\) is output. To the electrical engineer the objective of feedback control is to change the input signal to achieve the desired output. Mechanics on the other hand is formulated in terms of differential equations. To a mechanical engineer, the objective of feedback control is to find a forcing term for the equation which produces the desired solution. The Laplace transform gives us a way to represent the differential equation for a system as an input-output relation--so long that the equation is linear (you can look at Blog 1 to find out how to approximate a nonlinear system by a linear one). Let
$$a_ny^{(n)}+a_{n-1}y^{(n-1)}+\cdots+a_1\dot{y}+a_{0}y = b_mu^{(m)}+b_{m-1}u^{(m-1)}+\cdots+b_1\dot{u}+b_{0}u$$
be our model of the system. Applying the Laplace transformation we have
$$\begin{aligned}\mathcal{L}a_ny^{(n)}+\mathcal{L}a_{n-1}y^{(n-1)}+\cdots+&\mathcal{L}a_1\dot{y}+\mathcal{L}a_{0}y\\&= \mathcal{L}b_mu^{(m)}+\mathcal{L}b_{m-1}u^{(m-1)}+\cdots+\mathcal{L}b_1\dot{u}+\mathcal{L}b_{0}u.\\\end{aligned}$$
Whose LHS is
$$\begin{aligned}a_ns^nY(s) + a_{n-1}s^{n-1}Y(s) +\cdots &+ a_{1}sY(s) + a_{0}Y(s)\\ = &(a_ns^n+a_{n-1}s^{n-1}+\cdots +a_1s +a_0)Y(s),\\\end{aligned}$$
and whose RHS is
$$\begin{aligned}b_ms^mU(s) + b_{m-1}s^{m-1}U(s) +\cdots &+ b_{1}sU(s) + b_{0}U(s)\\ = &(b_ns^n+b_{n-1}s^{n-1}+\cdots +b_1s +b_0)U(s).\\\end{aligned}$$
Putting these together we have
$$\frac{Y(s)}{U(s)} = \frac{b_ms^m+b_{m-1}s^{m-1}+\cdots +b_1s +b_0}{a_ns^n+a_{n-1}s^{n-1}+\cdots +a_1s +a_0}.$$
We typically denote the fraction \(Y(s)/U(s)\) as a single function, something like \(H(s)\). The transfer function can be used to determine virtually every significant thing about the controller and system, from stability to rise/settling times, overshoots, gain and phase margins, etc. In fact, without using transfer functions there's no way of easily understanding what is known as "classical" control theory.3. An Open Question
So now that I've described transfer functions, I'll leave you with an open question. Supposing we have a transfer function \(G\), then find transfer functions \(G_0\) and \(H\) for which
$$ G = G_0\circ H.$$
It has been shown by Fernandez and Martinez-Garcia (G. Fernandez, "Preservation of SPR functions and stabilization by substitutions in SISO plants," IEEE Transaction on Automatic Control, vol. 44, no. 11, pp. 2171-2174, 1999.; G. Fernandez and J. Alvarez, “On the preservation of stability in families of polynomials via substitutions,” Int. J. of Robust and Nonlinear Control, vol. 10, no. 8, pp. 671-685, 2000.) that controlling \(G\) is equivalent to controlling \(G_0\) by substituting \(K(s)\) by \(K(H(s))\). This is one of those interesting problems in classical control that peaks my interest. I've been working on it a bit and might be announcing a few results soon ;-)
Subscribe to:
Posts (Atom)