Continuous Injective Function of Uniform is Uniform Distribution
Continuous random variables
A random variable \(X:S\mapsto{\mathbb R}\) is continuous if its support \(X(S)\) contains an interval of real numbers, or more precisely if its probability law can be described in terms of a nonnegative real function \(f_X\) (density mass function) in such a way the the probability that \(X\) lies on any (borelian) \(A\subset{\mathbb R}\) can be computed as
\[P(X\in A)=\int_A f_X(x)dx\,.\]
Introduction (histogram, 1000 observations)
                          set.seed(1) x=rexp(1000)              hist(x,probability=T)                                    
Introduction (histogram, 1000000 observations)
                          set.seed(1) x=rexp(1000000)              hist(x,nclass=              250,probability=T,border=              "white",col=              "blue")                                    
Introduction (histogram and density)
                          hist(x,nclass=              250,probability=T,border=              "white",col=              "blue") t=seq(0,15,by=.1)              points(t,dexp(t),type=              "l",col=              "blue",lwd=              2)                                    
Introduction (probability of an interval, \(P(2<X<4)\))
            cord.x <-                            c(2,seq(2,4,0.01),4)  cord.y <-                            c(0,dexp(seq(2,4,0.01)),0)              polygon(cord.x,cord.y,col=              'skyblue')              points(t,dexp(t),type=              "l",col=              "blue",lwd=              2)                                    
Density mass function and cdf
Every density mass function \(f:{\mathbb R}\mapsto{\mathbb R}\) satisfies 1. \(f(x)\geq 0\),; 2. \(\int_{-\infty}^{+\infty} f(x)dx=1\),.
If \(X\) is a continuous r.v. and \(f_X\) its associated density mass function, the probability that \(X\) lies in any (borelian) \(A\subset X\) is \[P(X\in A)=\int_A f_X(x)dx\,.\] When \(A=[a,b]\), we have \(P(a\leq X\leq b)=\int_a^b f_X(x)dx\,.\)
Properties of continuous r.v.s
- \(P(X=a)=\int_a^a f_X(x)dx=0\) for any \(a\in{\mathbb R}\);
 - \(P(a\leq X\leq b)=P(a<X\leq b)=P(a\leq X<b)=P(a<X<b)\).
 
Example
\[f_X(x)=\left\{\begin{array}{ll}e^{-x}&\textrm{ if }x\geq 0\\0&\textrm{ if }x<0\end{array}\right.\]
\[P(2<X<4)=\int_2^4 e^{-x}dx=-(e^{-4}-e^{-2})=0.117\,.\]
Cumulative distribution function, cdf The cumulative distribution function (cdf) of r.v. \(X\) evaluated at \(x\in{\mathbb R}\) is the probability that \(X\) is not greater than \(x\), \[F_X(x)=P(X\leq x)=\int_{-\infty}^x f_X(t)dt.\]
Properties of the cdf of a continuous random variable
- \(\lim\limits_{x\rightarrow-\infty}F(x)=0\);
 -               \(\lim\limits_{x\rightarrow+\infty}F(x)=1\);
 - \(F\) is nondecreasing;
 - \(F\) is continuous.
 
The probability that \(X\) lies in the interval \([a,b]\) is computed in terms of its cdf as \[P(a\leq X\leq b)=F_X(b)-F_X(a)\,.\] Relationship between density mass function and cdf
- The cdf is a primitive of the density mass function, \(F_X(x)=\int_{-\infty}^x f_X(t)dt\).
 - The density mass function is the derivative of the cdf, \(f_X(x)=F'_X(x)\).
 
Example
\[F_X(x)=\left\{\begin{array}{ll}1-e^{-x}&\textrm{ if }x\geq 0\\0&\textrm{ if }x<0\end{array}\right.\] \[P(2<X<4)=F_X(4)-F_x(2)=(1-e^{-4})-(1-e^{-2})=0.117\]
              t=seq(-1,10,by=.1)                plot(t,pexp(t),type=                "l",col=                "blue",lwd=                2)                abline(h=                1,lty=                2)                                            
Mean, variance, and quantiles
Mean or expectation The mean or expectation of \(X\) is defined as \[{\mathbb E}[X]=\int_{-\infty}^{+\infty }xf_X(x)dx,.\]
Properties of the mean For any real numbers \(a,b\in{\mathbb R}\), any function \(g:{\mathbb R}\mapsto{\mathbb R}\), and r.v. \(X\),
- \({\mathbb E}[aX+b]=a{\mathbb E}[X]+b\);
 - \({\mathbb E}[g(X)]=\int_{-\infty}^{+\infty} g(x)f_X(x)dx\);
 - \({\mathbb E}[(X-{\mathbb E}[X])^2]=\min_{x\in{\mathbb R}}{\mathbb E}[(X-x)^2]\).
 
Variance The variance is a measure of the scatter of the distribution of r.v. \(X\).
It is the expected squared distance of \(X\) to its mean,
Properties of the variance
- \({\rm Var}[X]\geq 0\);
 - \({\rm Var}[X]={\mathbb E}[X^2]-{\mathbb E}[X]^2\);
 - \({\rm Var}[aX+b]=a^2{\rm Var}[X]\), for any \(a,b\in{\mathbb R}\).
 
The standard deviation of \(X\) is the (positive) square root of its variance, \[\sigma_X=\sqrt{{\rm Var}[X]}\,.\]
Example
\(X\) with the previous density.
\({\mathbb E}[X]=\int_{-\infty}^{+\infty} xf_X(x)dx=\int_{0}^{+\infty} xe^{-x}dx=[-xe^{-x}]^{+\infty}_0+\int_0^{+\infty}e^{-x}dx=1.\)
\({\mathbb E}[X^2]=\int_{-\infty}^{+\infty} x^2f_X(x)dx=\int_{0}^{+\infty} x^2e^{-x}dx=2\int_0^{+\infty}xe^{-x}dx=2.\)
\({\rm Var}[X]={\mathbb E}[X^2]-{\mathbb E}[X]^2=1.\)
                              set.seed(1) x=rexp(10000)                mean(x)                                  ## [1] 0.9983612                                          ## [1] 1.031541                    Median The median is the most central value with respect to the distribution of a random variable \(X\) in the sense that \[F_X({\rm Me}_X)=P(X\leq{\rm Me}_X)=1/2\,.\]
Example Solve \(F_X({\rm Me}_X)=1/2\), then \(1-e^{-{\rm Me}_X}=1/2\), and \({\rm Me}_X=-\log(1/2)=\log(2)=0.693.\)
Properties of the median
- \({\rm Me}_{aX+b}=a{\rm Me}_X+b\), for any \(a,b\in{\mathbb R}\);
 - \({\rm Me}_{g(X)}=g({\rm Me}_X)\) if \(g\) is monotone;
 - \({\mathbb E}|X-{\rm Me}_X|=\min_{x\in{\mathbb R}}{\mathbb E}|X-x|\).
 
Quantiles* For \(0<\alpha<1\) the \(\alpha\)-quantile of random variable \(X\) a number \(q_\alpha\) such that \[F_X(q_\alpha)=P(X\leq q_\alpha)=\alpha\,.\]
The quantile function of random variable \(X\) is defined as \[F^{-1}_X(\alpha)=\inf\{x:\,F_X(x)\geq\alpha\}.\] A quantile function defined like this is:
- \(\lim\limits_{\alpha\downarrow 0}F^{-1}_X(\alpha)=\inf X(S)\);
 - \(\lim\limits_{\alpha\uparrow 1}F^{-1}_X(\alpha)=\sup X(S)\);
 - nondecreasing;
 - left-continuous.
 
Example
\(X\) with the previous density. If \(F_X(x)=1-e^{-x}=y\), then \(y=-\log(1-x)\), so
\[F_X^{-1}(x)=-\log(1-x)\,.\] Half of the random variables with the distribution of \(X\) assume a value greater (or less) than \({\rm Me}_{X}=0.693\), while \(75\%\) assume a value greater than \(F^{-1}(0.25)=-\log(0.75)=0.288\).
            ## [1] 0.6946537                                          ##       25%  ## 0.2810167                  Uniform distribution
A Uniform random variable in the interval \([a,b]\) represents a number at random in that interval selected in such a way that the probability that it lies in any subinterval of \([a,b]\) is proportional to the width of the subinterval. \[X\sim{\rm U}(a,b)\] \[f_X(x)=\left\{\begin{array}{cl}\frac{1}{b-a}&\textrm{ if }a\leq x\leq b\\0&\textrm{ otherwise}\end{array}\right..\]
\[F_X(x)=\left\{\begin{array}{cl}0&\textrm{ if }x<a\\\frac{x-a}{b-a}&\textrm{ if }a\leq x\leq b\\1&\textrm{ if }x>b\end{array}\right..\]
\[{\mathbb E}[X]=\frac{a+b}{2}\quad;\quad{\rm Var}[X]=\frac{(b-a)^2}{12}\]
            Uniform density mass function              dunif(min=0,max=1)                      
              x=seq(-1,2,by=.01)                plot(x,dunif(x),type=                "l",lwd=                2)                                              Uniform random observations              
runif(min=0,max=1)                      
                              set.seed(1) y=runif(1000)                hist(y)                                            
            Uniform cdf              punif(min=0,max=1)                      
              x=seq(-1,2,by=.01)                plot(x,punif(x),type=                "l",lwd=                2)                                            
Uniform empirical cumulative distribution function
                      
            Uniform quantile function              qunif(min=0,max=1)                      
              x=seq(0,1,by=.01)                plot(x,qunif(x),type=                "l",lwd=                2)                                            
Transformations of a random variable
If \(X\) is a random variable and \(g:{\mathbb R}\mapsto{\mathbb R}\) a function, then \(Y=g(X)\) is a random variable.
If \(X\) is continuous and \(g\) continuous and increasing \[F_{Y}(y)=P(Y\leq y)=P(g(X)\leq y)=P(X\leq g^{-1}(y))=F_X(g^{-1}(y))\,,\] where \(g^{-1}\) is the inverse function of \(g\), that is, \(g^{-1}(y)=x\) if \(g(x)=y\).
In general, if \(g\) is injective (one-to-one) and derivable \[f_Y(y)=f_X(x)\left|\frac{dx}{dy}\right|\,.\]
Example
Consider \(X\sim{\rm U}(0,1)\), determine the density mass function of \(Y=-\log(1-X)\).
Clearly the support of \(Y\) is \((0,+\infty)\), consider \(y>0\) \[\begin{multline*} F_{Y}(y)=P(Y\leq y)=P(-\log(1-X)\leq y)=P(\log(1-X)\geq -y)\\=P(1-X\geq e^{-y})=P(-X\geq e^{-y}-1)=P(X\leq 1-e^{-y})=1-e^{-y}\,. \end{multline*}\]
\[F_Y(y)=\left\{\begin{array}{ll}1-e^{-y}&\textrm{ if }y\geq 0\\0&\textrm{ if }y<0\end{array}\right.\,.\]
Inverse transform method for simulation
If \(X\sim{\rm U}(0,1)\), then \(F^{-1}(X)\) is a random variable with cdf \(F\).
\[P(F^{-1}(X)\leq x)=P(X\leq F(x))=F_X(F(x))=F(x)\]
Example
Observe that if \(F(x)=1-e^{-x}\) for \(x\geq 0\), then \(F^{-1}(x)=-\log(1-x)\). The cdf of \(Y=-\log(1-X)\) is \(F\) and we can use this to simulate from such a distribution.
                              set.seed(1) x=runif(10000)                hist(-log(1-x),probability=T)                                            
            Exponential distribution            exp(rate=1)          
          If \(X\sim{\mathcal P}(\lambda)\) represents the number of events that occur in a given time period (independently and with constant rate \(\lambda\) events per time units in the period), then the time between two consecutive events follows an Exponential distribution with parameter \(\lambda\).
\[X_t\equiv\text{'number of events in [0,t]'}\] \[T\equiv\text{'time until first event occurs'}\] \[X_t\sim{\mathcal P}(\lambda t)\] Take \(t>0\), \[F_T(t)=P(T\leq t)=1-P(T>t)=1-P(X_t=0)=1-e^{-\lambda t}\,.\]
            Exponential distribution              exp(rate=1)                      
\(T\sim{\rm Exp}(\lambda)\)
- cdf \[F_T(t)=\left\{\begin{array}{ll}1-e^{-\lambda t}&\textrm{ if }t\geq 0\\0&\textrm{ if }t<0\end{array}\right.\]
 - density \[f_T(t)=\left\{\begin{array}{ll}\lambda e^{-\lambda t}&\textrm{ if }t\geq 0\\0&\textrm{ if }t<0\end{array}\right.\]
 
\[{\mathbb E}[T]=\lambda^{-1}\quad;\quad{\rm Var}[T]=\lambda^{-2}\]
Lack of memory property of the exponential distribution
If \(T\sim{\rm Exp}(\lambda)\) and \(t_1,t_2>0\), then
\[P(T>t_1+t_2|T>t_1)=P(T>t_2)\,.\] Proof: \[\begin{multline*} P(T>t_1+t_2|T>t_1)=\frac{P((T>t_1+t_2)\cap(T>t_1))}{P(T>t_1)}=\frac{P(T>t_1+t_2)}{P(T>t_1)}\\=\frac{1-F_T(t_1+t_2)}{1-F_T(t_1)} =\frac{e^{-\lambda(t_1+t_2)}}{e^{-\lambda t_1}}=e^{-\lambda t_2}=P(T>t_2)\,. \end{multline*}\]
Normal distribution
Random variable \(X\) follows a normal distribution with mean \(\mu\) and standard deviation \(\sigma\), \(X\sim{\rm N}(\mu,\sigma)\) if its density mass function is \[f_X(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\,,\quad x\in{\mathbb R}\,.\]
                              dnorm(x,mean=mu,sd=sigma)                      We refer to \(Z\sim{\rm N}(0,1)\) as standard normal random variable, \[f_Z(x)=\phi(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}\,,\quad x\in{\mathbb R}\,.\]
            Normal density (location shift)              dnorm(mean=0,sd=1)                      
              x=seq(-3.5,5.5,by=.01)                plot(x,dnorm(x),type=                "l",lwd=                2)                points(x,dnorm(x,mean=                1,sd=                1),type=                "l",lwd=                2,col=                "red")                                            
            Normal density (scale shift)              dnorm(mean=0,sd=1)                      
              x=seq(-6,6,by=.01)                plot(x,dnorm(x),type=                "l",lwd=                2)                points(x,dnorm(x,mean=                0,sd=                2),type=                "l",lwd=                2,col=                "red")                                            
            Normal cdf              pnorm(mean=0,sd=1)                      
There is no analitic expression for the cdf of a normal r.v.
If \(Z\sim{\rm N}(0,1)\), \(F_Z(x)=P(Z\leq z)=\int_{-\infty}^x \phi(t)dt=\Phi(x)\).
              x=seq(-3.5,3.5,by=.01)                plot(x,pnorm(x),type=                "l",lwd=                2)                abline(h=                c(0.025,0.5,0.975),v=                c(-1.96,0,1.96))                                            
A linear transformation of a normal random variable is normal
If \(X\sim{\rm N}(\mu,\sigma)\) and \(a,b\in{\mathbb R}\), \[aX+b\sim{\rm N}(a\mu+b,|a|\sigma)\,.\]
            Standardization            
            Among all linear tranformations of a normal r.v., the most relevant is the            standardization, if            \(X\sim{\rm N}(\mu,\sigma)\),            \[\frac{X-\mu}{\sigma}\sim{\rm N}(0,1)\,.\]          
Examples
If \(X\sim{\rm N}(\mu=2,\sigma=3)\), compute:
- \(P(X\leq 4)\)
 
            ## [1] 0.7475075                    - \(P(X\leq 4)=P((X-2)/3\leq (4-2)/3)=\Phi(2/3)\)
 
            ## [1] 0.7475075                                Normal approximation to the Binomial distribution (DeMoivre-Laplace limit theorem)            
            For            \(0<p<1\)            and            \(r\in\{0,1,2,\ldots,n\}\)            \[\frac{\sqrt{2\pi np(1-p)}{n\choose r}p^r(1-p)^{n-r}}{e^{-(r-np)^2/(2np(1-p))}}\stackrel{n\rightarrow+\infty}{\longrightarrow} 1\]            Consequence:
If \(X\sim{\rm B}(n,p)\), then for any \(a<b\), we have \[P\left(a\leq \frac{X-np}{\sqrt{np(1-p)}}\leq b\right)\stackrel{n\rightarrow+\infty}{\longrightarrow}\Phi(b)-\Phi(a)\] Good approximation for values of \(n\) satisfying \(np(1-p)\geq 10\).
Example
\(X\sim{\rm B}(n=40,p=0.5)\)
- \(P(X=20)\)
 
                              dbinom(20,size=                40,prob=                0.5)                                  ## [1] 0.1253707                    - \(P(X=20)=P(19.5\leq X\leq 20.5)\)
 
                              pnorm(20.5,mean=                20,sd=                sqrt(10))-pnorm(19.5,mean=                20,sd=                sqrt(10))                                  ## [1] 0.1256329                                                  dnorm(20,mean=                20,sd=                sqrt(10))                                  ## [1] 0.1261566                                                  set.seed(2) x=rbinom(10000,size=                40,prob=.5)                hist(x,                breaks=                seq(-0.5,40.5,1),                probability=T)  t=seq(0,40,by=.01)                points(t,dnorm(t,mean=                20,sd=                sqrt(10)),type=                "l")                                            
            Continuous distributions in              R                      
| Distributions |                   R                  command |               
|---|---|
| Uniform, \({\rm U}(a,b)\) |                   unif(min=0,max=1)                 |               
| Exponential, \({\rm Exp}(\lambda)\) |                   exp(rate=1)                 |               
| Normal, \({\rm N}(\mu,\sigma)\) |                   norm(mean=0,sd=1)                 |               
| Gamma, \({\rm Gamma}(k,\lambda)\) |                   gamma(shape,rate=1)                 |               
| Beta, \({\rm Beta}(\alpha,\beta)\) |                   beta(shape1,shape2)                 |               
| Chi-square, \(\chi^2_n\) |                   chisq(df)                 |               
| Student's \(t\), \(t_n\) |                   t(df)                 |               
| Fisher's \(F\), \(F_{n_1,n_2}\) |                   f(df1,df2)                 |               
| Functions |                   R                  prefix |               
|---|---|
| density |                   d                 |               
| cdf |                   p                 |               
| quantile function |                   q                 |               
| random numbers |                   r                 |               
Source: http://www.est.uc3m.es/icascos/eng/probability_notes/continuous-random-variables.html
0 Response to "Continuous Injective Function of Uniform is Uniform Distribution"
Post a Comment