概率论中指数分布介绍及C++11中std::exponential_distribution的使用

来源：互联网发布：arctime 字幕制作软件编辑：程序博客网时间：2024/05/16 09:04

指数分布：在深度学习中，我们经常会需要一个在x=0点处取得边界点(sharp point)的分布。为了实现这一目的，我们可以使用指数分布(exponential distribution)：

p(x;λ)= λl_x≥0exp(-λx)

指数分布使用指示函数(indicator function) l_x≥0来使得当x取负值时的概率为零。

指数分布：在概率论和统计学中，指数分布(Exponential distribution)是一种连续概率分布。指数分布可以用来表示独立随机事件发生的时间间隔，比如旅客进入机场的时间间隔、打进客服中心电话的时间间隔、中文维基百科新条目出现的时间间隔等等。

一个指数分布的概率密度函数是：

其中λ>0是分布的一个参数，常被称为率参数(rate parameter)。即每单位时间发生该事件的次数。指数分布的区间是[0,∞)。如果一个随机变量X呈指数分布，则可以写作：X∽Exponential(λ)。

指数分布的累积分布函数可以写成：

随机变量X(X的率参数是λ)的期望值是：E[X]=1/λ,比方说：如果你平均每个小时接到2次电话，那么你预期等待每一次电话的时间是半个小时。X的方差是：D[X]=1/λ²，X的偏离系数是：V[X]=1.

以下简单介绍Laplace分布、Dirac分布、经验分布、混合分布：

Laplace分布：一个联系紧密的概率分布是Laplace分布(Laplace distribution)，它允许我们在任意一点μ处设置概率质量的峰值：

拉普拉斯分布：在概率论与统计学中，拉普拉斯分布是以皮埃尔-西蒙·拉普拉斯的名字命名的一种连续概率分布。由于它可以看作是两个不同位置的指数分布背靠背拼接在一起，所以它也叫做双指数分布。两个相互独立同概率分布指数随机变量之间的差别是按照指数分布的随机时间布朗运动，所以它遵循拉普拉斯分布。

如果随机变量的概率密度函数分布为：

那么它就是拉普拉斯分布。其中，μ是位置参数，b>0是尺度参数。如果μ=0，那么，正半部分恰好是尺度为1/2的指数分布。

拉普拉斯分布的概率密度函数让我们联想到正态分布，但是，正态分布是用相对于μ平均值的差的平方来表示，而拉普拉斯概率密度函数用相对于平均值的差的绝对值来表示。因此，拉普拉斯分布的尾部比正态分布更加平坦。

Boost 1.64.0库中实现了对Laplace分布的实现，在math/distributions/laplace.hpp文件中，对应类为laplace_distribution。

Dirac分布：在一些情况下，我们希望概率分布中的所有质量都集中在一个点上。这可以通过Dirac delta函数(Dirac delta function)δ(x)定义概率密度函数来实现：p(x)= δ(x-μ)

Dirac delta函数被定义成在除了0以外的所有点的值都为0，但是积分为1。Dirac delta函数不像普通函数一样对x的每一个值都有一个实数值的输出，它是一种不同类型的数学对象，被称为广义函数(generalized function)，广义函数是依据积分性质定义的数学对象。我们可以把Dirac delta函数想成一系列函数的极限点，这一系列函数把除0以外的所有点的概率密度越变越小。

通过把p(x)定义成δ函数左移-μ个单位，我们得到了一个在x=μ处具有无限窄也无限高的峰值的概率质量。

狄拉克δ函数：在科学和数学中，狄拉克δ函数或简称δ函数是在实数线上定义的一个广义函数或分布。它在除零以外的点上都等于零，且其在整个定义域上的积分等于1。δ函数有时可看作是在原点处无限高、无限细，但是总面积为1 的一个尖峰，在物理上代表了理想化的质点或点电荷的密度。

从纯数学的观点来看，狄拉克δ函数并非严格意义上的函数，因为任何在扩展实数线上定义的函数，如果在一个点以外的地方都等于零，其总积分必须为零。δ函数只有在出现在积分以内的时候才有实质的意义。根据这一点，δ函数一般可以当做普通函数一样使用。它形式上所遵守的规则属于运算微积分的一部分，是物理学和工程学的标准工具。

δ函数的图形通常可以视为整条x轴和正y轴。虽然称为函数，但δ函数并非真正的函数，至少它的值域不在实数以内。

Dirac分布经常作为经验分布(empirical distribution)的一个组成部分出现：

经验分布将概率密度1/m赋给m个点x⁽¹⁾,…,x^(m)中的每一个，这些点是给定的数据集或者采样的集合。只有在定义连续型随机变量的经验分布时，Dirac delta函数才是必要的。对于离散型随机变量，情况更加简单：经验分布可以被定义成一个Multinoulli分布，对于每一个可能的输入，其概率可以简单地设为在训练集上那个输入值的经验频率(empirical frequency)。

当我们在训练集上训练模型时，我们可以认为从这个训练集上得到的经验分布指明了我们采样来源的分布。关于经验分布另外一个重要的观点是，它是训练数据的似然最大的那个概率密度函数。

Empirical distribution function: In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample. This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

The empirical distribution function is an estimate of the cumulative distribution function that generated the points in the sample. It converges with probability 1 to that underlying distribution, according to the Glivenko–Cantelli theorem.A number of results exist to quantify the rate of convergence of the empirical distribution function to the underlying cumulative distribution function.

分布的混合：通过组合一些简单的概率分布来定义新的概率分布也是很常见的。一种通用的组合方法是构造混合分布(mixture distribution)。混合分布由一些组件(component)分布构成。每次实验，样本是由哪个组件分布产生的取决于从一个Multinoulli分布中采样的结果：这里P(c)是对各组件的一个Multinoulli分布。

一个非常强大且常见的混合模型是高斯混合模型(Gaussian Mixture Model)，它的组件p(x|c=i)是高斯分布。每个组件都有各自的参数，均值μ⁽ⁱ⁾和协方差矩阵∑⁽ⁱ⁾。有一些混合可以有更多的限制。和单个高斯分布一样，高斯混合模型有时会限制每个组件的协方差矩阵为对角的或者各向同性的(标量乘以单位矩阵)。

除了均值和协方差以外，高斯混合模型的参数指明了给每个组件i的先验概率(prior probability)αi=P(c=i)。”先验”一词表明了在观测到x之前传递给模型关于c的信念(belief)。作为对比，P(c|x)是后验概率(posterior probability)，因为它是在观测到x之后进行计算的。高斯混合模型是概率密度的万能近似器(universal approximator)，在这种意义上，任何平滑的概率密度都可以用具有足够多组件的高斯混合模型以任意精度来逼近。

Mixture distribution: In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be random vectors (each having the same dimension), in which case the mixture distribution is a multivariate distribution.

以上内容摘自：《深度学习中文版》和维基百科

以下是对C++11中的指数分布模板类std::exponential_distribution的介绍：

C++11中引入了指数分布(Exponential distribution)模板类std::exponential_distribution.指数分布是一个连续概率分布，产生随机非负浮点数，可以用来表示独立随机事件发生的时间间隔。
std::exponential_distribution:Random number distribution that produces floating-point values according to an exponential distribution, which is described by the following probability density function: p(x|λ)= λe-λx, x>0
This distribution produces random numbers where each value represents the interval between two random events that are independent but statistically defined by a constant average rate of occurrence (its lambda, λ).The distribution parameter, lambda, is set on construction. To produce a random value following this distribution, call its member function operator().Its analogous discrete distribution is the geometric_distribution.

下面是从其它文章中copy的std::exponential_distribution测试代码，详细内容介绍可以参考对应的reference：

#include "exponential_distribution.hpp"#include <iostream>#include <random>#include <string>#include <chrono>#include <thread>#include <iomanip>#include <map>///////////////////////////////////////////////////////////// reference: http://www.cplusplus.com/reference/random/exponential_distribution/int test_exponential_distribution_1(){{const int nrolls = 10000;  // number of experimentsconst int nstars = 100;    // maximum number of stars to distributeconst int nintervals = 10; // number of intervalsstd::default_random_engine generator;std::exponential_distribution<double> distribution(3.5);int p[nintervals] = {};for (int i = 0; i<nrolls; ++i) {double number = distribution(generator);if (number<1.0) ++p[int(nintervals*number)];}std::cout << "exponential_distribution (3.5):" << std::endl;std::cout << std::fixed; std::cout.precision(1);for (int i = 0; i<nintervals; ++i) {std::cout << float(i) / nintervals << "-" << float(i + 1) / nintervals << ": ";std::cout << std::string(p[i] * nstars / nrolls, '*') << std::endl;}}{ // (1)、exponential_distribution::exponential_distribution: Construct exponential distribution,//   Constructs an exponential_distribution object, adopting the distribution parameters specified either by lambda or by object parm.//   (2)、exponential_distribution::lambda: Returns the parameter lambda (λ) associated with the exponential_distribution.//   This parameter represents the number of times the random events are observed by interval, on average.//   This parameter is set on construction.//   (3)、exponential_distribution::max: Maximum value//   Returns the least upper bound of the range of values potentially returned by member operator().//   (4)、exponential_distribution::min: Minimum value//   Returns the greatest lower bound of the range of values potentially returned by member operator(),//   which for exponential_distribution is always zero.//   (5)、exponential_distribution::operator(): Generate random number//   Returns a new random number that follows the distribution's parameters associated to the object (version 1)//   or those specified by parm (version 2).// construct a trivial random generator engine from a time-based seed:int seed = std::chrono::system_clock::now().time_since_epoch().count();std::default_random_engine generator(seed);std::exponential_distribution<double> distribution(1.0);std::cout << "ten beeps, spread by 1 second, on average: " << std::endl;for (int i = 0; i<10; ++i) {double number = distribution(generator);std::chrono::duration<double> period(number);std::this_thread::sleep_for(std::chrono::seconds(1)/*period*/);std::cout << "beep!" << std::endl;}std::cout << "lambda: " << distribution.lambda() << std::endl;std::cout << "max: " << distribution.max() << std::endl;std::cout << "min: " << distribution.min() << std::endl;}{ // exponential_distribution::param: Distribution parameters//   The first version(1) returns an object with the parameters currently associated with the distribution object.//   The second version(2) associates the parameters in object parm to the distribution object.std::default_random_engine generator;std::exponential_distribution<double> d1(0.8);std::exponential_distribution<double> d2(d1.param());// print two independent values:std::cout << d1(generator) << std::endl;std::cout << d2(generator) << std::endl;}{ // exponential_distribution::reset: Resets the distribution, so that subsequent uses of the object do not depend on values already produced by it.//   This function may have no effect if the library implementation for this distribution class produces independent values.std::default_random_engine generator;std::exponential_distribution<double> distribution(1.0);// print two independent values:std::cout << distribution(generator) << std::endl;distribution.reset();std::cout << distribution(generator) << std::endl;}return 0;}////////////////////////////////////////////////////////////////////// reference: http://en.cppreference.com/w/cpp/numeric/random/exponential_distributionint test_exponential_distribution_2(){std::random_device rd;std::mt19937 gen(rd());// if particles decay once per second on average,// how much time, in seconds, until the next one?std::exponential_distribution<> d(1);std::map<int, int> hist;for (int n = 0; n<10000; ++n) {++hist[2 * d(gen)];}for (auto p : hist) {std::cout << std::fixed << std::setprecision(1)<< p.first / 2.0 << '-' << (p.first + 1) / 2.0 <<' ' << std::string(p.second / 200, '*') << '\n';}return 0;}////////////////////////////////////////////////////////////////// reference: https://msdn.microsoft.com/en-us/library/bb982914.aspxstatic void test(const double l, const int s){// uncomment to use a non-deterministic generator//    std::random_device gen;std::mt19937 gen(1701);std::exponential_distribution<> distr(l);std::cout << std::endl;std::cout << "min() == " << distr.min() << std::endl;std::cout << "max() == " << distr.max() << std::endl;std::cout << "lambda() == " << std::fixed << std::setw(11) << std::setprecision(10) << distr.lambda() << std::endl;// generate the distribution as a histogram  std::map<double, int> histogram;for (int i = 0; i < s; ++i) {++histogram[distr(gen)];}// print results  std::cout << "Distribution for " << s << " samples:" << std::endl;int counter = 0;for (const auto& elem : histogram) {std::cout << std::fixed << std::setw(11) << ++counter << ": "<< std::setw(14) << std::setprecision(10) << elem.first << std::endl;}std::cout << std::endl;}int test_exponential_distribution_3(){double l_dist = 0.5;int samples = 10;std::cout << "Use CTRL-Z to bypass data entry and run using default values." << std::endl;std::cout << "Enter a floating point value for the 'lambda' distribution parameter (must be greater than zero): ";//std::cin >> l_dist;std::cout << "Enter an integer value for the sample count: ";//std::cin >> samples;test(l_dist, samples);return 0;}

GitHub：https://github.com/fengbingchun/Messy_Test

阅读全文

0 0