Bayesian and frequentist reasoning in plain English
来源:互联网 发布:java实现平衡二叉树 编辑:程序博客网 时间:2024/06/02 06:55
Here is how I would explain the basic difference to my grandma:
I have misplaced my phone somewhere in the home. I can use the phone locator on the base of the instrument to locate the phone and when I press the phone locator the phone starts beeping.
Problem: Which area of my home should I search?
Frequentist Reasoning:
I can hear the phone beeping. I also have a mental model which helps me identify the area from which the sound is coming from. Therefore, upon hearing the beep, I infer the area of my home I must search to locate the phone.
Bayesian Reasoning:
I can hear the phone beeping. Now, apart from a mental model which helps me identify the area from which the sound is coming from, I also know the locations where I have misplaced the phone in the past. So, I combine my inferences using the beeps and my prior information about the locations I have misplaced the phone in the past to identify an area I must search to locate the phone.
Tongue firmly in cheek:
A Bayesian defines a "probability" in exactly the same way that most non-statisticians do - namely an indication of the plausibility of a proposition or a situation. If you ask him a question, he will give you a direct answer assigning probabilities describing the plausibilities of the possible outcomes for the particular situation (and state his prior assumptions).
A Frequentist is someone that believes probabilities represent long run frequencies with which events occur; if needs be, he will invent a fictitious population from which your particular situation could be considered a random sample so that he can meaningfully talk about long run frequencies. If you ask him a question about a particular situation, he will not give a direct answer, but instead make a statement about this (possibly imaginary) population. Many non-frequentist statisticians will be easily confused by the answer and interpret it as Bayesian probability about the particular situation.
However, it is important to note that most Frequentist methods have a Bayesian equivalent that in most circumstances will give essentially the same result, the difference is largely a matter of philosophy, and in practice it is a matter of "horses for courses".
As you may have guessed, I am a Bayesian and an engineer. ;o)
Very crudely I would say that:
Frequentist: Sampling is infinite and decision rules can be sharp. Data are a repeatable random sample - there is a frequency. Underlying parameters are fixed i.e. they remain constant during this repeatable sampling process.
Bayesian: Unknown quantities are treated probabilistically and the state of the world can always be updated. Data are observed from the realised sample. Parameters are unknown and described probabilistically. It is the data which are fixed.
There is a brilliant blog post which gives an indepth example of how a Bayesian and Frequentist would tackle the same problem. Why not answer the problem for yourself and then check?
The problem (taken from Panos Ipeirotis' blog):
You have a coin that when flipped ends up head with probability p and ends up tail with probability 1-p. (The value of p is unknown.)
Trying to estimate p, you flip the coin 100 times. It ends up head 71 times.
Then you have to decide on the following event: "In the next two tosses we will get two heads in a row."
Would you bet that the event will happen or that it will not happen?
Just a little bit of fun...
A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule.
From this site:
http://www2.isye.gatech.edu/~brani/isyebayes/jokes.html
and from the same site, a nice essay...
"An Intuitive Explanation of Bayes' Theorem"
http://yudkowsky.net/rational/bayes
Let us say a man rolls a six sided die and it has outcomes 1, 2, 3, 4, 5, or 6. Furthermore, he says that if it lands on a 3, he'll give you a free text book.
Then informally:
The Frequentist would say that each outcome has an equal 1 in 6 chance of occurring. She views probability as being derived from long run frequency distributions.
The Bayesian however would say hang on a second, I know that man, he's David Blaine, a famous trickster! I have a feeling he's up to something. I'm going to say that there's only a 1% chance of it landing on a 3 BUT I'll re-evaluate that beliefe and change it the more times he rolls the die. If I see the other numbers come up equally often, then I'll iteratively increase the chance from 1% to something slightly higher, otherwise I'll reduce it even further. She views probability as degrees of belief in a proposition.
The Bayesian is asked to make bets, which may include anything from which fly will crawl up a wall faster to which medicine will save most lives, or which prisoners should go to jail. He has a big box with a handle. He knows that if he puts absolutely everything he knows into the box, including his personal opinion, and turns the handle, it will make the best possible decision for him.
The frequentist is asked to write reports. He has a big black book of rules. If the situation he is asked to make a report on is covered by his rulebook, he can follow the rules and write a report so carefully worded that it is wrong, at worst, one time in 100 (or one time in 20, or one time in whatever the specification for his report says).
The frequentist knows (because he has written reports on it) that the Bayesian sometimes makes bets that, in the worst case, when his personal opinion is wrong, could turn out badly. The frequentist also knows (for the same reason) that if he bets against the Bayesian every time he differs from him, then, over the long run, he will lose.
In plain english, I would say that Bayesian and Frequentist reasoning are distinguished by two different ways of answering the question:
What is probability?
Most differences will essentially boil down to how each answers this question, for it basically defines the domain of valid applications of the theory. Now you can't really give either answer in terms of "plain english", without further generating more questions. For me the answer is (as you could probably guess)
probability is logic
my "non-plain english" reason for this is that the calculus of propositions is a special case of the calculus of probabilities, if we represent truth by
For the frequentist reasoning, we have the answer:
probability is frequency
although I'm not sure "frequency" is a plain english term in the way it is used here - perhaps "proportion" is a better word. I wanted to add into the frequentist answer that the probability of an event is thought to be a real, measurable (observable?) quantity, which exists independently of the person/object who is calculating it. But I couldn't do this in a "plain english" way.
So perhaps a "plain english" version of one the difference could be that frequentist reasoning is an attempt at reasoning from "absolute" probabilities, whereas bayesian reasoning is an attempt at reasoning from "relative" probabilities.
Another difference is that frequentist foundations are more vague in how you translate the real world problem into the abstract mathematics of the theory. A good example is the use of "random variables" in the theory - they have a precise definition in the abstract world of mathematics, but there is no unambiguous procedure one can use to decide if some observed quantity is or isn't a "random variable".
The bayesian way of reasoning, the notion of a "random variable" is not necessary. A probability distribution is assigned to a quantity because it is unknown - which means that it cannot be deduced logically from the information we have. This provides at once a simple connection between the observable quantity and the theory - as "being unknown" is unambiguous.
You can also see in the above example a further difference in these two ways of thinking - "random" vs "unknown". "randomness" is phrased in such a way that the "randomness" seems like it is a property of the actual quantity. Conversely, "being unknown" depends on which person you are asking about that quantity - hence it is a property of the statistician doing the analysis. This gives rise to the "objective" versus "subjective" adjectives often attached to each theory. It is easy to show that "randomness" cannot be a property of some standard examples, by simply asking two frequentists who are given different information about the same quantity to decide if its "random". One is the usual Bernoulli Urn: frequentist 1 is blindfolded while drawing, whereas frequentist 2 is standing over the urn, watching frequentist 1 draw the balls from the urn. If the declaration of "randomness" is a property of the balls in the urn, then it cannot depend on the different knowledge of frequentist 1 and 2 - and hence the two frequentist should give the same declaration of "random" or "not random".
In reality, I think much of the philosophy surrounding the issue is just grandstanding. That's not to dismiss the debate, but it is a word of caution. Sometimes, practical matters take priority - I'll give an example below.
Also, you could just as easily argue that there are more than two approaches:
- Neyman-Pearson ('frequentist')
- Likelihood-based approaches
- Fully Bayesian
A senior colleague recently reminded me that "many people in common language talk about frequentist and Bayesian. I think a more valid distinction is likelihood-based and frequentist. Both maximum likelihood and Bayesian methods adhere to the likelihood principle whereas frequentist methods don't."
I'll start off with a very simple practical example:
We have a patient. The patient is either healthy(H) or sick(S). We will perform a test on the patient, and the result will either be Positive(+) or Negative(-). If the patient is sick, they will always get a Positive result. We'll call this the correct(C) result and say that
So, the test is either 100% accurate or 95% accurate, depending on whether the patient is healthy or sick. Taken together, this means the test is at least 95% accurate.
So far so good. Those are the statements that would be make by a frequentist. Those statements are quite simple to understand and are true. There's no need to waffle about a 'frequentist interpretation'.
But, things get interesting when you try to turn things around. Given the test result, what can you learn about the health of the patient? Given a negative test result, the patient is obviously healthy, as there are no false negatives.
But we must also consider the case where the test is positive. Was the test positive because the patient was actually sick, or was it a false positive? This is where the frequentist and Bayesian diverge. Everybody will agree that this cannot be answered at the moment. The frequentist will refuse to answer. The Bayesian will be prepared to give you an answer, but you'll have to give the Bayesian a prior first - i.e. tell it what proportion of the patients are sick.
To recap, the following statements are true:
- For healthy patients, the test is very accurate.
- For sick patients, the test is very accurate.
If you are satisfied with statements such as that, then you are using frequentist interpretations. This might change from project to project, depending on what sort of problems you're looking at.
But you might want to make different statements and answer the following question:
- For those patients that got a positive test result, how accurate is the test?
This requires a prior and a Bayesian approach. Note also that this is the only question of interest to the doctor. The doctor will say "I know that the patients will either get a positive result or a negative result. I also now that the negative result means the patient is healthy and can be send home. The only patients that interest me now are those that got a positive result -- are they sick?."
To summarize: In examples such as this, the Bayesian will agree with everything said by the frequentist. But the Bayesian will argue that the frequentist's statements, while true, are not very useful; and will argue that the useful questions can only be answered with a prior.
A frequentist will consider each possible value of the parameter (H or S) in turn and ask "if the parameter is equal to this value, what is the probability of my test being correct?"
A Bayesian will instead consider each possible observed value (+ or -) in turn and ask "If I imagine I have just observed that value, what does that tell me about the conditional probability of H-versus-S?"
For sick patients, the test is NOT very accurate.
you forget the NOT? – agstudyJan 6 at 23:44Schools of thought in Probability Theory
This question about drawing inferences about an individual bowl player when you have two data sets - other players' results, and the new player's results, is a good spontaneous example of the difference which my answer tries to address in plain English.
Bayesian and frequentist statistics are compatible in that they can be understood as two limiting cases of assessing the probability of future events based on past events and an assumed model, if one admits that in the limit of a very large number of observations, no uncertainty about the system remains, and that in this sense a very large number of observations is equal to knowing the parameters of the model.
Assume we have made some observations, e.g., outcome of 10 coin flips. In Bayesian statistics, you start from what you have observed and then you assess the probability of future observations or model parameters. In frequentist statistics, you start from an idea (hypothesis) of what is true by assuming scenarios of a large number of observations that have been made, e.g., coin is unbiased and gives 50% heads up, if you throw it many many times. Based on these scenarios of a large number of observations (=hypothesis), you assess the frequency of making observations like the one you did, i.e.,frequency of different outcomes of 10 coin flips. It is only then that you take your actual outcome, compare it to the frequency of possible outcomes, and decide whether the outcome belongs to those that are expected to occur with high frequency. If this is the case you conclude that the observation made does not contradict your scenarios (=hypothesis). Otherwise, you conclude that the observation made is incompatible with your scenarios, and you reject the hypothesis.
Thus Bayesian statistics starts from what has been observed and assesses possible future outcomes. Frequentist statistics starts with an abstract experiment of what would be observed if one assumes something, and only then compares the outcomes of the abstract experiment with what was actually observed. Otherwise the two approaches are compatible. They both assess the probability of future observations based on some observations made or hypothesized.
I started to write this up in a more formal way:
Positioning Bayesian inference as a particular application of frequentist inference and vice versa. figshare.
http://dx.doi.org/10.6084/m9.figshare.867707
The manuscript is new. If you happen to read it, and have comments, please let me know.
Somewhat OT, but here are two poems I wrote (one about each school of thought):
For Bayesians and for frequentists
I would say that they look at probability in different ways. The Bayesian is subjective and uses a prior beliefs to define a prior probability distribution on the possible values of the unknown parameters. So he relies on a theory of probability like deFinetti's. The frequentist see probability as something that has to do with a limiting frequency based on an observed proportion. This is in line with the theory of probability as developed by Kolmogorov and von Mises.
A frequentist does parametric inference using just the likelihood function. A Bayesian takes that and multiplies to by a prior and normalizes it to get the posterior distribution that he uses for inference.
I have studied an exciting example like this: Take a look at this pictures
What did you see?
If you said that this is maybe a half-black half-white dog, you are frequentist.
If you see a black dog, it means that you are bayesian, this's based on your available knowledge about dog that there's rarely exist a dog half-black half-white.
So, most of us are Bayesian, we only don't recognize that.
A male cat and a female cat are penned up in a steel chamber, along with enough food and water for 70 days.
A Frequentist would say the average gestation period for felines is 66 days, the female was in heat when the cats were penned up, and once in heat she will mate repeatedly for 4 to 7 days. Since there were likely many acts of propagation and enough subsequent time for gestation, the odds are, when the box is opened on day 70, there's a litter of newborn kittens.
A Bayesian would say, I heard some serious Marvin Gaye coming from the box on day 1 and then this morning I heard many kitten-like sounds coming from the box. So without knowing much about cat reproduction, the odds are, when the box is opened on day 70, there's a litter of newborn kittens.
- Bayesian and frequentist reasoning in plain English
- Bayesian Reasoning and Machine Learning
- Frequentist and Bayesian:智慧的碰撞
- Frequentist和Bayesian的差别
- Frequentist 观点和 Bayesian 观点
- Frequentist和Bayesian之间的故事
- 用平常语言介绍神经网络(Neural Networks in Plain English)
- Essential English Grammar And Idioms In One
- reasoning
- 对频率论(Frequentist)方法和贝叶斯方法(Bayesian Methods)的一个总结
- 对频率论(Frequentist)方法和贝叶斯方法(Bayesian Methods)的一个总结
- I will write articles both in chinese and in english
- A plain english introduction to CAP Theorem
- Asking The Right Questions--Fallacies In Reasoning.
- Current research issues and trends in non-English Web searching
- 贝叶斯统计(Bayesian statistics) vs 频率统计(Frequentist statistics):marginal likelihood(边缘似然)
- JavaScript Prototype in Plain Language
- Docking Toolbars in Plain C
- 【通知】通信13期末考核
- twisted入门教程之八:使用Deferred的诗歌下载客户端
- gcc的内建函数__builtin_expect
- Jackson使用(四)----springmvc中配置jackson
- adb devices : error:
- Bayesian and frequentist reasoning in plain English
- twisted入门教程之九:第二个小插曲,Deferred
- c++学习笔记之文件操作
- sqlite for Unity with C sharp
- Android 与pc建立socket连接
- 基于ajax的消息轮播
- twisted入门教程之十:增强defer功能的客户端
- 独立开发者:新手做2D手游该用哪些工具?
- 机器学习系统设计笔迹(一)