统计学 入门基础概念篇 - Descriptive Statistics: Charts and Graphs(个人笔记)
来源:互联网 发布:邀请函设计软件 编辑:程序博客网 时间:2024/04/29 22:23
Graphically,
The center of a distribution is located at the median of the distribution.
The spread of a distribution refers to the variability of the data.
Shape
The shape of a distribution is described by the following characteristics.
- Symmetry. When it is graphed, a symmetric distribution can be divided at the center so that each half is a mirror image of the other.
- Number of peaks. Distributions can have few or many peaks. Distributions with one clear peak are called unimodal, and distributions with two clear peaks are called bimodal. When a symmetric distribution has a single peak at the center, it is referred to as bell-shaped.
- Skewness. When they are displayed graphically, some distributions have many more observations on one side of the graph than the other. Distributions with fewer observations on the right (toward higher values) are said to be skewed right; and distributions with fewer observations on the left (toward lower values) are said to be skewed left.
- Uniform. When the observations in a set of data are equally spread across the range of the distribution, the distribution is called a uniform distribution. A uniform distribution has no clear peaks.
Here are some examples of distributions and shapes.
bell-shaped Skewed right Non-symmetric, bimodal 0123456789 0123456789 0123456789Uniform Skewed left Symmetric, bimodal
Frequency vs. Cumulative Frequency
In a data set, the cumulative frequency for a value x is the total number of scores that are less than or equal to x. The charts below illustrate the difference between frequency and cumulative frequency. Both charts show scores for a test administered to 300 students.
frequency
In the chart on the left, column height shows frequency - the number of students in each test score grouping. For example, about 30 students received a test score between 51 and 60.
In the chart on the right, column height shows cumulative frequency - the number of students up to and including each test score. The chart on the right is a cumulative frequency chart. It shows that 30 students received a test score of at most 50; 60 students received a score of at most 60; 120 students received a score of at most 70; and so on.
Absolute vs. Relative Frequency
percentage
Frequency counts can be measured in terms of absolute numbers or relative numbers (e.g.,proportions or percentages). The chart to the right duplicates the cumulative frequency chart above, except that it expresses the counts in terms of percentages rather than absolute numbers.
Note that the columns in the chart have the same shape, whether the Y axis is labeled with actual frequency counts or with percentages. If we had used proportions instead of percentages, the shape would remain the same.
Discrete vs. Continuous Variables
percentage
Each of the previous cumulative charts have used adiscrete variable on the X axix (i.e., the horizontal axis). The chart to the right duplicates the previous cumulative charts, except that it uses a continuous variable for the test scores on the X axis.
Let's work through an example to understand how to read this cumulative frequency plot. Specifically, let's find the median. Follow the grid line to the right from the Y axis at 50%. This line intersects the curve over the X axis at a test score of about 73. This means that half of the students received a test score of at most 73, and half received a test score of at least 73. Thus, the median is 73.
You can use the same process to find the cumulative percentage associated with any other test score. For example, what percentage of students received a test score of 64 or less? From the graph, you can see that about 25% of students received a score of 64 or less.
Test Your Understanding
Problem 1
Below, the cumulative frequency plot shows height (in inches) of college basketball players.
What is the interquartile range?
q1 is the 25 % out of the whole dataset which corresponds to 71.- 统计学 入门基础概念篇 - Descriptive Statistics: Charts and Graphs(个人笔记)
- 统计学 入门基础概念篇 - Descriptive Statistics: Quantitative Measures(个人笔记)
- 统计学 入门基础概念篇 Probability 概率部分 (个人笔记)
- WEEK2-Descriptive statistics and data cleaning
- 重学Statistics, Cha2 Descriptive Statistics (Categorical and Quantitative Data)
- 50 javascript libraries for charts and graphs
- Charts and Graphs for Microsoft(R) Office Excel 2007
- 05-Descriptive/Inferential Statistics Definition
- 统计学基础概念【未完待续】
- perl Statistics::Descriptive Perl 的统计模块
- 随手笔记:描述统计学入门
- 重学statistics,Cha3 Descriptive Statistics: numerical measures
- 统计学概念
- java入门基础之数据类型 个人笔记
- Mathematical Statistics and Data Analysis笔记
- 统计学概念基础---数学期望,方差,标准差,协方差
- 统计学概念基础---数学期望,方差,标准差,协方差
- Scenario and Attack Graphs
- 蓝桥杯国赛前一周深夜思索
- Android自定义View(二、深入解析自定义属性)
- 【三分法/数学公式】B君的圆锥【51nod】【BSG白山极客挑战赛】
- 大数据和Android
- LeetCode 45. Jump Game II(跳格子)
- 统计学 入门基础概念篇 - Descriptive Statistics: Charts and Graphs(个人笔记)
- LeetCode 46. Permutations(排列)
- LeetCode 47. Permutations II(排列)
- 山寨Besiege(五)车轮
- LeetCode 48. Rotate Image(旋转)
- view.scrollTo()无效的原因
- LeetCode 49. Group Anagrams(分组同构异形词)
- 粘性动画以及果冻效果的实现
- On the Personalities of Dead Authors