> )+('` :(bjbjLULU .>.?.? %
"""8Z$~,^"666666`bbbbbb$h
66666
666
6
6`6`
6P!c"`0
6666666LX6666666""
Chapter 2--Summarizing and Graphing Data
Our ultimate goal in this chapter is not only to obtain a table or graph, but also to analyze the data and understand what it tells us.
General characteristics of data
1. Center--a representative or average value that indicates where the middle of the data set is located
2. Variation--a measure of the amount that the data values vary
3. Distribution--the nature or shape of the spread of the data over the range of values such as bell-shaped, uniform, or skewed
4. Outliers--sample values that lie very far away from the vast majority of the other sample values
5. Time--changing characteristics of the data over time
2-2 Frequency Distributions
Key Concept--When working with large data sets, it is often helpful to organize and summarize the data by constructing a table called a frequency distribution. A frequency distribution helps us to understand the nature of the distribution of a data set.
A frequency distribution lists data values along with their corresponding frequencies, classes must be mutually exclusive (do not overlap). The frequency for a particular class is the number of original values that fall into that class.
Reasons for constructing a frequency distribution:
1. Large data can be summarized
2. We gain some insight into the nature of the data
3. We have a basis for constructing graphs.
The goal of a frequency distribution is to make a table that will quickly reveal the shape of the data.
Definitions:
1. Lower class limits are the smallest numbers that can belong to the different classes.
2. Upper class limits are the largest numbers that can belong to the different classes.
3. Class boundaries are the numbers used to separate the classes, but without the gaps created by class limits. They split the difference between the end of one class and the beginning of the next class.
4. Class midpoints are the values in the middle of the class.
EMBED Equation.DSMT4
5. Class width is the difference between two consecutive lower class limits or two consecutive lower class boundaries
Procedure for constructing a frequency distribution
1. Determine the number of classes--between 5 and 20
EMBED Equation.DSMT4 is often used as estimate for the number of classes.
2. Calculate class width
Class width EMBED Equation.DSMT4 ; round result to get a convenient number
3. Choose either the minimum data value or a convenient value below the minimum data value as the first lower class limit.
4. Using the first lower class limit and the class width, list the other lower class limits.
5. List the lower class limits in a vertical column and then enter the upper class limits.
6. Tally and then find total frequency for each class
The relative frequency distribution is a variation of the frequency distribution. Instead of frequencies, you use relative frequencies or percents.
Relative frequency = EMBED Equation.DSMT4
Percentage frequency = EMBED Equation.DSMT4
The sum of the relative frequencies must be 1 or very close to 1.
The cumulative frequency for a class is the sum of the frequencies for that class and all previous classes.
Critical Thinking: Interpreting Frequency Distributions
A frequency distribution can reveal some important characteristics of the data. It can help us decide if the data is normal.
Normal distribution
1. The frequencies start low, then increase to one or two high frequencies, the decrease to a low frequency.
2. The distribution is approximately symmetric.
The presence of gaps can show that we have data from two or more different populations.
Frequency distributions can be used to summarize qualitative data.
2-3 Histograms
A frequency distribution is a tool for summarizing a large set of data and determining the distribution of the data. A histogram can be constructed from a frequency distribution.
A histogram is a graph consisting of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to the frequency values. On the horizontal scale, use class boundaries or class midpoints. On the vertical scale, use class frequencies or relative frequencies. If relative frequencies are used, it is called a relative frequency histogram.
What to look for in a histogram: central or typical value, extent of spread or variation, general shape, location and number of peaks, and presence of gaps or outliers
Normal distribution is bell-shaped.
Characteristics of bell-shaped
1. the frequencies increase to a maximum and then decrease
2. symmetric
You can often use a histogram to see if data is normal.
2-4 Statistical Graphs
The objective of this section is to identify a suitable graph for representing a data set. The graph should be effective in revealing the important characteristics of the data.
A frequency polygon uses line segments connected to points located directly above class midpoint values. The vertical axis uses frequencies. A relative frequency polygon uses relative frequencies on the vertical axis.
An ogive is a line graph that depicts cumulative frequencies. You use class boundaries along the horizontal scale and cumulative frequencies along the vertical scale.
A dotplot consists of a graph in which each data value is plotted as a point or dot along a scale of values. Dots representing equal values are stacked. A dotplot is a simple way to display quantitative data when the data set is reasonably small. Dotplots convey information about a representative or typical value in the data set, the extent to which the data values spread out, the nature of the distributions of values along the number line, and the presence of unusual values in the data set.
A stemplot or stem-and-leaf plot is an effective and compact way to summarize quantitative data. The stem is the first part of the number and the leaf is the last part of the number. Each numerical value is divided into two parts. The leading digit(s) become the stem and the trailing digit becomes the leaf. The stems are located along the vertical axis and the leaf values are stacked against each other along the horizontal axis.
Advantage: see distribution of data and retain all information in the original list; quick way to sort data
Use stemplots when you have a small to moderate set of data.
Stemplots are useful in getting a sense of a typical value for the data set and how spread out the values in the data set are. It shows outliers. You usually want between 5 and 20 stems in a stemplot.
A bar graph uses bars of equal width to show frequencies of categories of qualitative data. The vertical scale represents frequencies or relative frequencies.
A multiple bar graph has two or more sets of bars, and is used to compare two or more data sets.
A Pareto chart is a bar graph for qualitative data with the bars arranged in descending order according to frequencies.
A pie chart is a graph that shows qualitative data as slices of a circle, in which the size of each slice is proportional to the frequency count for the category. Pie charts are most effective when there are not too many different categories. It is difficult to compare category proportions using pie charts. Look for categories that form large and small proportions of the data set.
A scatterplot or scatter diagram is a plot of (x,y) quantitative data with a horizontal x-axis and a vertical y-axis. The pattern of the plotted points is often helpful in determining whether there is a relationship between the two variables.
A time-series graph is a graph of time-series data, which are quantitative data that have been collected at different points in time.
2-5 Critical Thinking: Bad graphs
Key Concept: Some graphs are bad because they contain errors and others are bad because they are misleading. It is important to recognize bad graphs and why they are bad.
Two ways graphs can be misleading.
1. Nonzero axis: Look at the axis. The vertical scale should begin at 0.
2. Pictographs--drawings of objects--can be misleading if using three-dimensional objects
PAGE
PAGE 3
()
QR !"ףwo`O!jhphpCJEHUaJj2P
hpCJUVaJh[CJaJ!j-hphpCJEHUaJjP
hpCJUVaJjhpCJUaJhpCJaJ!jhsnhsnCJEHUaJjP
hsnCJUVaJjhsnCJUaJhsnCJaJh=eCJaJhW4n5CJaJh-QCJaJhW4nCJaJ)*= } c
/
[
\
,(9(R$%YZL&OPcd"ef}~bcdqr45UVɸwnfwf^f^fhD@tCJaJh.PCJaJhkK5CJaJhkKCJaJh]5CJaJh]CJaJhv>`CJaJ!j
h[h[CJEHUaJjzP
h[CJUVaJ!j h[h[CJEHUaJj-P
h[CJUVaJjh[CJUaJh[h[5CJaJh[CJaJhpCJaJ!0-.qr5ANORS.!OTUf27 !!!!""""""""#x##$$$$%%%)%*%+%,%R%˿˿˷˷˷跫跫hT=6CJaJhT=hT=5CJaJhT=CJaJhT=hR5CJaJhRhR5CJaJhRCJaJhRh.5CJaJh.CJaJhRhD@t5CJaJhD@t5CJaJhD@tCJaJh.PCJaJhkCJaJ3./ -!.!!!""""u#v#$$%%u&v&&&G'H'R%S%h%i%%&&v&&G'k'l'''((((((((( (!('((()(*(+(,(2(3(4(5(6(7(9(:(ƾh}D0JmHnHuh}D
h}D0Jjh}D0JUhqjhqUhT=h)>CJaJh)>CJaJhT=5CJaJhT=hT=5CJaJhT=CJaJhT=6CJaJ&H'k'l'''(((((((( ()(*(+(6(7(8(9(:(h]hgd}D&`#$gdq,1h/ =!"#$%-Dd
lb
c$A??3"`?2w%KU9f#,y#SD7`!K%KU9f#,y#xڍRKkQ&m`At좥3Mv]:3Lu+B\\
vE"9T\~s+H<41r>쮸Vdb_Dl튊똳1'ɔ͊
WTlC=l!OW
cC6E}UBVBV"vOe2vC 'DZmikvq6+"7x<|:~Q<ؤ̇ag5vm_oS/! YbU\Ar|t.s].c(>ɝ싴@Z֚};ζ9TV<(]Pc琪
6NJRf<&2&-՝~6Zeb`J*%km#_Ws]SVHC,>0E;ati/Qrجo_x@,n27;PzZPoބgk1>t<Q#bDd
8 hb
c$A??3"`?26Vtɛ]q7`!6Vtɛ]@|Nx]SkQym&=(
Z=xlI
6
ljd7+ҀhzFdozփ7쩊qۗv~o潙yKX!0DɐM&n%(2/&YLfi<$'1srP0;K@¤=`![x_a&KB1~ 휲7EbOU@ 3Lu%Ba7a.p;Ir-H,F;D2DXmmz=u D>J}T&wUfשZ-ͭk5cj=ZmɠG*MRSºX26pZ",W(L4moƳ[1EHM*{}dʹV"#$CfLHA1oE?.W։u
?*Ks|FyMw;DrlQBF
t
N˜QUˡ>[8:3&Ҟ9ԴG]^лB03
A$`$nm H)y~A̿Sq?WW;EDd
lb
c$A??3"`?2*>ϰk7`!c*>ϰ
&1xڍSKkQ>Niv:P˳x{t9F8Xh7C=z✹/$^IGbF|~!V,Dd
+b
c$A??3"`?2v,V5? {;_(R
7`!J,V5? {;_( MdxڕS=oP=9Dr>`@B P *!MvI5]v^XR:fDbPUypXtr;soE9GL
2!E\+I-wZ}A5,0vP&ό!+t+=UB
t§6py^-
##r#f5b̊[&CK1կ{l_۱ih^Eii<<>$qViTc샑=솘G>^МbFs$w8>7 ^Ms&:qNesD^ /f>DA;Φuݲ>*?(puh&*fG3*
7֍Ic]oEeu[߯ꈌV44:/#Z3I٠6}lF?]- Д4l*3Ueu .`1h=f8`eL L7cV&j8H>?^e%R{wX5_}%IWrDd
b
c$A??3"`?251~
${)pxcJD
7`!51~
${)pxcJ` Gd^xڥSMkQ=MI/]RB] iMJhR]&c:$$.H!DY`]tt,ȬLRD{ɝw?O (w$xI.r<캸^Xt3\/c
QH!+V9{_ cC y dAfA-
N1rcJ8^-꺩f]k`Gx;,~Vo;qfbILSި0@52`H{`'|:]|~lyٰf1'+vbm8VAQ
U)lgS5
t&B;onl~ʑ;/{b2//Bx"qHƲf*榞&b)n61iD̗+'햆8~f!R*5gKĢ45_ZfgDRL)k/ CHZA);
!"#$%&'9*-./012346587K:;=>?@ABCDEFGHIJRoot Entry F2!c,@Data
rWordDocument.>ObjectPool !c2!c_1356714777F!c!cOle
CompObjiObjInfo !$%&'(),/012346789:;=>?@ABD
FMathType 5.0 EquationMathType EFEquation.DSMT49qR\DSMT5WinAllBasicCodePagesTimes New RomanSymbolCourier NewMT Extra!/ED/APG_APAPAE%B_AC_AE*_HA@AHA*_D_E_E_A
upperEquation Native n_1356714985F!c!cOle
CompObj
i limit + lower limit2
FMathType 5.0 EquationMathType EFEquation.DSMT49q9lDSMT5WinAllBasicCodePagesTimes New RomanSymbolCourier NewMT Extra!/ED/APObjInfo
Equation Native U_1356715058F!c!cOle
G_APAPAE%B_AC_AE*_HA@AHA*_D_E_E_A
number of data values
FMathType 5.0 EquationMathType EFEquation.DSMT49qCompObjiObjInfoEquation Native _1356715565F!c!cpLDSMT5WinAllBasicCodePagesTimes New RomanSymbolCourier NewMT Extra!/ED/APG_APAPAE%B_AC_AE*_HA@AHA*_D_E_E_A
maximum-minimumnumber of classesOle
CompObj iObjInfo"Equation Native #
FMathType 5.0 EquationMathType EFEquation.DSMT49qfdDSMT5WinAllBasicCodePagesTimes New RomanSymbolCourier NewMT Extra!/ED/APG_APAPAE%B_AC_AE*_HA@AHA*_D_E_E_A
class frequencytotal frequency
FMathType 5.0 EquationMathType EFEquation.DSMT49qÀ\DSMT5WinAllBasicCodePagesTimes New RomanSymbolCourier NewMT Extra!/ED/AP_1356715642F!c!cOle
*CompObj+iObjInfo-Equation Native .1Table<SummaryInformation(5DocumentSummaryInformation8<G_APAPAE%B_AC_AE*_HA@AHA*_D_E_E_A
class frequencytotal frequency100%Oh+'0 ,
LX
dpxb6h`0h*CŠ3l1t
AfhzL7z/>c_bw4 xx&LGgm}r3zI`/mW35+ެ*o,Chapter 2--Summarizing and Graphing DataCarol MNormalCarol M6Microsoft Office Word@ִ_@(@cK՜.+,D՜.+,h$hp
Hinds Community College: ')Chapter 2--Summarizing and Graphing DataTitleH 6>
MTWinEqns
FMicrosoft Office Word Document
MSWordDocWord.Document.89q@@@NormalCJ_HaJmH sH tH DA@DDefault Paragraph FontRiRTable Normal4
l4a(k(No List4 @4}DFooter
!.)@.}DPage Number: >)*=}c/[\,R$%YZL &
OPcd
0
-.qr5ANORS./-.uvuvGHkl ) * + 6 7 8 ; 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000@00@00@00@00@000@00 0 0 $$$'"R%:(.H':(9( ! e}: ::::: '!!27t{.7), 8 ; DM*, 8 ; 333333 " eu + 5 ; 8 ; )(.P)>R^SYSMq+S&1T=]L;;MOv>`=eW4nD@t.-QQx}|}Dpk.kK]
g %[-_| Rsn'A@<^: @@UnknownG:Ax Times New Roman5Symbol3&:Cx Arial"1hPK:K:!4 2HP ?W4n2(Chapter 2--Summarizing and Graphing DataCarol MCarol MCompObjCq