Вашата кошница е празна
The representation of data
The data collected for statistical analysis are usually called the set of data or sample.
We can collect the data in tables and we can represent them on diagrams. We can come across several such diagrams in the newspapers and in the television.

thousand hectares
The changes in the area of woods in a country
\latex{ 1600 }
\latex{ 1400 }
\latex{ 1200 }
\latex{ 1000 }
\latex{ 800 }
\latex{ 600 }
\latex{ 400 }
\latex{ 200 }
\latex{ 0 }
\latex{ 1955 }
\latex{ 1960 }
\latex{ 1965 }
\latex{ 1970 }
\latex{ 1975 }
\latex{ 1980 }
\latex{ 1985 }
\latex{ 1990 }
\latex{ 1995 }
\latex{ 2000 }
\latex{ 2005 }
For example the table and the diagram in Figure 1 show the changes in the area of woods in a country from \latex{ 1955 } to \latex{ 2005 }.
We can follow the changes of and can compare the different data well if we represent them in the coordinate system with the help of curves. (Figure 2)
In many cases the information read from the diagrams could only be communicated with several tables or text. An image is not only more demonstrative but also provides more information.
In many cases the information read from the diagrams could only be communicated with several tables or text. An image is not only more demonstrative but also provides more information.
The following column charts show the changes in the population of a country in Central Europe in certain age groups (Figure 3). The curves, column charts are useful when we are interested in the changes of or in the relation of the data.

thousand
people
people
The changes in the natural increase (or decrease)
number of deaths
number of
live-births
live-births
natural increase
natural decrease
\latex{ 1960 }
\latex{ 1970 }
\latex{ 1980 }
\latex{ 1990 }
\latex{ 1991 }
\latex{ 1992 }
\latex{ 2001 }
\latex{ 2002 }
\latex{ 2003 }
\latex{ 2004 }
\latex{ 2005 }
\latex{ 2006 }
\latex{ 130 }
\latex{ 120 }
\latex{ 110 }
\latex{ 100 }
\latex{ +44,936 }
\latex{ +31,622 }
\latex{ +3,318 }
\latex{ -19,981 }
\latex{ -17,606 }
\latex{ -27,057 }
\latex{ -33,136 }
\latex{ -36,029 }
\latex{ -41,176 }
\latex{ -37,355 }
\latex{ -38,236 }
\latex{ -31,650 }

The changes in the number of new-borns
\latex{ 192 } thousand
\latex{ 146 } thousand
\latex{ 152 } thousand
\latex{ 149 } thousand
\latex{ 97 } thousand
\latex{ 97 } thousand
\latex{ 100 } thousand
\latex{ 95 } thousand
\latex{ 50 }
\latex{ 100 }
\latex{ 150 }
\latex{ 200 }
\latex{ 1948 }
\latex{ 1960 }
\latex{ 1970 }
\latex{ 1980 }
\latex{ 1990 }
\latex{ 2001 }
\latex{ 2004 }
\latex{ 2006 }

Figure 3
women
men
excess number
of women
of women
excess
number
of men
number
of men
The age pyramid of the population of a country
in Central Europe (\latex{ 2006 })
in Central Europe (\latex{ 2006 })
\latex{ 400 }
\latex{ 300 }
\latex{ 200 }
\latex{ 100 }
\latex{ 100 }
\latex{ 200 }
\latex{ 400 }
\latex{ 300 }
\latex{ 90- }
\latex{ 85-89 }
\latex{ 80-84 }
\latex{ 75-79 }
\latex{ 70-74 }
\latex{ 65-69 }
\latex{ 60-64 }
\latex{ 55-59 }
\latex{ 50-54 }
\latex{ 45-49 }
\latex{ 40-44 }
\latex{ 35-39 }
\latex{ 30-34 }
\latex{ 25-29 }
\latex{ 20-24 }
\latex{ 15-19 }
\latex{ 10-14 }
\latex{ 5-9 }
\latex{ -4 }
Example 1
Albert collected \latex{ 20 } points in his Mathematics test, while Bruce collected \latex{ 25 } points in his Physics test. Who has the better test result if the total points were 25 for the Mathematics test and \latex{ 50 } for the Physics test?
Solution
Although Bruce collected more points than Albert, it is only \latex{ 50 }% of the total, while Albert's result is \latex{ 80 }% of the total, so Albert has the better test result.
It often happens that comparing the amounts measured in different ways does not provide the adequate information; in cases like this comparing the percentages or ratios shows a more realistic picture.
It often happens that comparing the amounts measured in different ways does not provide the adequate information; in cases like this comparing the percentages or ratios shows a more realistic picture.
If we are interested in the proportion of data compared to the whole, then it is worth using a pie chart (or a three-dimensional pie chart) to represent them, where the circle is divided in the ratio of the corresponding data (Figure 4).
Figure 5 and 6 show well when it is more helpful to use a column chart and when to use a pie chart (or a three-dimensional pie chart):
Figure 5 and 6 show well when it is more helpful to use a column chart and when to use a pie chart (or a three-dimensional pie chart):

Figure 5
thousand EUR/person/year
Salaries based on the qualification in \latex{ 2006 }
in a Central European country
in a Central European country
\latex{ 400 }
\latex{ 350 }
\latex{ 300 }
\latex{ 250 }
\latex{ 200 }
\latex{ 150 }
\latex{ 100 }
\latex{ 50 }
\latex{ 0 }

Figure 6
secondary education
28.9%
28.9%
GCSE
34.7%
34.7%
8 classes
15.1%
15.1%
university, college
20.8%
20.8%
0–7 classes 0.5%
The distribution of the number of employed people,
based on the highest level of qualification
based on the highest level of qualification
The number of occurrence of each data is shown by the frequency which can be represented on a frequency diagram, in other word on a histogram.
The data and their frequency together constitute a frequency distribution.
We often classify data; we count how much data there is in each class and thus we get the frequency of classes.
The number of occurrence of each data is shown by the frequency which can be represented on a frequency diagram, in other word on a histogram.
The result of the survey about the language knowledge of citizens older than 14 in a Central European country is as follows:

The number of languages spoken
Does not speak a foreign language
Speaks \latex{ 1 } foreign language
Speaks \latex{ 2 } foreign language
Speaks \latex{ 3 } or more foreign languages
\latex{ 5,603 } thousand
\latex{ 1,483 } thousand
\latex{ 906 } thousand
\latex{ 247 } thousand
Number of people

Figure 7
The number of languages spoken
Number of people
(thousand people)
(thousand people)
\latex{ 6,000 }
\latex{ 5,000 }
\latex{ 4,000 }
\latex{ 3,000 }
\latex{ 2,000 }
\latex{ 1,000 }
\latex{ 0 }
If we give the data in the ratio of the whole population older than \latex{ 14 }, i.e. we calculate what portion, what percentage of the people older than \latex{ 14 } speak \latex{ 0;\, 1;\, 2;\, 3 } or more languages, then we get the relative frequency of the data, which often makes a better comparison of the data possible. It can be represented on a histogram too.
Example 2
A class wrote a test in Mathematics. The maximum attainable score was \latex{ 50 } points. The scores the students got:
\latex{ 18;\, 22;\, 37;\, 42;\, 48;\, 50;\, 32;\, 38;\, 26;\, 40; }
\latex{ 42;\, 43;\, 45;\, 35;\, 34;\, 36;\, 39;\, 40;\, 34;\, 33 }.
The teacher followed this pattern for the grades:
\latex{A: 43-50; \; B:36-42; \; C:29-35; \;D:22-28; \; F:0-21.}
Give the frequency distribution of the grades and represent them on a column chart.
Solution
The frequency distribution is shown in Figure 9; the column chart is shown in Figure 8.

Grade
Frequency
\latex{ F }
\latex{ D }
\latex{ C }
\latex{ B }
\latex{ A }
\latex{ 1 }
\latex{ 2 }
\latex{ 5 }
\latex{ 8 }
\latex{ 4 }
Figure 9

Exercises
{{exercise_number}}. A survey was conducted in a class regarding how many times the students had gone to the theatre in the past year. The result is shown in the column chart.
Read and tabulate the frequency distribution of the data.

number of students
occasions
\latex{ 0 }
\latex{ 1 }
\latex{ 2 }
\latex{ 3 }
\latex{ 4 }
\latex{ 5 }
\latex{ 6 }
\latex{ 7 }
\latex{ 8 }
\latex{ 0 }
\latex{ 1 }
\latex{ 2 }
\latex{ 3 }
\latex{ 4 }
\latex{ 5 }
\latex{ 6 }
\latex{ 7 }
\latex{ 8 }
{{exercise_number}}. Represent the area of the continents and the population on a column chart.

Continent
Europe
Asia
Africa
North America
Central and South America
Australia and Oceania
Antarctica
Total of the world
Area
Population
(\latex{ 1,000\, km^2 })
(million people)
\latex{ 10,508 }
\latex{ 44,411 }
\latex{ 30,319 }
\latex{ 21,515 }
\latex{ 20,566 }
\latex{ 8,510 }
\latex{ 13,328 }
\latex{ 149,157 }
\latex{ 732 }
\latex{ 3,969 }
\latex{ 924 }
\latex{ 332 }
\latex{ 566 }
\latex{ 34 }
\latex{ - }
\latex{ 6,555 }
{{exercise_number}}. In the table the data of the regions of Hungary is shown. Create a diagram based on the data.

Central
Hungary
Hungary
Central
Transdanubia
Transdanubia
Western
Transdanubia
Transdanubia
Southern
Transdanubia
Transdanubia
Northern
Hungary
Hungary
Northern
Great Plain
Great Plain
Southern
Great Plain
Great Plain
Area(%)
Population(%)
Gross national product (GNP)(%)
Unemployed people(%)
Foreign capital investment(%)
Investment(%)
\latex{ 7.4 }
\latex{ 28.3 }
\latex{ 41.6 }
\latex{ 15.2 }
\latex{ 64.0 }
\latex{ 39.4 }
\latex{ 12.8 }
\latex{ 6.8 }
\latex{ 10.5 }
\latex{ 10.0 }
\latex{ 11.0 }
\latex{ 12.1 }
\latex{ 12.0 }
\latex{ 9.8 }
\latex{ 10.3 }
\latex{ 6.9 }
\latex{ 8.9 }
\latex{ 11.3 }
\latex{ 15.2 }
\latex{ 9.7 }
\latex{ 7.8 }
\latex{ 11.8 }
\latex{ 3.3 }
\latex{ 6.9 }
\latex{ 14.4 }
\latex{ 12.7 }
\latex{ 8.8 }
\latex{ 19.2 }
\latex{ 7.7 }
\latex{ 9.5 }
\latex{ 19.1 }
\latex{ 15.1 }
\latex{ 10.6 }
\latex{ 22.2 }
\latex{ 4.7 }
\latex{ 11.4 }
\latex{ 8.7 }
\latex{ 4.6 }
\latex{ 14.2 }
\latex{ 10.9 }
\latex{ 13.4 }
\latex{ 19.8 }
{{exercise_number}}. The club is buying basketball shoes for the members of a girls basketball team. The length of the foot of the girls was measured in centimetres accurate to the millimetres, and the following results were obtained:
\latex{ 23.2;\, 26.3;\, 28.0;\, 25.1;\, 25.8;\, 24.9;\, 24.2;\, 25.4;\, 25.9;\, 26.1;\, 24.4;\, 24.9;\, 23.6;\, 23.4 }.
Give the frequency distribution of each size and represent them on a column chart, if the size conversion table of basketball shoes is as follows:

European size
Length of foot (cm)
\latex{ 36.0 }
\latex{ 36.5 }
\latex{ 37.5 }
\latex{ 38.0 }
\latex{ 38.5 }
\latex{ 39.0 }
\latex{ 40.0 }
\latex{ 40.5 }
\latex{ 41.0 }
\latex{ 42.0 }
\latex{ 42.5 }
\latex{ 43.0 }
\latex{ 28.0 }
\latex{ 27.5 }
\latex{ 27.0 }
\latex{ 26.5 }
\latex{ 26.0 }
\latex{ 25.5 }
\latex{ 25.0 }
\latex{ 24.5 }
\latex{ 24.0 }
\latex{ 23.5 }
\latex{ 23.0 }
\latex{ 22.5 }
{{exercise_number}}. Observe how many hours you spend a day with the following activities: sleeping, school, homework, entertainment, eating and other. Represent the data on a column chart and on a pie chart.
Collect data on the number of students of the class who on average sleep less than \latex{ 4 } hours, \latex{ 4 } to \latex{ 6 } hours, \latex{ 6 } to \latex{ 8 } hours, more than 8 hours a day. Represent the frequency distribution on a column chart.
{{exercise_number}}. Collect and represent data on a histogram regarding how many students from the class go to school on foot, by bicycle, by public transport or by car.
{{exercise_number}}. Create a frequency table showing how many times each letter occurs on the first page of the first lesson of your history book. Represent it on a histogram. Based on this, determine which letters are the most common in your language? In one of the games one should figure out a word and can guess \latex{ 6 } letters. The letter which appears in the word will be shown. Which letters would you guess? Why is the chance for hitting small?
{{exercise_number}}. Experiment with two coins. Flip the two coins at once \latex{ 30 } times in a row and take notes on how many times \latex{ 0 } head, \latex{ 1 } head or \latex{ 2 } heads appear on the two coins. Represent it on a column chart. Choose the value appearing most often.
Puzzle
In a quiz show one answer should be chosen out of the four possible answers. The audience is allowed to give assistance to the player. The question was as follows: “Whose comedy is musical My Fair Lady based on?”
- SHAKESPEARE
- G. B. SHAW
- OSCAR WILDE
- NEIL SIMON
In the diagram of the votes the columns show the votes given for answers \latex{ A, B, C, D } respectively. Which answer shall we choose?

%
\latex{ 50 }
\latex{ 40 }
\latex{ 30 }
\latex{ 20 }
\latex{ 10 }
\latex{ 0 }
\latex{ A }
\latex{ B }
\latex{ C }
\latex{ D }




