010. Frequency Distributions

Example 2.2 As the resident statistician for Pigs & People Airline, you have been asked by the chairman of the board to collect and organize data regarding flight operations. You are primarily interested in the daily values for two variables: (1) number of passengers and (2) number of miles flown, rounded to the nearest tenth. You are able to obtain these data from daily flight records for the past 50 days, and you have recorded this information in Table 2.1 and Table 2.2. However, in this raw form it is unlikely that the chairman could gain any useful knowledge regarding operations.

Table 2.1 – Raw Data on the Number of Passengers for P&P Airlines

68

71

77

83

79

72

74

57

67

69

50

60

70

66

76

70

84

59

75

94

65

72

85

79

71

83

84

74

82

97

77

73

78

93

95

78

81

79

90

83

80

84

91

101

86

93

92

102

80

69

Table 2.2 – Raw Data on Miles Flown for P&P Airlines

569.3

420.4

468.5

443.9

403.7

519.7

518.7

445.3

459.0

373.4

493.7

505.7

453.7

397.1

463.9

618.3

493.3

477.0

380.0

423.7

391.0

553.5

513.7

330.0

419.8

370.7

544.1

470.0

361.9

483.8

405.7

550.6

504.6

343.3

497.9

453.3

604.3

473.3

393.9

478.4

437.9

320.4

473.3

359.3

568.2

450.0

413.4

469.3

383.7

469.1

There are several ways to organize presentation.

1. Frequency distribution (frequency table) will provide some order to the data by dividing them into classes and recording the number of observations in each class. In Table 2.3 distribution for the daily number of passengers over the last 50 days is presented.

Each class has a Lower boundary and Upper boundary. The class limits of these boundaries are quite important. It is essential that class boundaries do not overlap.

Table 2.3 – Frequency Distribution for Passengers

Class (passengers)

Frequency (days)

50 to 59

3

60 to 69

7

70 to 79

18

80 to 89

12

90 to 99

8

100 to 109

2

Total

50

Boundaries such as

50 to 60

60 to 70

70 to 80

Are confusing. Since Passengers is a discrete variable, values such as 59.9 pose no problem since it is impossible to have fractional values. On the other hand, Miles flown is a continuous variable since it is possible to fly a fraction of a mile. It would be improper to set the boundaries as

300 to 349

350 to 399

400 to 449

Since it is unclear in which class observations such as 349.9 or 399.9 should be tallied. The frequency distribution for miles flown might instead appear in Table 2.4. The chairman can now detect a pattern to flight operations not apparent from the raw data in Table 2.2. for example, P&P never flew over 650 miles on any of the 50 days examined. They flew between 450 and 500 miles more often than any other distance. On 26 of the 50 days examined, total mileage was between 400 and 500 miles.

Table 2.4 – Frequency Distribution for Miles Flown

Class (miles)

Frequency (days)

300 and under 350

3

350 and under 400

9

400 and under 450

9


Continuation of Table 2.4

Class (miles)

Frequency (days)

450 and under 500

17

500 and under 550

6

550 and under 600

4

600 and under 650

2

Total

50

2. The number of classes in a frequency table is somewhat arbitrary. In general, StatisticAl table should have between 5 and 20 classes. Too few classes would not reveal any details about the data; too many would prove as confusing as the list of raw data itself.

There is a simple rule to approximate the number of classes

, (2.1)

Or

C = 1 + 3.322*Lg n, (2.2)

Where C is the number of classes, N is the number of observations.

In example 2.2 for P&P the number of observations is N = 50. thus,

.

Solving for C, it can be found , which exceeds N. This rule suggests that there should be six classes in the frequency table.

This rule should not be taken as the final determining factor. For convenience, more classes or fewer classes may be used.

© 2011-2024 Контрольные работы по математике и другим предметам!