Average statistical value formula. Types of medium

In statistics, various types of averages are used, which are divided into two large classes:

Power means (harmonic mean, geometric mean, arithmetic mean, quadratic mean, cubic mean);

Structural means (mode, median).

To calculate power averages it is necessary to use all available characteristic values. Fashion And median are determined only by the structure of the distribution, therefore they are called structural, positional averages. The median and mode are often used as an average characteristic in those populations where calculating the power mean is impossible or impractical.

The most common type of average is the arithmetic mean. Under arithmetic mean is understood as the value of a characteristic that each unit of the population would have if the total sum of all values ​​of the characteristic were distributed evenly among all units of the population. The calculation of this value comes down to summing all the values ​​of the varying characteristic and dividing the resulting amount by the total number of units in the population. For example, five workers fulfilled an order for the production of parts, while the first made 5 parts, the second – 7, the third – 4, the fourth – 10, the fifth – 12. Since in the source data the value of each option occurred only once, to determine

To determine the average output of one worker, one should apply the simple arithmetic average formula:

i.e. in our example, the average output of one worker is equal to

Along with the simple arithmetic mean, they study weighted arithmetic average. For example, let's calculate the average age of students in a group of 20 people, whose ages range from 18 to 22 years, where xi– variants of the characteristic being averaged, fi– frequency, which shows how many times it occurs i-th value in the aggregate (Table 5.1).

Table 5.1

Average age of students

Applying the weighted arithmetic mean formula, we get:


There is a certain rule for choosing a weighted arithmetic mean: if there is a series of data on two indicators, for one of which you need to calculate

average value, and at the same time the numerical values ​​of the denominator of its logical formula are known, and the values ​​of the numerator are unknown, but can be found as the product of these indicators, then the average value should be calculated using the arithmetic weighted average formula.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic average loses its meaning and the only generalizing indicator can only be another type of average - harmonic mean. Currently, the computational properties of the arithmetic mean have lost their relevance in the calculation of general statistical indicators due to the widespread introduction of electronic computing technology. The harmonic mean value, which can also be simple and weighted, has acquired great practical importance. If the numerical values ​​of the numerator of a logical formula are known, and the values ​​of the denominator are unknown, but can be found as a partial division of one indicator by another, then the average value is calculated using the harmonic weighted average formula.

For example, let it be known that the car covered the first 210 km at a speed of 70 km/h, and the remaining 150 km at a speed of 75 km/h. It is impossible to determine the average speed of a car over the entire journey of 360 km using the arithmetic average formula. Since the options are speeds in individual sections xj= 70 km/h and X2= 75 km/h, and the weights (fi) are considered to be the corresponding sections of the path, then the products of the options and the weights will have neither physical nor economic meaning. In this case, the quotients acquire meaning from dividing the sections of the path into the corresponding speeds (options xi), i.e., the time spent on passing individual sections of the path (fi / xi). If the segments of the path are denoted by fi, then the entire path will be expressed as?fi, and the time spent on the entire path will be expressed as?fi. fi / xi , Then the average speed can be found as the quotient of the entire path divided by the total time spent:

In our example we get:

If, when using the harmonic mean, the weights of all options (f) are equal, then instead of the weighted one you can use simple (unweighted) harmonic mean:

where xi are individual options; n– number of variants of the characteristic being averaged. In the speed example, simple harmonic mean could be applied if the path segments traveled at different speeds were equal.

Any average value must be calculated so that when it replaces each variant of the averaged characteristic, the value of some final, general indicator that is associated with the averaged indicator does not change. Thus, when replacing actual speeds on individual sections of the route with their average value (average speed), the total distance should not change.

The form (formula) of the average value is determined by the nature (mechanism) of the relationship of this final indicator with the averaged one, therefore the final indicator, the value of which should not change when replacing options with their average value, is called defining indicator. To derive the formula for the average, you need to create and solve an equation using the relationship between the averaged indicator and the determining one. This equation is constructed by replacing the variants of the characteristic (indicator) being averaged with their average value.

In addition to the arithmetic mean and harmonic mean, other types (forms) of the mean are used in statistics. They are all special cases power average. If we calculate all types of power averages for the same data, then the values

they will turn out to be the same, the rule applies here majo-rate average. As the exponent of the average increases, the average value itself increases. The most frequently used formulas for calculating various types of power averages in practical research are presented in Table. 5.2.

Table 5.2

Types of power means


The geometric mean is used when there is n growth coefficients, while the individual values ​​of the characteristic are, as a rule, relative dynamics values, constructed in the form of chain values, as a ratio to the previous level of each level in the dynamics series. The average thus characterizes the average growth rate. Average geometric simple calculated by the formula

Formula weighted geometric mean has the following form:

The above formulas are identical, but one is applied for current coefficients or growth rates, and the second is applied for absolute values ​​of series levels.

Mean square used in calculations with the values ​​of quadratic functions, used to measure the degree of fluctuation of individual values ​​of a characteristic around the arithmetic mean in the distribution series and is calculated by the formula

Weighted mean square calculated using another formula:

Average cubic is used when calculating with values ​​of cubic functions and is calculated by the formula

average cubic weighted:

All average values ​​discussed above can be presented as a general formula:

where is the average value; – individual meaning; n– number of units of the population being studied; k– exponent that determines the type of average.

When using the same source data, the more k in the general power average formula, the larger the average value. It follows from this that there is a natural relationship between the values ​​of power averages:

The average values ​​described above give a generalized idea of ​​the population being studied, and from this point of view, their theoretical, applied and educational significance is indisputable. But it happens that the average value does not coincide with any of the actually existing options, therefore, in addition to the considered averages, in statistical analysis it is advisable to use the values ​​of specific options that occupy a very specific position in the ordered (ranked) series of attribute values. Among these quantities, the most commonly used are structural, or descriptive, average– mode (Mo) and median (Me).

Fashion– the value of a characteristic that is most often found in a given population. In relation to a variational series, the mode is the most frequently occurring value of the ranked series, that is, the option with the highest frequency. Fashion can be used in determining the stores that are visited more often, the most common price for any product. It shows the size of a feature characteristic of a significant part of the population and is determined by the formula

where x0 is the lower limit of the interval; h– interval size; fm– interval frequency; fm_ 1 – frequency of the previous interval; fm+ 1 – frequency of the next interval.

Median the option located in the center of the ranked row is called. The median divides the series into two equal parts in such a way that there are the same number of population units on either side of it. In this case, one half of the units in the population has a value of the varying characteristic that is less than the median, while the other half has a value greater than it. The median is used when studying an element whose value is greater than or equal to, or at the same time less than or equal to, half of the elements of a distribution series. The median gives a general idea of ​​where the attribute values ​​are concentrated, in other words, where their center is.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative limit of the values ​​of a varying characteristic that half of the units in the population possess. The problem of finding the median for a discrete variation series is easily solved. If all units of the series are given serial numbers, then the serial number of the median option is determined as (n + 1) / 2 with an odd number of members of n. If the number of members of the series is an even number, then the median will be the average value of two options that have serial numbers n/ 2 and n/ 2 + 1.

When determining the median in interval variation series, first determine the interval in which it is located (median interval). This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds half the sum of all frequencies of the series. The median of an interval variation series is calculated using the formula

Where X0– lower limit of the interval; h– interval size; fm– interval frequency; f– number of members of the series;

M -1 – the sum of the accumulated terms of the series preceding the given one.

Along with the median, to more fully characterize the structure of the population under study, other values ​​of options that occupy a very specific position in the ranked series are also used. These include quartiles And deciles. Quartiles divide the series according to the sum of frequencies into 4 equal parts, and deciles - into 10 equal parts. There are three quartiles and nine deciles.

The median and mode, unlike the arithmetic mean, do not eliminate individual differences in the values ​​of a variable characteristic and therefore are additional and very important characteristics of the statistical population. In practice, they are often used instead of the average or along with it. It is especially advisable to calculate the median and mode in cases where the population under study contains a certain number of units with a very large or very small value of the varying characteristic. These values ​​of the options, which are not very characteristic of the population, while influencing the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.

5.1. The concept of average

Average value - This is a general indicator characterizing the typical level of the phenomenon. It expresses the value of a characteristic per unit of the population.

The average always generalizes the quantitative variation of a trait, i.e. in average values, individual differences between units in the population due to random circumstances are eliminated. In contrast to the average, the absolute value characterizing the level of a characteristic of an individual unit of a population does not allow one to compare the values ​​of a characteristic among units belonging to different populations. So, if you need to compare the levels of remuneration of workers at two enterprises, then you cannot compare two employees of different enterprises on this basis. The compensation of workers selected for comparison may not be typical for these enterprises. If we compare the size of wage funds at the enterprises under consideration, the number of employees is not taken into account and, therefore, it is impossible to determine where the level of wages is higher. Ultimately, only average indicators can be compared, i.e. How much does one employee earn on average at each enterprise? Thus, there is a need to calculate the average value as a generalizing characteristic of the population.

Calculating the average is one of the common generalization techniques; the average indicator denies what is common (typical) for all units of the population being studied, while at the same time it ignores the differences of individual units. In every phenomenon and its development there is a combination of chance and necessity. When calculating averages, due to the action of the law of large numbers, the randomness cancels out and balances out, so it is possible to abstract from the unimportant features of the phenomenon, from the quantitative values ​​of the characteristic in each specific case. The ability to abstract from the randomness of individual values ​​and fluctuations lies the scientific value of averages as generalizing characteristics of aggregates.

In order for the average to be truly representative, it must be calculated taking into account certain principles.

Let us dwell on some general principles for the use of averages.
1. The average must be determined for populations consisting of qualitatively homogeneous units.
2. The average must be calculated for a population consisting of a sufficiently large number of units.
3. The average must be calculated for a population whose units are in a normal, natural state.
4. The average should be calculated taking into account the economic content of the indicator under study.

5.2. Types of averages and methods for calculating them

Let us now consider the types of average values, features of their calculation and areas of application. Average values ​​are divided into two large classes: power averages, structural averages.

TO power average These include the most well-known and frequently used types, such as geometric mean, arithmetic mean and quadratic mean.

As structural averages mode and median are considered.

Let's focus on power averages. Power averages, depending on the presentation of the source data, can be simple or weighted. Simple average It is calculated based on ungrouped data and has the following general form:

where X i is the variant (value) of the characteristic being averaged;

n – number option.

Weighted average is calculated based on grouped data and has a general appearance

,

where X i is the variant (value) of the characteristic being averaged or the middle value of the interval in which the variant is measured;
m – average degree index;
f i – frequency showing how many times the i-e value of the averaged characteristic occurs.

Let us give as an example the calculation of the average age of students in a group of 20 people:


We calculate the average age using the simple average formula:

Let's group the source data. We get the following distribution series:

As a result of grouping, we obtain a new indicator - frequency, indicating the number of students aged X years. Therefore, the average age of the students in the group will be calculated using the weighted average formula:

General formulas for calculating power averages have an exponent (m). Depending on the value it takes, the following types of power averages are distinguished:
harmonic mean if m = -1;
geometric mean, if m –> 0;
arithmetic mean if m = 1;
root mean square if m = 2;
average cubic if m = 3.

Formulas for power averages are given in Table. 4.4.

If you calculate all types of averages for the same initial data, then their values ​​will turn out to be different. The rule of majority of averages applies here: as the exponent m increases, the corresponding average value also increases:

In statistical practice, arithmetic means and harmonic weighted means are used more often than other types of weighted averages.

Table 5.1

Types of power means

Kind of power
average
Index
degree (m)
Calculation formula
Simple Weighted
Harmonic -1
Geometric 0
Arithmetic 1
Quadratic 2
Cubic 3

The harmonic mean has a more complex structure than the arithmetic mean. The harmonic mean is used for calculations when not the units of the population - the carriers of the characteristic - are used as weights, but the product of these units by the values ​​of the characteristic (i.e. m = Xf). The average harmonic simple should be resorted to in cases of determining, for example, the average cost of labor, time, materials per unit of production, per one part for two (three, four, etc.) enterprises, workers engaged in the manufacture of the same type of product , the same part, product.

The main requirement for the formula for calculating the average value is that all stages of the calculation have a real meaningful justification; the resulting average value should replace the individual values ​​of the attribute for each object without disrupting the connection between the individual and summary indicators. In other words, the average value must be calculated in such a way that when each individual value of the averaged indicator is replaced by its average value, some final summary indicator, connected in one way or another with the averaged value, remains unchanged. This total is called defining since the nature of its relationship with individual values ​​determines the specific formula for calculating the average value. Let us demonstrate this rule using the example of the geometric mean.

Geometric mean formula

used most often when calculating the average value based on individual relative dynamics.

The geometric mean is used if a sequence of chain relative dynamics is given, indicating, for example, an increase in production compared to the level of the previous year: i 1, i 2, i 3,..., i n. Obviously, the volume of production in the last year is determined by its initial level (q 0) and subsequent increase over the years:

q n =q 0 × i 1 × i 2 ×...×i n .

Taking q n as the determining indicator and replacing the individual values ​​of the dynamics indicators with average ones, we arrive at the relation

From here

5.3. Structural averages

A special type of average values ​​- structural averages - is used to study the internal structure of the distribution series of attribute values, as well as to estimate the average value (power type), if, according to the available statistical data, its calculation cannot be performed (for example, if in the considered example there were no data both the volume of production and the amount of costs by group of enterprises).

Indicators are most often used as structural averages fashion – the most frequently repeated value of the attribute – and medians – the value of a characteristic that divides the ordered sequence of its values ​​into two equal parts. As a result, for one half of the units in the population the value of the attribute does not exceed the median level, and for the other half it is not less than it.

If the characteristic being studied has discrete values, then there are no particular difficulties in calculating the mode and median. If data on the values ​​of attribute X are presented in the form of ordered intervals of its change (interval series), the calculation of the mode and median becomes somewhat more complicated. Since the median value divides the entire population into two equal parts, it ends up in one of the intervals of characteristic X. Using interpolation, the value of the median is found in this median interval:

,

where X Me is the lower limit of the median interval;
h Me – its value;
(Sum m)/2 – half of the total number of observations or half the volume of the indicator that is used as a weighting in the formulas for calculating the average value (in absolute or relative terms);
S Me-1 – the sum of observations (or the volume of the weighting attribute) accumulated before the beginning of the median interval;
m Me – the number of observations or the volume of the weighting characteristic in the median interval (also in absolute or relative terms).

In our example, even three median values ​​can be obtained - based on the number of enterprises, production volume and total production costs:

Thus, in half of the enterprises the cost per unit of production exceeds 125.19 thousand rubles, half of the total volume of products is produced with a cost per product of more than 124.79 thousand rubles. and 50% of the total costs are formed when the cost of one product is above 125.07 thousand rubles. Note also that there is a certain tendency towards an increase in cost, since Me 2 = 124.79 thousand rubles, and the average level is 123.15 thousand rubles.

When calculating the modal value of a characteristic based on the data of an interval series, it is necessary to pay attention to the fact that the intervals are identical, since the repeatability indicator of the values ​​of the characteristic X depends on this. For an interval series with equal intervals, the magnitude of the mode is determined as

where X Mo is the lower value of the modal interval;
m Mo – number of observations or volume of the weighting characteristic in the modal interval (in absolute or relative terms);
m Mo -1 – the same for the interval preceding the modal one;
m Mo+1 – the same for the interval following the modal one;
h – the value of the interval of change of the characteristic in groups.

For our example, we can calculate three modal values ​​based on the characteristics of the number of enterprises, the volume of products and the amount of costs. In all three cases, the modal interval is the same, since for the same interval the number of enterprises, the volume of production, and the total amount of production costs are greatest:

Thus, most often there are enterprises with a cost level of 126.75 thousand rubles, most often products are produced with a cost level of 126.69 thousand rubles, and most often production costs are explained by a cost level of 123.73 thousand rubles.

5.4. Variation indicators

The specific conditions in which each of the studied objects is located, as well as the features of their own development (social, economic, etc.) are expressed by the corresponding numerical levels of statistical indicators. Thus, variation, those. the discrepancy between the levels of the same indicator in different objects is objective in nature and helps to understand the essence of the phenomenon being studied.

There are several methods used to measure variation in statistics.

The simplest is to calculate the indicator range of variation H as the difference between the maximum (X max) and minimum (X min) observed values ​​of the characteristic:

H=X max - X min .

However, the range of variation shows only the extreme values ​​of the trait. The repeatability of intermediate values ​​is not taken into account here.

More stringent characteristics are indicators of variability relative to the average level of the attribute. The simplest indicator of this type is average linear deviation L as the arithmetic mean of the absolute deviations of a characteristic from its average level:

When individual X values ​​are repeatable, use the weighted arithmetic average formula:

(Recall that the algebraic sum of deviations from the average level is zero.)

The average linear deviation indicator is widely used in practice. With its help, for example, the composition of workers, the rhythm of production, the uniformity of supplies of materials are analyzed, and systems of material incentives are developed. But, unfortunately, this indicator complicates probabilistic calculations and complicates the use of mathematical statistics methods. Therefore, in statistical scientific research, the indicator most often used to measure variation is variances.

The variance of the characteristic (s 2) is determined based on the quadratic power mean:

.

The indicator s equal to is called standard deviation.

In the general theory of statistics, the dispersion indicator is an estimate of the probability theory indicator of the same name and (as the sum of squared deviations) an estimate of the dispersion in mathematical statistics, which makes it possible to use the provisions of these theoretical disciplines for the analysis of socio-economic processes.

If variation is estimated from a small number of observations taken from an unlimited population, then the average value of the characteristic is determined with some error. The calculated value of the dispersion turns out to be shifted towards a decrease. To obtain an unbiased estimate, the sample variance obtained using the previously given formulas must be multiplied by the value n / (n - 1). As a result, with a small number of observations (< 30) дисперсию признака рекомендуется вычислять по формуле

Usually, already for n > (15÷20), the discrepancy between the biased and unbiased estimates becomes insignificant. For the same reason, bias is usually not taken into account in the formula for adding variances.

If several samples are taken from the general population and each time the average value of a characteristic is determined, then the problem arises of assessing the variability of the averages. Estimate variance average value it is possible based on just one sample observation using the formula

,

where n is the sample size; s 2 – variance of the characteristic calculated from the sample data.

Magnitude is called average sampling error and is a characteristic of the deviation of the sample average value of attribute X from its true average value. The average error indicator is used to assess the reliability of the results of sample observation.

Relative dispersion indicators. To characterize the measure of variability of the characteristic being studied, indicators of variability are calculated in relative values. They make it possible to compare the nature of dispersion in different distributions (different units of observation of the same characteristic in two populations, with different average values, when comparing populations of different names). The calculation of indicators of the relative dispersion measure is carried out as the ratio of the absolute dispersion indicator to the arithmetic mean, multiplied by 100%.

1. Oscillation coefficient reflects the relative fluctuation of the extreme values ​​of the characteristic around the average

.

2. Relative linear shutdown characterizes the proportion of the average value of the sign of absolute deviations from the average value

.

3. Coefficient of variation:

is the most common measure of variability used to assess the typicality of average values.

In statistics, populations with a coefficient of variation greater than 30–35% are considered heterogeneous.

This method of assessing variation also has a significant drawback. Indeed, let, for example, the original population of workers with an average experience of 15 years, with a standard deviation of s = 10 years, “grow older” by another 15 years. Now = 30 years, and the standard deviation is still 10. The previously heterogeneous population (10/15 × 100 = 66.7%), thus turning out to be quite homogeneous over time (10/30 × 100 = 33.3%).

Boyarsky A.Ya. Theoretical studies in statistics: Sat. Scientific Trudov. – M.: Statistics, 1974. pp. 19–57.

Previous

Lecture 5. Average values

The concept of average in statistics

Arithmetic mean and its properties

Other types of power averages

Mode and median

Quartiles and deciles

Average values ​​are widely used in statistics. Average values ​​characterize the qualitative indicators of commercial activity: distribution costs, profit, profitability, etc.

Average- This is one of the common generalization techniques. A correct understanding of the essence of the average determines its special significance in a market economy, when the average, through the individual and random, allows us to identify the general and extremely important, to identify the trend of patterns of economic development.

average value- these are generalizing indicators in which the effects of general conditions and patterns of the phenomenon being studied are expressed.

average value (in statistics) – a general indicator characterizing the typical size or level of social phenomena per unit of the population, all other things being equal.

Using the method of averages, the following can be solved: main goals:

1. Characteristics of the level of development of phenomena.

2. Comparison of two or more levels.

3. Study of the interrelations of socio-economic phenomena.

4. Analysis of the location of socio-economic phenomena in space.

Statistical averages are calculated on the basis of mass data from correctly statistically organized mass observation (continuous and selective). In this case, the statistical average will be objective and typical if it is calculated from mass data for a qualitatively homogeneous population (mass phenomena). For example, if you calculate the average wage in cooperatives and state-owned enterprises, and extend the result to the entire population, then the average is fictitious, since it is calculated for a heterogeneous population, and such an average loses all meaning.

With the help of the average, differences in the value of a characteristic that arise for one reason or another in individual units of observation are smoothed out. For example, the average output of a salesperson depends on many reasons: qualifications, length of service, age, form of service, health, etc.

The essence of the average lies in the fact that it cancels out the deviations of the characteristic values ​​of individual units of the population caused by the action of random factors, and takes into account changes caused by the action of basic factors. This allows the average to reflect the typical level of the trait and abstract from the individual characteristics inherent in individual units.

The average value is a reflection of the values ​​of the characteristic being studied, therefore, it is measured in the same dimension as the given characteristic.

Each average value characterizes the population under study according to any one characteristic. In order to obtain a complete and comprehensive picture of the population being studied according to a number of essential characteristics, in general it is extremely important to have a system of average values ​​that can describe the phenomenon from different angles.

There are different averages:

Arithmetic mean;

Geometric mean;

Harmonic mean;

Mean square;

Average chronological.

The concept of average in statistics - concept and types. Classification and features of the category "The concept of average value in statistics" 2017, 2018.

Department of Statistics

COURSE WORK

THEORY OF STATISTICS

On the topic: Average values

Completed by: Group number: STP - 72

Yunusova Gulnazia Chamilevna

Checked by: Serga Lyudmila Konstantinovna


Introduction

1. The essence of average values, general principles of application

2. Types of average values ​​and scope of their application

2.1 Power averages

2.1.1 Arithmetic mean

2.1.2 Harmonic mean value

2.1.3 Geometric mean value

2.1.4 Root mean square value

2.2. Structural averages

2.2.1 Median

3. Basic methodological requirements for the correct calculation of average values

Conclusion

List of used literature


Introduction

The history of the practical use of averages goes back tens of centuries. The main purpose of calculating the average was to study the proportions between values. The importance of calculating average values ​​has increased in connection with the development of probability theory and mathematical statistics. Solving many theoretical and practical problems would be impossible without calculating the average and assessing the variability of individual values ​​of a characteristic.

Scientists from different directions have sought to define the average. For example, the outstanding French mathematician O.L. Cauchy (1789 - 1857) believed that the average of several quantities is a new quantity, which lies between the smallest and largest of the quantities under consideration.

However, the Belgian statistician A. Quetelet (1796 - 1874) should be considered the creator of the theory of averages. He made an attempt to determine the nature of average values ​​and the patterns manifested in them. According to Quetelet, constant causes act equally (constantly) on every phenomenon under study. It is they who make these phenomena similar to each other and create patterns common to all of them.

A consequence of A. Quetelet's teaching about general and individual causes was the identification of average values ​​as the main technique of statistical analysis. He emphasized that statistical averages are not just a measure of mathematical measurement, but a category of objective reality. He identified the typical, really existing average with the true value, deviations from which can only be random.

A clear expression of the stated view of the average is his theory of the “average man,” i.e. a person of average height, weight, strength, average chest volume, lung capacity, average visual acuity and normal complexion. The average characterizes the “true” type of a person; all deviations from this type indicate ugliness or disease.

The views of A. Quetelet were further developed in the works of the German statistician W. Lexis (1837 - 1914).

Another version of the idealistic theory of averages is based on the philosophy of Machism. Its founder was the English statistician A. Bowley (1869 - 1957). He saw averages as a way to most simply describe the quantitative characteristics of a phenomenon. Defining the meaning of averages or, as he puts it, “their function,” Bowley brings to the fore the Machian principle of thinking. Thus, he wrote that the function of averages is clear: it is to express a complex group using a few prime numbers. The mind is not able to immediately grasp the magnitude of millions of statistical data; they must be grouped, simplified, and reduced to averages.

A follower of A. Quetelet was also the Italian statistician C. Gini (1884-1965), author of a major monograph “Average Values”. K. Gini criticized the definition of average given by the Soviet statistician A. Ya . Boyarsky, and formulated his own: “The average of several quantities is the result of actions performed according to a certain rule on given quantities, and represents either one of the given quantities, which is no more and no less than all the others (real or effective average), or some a new value intermediate between the smallest and largest of the given values ​​(countable average).”

In this course work we will consider in detail the main problems of the theory of averages. In the first chapter we will reveal the essence of average values ​​and general principles of application. In the second chapter we will consider the types of averages and the scope of their application using specific examples. The third chapter will discuss the basic methodological requirements for calculating average values.


1. The essence of average values, general principles of application

Average values ​​are one of the most common generalizing statistical indicators. They aim to characterize with one number a statistical population consisting of a minority of units. Average values ​​are closely related to the law of large numbers. The essence of this dependence is that with a large number of observations, random deviations from general statistics cancel each other out and, on average, a statistical pattern appears more clearly.

The average value is a general indicator characterizing the typical level of a phenomenon in specific conditions of place and time. It expresses the level of a characteristic typical for each unit of the population.

The average is an objective characteristic only for homogeneous phenomena. Averages for heterogeneous populations are called sweeping and can only be used in combination with partial averages of homogeneous populations.

The average is used in statistical studies to assess the current level of a phenomenon, to compare several populations with each other on the same basis, to study the dynamics of the development of the phenomenon being studied over time, to study the interrelations of phenomena.

Averages are widely used in various planning, forecasting, and financial calculations.

The main significance of average values ​​lies in their generalizing function, i.e. replacing many different individual values ​​of a characteristic with an average value that characterizes the entire set of phenomena. Everyone knows the developmental features of modern people, which are manifested, among other things, in the higher growth of sons compared to fathers, daughters compared to mothers at the same age. But how to measure this phenomenon?

In different families, there are very different ratios of the heights of the older and younger generations. Not every son is taller than his father and not every daughter is taller than his mother. But if you measure the average height of many thousands of individuals, then by the average height of sons and fathers, daughters and mothers, you can accurately establish both the very fact of acceleration and the typical average amount of increase in height over one generation.

To produce the same quantity of goods of a certain type and quality, different producers (factories, firms) spend unequal amounts of labor and material resources. But the market averages these costs, and the cost of the product is determined by the average consumption of resources for production.

The weather in a certain point on the globe on the same day in different years can be very different. For example, in St. Petersburg on March 31, the air temperature over more than a hundred years of observations ranged from -20.1° in 1883 to +12.24° in 1920. Approximately the same fluctuations are on other days of the year. Based on such individual weather data in any arbitrary year, it is impossible to get an idea of ​​the climate of St. Petersburg. Climate characteristics are average weather characteristics over a long period - air temperature, humidity, wind speed, amount of precipitation, number of hours of sunshine per week, month and whole year, etc.

If the average value generalizes qualitatively homogeneous values ​​of a characteristic, then it is a typical characteristic of the characteristic in a given population. Thus, we can talk about measuring the typical height of Russian girls born in 1973 when they reach the age of 20. A typical characteristic would be the average milk yield from black-and-white cows in the first year of lactation at a feeding rate of 12.5 feed units per day.

However, it is incorrect to reduce the role of average values ​​only to the characteristics of typical values ​​of characteristics in populations homogeneous for a given characteristic. In practice, much more often, modern statistics use average values ​​that generalize clearly heterogeneous phenomena, such as, for example, the yield of all grain crops throughout Russia. Or consider such an average as the average consumption of meat per capita: after all, among this population there are children under one year old who do not consume meat at all, and vegetarians, and northerners, and southerners, miners, athletes and pensioners. The atypicality of such an average indicator as the average national income produced per capita is even clearer.

The average national income per capita, the average grain yield throughout the country, the average consumption of various food products - these are the characteristics of the state as a single national economic system, these are the so-called system averages.

System averages can characterize both spatial or object systems that exist simultaneously (state, industry, region, planet Earth, etc.) and dynamic systems extended over time (year, decade, season, etc.).

An example of a system average characterizing a period of time is the average air temperature in St. Petersburg for 1992, equal to +6.3°. This average generalizes the extremely heterogeneous temperatures of winter frosty days and nights, hot summer days, spring and autumn. 1992 was a warm year, its average temperature is not typical for St. Petersburg. As a typical average annual air temperature in a city, one should use the long-term average, say, for 30 years from 1963 to 1992, which is +5.05°. This average is a typical average, since it generalizes homogeneous values; average annual temperatures of the same geographical location, varying over 30 years from +2.90° in 1976 to +7.44° in 1989.

General theory of statistics: lecture notes Konik Nina Vladimirovna

2. Types of averages

2. Types of averages

In statistics, various types of averages are used, which are divided into two large classes:

1) power means (harmonic mean, geometric mean, arithmetic mean, quadratic mean, cubic mean);

2) structural averages (mode, median). To calculate power averages, it is necessary to use all available characteristic values. The mode and median are determined only by the structure of the distribution. Therefore, they are called structural, positional averages. The median and mode are often used as an average characteristic in those populations where calculating the power mean is impossible or impractical.

The most common type of average is the arithmetic mean. The arithmetic mean is the value of a characteristic that each unit of the population would have if the total sum of all values ​​of the characteristic were distributed evenly among all units of the population. In the general case, its calculation comes down to summing all the values ​​of the varying characteristic and dividing the resulting amount by the total number of units in the population. For example, five workers fulfilled an order for the manufacture of parts, while the first produced 5 parts, the second - 7, the third - 4, the fourth - 10, the fifth - 12. Since in the source data the value of each option occurred only once to determine the average output of one worker , you should apply the simple arithmetic average formula:

i.e. in our example, the average output of one worker

Along with the simple arithmetic average, the weighted arithmetic average is studied. For example, let’s calculate the average age of students in a group of 20 people, whose ages vary from 18 to 22 years, where x i are the variants of the characteristic being averaged, f is the frequency, which shows how many times the i-th value occurs in the population.

Applying the weighted arithmetic mean formula, we get:

There is a certain rule for choosing a weighted arithmetic average: if there is a series of data on two interrelated indicators, for one of which it is necessary to calculate the average value, and the numerical values ​​of the denominator of its logical formula are known, and the values ​​of the numerator are not known, but can be found as a product these indicators, then the average value should be calculated using the weighted arithmetic average formula.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic mean loses its meaning and the only generalizing indicator can only be another type of mean - the harmonic mean. Currently, the computational properties of the arithmetic mean have lost their relevance in the calculation of general statistical indicators due to the widespread introduction of electronic computing technology. The harmonic mean value, which can also be simple and weighted, has acquired great practical importance. If the numerical values ​​of the numerator of a logical formula are known, but the values ​​of the denominator are not known, then the average value is calculated using the harmonic weighted average formula.

If, when using the harmonic mean, the weights of all options (f ;) are equal, then instead of the weighted one, you can use a simple (unweighted) harmonic mean:

where x are individual options;

n – number of variants of the characteristic being averaged.

For example, simple harmonic mean can be applied to speed if the path segments covered at different speeds are equal.

Any average value must be calculated so that when it replaces each variant of the averaged characteristic, the value of some final, general indicator that is associated with the averaged indicator does not change. Thus, when replacing actual speeds on individual sections of the path with their average value (average speed), the total distance should not change.

The average formula is determined by the nature (mechanism) of the relationship between this final indicator and the averaged indicator. Therefore, the final indicator, the value of which should not change when replacing the options with their average value, is called the determining indicator. To derive the formula for the average, you need to create and solve an equation using the relationship between the averaged indicator and the determining one. This equation is constructed by replacing the variants of the characteristic (indicator) being averaged with their average value.

In addition to the arithmetic mean and harmonic mean, other types (forms) of the mean are used in statistics. All of them are special cases of power average. If we calculate all types of power averages for the same data, then their values ​​will be the same; the rule of majority of averages applies here. As the exponent of the average increases, the average value itself increases.

The geometric mean is used when there are n growth coefficients, and the individual values ​​of the characteristic are, as a rule, relative dynamics values, constructed in the form of chain values, as a ratio to the previous level of each level in the dynamics series. The average thus characterizes the average growth rate. The simple geometric mean is calculated using the formula:

The weighted geometric mean formula is as follows:

The above formulas are identical, but one is applied for current coefficients or growth rates, and the second is applied for absolute values ​​of series levels.

The mean square is used when calculating with the values ​​of quadratic functions, it is used to measure the degree of fluctuation of individual values ​​of a characteristic around the arithmetic mean in the distribution series and is calculated by the formula:

The weighted mean square is calculated using another formula:

The cubic average is used when calculating with the values ​​of cubic functions and is calculated using the formula:

and the average cubic weighted:

All average values ​​discussed above can be presented as a general formula:

Where x- average value;

x – individual value;

n – number of units of the studied population;

k – exponent that determines the type of average.

When using the same initial data, the larger k in the general power average formula, the larger the average value. It follows from this that there is a natural relationship between the values ​​of power averages:

The average values ​​described above give a generalized idea of ​​the population being studied, and from this point of view, their theoretical, applied and educational significance is indisputable. But it happens that the average value does not coincide with any of the actually existing options. Therefore, in addition to the considered averages, in statistical analysis it is advisable to use the values ​​of specific options that occupy a very specific position in the ordered (ranked) series of attribute values. Among these quantities, the most commonly used are structural (or descriptive) averages– mode (Mo) and median (Me).

Fashion– the value of a characteristic that is most often found in a given population. In relation to a variational series, the mode is the most frequently occurring value of the ranked series, that is, the option with the highest frequency. Fashion can be used in determining the stores that are visited more often, the most common price for any product. It shows the size of a feature characteristic of a significant part of the population, and is determined by the formula:

Where x 0– lower limit of the interval;

h– interval size;

f m– interval frequency;

f m1– frequency of the previous interval;

f m+1– frequency of the next interval.

Median the option located in the center of the ranked row is called. The median divides the series into two equal parts in such a way that there are the same number of population units on either side of it. In this case, one half of the units in the population has a value of the varying characteristic that is less than the median, while the other half has a value greater than it. The median is used when studying an element whose value is greater than or equal to, or at the same time less than or equal to, half of the elements of a distribution series. The median gives a general idea of ​​where the attribute values ​​are concentrated, in other words, where their center is.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative limit of the values ​​of a varying characteristic that half of the units in the population possess. The problem of finding the median for a discrete variation series is easily solved. If all units of the series are given ordinal numbers, then the ordinal number of the median option is defined as (n+1) /2 with an odd number of terms n. If the number of members of the series is an even number, then the median will be the average value of two options having ordinal numbers n / 2 and n/2+1.

When determining the median in interval variation series, first determine the interval in which it is located (median interval). This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds half the sum of all frequencies of the series. The median of an interval variation series is calculated using the formula:

Where x 0– lower limit of the interval;

h– interval size;

f m– interval frequency;

f – number of series members;

? m -1– the sum of the accumulated terms of the series preceding the given one.

Along with the median, to more fully characterize the structure of the population under study, other values ​​of options that occupy a very specific position in the ranked series are also used. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into four equal parts, and deciles into ten equal parts. There are three quartiles and nine deciles.

The median and mode, unlike the arithmetic mean, do not eliminate individual differences in the values ​​of a variable characteristic and therefore are additional and very important characteristics of the statistical population. In practice, they are often used instead of the average or along with it. It is especially advisable to calculate the median and mode in cases where the population under study contains a certain number of units with a very large or very small value of the varying characteristic. These values ​​of the options, which are not very characteristic of the population, while affecting the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.

From the book The Gold Standard: Theory, History, Politics author Team of authors

I. M. Kulisher A brief history of money circulation from the Middle Ages to modern times Published according to the publication: Kulisher I. M. History of the economic life of Western Europe. Chelyabinsk: Socium, 2004. T. I, p. 368-90; vol. II, p.

From the book Accounting Theory: Lecture Notes author Daraeva Yulia Anatolevna

1. Types of inventory Inventory is a check of the actual availability of an enterprise’s property. As a rule, the property of an enterprise includes: fixed assets; intangible assets, other inventories, cash, financial liabilities reflected in

From the book Trader's Trading System: Success Factor author Safin Veniamin Iltuzarovich

Chapter 5 Creating trading systems based on moving averages 5.1. Introduction Trading systems based on moving averages are written about in almost every book on technical analysis. And many novice traders try to work on the stock exchange using these systems. However

From the book Forex - it's simple author Kaverina Irina

Moving Averages Convergence Divergence (MACD) is a simple oscillator of two exponentially smoothed moving averages. Depicted as a line (see Fig. 9.1).To clearly indicate

author Shcherbina Lidiya Vladimirovna

20. Purpose and types of statistical indicators and values ​​There are two types of indicators of economic and social development of society: planned and reporting. Planned indicators represent certain specific values ​​of indicators. Reporting

From the book General Theory of Statistics author Shcherbina Lidiya Vladimirovna

24. Types of averages In statistics, various types of averages are used, which are divided into two large classes: 1) power averages (harmonic average, geometric average, arithmetic average, quadratic average, cubic average); 2)

From the book Enterprise Economics: lecture notes author

4. Types of prices The price system is a single, ordered set of different types of prices that serve and regulate economic relations between various participants in the national and world markets. Differentiation of prices by industry and service areas of the economy

From the book Enterprise Economics author Dushenkina Elena Alekseevna

31. Types of prices The price system is a set of different types of prices that serve and regulate economic relations between various participants in the national and world markets. Differentiation of prices by industry and service areas of the economy is based on accounting

author Konik Nina Vladimirovna

1. Purpose and types of statistical indicators and quantities The nature and content of statistical indicators correspond to the economic and social phenomena and processes that reflect them. All economic and social categories or concepts are abstract

From the book General Theory of Statistics: Lecture Notes author Konik Nina Vladimirovna

2. Types of averages In statistics, various types of averages are used, which are divided into two large classes: 1) power averages (harmonic average, geometric average, arithmetic average, quadratic average, cubic average); 2) structural average

author

28. Types of relative quantities Let's consider the following types of relative quantities.1. The relative amount of fulfillment of contractual obligations is an indicator characterizing the level of fulfillment by an enterprise of its obligations stipulated in the contracts. Calculation

From the book Theory of Statistics author Burkhanova Inessa Viktorovna

29. General characteristics of average values ​​An average value is a generalizing characteristic of units of a population according to some varying characteristic. An average value is one of the common methods of generalizations. Average values ​​allow you to compare the levels of one and

From the book Theory of Statistics author Burkhanova Inessa Viktorovna

30. Types of averages Mathematical statistics uses various averages, such as: arithmetic mean; geometric mean; harmonic mean; mean square. In the study of average values, the following indicators and

From the book Theory of Statistics author Burkhanova Inessa Viktorovna

44. Other aggregate indices: plan execution index, arithmetic mean and harmonic mean index, average value indices 1. Plan execution index. When calculating it, actual data are compared with planned ones, and the weights of the index can be indicators

From the book Real Estate. How to advertise it author Nazaikin Alexander

From the book Key Strategic Tools by Evans Vaughan

18. Smoothing with Moving Averages Tool “Life is like a roller coaster, so just ride it,” crooned Ronan Keating. This statement most likely applies not only to life, but also to the market. Sometimes you just need to ride there too. When