How to set up smartphones and PCs. Informational portal
  • home
  • Errors
  • Electronic means of collecting, processing and displaying information. Binary logarithmic measure

Electronic means of collecting, processing and displaying information. Binary logarithmic measure

Send your good work in the knowledge base is simple. Use the form below

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

On the justification of the logarithmic measure of information

Information theory has now gone beyond the narrow framework of communication systems, where it was originally applied, and began to be widely used in such non-traditional fields as physics, systems theory, control theory, biology, mathematics. It has found especially wide application in such relatively new fields of science as computer science, the theory of automata, and data protection.

Therefore, further analysis of the foundations of information theory is necessary in order to penetrate into its essence, which today still remains largely mysterious, and to identify new possibilities of its application for solving practical problems.

The most important issue on the basis of which this or that information theory is built is the choice of the information measure.

It is largely determined by those objects, the analysis of which is aimed at the developed theory.

At present, the Hartley and Shannon measures are most widely used in information theory, and in a number of cases the Hartley measure is presented as a special case of the Shannon measure.

However, according to its purpose, the Hartley measure has a significant difference from the Shannon measure, since the first is aimed at studying deterministic (improbable) processes of finite length, and the second is at analyzing probabilistic processes of any duration, for the analysis of which statistical methods are used.

Accordingly, information theory using one or another of these measures is called structural or statistical information theory.

The finiteness of the lengths of the analyzed data sets leads, respectively, to the possibility of counting their number by simple enumeration or using any mathematical methods, as well as to the use of known improbability methods for information analysis, for example, the theory of finite predicates or group theory. As a result, in the structural information theory today, coding methods have been obtained that cannot be developed on the basis of statistical information theory.

At the same time, statistical theory allows one to obtain limit theorems and conduct information analysis of messages based on a statistical data set, rather than analyzing each message separately, as is the case in structural information theory.

The logarithmic measure underlying the structural theory of information, where and are any positive numbers of finite length, not equal to 0, and also not equal to 1, proposed by Hartley in 1928, was not logically substantiated by him, but was introduced on the basis of intuitive considerations. Moreover, what is important, in this form, it can take both positive, at, and negative, at, values.

At present, it is justified by the property of its additivity, which manifests itself in the fact that the general information generated jointly by two sources of information and is equal to the sum of separate information and from each of them, as shown, for example, in.

Indeed, if each of the two sources also generates messages, respectively, then their total number

Taking the logarithm of expression (1), we obtain the equality

which proves the additivity property of the Hartley information measure.

Let us consider another justification of the Hartley measure as applied to search problems (continuous and discrete).

A feature of discrete search problems is the finiteness of the initial set of objects, representing with equal probability possible solutions to the discrete search problem, among which there is supposedly one desired one. Its search is carried out in the process of solving a discrete problem, as happens, for example, in the well-known traveling salesman problem.

In this problem, the desired object is a route of minimum length, chosen from some initial finite number of possible routes.

The solution of these problems, one way or another, is in the process of sequential partitions of the initial set of possible objects - solutions into, equivalence classes and testing each of them for the presence of the desired object in it. In the process of testing, the uncertainty about the presence of the desired object is eliminated, accompanied by the generation of an appropriate amount of information.

A special case of splitting will be when the initial possible objects are split into equivalence classes so that they contain strictly equal numbers of whole objects.

Obviously, such partitions are possible only if

where is the maximum number of partitions before the appearance of a class with one object.

If we take as the measure of information in this case, then it exactly coincides with the logarithmic Hartley measure, taken by the base:

Thus, the number of partitions in an equiprobable discrete search for an object among the possible is a logarithmic measure of the Hartley information, and, conversely, the Hartley measure represents, for the case under consideration, the number of uniform partitions of a set of objects into equivalence classes until one desired one appears.

In the general case, when partitioning the original set consisting of objects into equivalence classes, each of them can contain objects and, accordingly, the probability of finding the desired object in one or another class is equal to

Wherein.

Shannon's formula for entropy, which determines the measure of uncertainty of finding the desired object in a particular equivalence class before testing, during the first partition

where asserts that the value of entropy reaches a maximum for the first partition

when the desired object is found in equivalence classes with equal probabilities

In this case.

Accordingly, the maximum amount of information generated by the test in the process of removing entropy will also be equal to this value

Similarly, on the remaining partitions, if the probabilities of finding the desired object in the new equivalence classes are equal, the maximum amount of information will be obtained, equal to 1.

It follows from this that in order to achieve the maximum of the information generated by the test, it is necessary to divide them into equivalence classes with an equal number of objects in each of them in the process of splitting a set of objects.

Since the Hartley measure in relation to the problem under consideration uses just such partitions, this means that it determines the maximum possible amount of information obtained in the process of searching for a discrete object, and if this is so, then the number of partitions and, accordingly, the search time should be minimal in compared to any other possible partitions. This is precisely the fundamental feature of the Hartley information measure.

In fig. 1 shows a tree with 3 uniform partitions into 2 classes of equivalence of the original objects. Its vertices contain the number of objects contained in the obtained equivalence classes. In this case, the maximum amount of information is generated at each vertex.

the sum of all partitions is a constituent bit.

Figure 1 - Tree of uniform partitions with,

Obviously, the number of uniform partitions for this case is minimal.

Another partition tree in Fig. 2 for non-uniform partitions of objects into 2 equivalence classes has the average number of partitions along all possible paths of partitions

which is more than the average number of partitions, equal to that obtained in the previous example.

This is due to the fact that the amount of information generated at each partition in accordance with Shannon's formula (6) is less than 1 bit, that is, the search time for the desired object is not minimal.

In this case, the basic rule of information retrieval should be fulfilled, which we will formulate as follows.

The amount of information required to search for one whole desired object does not depend on the method of partitioning the original set of objects into equivalence classes and remains constant and equal.

This means that no matter what the partitioning tree of the initial set of objects is, the required amount of information to find one of them will always be the same -.

Figure 2 - The tree of non-uniform partitions for, and

Partitions into equivalence classes are widespread in practice. Thus, the positional coding of words and numbers is based on them, which occurs in the process of sequential partitioning of their original sets into equivalence classes using letters and numbers that represent the characteristics of these classes. Together, these letters and numbers form alphabets, and the number into which the original sets of words and numbers are divided represents the cardinalities of these alphabets. The number of partitions determines the length of words and numbers.

Consequently, each letter or digit of a word or number indicates the equivalence class to which they belong in this or that partition.

The main expression for information theory proposed by Shannon is

It asserts in relation to the search problem that the amount of information produced in its process is equal to the difference between the initial entropy

desired object and residual

where is the residual number of objects, among which there is the desired one.

Obviously, in the process of partitions and testing, the number decreases and, ultimately, with

The last expression represents an important condition, which is formulated in as the principle of unitarity.

Its essence boils down to the fact that complete information about an object will be obtained if and only if one whole object is found in the search process.

If, then this indicates that the information about the object is partially transmitted to the receiver.

A special case will be for which. For it takes a negative value - and, accordingly

This means that in the case under consideration, when, during testing, additional information is generated about the details of the object belonging to the now new, previously unexplored equivalence classes. This procedure for detailing an object can take an indefinitely long time. For example, in the tree of partitions in Fig. 1 after a vertex (equivalence class) containing one object after the 3rd partition, there can be vertices containing 0.5 objects (4th partition), then 0.25, etc. Each time the amount of information about the object increases by 1 bit and can reach any value.

This procedure confirms the well-known fact in science that any object can be infinitely cognizable, however, the principle of unitarity will be violated in this case, since and accordingly, i.e. the analyzed object cannot be identified as an integral system.

All the above reasoning also applies to search problems with a number of objects, provided that non-integer numbers of objects are admitted in the equivalence classes obtained in the process of partitions.

It follows from the inequality that

and correspondingly

where is the entropy at;

Entropy at.

Theorem 1. If the th partition of the number of objects contains equivalence classes with an equal number of objects, then the inequality takes place

Proof.

From the condition and, accordingly, it follows that.

The theorem is proved.

Corollary 1. The entropy of the ith partition is bounded by the inequality

Theorem 2. If the -th partition of the number of objects at contains equivalence classes with the number of objects, then the inequality

Proof. Since, then, where is the number of objects placed according to the equivalence classes of the -th partition.

From the condition and, accordingly, it immediately follows that.

The theorem is proved.

Corollary 1 The residual entropy is bounded by the inequality

In fig. 3, as an example to Theorems 1, 2, a tree is given for three partitions with the initial number of objects. It can be seen from it that the classes of the second partition contain 1.5 objects each, and the classes of the third partition each contain 0.75 objects,. Along the vertical axis of coordinates in the figure are the numbers of the original objects, and horizontally the value of the total information obtained after the next division 1, 2, 3 and the value of the residual information. The amount of information generated at each step remains constant and maximum:

Theorem 3.

Proof. Since, then where. Taking the logarithm of the last expression, we get that

The theorem is proved.

Figure 3 - Partition tree for.

Theorem 4

Proof. Since, then where. Taking the logarithm of the last expression, we get that.

The theorem is proved.

Corollary 1

Since during partitions the number in the classes obtained during the -th partition contains more, and in the classes of the -th partition is less than 1 object, this means that the amount of information about the object after the -th partition

less than the required amount required to identify the desired object, and therefore it cannot be fully determined, and after the th partition, the amount of information

comes in abundance, and as a result, not only the object itself is determined, but also some of its details, which is superfluous for solving the search problem.

Moreover, only in the first case there is a violation of the principle of unitarity, and in the second this principle is preserved and even provided with greater reliability. Therefore, in reality, in practice, if the analyzed set of objects, it is replaced by the nearest set containing objects, and the search for the desired object is performed already among the objects of this set.

Therefore, we can talk about a discrete (integer) measure of information, which is a kind of the logarithmic Hartley measure, which is the average number of partitions into equivalence classes containing with equal probability the same number of objects until the desired one is obtained. This measure can be effectively used in problems of discrete mathematics and combinatorics, where solutions are integer objects.

However, partitions can also be made into a non-integer number of equivalence classes. In this case, it is possible to achieve the fulfillment of the principle of unitarity for any real value by solving the equation

relatively.

For example, when the value should be selected approximately equal. Then.

This means that and, accordingly, the amount of information obtained in 3 partitions will be equal to

In theoretical works, it is often chosen equal, and in practice, the value of the base of the logarithm is most often used, on the basis of which such a modern measure of information as a bit is obtained, that is, the initial set of objects for this measure consists of, and the desired object is located in one division into 2 equivalence classes, each of which contains 1 object. The residual entropy in this case is equal to 0 and, accordingly, the unitarity principle is observed for the bit.

The value obtained above for the integer number of partitions for the initial set of objects can also be obtained based on the following considerations.

The base of the logarithm at which

where is an integer number of partitions that can be found from the expression

Respectively

It follows from (25) that

For example, for,

This means that if the partitions of the original set of objects before obtaining one integer are made into equivalence classes, then the desired object will be found for integer partitions that represent their minimum possible number. In this case, during each partition, the maximum amount of information is produced - one, and for partitions - one.

Let us define the ratio (25) as the initial information density before the first partition:

Obviously, the density of information changes from 1 to in the range from 0 to 1.

So for, the initial information density

After each partition, the information density will be determined in accordance with the expression

So, for the example considered above, after the first partition into two equivalence classes

and after the second

From expression (28) it follows that in the case after each partition, the information density decreases and only when it remains constant for all partitions and equal to the maximum - 1.

It follows from (26) that

and, accordingly, for

Therefore, knowing, it is possible to determine the required number of equivalence classes into which it is necessary to sequentially divide the initial number of objects to obtain an integer number of partitions. Since this will generate the maximum possible amount of information, this will be the minimum number of partitions under the given conditions.

Corollary 1 of Theorem 4 shows that the amount of information generated on the th last partition

Moreover, in accordance with (16), it is not equal to 0.

To obtain complete information about the object, it is enough to. Then expression (31) takes the form

Since it follows from (17) that

then equality (32) can be achieved based on the expression

which, for a given, is satisfied with an appropriate probability distribution.

So, for example, for

and correspondingly

To achieve the last equality, it follows that the probabilities and are equal to 0.15, respectively; 0.85 or 0.85; 0.15.

This means that the number obtained in the 2nd division in the size of the object is divided during the 3rd division into two proportional probabilities and parts (0.225 and 1.275), which are then analyzed by a test for the relation of one of them to the desired one. The probability of finding them is equal to or, or depending on their magnitude.

As a result, complete information about one of the objects will be obtained, however, in addition to uniform partitions, an uneven one was also used.

In the case of a purely logarithmic measure of information with the number of initial objects to obtain, the value should be information obtained when the objects are incompletely divided into two equal parts so that each of them contains elements of two objects. In this case, the entropy will be equal to 0 because the information obtained in the process of the last integer -th partition will partially go towards eliminating interference during testing created by the elements of another object.

From what was considered above, it follows that information is measured by the number of partitions of a set of possible objects until one integer is obtained. The source of information in this case is the test, which indicates the equivalence class in which the sought object is located. At the same time, information as an independent entity during partitions does not directly manifest itself in any way, remaining outside the framework of the measurement procedure (counting the number of partitions). In the test, it manifests itself by indicating the results of comparison, which manifests itself in the coincidence or non-coincidence of the features of the equivalence classes with the corresponding features of the test. This means that the test must have information in advance about the characteristics of the analyzed equivalence classes. Its final function is the decoding of the features of these classes and the development of control actions indicating which class of the analyzed should be subdivided into subclasses at the next step of the partitions, or that the object has been found and the search procedure should be terminated.

Essential for the search for an object is that it can be unambiguously determined only after receiving all the information about it, which happens only when there is residual entropy. This is possible only if in the process of partitions an equivalence class containing one object will be obtained. In this case, the entropy and thus the unitarity principle is satisfied.

Such a case will be when the original number of objects. If, then with a uniform partition, the last th equivalence class will contain less than one object, and as a result additional information will be obtained that details the object and is not used in its search.

In practice, in coding problems, the replacement of the initial number of objects with a number is widely used, which, on the one hand, leads to the satisfaction of the principle of unitarity, and on the other hand, to an increase in the amount of redundant information generated by the test.

Similar documents

    The concept and goals of the method of focal objects is the search for new ideas by attaching properties or attributes of random objects to the original object. Activation of associative thinking as one of the methods of heuristic research in decision-making theory.

    test, added 12/24/2012

    Theoretical foundations of primary processing of statistical information. Peculiarities of determining the minimum number of objects of observation when assessing reliability indicators. Analysis of the probabilistic paper of the laws of normal distribution and the Weibull distribution.

    term paper, added 03/22/2010

    Basic concepts and methods of encoding information. Features of the barcode decryption process. Bar-coding technology and equipment. The use of automated barcode identification technology in logistics systems.

    term paper added 05/09/2013

    Entropy concept. Entropy as a measure of the degree of uncertainty. The concept of information. Measuring information. Shannon's Noise Coding Theorem. An example of the use of entropy in forecasting and its significance for forecasting.

    abstract, added 12/14/2008

    Development of an economic and mathematical model and solving a linear programming problem using mathematical methods. Transport problem in matrix formulation and its properties. Building an initial feasible plan. Optimality criterion.

    term paper, added 01/16/2011

    Fundamentals of mathematical modeling of deterministic and stochastic objects. Identification of control objects by transient response. Obtaining a model by the method of multiple linear regression and checking its adequacy according to Fisher's criterion.

    term paper added 10/14/2014

    The simplest algorithms for directed random search. Best-Sample Algorithm with Guiding Hyper-square. Multichannel statistical optimizer with random search. Statistical gradient method. Best-Sample Local Random Search.

    term paper, added 02/08/2015

    Concepts and definitions of the theory of genetic algorithms. Mathematical basis of inventive physics. Genetic algorithm for an inventive problem. Description of the operators of genetic algorithms. The system of mental search and tracking in the mind of the inventor.

    term paper, added 05/22/2012

    Construction of a mathematical model of the dual problem (a system of restrictions on unit profit and the target function of total costs for raw materials. Determination of the optimal set of prices for raw materials, providing a minimum of total costs for raw materials. Analysis of variables.

    test, added 05/18/2015

    Experiment planning as a mathematical and statistical discipline. Search for optimal conditions and rules for conducting experiments in order to obtain information about an object with the lowest labor cost. The theory of correlation research, measures of correlation.

Combinatorial measure

For a better understanding, consider a few simple examples.

Example 1... Let's do the experiment. Let's take a dice. It has six sides, each with numbers from one to six.

Let's throw it up. When the die is thrown, one of the numbers on the sides of the die drops out. The resulting number is the outcome of our experience.

By tossing the dice any number of times, we can get only six possible numbers. Let's denote this as N = 6.

This example allows you to move on to the concept of a combinatorial measure of information and give the following definition:

The combinatorial measure of information N is a way of measuring the amount of information by estimating the number of possible combinations of information items.

Since in the example with a dice, only six variants of the outcome of the experiment are possible, in other words, six combinations, then the amount of information in accordance with the combinatorial measure is N = 6 combinations.

Consider the following example.

Example 2. Let one of the decimal digits be given, for example, the digit 8 and one of the hexadecimal digits - for example, the digit 6 (you could take any other hexadecimal - 8, B, F, etc.). Now, in accordance with the definition of a combinatorial measure, let us determine the amount of information contained in each of these numbers. Since the digit 8 is decimal, which means it represents one character out of ten, N 8 = 10 combinations. Likewise, the number 6 represents one of the sixteen characters, and therefore N 6 = 16 combinations. Therefore, a hexadecimal digit contains more information than a decimal one.

From the considered example, we can conclude that the fewer numbers are at the base of the number system, the less information one of its elements carries.

The English engineer R. Hartley proposed to measure the amount of information with a binary logarithmic measure:

where N is the number of different combinations of information items. The unit of information measurement in this measurement is the bit.

Since the formula derived by R. Hartley takes into account the number of possible combinations N, it is interesting to know what estimate of the amount of information is given by the binary logarithmic measure for the examples considered above.

The counting gives the following results:

in the cube example I = log 2 6 = 2.585 bits;

in the example with the decimal number system, I = log 2 10 = 3.322 bits;

in the example with hexadecimal notation I = log 2 16 = 4 bits;

in the example with the binary system, I = log 2 2 = 1 bit.

The last digit indicates that each digit of the binary number system contains one bit of information. In general, in technical systems, the binary number system is used to encode two possible states, for example, 1 means the presence of electric current in the network, 0 means its absence.



In all the examples considered above, the outcomes of the experiments were equally probable and mutually independent. This means that when the dice is tossed, each of the six faces has the same probability of a successful outcome. And also that the result of the next toss does not depend in any way on the result of the previous one.

Equally probable and mutually independent events in real life are quite rare. If you pay attention to the spoken languages, for example Russian, then you can draw interesting conclusions. To simplify theoretical research in computer science, it is generally accepted that the Russian alphabet consists of 32 characters (e and e, as well as b and b do not differ from each other, but a space is added between the words). If we assume that each letter of the Russian language in the message appears equally often and after each letter there can be any other symbol, then we can determine the amount of information in each symbol of the Russian language as:

I = log 2 32 = 5.

However, in fact, things are not so. In all spoken languages, some letters are found more often, others much less often. Studies say there are the following repetitions per 1000 letters:

In addition, the likelihood of individual letters appearing depends on which letters precede them. So, in Russian, a soft sign cannot follow a vowel, four vowels cannot stand in a row, and so on. Any spoken language has its own characteristics and patterns. Therefore, the amount of information in messages built from symbols of any spoken language cannot be estimated with either combinatorial or binary logarithmic measures.

1

The paper presents a model for determining the logarithmic measure of information. An object is singled out from the structure of the technical system, and its probabilistic states of failure and operation are considered. When the states are equally probable, it is proposed to use the Hartley measure, and for non-equiprobable states, the Shannon measure for one or many objects, if they are mutually independent. The model takes into account the possibility of determining the measure of information for only one object. All object states are divided into two classes. Each of the selected classes is formed on the basis of data on the flow of non-uniform events. For each class of object states, the total and generalized probabilities of operability and failure are determined. These probabilities have found application in the obtained mathematical expressions to determine the measure of information uncertainty. It is shown that the formulas obtained are identical and are applicable both when using the total probability and the generalized probability.

LOGARITHMIC MEASURE OF INFORMATION OF THE CONDITION OF TECHNICAL OBJECT

Dulesov A.S. 1 Kabaeva E.V. one

1 Khakass State University n.a. N.F. Katanov

Abstract:

The article presents the modifier of logarithmic measure of information model. An object is picked out from the technical system, and its probabilistic states of failure and work are analyzed. When the states are equiprobable it is recommended to use Hartley’s measure, and when they are not equiprobable Shanon’s measure is preferable for one or more interindependent objects. The model considers the capability of modifying the measure of information only for one object. All states of the object are divided into two classes. Each class is based on data of the flow of the inequiprobable events. The total and generalized probabilities of efficiency and failure are determined for the object's states of each class. The studied probabilities are used in the mathematical formulas for modifying the measure of the uncertainty of information. It is shown that the formulas are identical and may be applied both for the total and generalized probability.

Keywords:

Bibliographic reference

Dulesov A.S., Kabaeva E.V. LOGARITHMIC MEASURE OF INFORMATION OF THE STATE OF A TECHNICAL OBJECT // Scientific Review. Technical science. - 2014. - No. 1. - P. 146-146;
URL: http://science-engineering.ru/ru/article/view?id=204 (date accessed: 04/06/2019). We bring to your attention the journals published by the "Academy of Natural Sciences"

This measure determines the usefulness of the information (value) for the user to achieve the set goal.

The whole theory of information is based on the discovery made by R. Hartley in 1928, and that information can be quantified.

Hartley's approach is based on fundamental set-theoretic, essentially combinatorial foundations, as well as several intuitively clear and quite obvious assumptions.

If there are many elements and one of them is selected, then a certain amount of information is communicated or generated by this. This information consists in the fact that if before the selection it was not known which element will be selected, then after the selection it becomes known. It is necessary to find the kind of function that connects the amount of information obtained when choosing a certain element from the set, with the number of elements in this set, that is, with its cardinality. measurement algorithmic pragmatic byte

If the set of elements from which the choice is made consists of one single element, then it is clear that its choice is predetermined, that is, there is no uncertainty of choice - there is zero amount of information.

If the set consists of two elements, then the choice uncertainty is minimal. In this case, the amount of information is also minimal.

The more elements in the set, the greater the uncertainty of choice, the more information.

The number of these numbers (elements) in the set is: N = 2i

From these obvious considerations, the first requirement follows: information is a monotonic function of the cardinality of the original set.

Choosing one number gives us the following amount of information: i = Log 2 (N)

Thus, the amount of information contained in a binary number is equal to the number of binary digits in this number.

This expression is Hartley's formula for the amount of information.

When the length of a number doubles, the amount of information in it should also double, despite the fact that the number of numbers in the set increases exponentially (squared, if the numbers are binary), that is, if N2 = ( N1) 2, then I2 = 2 * I1,

F (N1 * N1) = F (N1) + F (N1).

This is impossible if the amount of information is expressed as a linear function of the number of elements in the set. But there is a known function that has just such a property: it is Log:

Log 2 (N2) = Log 2 (N1) 2 = 2 * Log 2 (N1)

This second requirement is called the additivity requirement.

Thus, the logarithmic measure of information proposed by Hartley simultaneously satisfies the conditions of monotonicity and additivity. Hartley himself came to his measure on the basis of heuristic considerations similar to those just outlined, but it has now been rigorously proven that the logarithmic measure for the amount of information unambiguously follows from these two conditions he postulated.

Example. There are 192 coins. It is known that one of them is fake, for example, lighter in weight. Determine how many weighings need to be done in order to identify it. If we put a different number of coins on the scales, we get three independent possibilities: a) the left cup is lower; b) the right cup is lower; c) the cups are balanced. Thus, each weighing gives the amount of information I = log23, therefore, to determine a false coin, at least k weighings must be made, where the smallest k satisfies the condition log23k log2192. Hence, k 5 or k = 4 (or k = 5 - if we count as one weighing and the last one, which is obvious for determining the coin). So, it is necessary to do at least five weighings (5 is enough).

Directions for assessing the amount of information

There are three main directions in information theory: structural, statistical, and semantic.

Structural- considers the discrete structure of information arrays and their measurement by simple counting of information elements. (The simplest encoding of arrays is a combinatorial method.)

Statistical the direction operates with the concept of entropy as a measure of uncertainty, that is, the probability of the appearance of certain messages is taken into account here.

Semantic the direction takes into account the appropriateness, value or materiality of the information.

These three areas have their own specific areas of application. Structural is used to assess the capabilities of technical means of various information processing systems, regardless of the specific conditions of their use. Statistical estimates are used when considering data transmission issues, determining the bandwidth of communication channels. Semantic are used in solving problems of constructing information transmission systems for the development of coding devices and in assessing the effectiveness of various devices.

Structural measures of information

Structural measures only take into account the discrete structure of information. The elements of the information complex are quanta - indivisible parts of information. Distinguish geometric, combinatorial and additive measures.

Definition of information geometric the method is a measurement of the line length, area or volume of the geometric model of the information complex in the number of quanta. The maximum possible number of quanta in the given structural dimensions determines information capacity of the system... Information capacity is a number indicating the number of quanta in the complete information array. According to fig. 1.2, G, amount of information M in complex X(T, N), determined by the geometric method, is equal to

X, T,N - intervals at which discrete samples are taken.

V combinatorial the amount of information is calculated as the number of combinations of elements. Possible or realized combinations are taken into account here.

In many cases, a discrete message can be viewed as a word consisting of a number of elements. n, given by the alphabet consisting of T elements-letters. Let's determine the number of different messages that can be formed from a given alphabet. If the message consists of two elements ( n = 2), then there may be different messages in total. For example, ten digits (0, 1, 2, ..., 9) can form one hundred different numbers from 0 to 99. If the number of elements is three, then the number of different messages is equal, and so on.

Thus, the number of possible messages is determined by:

where L- number of messages; P- the number of elements in a word; T- alphabet.

The more L, the more different each message can be from the rest. The magnitude L can be taken as a measure of the amount of information. However, the choice L as a measure of the amount of information is associated with inconveniences: firstly, when L= 1 information is equal to zero, since the nature of the message is known in advance (i.e., there is a message, and the information is equal to zero); secondly, the condition of linear addition of the amount of information is not met, i.e. condition of additivity. If, for example, the first source is characterized by different messages, and the second -, then the total number of different messages for the two sources is determined by the product

L = .

For k sources the total number of possible different messages is

Therefore, Hartley introduced a logarithmic (additive) measure of the amount of information, which makes it possible to estimate the amount of information contained in a message by the logarithm of the number of possible messages.

I = .

Then at L = 1I = 0, i.e. information is absent.

For k sources of information

those. I = .

Statistical measures of information

In the static probabilistic approach, obtaining a specific amount of information is considered as the result of a certain choice among possible messages. The recipient of the information can know in advance or guess part of it. When a message comes about frequently occurring events, the likelihood of which R tends to one, then such a message is uninformative. On average, the messages about events, the probabilities of which tend to zero, are just as uninformative, i.e. almost impossible events, since such events are reported extremely rarely.

Events can be viewed as possible outcomes of some experience. All outcomes make up a complete group of events, or an ensemble.

The ensemble is characterized by the fact that the sum of the probabilities of all messages in it is equal to one, that is

.

Consider complex messages composed of P elements, each of which is independent and is chosen from the alphabet containing T letters, with the probabilities of selecting elements respectively. Suppose that some message includes elements of the alphabet, elements, etc. Such a message is characterized by a table (Table 1.1).

Table 1.1

Item type ... ...
Number of elements ... ...

Choice probabilities

elements

The probability that the message will include elements is equal, and the probability of the formation of a message from ,,, ... ,, ..., elements will be equal

P = . (1.1)

For long lengths P the source will form typical messages in which the relative frequency of occurrence of individual elements tends to the probability of occurrence of these elements, that is

, (1.2)

and the probability of occurrence of typical messages R will be the same and can be found from (1.1), (1.2):

P =. (1.3)

Let's define the number of typical messages:

since the total probability of all typical messages tends to unity with increasing message length.

Although the number of possible messages, the source will practically only generate L typical messages, and the probability of the appearance of other messages tends to zero.

Find the amount of information I contained in one message:

I = log L = - log . (1.5)

This expression (Shannon's formula) gives a more complete picture of the source of information than an additive measure (Hartley's measure). Let us explain this with the following example. If we flip a coin, we get a message from two possible states (heads or tails), that is, an alphabet of messages from two letters. If we toss a cube, one face of which is blue, and the other faces are colored pink, then here we also have an alphabet of two letters (blue or pink). To write the received text (message), in both cases, one binary digit per letter ( n = 1, t = 2).

According to Hartley here in both cases

But we know that in the first case, the probability of each outcome of the experiment is 0.5 (= 0.5). And in the second case, and accordingly. Hartley's measure ignores this.

With equiprobability of symbols (special case), Shannon's formula degenerates into Hartley's formula:

I = - n .

For the coin case:

I = - 1 .

For the dice case:

I = - 1 .

The amount of information per message element is called specific information content or entropy.

H =. (1.6)

The amount of information and entropy are logarithmic measures and are measured in the same units. The base of the logarithm defines the unit of measurement for the amount of information and entropy. The binary one corresponds to the base of the logarithm, equal to two, and is called a bit. One bit is the amount of information in the message in one of the two equiprobable outcomes of some experience. Natural (NIT) and decimal (DIT) logarithms are also used. Similar units are used when assessing the amount of information using the Hartley measure.

It follows from Shannon's formula that the amount of information contained in a message depends on the number of message elements P, alphabet T and the probabilities of item selection. Addiction I from P is an linear.

Let us note some properties of entropy.

1. Entropy is a real, bounded and non-negative quantity, that is H> 0. This property follows from expression (1.6).

2. The entropy is minimal and equal to zero if the message is known in advance, that is, if = 1, and

3. Entropy is maximum if all states of message elements are equally probable.

H =, if . (1.7)

We find the value of the maximum entropy using (1.6) and (1.7):

The feasibility, usefulness of information for solving a problem can be assessed by the effect that the information received on the solution of the problem. If the likelihood of achieving the goal increases, then the information should be considered useful.

Top related articles