Units. Units of information measurement in computer science. Minimum unit of information

Amount of information

The amount of information as a measure of reducing knowledge uncertainty.
(Substantive approach to determining the amount of information)

The process of cognition of the surrounding world leads to the accumulation of information in the form of knowledge (facts, scientific theories, etc.). Receipt new information leads to an increase in knowledge or, as is sometimes said, to a decrease in the uncertainty of knowledge. If some message leads to a decrease in the uncertainty of our knowledge, then we can say that such a message contains information.

For example, after taking a test or completing a test, you are tormented by uncertainty; you do not know what grade you received. Finally, the teacher announces the results, and you receive one of two information messages: "pass" or "fail", and after the test, one of four information messages: "2", "3", "4" or "5".

An information message about a grade for a test leads to a reduction in the uncertainty of your knowledge by half, since one of two possible information messages is received. Information message about the assessment for test results in a fourfold reduction in the uncertainty of your knowledge, since one of four possible information messages is received.

It is clear that the more uncertain the initial situation (the large quantity information messages are possible), the more new information we receive when receiving an information message (the more times the uncertainty of knowledge will decrease).

Amount of information can be considered as a measure of reducing knowledge uncertainty when receiving information messages.

The approach to information discussed above as a measure of reducing the uncertainty of knowledge allows us to quantitatively measure information. There is a formula that relates the number of possible information messages N and the amount of information I carried by the received message:

N = 2 i

(1.1)

Bit. To quantify any quantity, you must first determine the unit of measurement. So, to measure length, the meter is selected as the unit, to measure mass - kilogram, etc. Similarly, to determine the amount of information, you must enter a unit of measurement.

Behind unit of information quantity the amount of information that is contained in the information message is accepted, reducing the uncertainty of knowledge by half. This unit is called bit.

If we return to the receipt of an information message about the test results discussed above, then here the uncertainty is reduced by half and, therefore, the amount of information that the message carries is equal to 1 bit.

Derived units for measuring the amount of information. The smallest unit of measurement of the amount of information is a bit, and the next largest unit is a byte, and:

1 byte = 8 bits = 2 3 bits.

In computer science, the system for forming multiple units of measurement is somewhat different from that accepted in most sciences. Traditional metric systems of units, e.g. International system SI units, the coefficient 10 n is used as multipliers of multiple units, where n = 3, 6, 9, etc., which corresponds to the decimal prefixes “Kilo” (10 3), “Mega” (10 6), “Giga” (10 9), etc.

In a computer, information is encoded using a binary sign system, and therefore, in multiple units of measurement of the amount of information, a factor of 2 n is used

Thus, units of measurement of the amount of information that are multiples of a byte are entered as follows:

1 kilobyte (KB) = 2 10 bytes = 1024 bytes;

1 megabyte (MB) = 2 10 KB = 1024 KB;

1 gigabyte (GB) = 2 10 MB = 1024 MB.

Control questions

Determining the amount of information

Determining the number of information messages. Using formula (1.1), you can easily determine the number of possible information messages if the amount of information is known. For example, in an exam you take an exam card, and the teacher tells you that the visual information message about its number carries 5 bits of information. If you want to determine the quantity exam papers, then it is enough to determine the number of possible information messages about their numbers using formula (1.1):

Thus, the number of exam tickets is 32.

Determining the amount of information. On the contrary, if the possible number of information messages N is known, then to determine the amount of information carried by the message, it is necessary to solve the equation for I.

Imagine that you control the movement of a robot and can set the direction of its movement using information messages: "north", "northeast", "east", "southeast", "south", "southwest", " west" and "northwest" (Fig. 1.11). How much information will the robot receive after each message?

There are 8 possible information messages, so formula (1.1) takes the form of an equation for I:

Let's factor the number 8 on the left side of the equation and present it in power form:

8 = 2 × 2 × 2 = 2 3 .

Our equation:

Equality of left and right parts the equation is valid if the exponents of the number 2 are equal. Thus, I = 3 bits, i.e., the amount of information that each information message carries to the robot is equal to 3 bits.

Alphabetical approach to determining the amount of information

With the alphabetical approach to determining the amount of information, one abstracts from the content of the information and considers the information message as a sequence of signs of a certain sign system.

Information capacity of the sign. Let's imagine that it is necessary to transmit an information message through an information transmission channel from the sender to the recipient. Let the message be encoded using a sign system whose alphabet consists of N characters (1, ..., N). In the simplest case, when the length of the message code is one character, the sender can send one of N possible messages“1”, “2”, ..., “N”, which will carry the amount of information I (Fig. 1.5).

Rice. 1.5. Transfer of information

Formula (1.1) relates the number of possible information messages N and the amount of information I carried by the received message. Then, in the situation under consideration, N is the number of signs in the alphabet of the sign system, and I is the amount of information that each sign carries:

Using this formula, you can, for example, determine the amount of information that a sign carries in the binary sign system:

N = 2 => 2 = 2 I => 2 1 = 2 I => I=1 bit.

Thus, in a binary signed system, a sign carries 1 bit of information. It is interesting that the very unit of measurement of the amount of information “bit” (bit) got its name FROM the English phrase “Binary digiT” - “binary digit”.

The information capacity of the sign of the binary sign system is 1 bit.

The greater the number of signs the alphabet of a sign system contains, the greater the amount of information carried by one sign. As an example, we will determine the amount of information carried by a letter of the Russian alphabet. The Russian alphabet includes 33 letters, but in practice, only 32 letters are often used to convey messages (the letter “ё” is excluded).

Using formula (1.1), we determine the amount of information carried by a letter of the Russian alphabet:

N = 32 => 32 = 2 I => 2 5 = 2 I => I=5 bits.

Thus, a letter of the Russian alphabet carries 5 bits of information (with an alphabetic approach to measuring the amount of information).

The amount of information a sign carries depends on the likelihood of its receipt. If the recipient knows in advance exactly what sign will come, then the amount of information received will be equal to 0. On the contrary, the less likely it is to receive a sign, the greater its information capacity.

In Russian writing the frequency of use of letters in the text is different, so on average, per 1000 characters of a meaningful text there are 200 letters “a” and a hundred times less number of letters “f” (only 2). Thus, from the point of view of information theory, the information capacity of the characters of the Russian alphabet is different (the letter “a” has the smallest, and the letter “f” has the largest).

The amount of information in the message. A message consists of a sequence of characters, each of which carries a certain amount of information.

If the signs carry the same amount of information, then the amount of information I c in the message can be calculated by multiplying the amount of information I z carried by one sign by the code length (number of characters in the message) K:

I c = I × K

So, each digit of binary computer code carries information in 1 bit. Consequently, two digits carry information in 2 bits, three digits - in 3 bits, etc. The amount of information in bits is equal to the number of digits of the binary computer code (Table 1.1).

Table 1.1. The amount of information carried by a binary computer code

In modern computers we can enter text information, numeric values, as well as graphic and audio information. The amount of information stored in a computer is measured by its “length” (or “volume”), which is expressed in bits. Bit is the minimum unit of information (from English BInary digiT - binary digit). Each bit can take the value 0 or 1. A bit is also called a bit of a computer memory cell. The following units are used to measure the amount of information stored:

1 byte = 8 bits;

1 KB = 1024 bytes (A KB is read as a kilobyte);

1 MB = 1024 KB (MB reads like megabyte);

1 GB = 1024 MB (a GB is read as a gigabyte).

Beat (from English. binary digit; also play on words: English. bit- A little)

According to Shannon, a bit is the binary logarithm of the probability of equally probable events or the sum of the products of the probability by the binary logarithm of the probability of equally probable events.

One digit binary code(binary digit). Can only take two mutually exclusive values: yes/no, 1/0, on/off, etc.

A basic unit of measurement for the amount of information equal to the amount of information contained in an experience that has two equally probable outcomes. This is identical to the amount of information in the answer to a question that allows the answers “yes” or “no” and nothing else (that is, the amount of information that allows you to unambiguously answer the question posed). One binary bit contains one bit of information.

IN computer technology and data networks, the values 0 and 1 are usually transmitted different levels voltage or current. For example, in TTL-based chips, 0 is represented by a voltage in the range +0 to +3 IN, and 1 in the range from 4.5 to 5.0 IN.

The data transfer speed of a network is usually measured in bits per second. It is noteworthy that with the increase in data transmission speed, the bit also acquired another metric expression: length. So, in a modern gigabit network (1 Gigabit/sec) there are approximately 30 meters of wire per bit. Because of this, the difficulty network adapters has increased significantly. Previously, for example, in one-megabit networks, a bit length of 30 km was almost always obviously greater than the length of the cable between two devices.

In computing, especially in documentation and standards, the word “bit” is often used to mean binary digit. For example: the first bit is the first binary digit of the byte or word in question.

Currently, a bit is the smallest possible unit of information in computing, but intensive research in the field quantum computers assume the presence of q-bits.

Byte (English) byte) - a unit of measurement of the amount of information, usually equal to eight bits, can take 256 (2 8) different values.

In general, a byte is a sequence of bits, the number of which is fixed, the minimum addressable amount of memory in a computer. IN modern computers general purpose a byte is equal to 8 bits. To emphasize that an eight-bit byte is meant, in the description network protocols The term "octet" is used. octet).

Sometimes a byte is a sequence of bits that make up a subfield of a word. Some computers can address bytes of different lengths. This is provided by the field extraction instructions of the LDB and DPB assemblers on the PDP-10 and in Common Lisp.

In the IBM-1401, a byte was equal to 6 bits, just like in Minsk-32, and in BESM - 7 bits, in some computer models manufactured by Burroughs Computer Corporation (now Unisys) - 9 bits. Many modern digital signal processors use bytes that are 16 bits or larger in length.

The name was first used in 1956 by W. Buchholz when designing the first supercomputer IBM 7030 for a bunch of bits simultaneously transmitted in input-output devices (six pieces); later, as part of the same project, the byte was expanded to eight (2 3) bits.

Multiple prefixes to form derivative units for a byte are not used as usual: firstly, diminutive prefixes are not used at all, and units of information smaller than a byte are called special words (nibble and bit); secondly, magnifying prefixes mean for every thousand 1024 = 2 10 (a kilobyte is equal to 1024 bytes, a megabyte is equal to 1024 kilobytes or 1,048,576 bytes, etc. with gigabytes, terabytes and petabytes (not used anymore)). The difference increases with the weight of the console. It is more correct to use binary prefixes, but in practice they are not yet used, perhaps due to the cacophony - kibibyte, mebibyte, etc.

Sometimes decimal prefixes are also used in the literal sense, for example, when indicating capacity hard drives: for them, a gigabyte can mean a million kibibytes, i.e. 1,024,000,000 bytes, or even just a billion bytes, and not 1,073,741,824 bytes, as, for example, in memory modules.

Kilobyte (kbyte, kB) m., skl . - a unit of measurement of the amount of information equal to (2 10) standard (8-bit) bytes or 1024 bytes. Used to indicate the amount of memory in various electronic devices.

The name “kilobyte” is generally accepted, but formally incorrect, since the prefix kilo - means multiplication by 1,000, not 1,024. The correct binary prefix for 2 10 is kibi - .

Table 1.2 - Multiple prefixes to form derivatives

Megabyte (MB, M) m., skl. - a unit of measurement of the amount of information equal to 1048576 (2 20) standard (8-bit) bytes or 1024 kilobytes. Used to indicate the amount of memory in various electronic devices.

The name “Megabyte” is generally accepted, but formally incorrect, since the prefix mega - , means multiplying by 1,000,000, not 1,048,576. The correct binary prefix for 2 20 is mebi - . Large corporations that produce hard disks, which, when labeling their products, understand a megabyte to be 1,000,000 bytes, and a gigabyte to be 1,000,000,000 bytes.

The most original interpretation of the term megabyte is used by manufacturers computer floppy disks, which mean 1,024,000 bytes. Thus, a floppy disk with a capacity of 1.44 MB actually holds only 1440 KB, that is, 1.41 MB in the usual sense.

In this regard, it turned out that a megabyte can be short, medium and long:

short - 1,000,000 bytes

average - 1,024,000 bytes

long - 1,048,576 bytes

Gigabyte is a multiple unit of measurement of the amount of information, equal to 1,073,741,824 (2 30) standard (8-bit) bytes or 1,024 megabytes.

SI giga prefix - is used erroneously because it means multiplying by 10 9 . For 2 30 should be consumed binary prefix gibi-. Large corporations that produce hard drives take advantage of this situation, and when labeling their products, a megabyte means 1,000,000 bytes, and a gigabyte means 1,000,000,000 bytes.

A machine word is a machine-dependent and platform-dependent quantity, measured in bits or bytes, equal to the width of the processor registers and/or the width of the data bus (usually some power of two). The word size also matches minimum size addressed information (bit depth of data located at one address). Machine word defines following characteristics cars:

bit depth of data processed by the processor;

addressable data width (data bus width);

maximum value of an unsigned integer type directly supported by the processor: if the result arithmetic operation exceeds this value, an overflow occurs;

maximum volume random access memory, directly addressed by the processor.

The maximum value of a word of length n bits can be easily calculated using the formula 2 n −1

Table 1.3 - Machine word size on various platforms

Everything that is on your computer is information. But how to measure it?
Agree, it is difficult to work with information without knowing its quantity. Let's try to sort these out.

Unit computer measurements information it is generally accepted BYTE . But this is not entirely true if we take into account that a computer is Calculating machine. And the computer calculates using “machine language”, an even smaller unit called BIT.

Bit can be expressed only by one or zero, and such a calculation system is called binary. One byte contains 8 bits. In fairness, it is worth noting that the computer also uses octal and hexadecimal calculation systems in its operations. But, on machine language computer we won't stop anymore.

Let's continue with the user language. If we simplify everything, then one byte Only one character can be represented. This symbol can be expressed as a letter, number, or some other symbol. If you imagine how many bytes one page of text in an ordinary book contains, which is about 2000 characters, and multiply the resulting number by the number of pages, then the need to use derived units of measurement will become clear. Let's look at them:

KB - kilobyte - 1024 bytes
MB - megabyte - 1024 kilobytes
GB - gigabyte - 1024 megabytes
Tr - terabyte - 1024 gigabytes

A reasonable question arises: why not a whole thousand? It seems to be more convenient to count, but nothing can be done about it, this is the computer’s calculation algorithm. Each subsequent unit of measurement an order of magnitude higher is equal to two to the tenth power of the previous one; mathematics is an exact science.

If you follow top list, then, as already mentioned, we can conditionally assume that 1 byte is one character, 1 kB is 1024 characters, and so on. How to evaluate these numbers, how to understand and imagine what amount of information lies behind their meanings.

It’s easier to understand this when dealing with text. I have already mentioned that the size of one page of typewritten text is on average about 2000 characters. It’s easy to calculate that 1MB will fit about 500 pages.
Let's dilute our book with several dozen optimized pictures by another 1 MB. And we get a book that weighs 2MB. Let's take a flash drive or a 1GB micro-CD memory card. You have already calculated, and correctly - 500 of these books will fit there. But a flash drive, and even more so a memory card, can be easily placed in the piston of a trouser pocket. Try to put at least one book of 500 pages in your pocket!

Of course, all these arguments are very conditional. Such an assessment is unlikely to apply to images, films, or games, but this is information of a completely different kind. Although, maybe someone remembers, or saw in the cinema, reels with old movies (single-episode films of several parts and kilograms), add also tape reels, and the old records were quite large - and you will feel the difference between the volumes digital information and information on other older generation media.

More about the pictures. One Good photo, or another image can take up 2 MB or more. But everything beautiful always requires a lot!!!

To measure length there are units such as millimeter, centimeter, meter, kilometer. It is known that mass is measured in grams, kilograms, centners and tons. The passage of time is expressed in seconds, minutes, hours, days, months, years, centuries. The computer works with information and there are also corresponding units of measurement to measure its volume.

We already know that the computer perceives all information through zeros and ones. A bit is the minimum unit of measurement of information corresponding to one binary digit(“0” or “1”).

A byte consists of eight bits. Using one byte, you can encode one character out of 256 possible (256 = 28). Thus, one byte is equal to one character, that is, 8 bits:

1 character = 8 bits = 1 byte.

Studying computer literacy involves consideration of other, larger units of measurement of information.

Byte table: 1 byte = 8 bits

1 KB (1 Kilobyte) = 210 bytes = 2*2*2*2*2*2*2*2*2*2 bytes =

1024 bytes (approximately 1 thousand bytes - 103 bytes)

1 MB (1 Megabyte) = 220 bytes = 1024 kilobytes (approximately 1 million bytes - 106 bytes)

1 GB (1 Gigabyte) = 230 bytes = 1024 megabytes (approximately 1 billion bytes - 109 bytes)

1 TB (1 Terabyte) = 240 bytes = 1024 gigabytes (approximately 1012 bytes). A terabyte is sometimes called a ton.

1 PB (1 Petabyte) = 250 bytes = 1024 terabytes (approximately 1015 bytes).

1 Exabyte = 260 bytes = 1024 petabytes (approximately 1018 bytes).

1 Zettabyte = 270 bytes = 1024 exabytes (approximately 1021 bytes).

1 Yottabyte = 280 bytes = 1024 zettabytes (approximately 1024 bytes).

In the table above, powers of two (2 10, 2 20, 2 30, etc.) are the exact values of kilobytes, megabytes, gigabytes.

The question arises: is there a continuation of the byte table? In mathematics there is a concept of infinity, which is symbolized as an inverted figure eight: ∞.

It is clear that in the byte table you can continue to add zeros, or rather, powers to the number 10 in this way: 10 27, 10 30, 10 33 and so on ad infinitum. But why is this necessary? In principle, terabytes and petabytes are enough for now. In the future, perhaps even a yottabyte will not be enough.

Finally, a couple of examples of devices that can store terabytes and gigabytes of information. There is a convenient “terabyte” - external HDD, which connects via USB port to the computer. You can store a terabyte of information on it. Particularly convenient for laptops (where changing hard drive can be problematic) and for Reserve copy information. It's better to do it in advance backups information, and not after everything is gone.

Computer literacy exercises:

1) How many bytes (without quotes) does the phrase “Today is July 7, 2011” contain?

2) How many bytes (kilobytes) does one page of text take if there are 60 characters in one line and 40 lines on a page? What is the volume of one book consisting of 100 similar pages?

3) Terabyte is external hard a disk that connects to a computer via a USB connector and has a capacity of 1 terabyte. The instructions for its use say that this disk can fit 250 thousand. music files or 285 thousand photographs. What is the size of one music file and the size of one photo according to the manufacturers of this device?

4) How many similar music files can fit on one 700 megabyte CD?

5) How many similar photos can fit on a 4 gigabyte flash drive?

Solutions:

1) “Today” - with a space (but without quotes) 8 bytes “July 7” - with two spaces (without quotes) 7 bytes “2010” - with a space and a dot (without quotes) 7 bytes Total: 8 + 7 + 7 = 22 bytes “weighs” the phrase “Today is July 7, 2010”

2) One line contains 60 characters, which means the volume of one line is 60 bytes. There are 40 such lines on a page, each containing 60 bytes, so the volume of one page of text is 60 x 40 = 2400 bytes = 2.4 Kilobytes = 2.4 KB

The volume of one book is 2400 x 100 = 240,000 bytes = 240 Kilobytes = 240 KB

3) The size of one music file, which, according to manufacturers, can be recorded on a “terabyte”: 1,000,000,000,000: 250,000 = (we reduce three zeros in the dividend and in the divisor) 1000,000,000: 250 = 4,000,000 bytes = 4 Megabytes = 4 MB

The size of one photograph, which, according to manufacturers, can be recorded on a “terabyte”: 1,000,000,000,000: 285,000 = (we reduce three zeros in the dividend and divisor) 1,000,000,000: 285 = 3,508,771, 93 bytes = ( round up) 3.5 Megabytes = 3.5 MB

4) A 700 megabyte CD can hold 700 MB: 4 MB = 175 music files, each no larger than 4 MB. Here megabytes can be immediately divided into megabytes, but when working with different volumes of bytes, it is better to first convert everything into bytes, and then perform various arithmetic operations with them.

5) A 4 GB flash drive can hold 4,000,000,000: 3,508,771, 93 = (reduce three zeros in the dividend and divisor) = 4,000,000: 3,508 = 1,139.99 photo = (round) 1,140 photos, each of which is no more than 3.5 MB in size.

You can count approximately. Then: A 4 GB flash drive can hold 4,000,000,000: 3,500,000 = (reduce five zeros in the dividend and divisor) = 40,000: 35 = 1,142.86 photos = (round down) 1,140 photos , each of which is no more than 3.5 MB in size

We already know that the computer perceives all information. Bit is the minimum unit of measurement of information corresponding to one binary digit (“0” or “1”).

Byte consists of eight bits. Using one byte, you can encode one character out of 256 possible (256 = 2 8). Thus, one byte is equal to one character, that is, 8 bits:

1 character = 8 bits = 1 byte.

The study of computer literacy involves consideration of other, larger units of measurement of information.

Byte table:

1 byte = 8 bits

1 KB (1 Kilobyte) = 2 10 bytes = 2*2*2*2*2*2*2*2*2*2 bytes =
= 1024 bytes (approximately 1 thousand bytes – 10 3 bytes)

1 MB (1 Megabyte) = 2 20 bytes = 1024 kilobytes (approximately 1 million bytes - 10 6 bytes)

1 GB (1 Gigabyte) = 2 30 bytes = 1024 megabytes (approximately 1 billion bytes - 10 9 bytes)

1 TB (1 Terabyte) = 2 40 bytes = 1024 gigabytes (approximately 10 12 bytes). Terabyte is sometimes called ton.

1 Pb (1 Petabyte) = 2 50 bytes = 1024 terabytes (approximately 10 15 bytes).

1 Exabyte= 2 60 bytes = 1024 petabytes (approximately 10 18 bytes).

1 Zettabyte= 2 70 bytes = 1024 exabytes (approximately 10 21 bytes).

1 Yottabyte= 2 80 bytes = 1024 zettabytes (approximately 10 24 bytes).

In the table above, powers of two (2 10, 2 20, 2 30, etc.) are the exact values of kilobytes, megabytes, gigabytes. But the powers of the number 10 (more precisely, 10 3, 10 6, 10 9, etc.) will already be approximate values, rounded down. So 2 10 = 1024 bytes represents exact value kilobyte, and 10 3 = 1000 bytes is the approximate value of a kilobyte.

Such approximation (or rounding) is quite acceptable and generally accepted.

Below is a table of bytes with English abbreviations (in the left column):

1 Kb ~ 10 3 b = 10*10*10 b= 1000 b – kilobyte

1 Mb ~ 10 6 b = 10*10*10*10*10*10 b = 1,000,000 b – megabyte

1 Gb ~ 10 9 b – gigabyte

1 Tb ~ 10 12 b – terabyte

1 Pb ~ 10 15 b – petabyte

1 Eb ~ 10 18 b – exabyte

1 Zb ~ 10 21 b – zettabyte

1 Yb ~ 10 24 b – yottabyte

Above in the right column are the so-called “decimal prefixes”, which are used not only with bytes, but also in other areas of human activity. For example, the prefix “kilo” in the word “kilobyte” means a thousand bytes, just as in the case of a kilometer it corresponds to a thousand meters, and in the example of a kilogram it equals a thousand grams.

The question arises: is there a continuation of the byte table? In mathematics there is a concept of infinity, which is symbolized as an inverted figure eight: ∞.

Finally, a couple of examples of devices that can store terabytes and gigabytes of information.

There is a convenient “terabyte” - an external hard drive that connects via a USB port to the computer. You can store a terabyte of information on it. It is especially convenient for laptops (where changing the hard drive can be problematic) and for backing up information. It is better to back up information in advance, rather than after everything is lost.

Flash drives come in 1 GB, 2 GB, 4 GB, 8 GB, 16 GB, 32 GB, 64 GB and even 1 terabyte.

Can hold 650 MB, 700 MB, 800 MB and 900 MB.

DVDs are designed for a larger amount of information: 4.7 GB, 8.5 GB, 9.4 GB and 17 GB.

Units. Units of information measurement in computer science. Minimum unit of information

Amount of information

The amount of information as a measure of reducing knowledge uncertainty. (Substantive approach to determining the amount of information)

Alphabetical approach to determining the amount of information

Byte table:

The amount of information as a measure of reducing knowledge uncertainty.
(Substantive approach to determining the amount of information)