Difference between data and information. Concept of information. Information and data. Differences between the concepts of information and data

Data- this is also knowledge, but knowledge of a very special kind. To a first approximation, data is the result of linguistic recording of a single observation, experiment, fact or situation. Examples of data could be:

a) “on such and such a date, such and such a year, at moment t it was raining in a certain area” (meteorological data)”;

b) “the price of commercial timber on such and such a day of such and such a year, according to information from such and such an exchange, was so many dollars per ton” (trade data);

c) “the state budget deficit in such and such a country amounted to so and so billions of dollars in such and such a year” (financial datum);

d) “at such and such a moment in time, the automatic laboratory heading towards Jupiter deviated from the calculated trajectory by so many degrees, so many thousands of kilometers in such and such a direction” (data from the field of space technology).

From a technological point of view, some experts usually define the concept of “data” as information that is stored in databases and processed by application programs, or information presented as a sequence of characters and intended for processing in a computer, i.e. data includes only that part of knowledge that is formalized to such an extent that formalized processing procedures can be carried out on it using various technical means.

Data is information presented in a formalized form suitable for automatic processing with possible human participation. Data is information written (encoded) in the language of the machine. Data are individual facts that characterize objects, processes and phenomena in the subject area, as well as their properties.

There is a difference between information and data; Data can be considered as signs or recorded observations that for some reason are not used, but only stored. Consequently, at a given moment in time they do not influence behavior or decision-making. However, data turns into information if such an impact exists.

For example, the main body of data for a computer consists of features that do not affect behavior. Unless this data is organized appropriately and reflected in the form of an output so that the manager acts in accordance with it, it is not information. They remain data until the employee accesses them in connection with the implementation of certain actions or in connection with some decision that he is obliged to make.

Data turns into information when its meaning is realized. It can also be said that when it is possible to use data to reduce uncertainty about something, data turns into information.

Data Life Cycles

Like matter and energy, data can be collected, processed, stored, and changed in form. However, they have some features. First of all, data can be created and disappeared. For example, data on an extinct animal may disappear when a piece of coal with its prints is burned. Data may be erased, lose accuracy, etc. Data can be characterized by a life cycle (Fig. 1.9), in which three aspects are of primary importance - generation, processing, storage and retrieval.

Reproduction and use of data can occur at different points in their life cycle and are therefore not shown in the diagram.

Rice. 1.9. Life cycle of data

When processed on a computer, data is transformed, conditionally going through the following stages:

1) data as a result of measurements and observations:

2) data on tangible media (tables, protocols, directories);

3) data models (structures) in the form of diagrams, graphs, functions;

4) data in the computer in a data description language;

5) databases on computer media.

Data Models

The data model is the core of any database. The appearance of this term in the early 70s of the twentieth century is associated with the works of the American cybernetics E.F. Codd, which reflected the mathematical aspect of the data model used in the sense of data structure. In connection with the needs for the development of data processing technology in the theory of automated information banks (ABI), in the second half of the 70s, the instrumental aspect of the data model appeared; the content of this term included restrictions imposed on data structures and operations with them.

In a modern interpretation data model is defined as a set of rules for generating data structures in databases, operations on them, as well as integrity constraints that determine permissible connections and data values, and the sequence of their changes.

Thus, a data model represents a set of data structures, integrity constraints, and data manipulation operations. Based on this, the following working definition can be formulated: A data model is a set of data structures and processing operations.

Currently, there are three main types of data models: hierarchical, network and relational. Hierarchical data model organizes data in the form of a tree structure and is the implementation of logical connections: generic relations or “whole - part” relations. For example, the structure of a higher education institution is a multi-level hierarchy (see Fig. 1.10).

Rice. 1.10. Example of a hierarchical structure

A hierarchical (tree) database consists of an ordered set of trees; more precisely, from an ordered set of multiple instances of the same type of tree. In this model, initial elements give rise to other elements, and these elements in turn give rise to further elements. Each child element has only one parent element. Organizational structures, lists of materials, tables of contents in books, project plans, meeting schedules, and many other sets of data can be presented in a hierarchical form.

The main disadvantages of this model are: a) the complexity of displaying the relationship between objects of the “many to many” type; b) the need to use the hierarchy that was the basis of the database during design. The need for constant data reorganization (and often the impossibility of this reorganization) led to the creation of a more general model - the network model.

The network approach to data organization is an extension of the hierarchical approach. This model differs from the hierarchical one in that each child element can have more than one parent element. An example of a network data model is shown in Figure 1.11.

Since a network database can directly represent all kinds of relationships inherent in the data of the corresponding organization, this data can be navigated, explored and queried in various ways, i.e. the network model is not bound by just one hierarchy. However, in order to make a request to a network database, you need to delve deeply into its structure (have the schema of this database at hand) and develop your own mechanism for navigating the database, which is a significant drawback of this database model.

Rice. 1.11. Example network structure

One of the disadvantages of the data models discussed above is that in some cases, with a hierarchical and network representation, the growth of the database can lead to a violation of the logical representation of the data. Such situations arise when new users, new applications and types of requests appear, taking into account other logical connections between data elements. The relational data model avoids these disadvantages.

A relational database is one in which all data is presented to the user in the form of rectangular tables of data values, and all operations on the database are reduced to manipulations with tables.

A table consists of columns (fields) and rows (records); has a name that is unique within the database. The table reflects the type of real world object (entity), and each of its rows represents a specific object. Thus, the Sports section table contains information about all children involved in a given sports section, and its rows represent a set of attribute values ​​for each specific child. Each table column is a collection of values ​​for a specific attribute of an object. The Weight column, for example, represents the totality of all weight categories of children involved in the section. The Gender column can only contain two different values: “male.” and "feminine." These values ​​are selected from the set of all possible values ​​for an object's attribute, called the domain. Thus, the values ​​in the Weight column are selected from the set of all possible child weights.

Each column has a name, which is usually written at the top of the table. These columns are called fields tables. When designing tables within a specific DBMS, it is possible to select for each field its type, those. define a set of rules for its display, as well as determine the operations that can be performed on the data stored in this field. Sets of types may vary between different DBMSs.

The field name must be unique in the table, but different tables can have fields with the same name. Any table must have at least one field; The fields are located in the table in accordance with the order in which their names appeared when it was created. Unlike fields, strings do not have names; their order in the table is not defined, and their number is logically unlimited. The lines are called records tables.

Since the rows in the table are not ordered, it is impossible to select a row by its position - there is no "first", "second", or "last" among them. Any table has one or more columns, the values ​​of which uniquely identify each of its rows. This column (or combination of columns) is called a primary key. In the Sports section table, the primary key is the Full Name column. (Fig. 1.12).

This choice of primary key has a significant drawback: it is impossible to record two children in a section with the same value in the Full Name field, which is not so rare in practice. That is why an artificial field is often introduced to number records in the table. Such a field, for example, could be a journal number for each child, which can ensure the uniqueness of each entry in the table. If a table satisfies this requirement, it is called attitude(relation).

Rice. 1.12. Relational data model

Relational data models can typically support four types of relationships between tables:

1) One to one(example: one table stores information about schoolchildren, another table stores information about schoolchildren’s vaccinations).

2) One to Many(example: one table stores information about teachers, another table stores information about students for whom these teachers are class teachers).

3) Many to One(as an example, we can offer the previous case, considering it from the other side, namely from the side of the table in which information about schoolchildren is stored).

4) Many to Many(example: orders for the supply of goods are stored in one table, and in another - companies executing these orders, and several companies can be combined to fulfill one order /

Relational representation of data has a number of advantages. It is understandable to a user who is not a programming specialist, allows you to easily add new descriptions of objects and their characteristics, and has great flexibility when processing queries.

Questions and tasks

1. Define the concept of “data”.

2. What is the data life cycle?

3. What data models do you know?

4. List the advantages and disadvantages of each data model.


INFORMATION PROCESSES


Module 1 (1.5 credits): Introduction to Economic Informatics

Topic 1.1: Theoretical foundations of economic informatics

Topic 1.2: Technical means of information processing

Topic 1.3: System Software

Topic 1.4: Service software and algorithmic basics

Economic informatics and information

1.1. Theoretical foundations of economic informatics

1.1.2. Data, information and knowledge

Basic concepts of data, information, knowledge.

The basic concepts used in economic informatics include: data, information and knowledge. These concepts are often used interchangeably, but there are fundamental differences between these concepts.

The term data comes from the word data - fact, and information (informatio) means explanation, presentation, i.e. information or message.

Data is a collection of information recorded on a specific medium in a form suitable for permanent storage, transmission and processing. Transformation and processing of data allows you to obtain information.

Information is the result of data transformation and analysis. The difference between information and data is that data is fixed information about events and phenomena that is stored on certain media, and information appears as a result of data processing when solving specific problems. For example, various data are stored in databases, and upon a certain request, the database management system provides the required information.

There are other definitions of information, for example, information is information about objects and phenomena of the environment, their parameters, properties and state, which reduce the degree of uncertainty and incomplete knowledge about them.

Knowledge– this is recorded and practice-tested processed information that has been used and can be repeatedly used for decision-making.

Knowledge is a type of information that is stored in a knowledge base and reflects the knowledge of a specialist in a specific subject area. Knowledge is intellectual capital.

Formal knowledge can be in the form of documents (standards, regulations) regulating decision-making or textbooks, instructions describing how to solve problems.

Informal knowledge is the knowledge and experience of specialists in a certain subject area.

It should be noted that there are no universal definitions of these concepts (data, information, knowledge), they are interpreted differently.

Decisions are made based on the information received and existing knowledge.

Making decisions- this is the choice of the best, in a certain sense, solution option from a set of acceptable ones based on the available information.

The relationship between data, information and knowledge in the decision-making process is presented in the figure.


Rice. 1.

To solve the problem, fixed data is processed on the basis of existing knowledge, then the information received is analyzed using existing knowledge. Based on the analysis, all feasible solutions are proposed, and as a result of the choice, one decision that is best in some sense is made. The results of the solution add to knowledge.

Depending on the scope of use, information can be different: scientific, technical, management, economic, etc. For economic informatics, economic information is of interest.

Information- this is information about objects and phenomena of the environment, their parameters, properties and states, which reduce the degree of uncertainty and incomplete knowledge about them.

Data is a collection of information recorded on a specific medium in a form suitable for permanent storage, transmission and processing. Transformation and processing of data allows you to obtain information. Become information when used

2.Properties of information: objectivity, reliability, completeness, relevance, adequacy, accessibility.

Information properties:

  1. Objectivity of information. Objective – existing outside and independently of human consciousness. Information is a reflection of the external objective world. Information is objective if it does not depend on the methods of its recording, anyone’s opinion, or judgment. Example. The message “It’s warm outside” carries subjective information, while the message “It’s 22°C outside” carries objective information. Objective information can be obtained using working sensors and measuring instruments. Reflected in a person’s consciousness, information can be distorted depending on the opinion, judgment, experience, knowledge of a particular subject, and thus cease to be objective.
  2. Reliability of information. Information is reliable if it reflects the true state of affairs. Objective information is always reliable, but reliable information can be both objective and subjective. Reliable information helps us make the right decision. Information may be inaccurate for the following reasons:
  • intentional or unintentional distortion of a subjective property;
  • distortion as a result of interference and insufficiently accurate means of fixing it.
  • Completeness of information. Information can be called complete if it is sufficient for understanding and making decisions. Incomplete information may lead to an erroneous conclusion or decision.
  • Relevance of information is the degree to which information corresponds to the current moment in time. Only timely information received can be useful.
  • Adequacy of information - this is the degree of correspondence to the real objective state of the matter. Inadequate information can be created when new information is created based on incomplete or unreliable data. However, both complete and reliable data can lead to the creation of inadequate information if inadequate methods are applied to them.
  • Availability of information - measure of the possibility of obtaining this or that information. The degree of availability of information is influenced simultaneously by both the availability of data and the availability of adequate methods for their interpretation. Lack of access to data or lack of adequate data processing methods lead to the same result: information is inaccessible.
  • In recent years, the Xerox company has positioned itself not as a manufacturer of copying machines, but as a document processing company. The ZM company calls itself an innovative problem solving company. IBM identifies itself as a company that creates long-term economic benefits for customers by combining its business knowledge with broad technological capabilities. Office equipment company Steelcase says it sells proprietary knowledge and services that help create better experiences for people in their workplaces. What adds value to all these companies? These are mainly solutions based on knowledge: technical and technological know-how, product design, marketing research, identifying the true needs of customers. It is knowledge that gives these companies a sustainable competitive advantage.

    Let's consider the difference between knowledge and data and information. Managers begin to realize that these are different things especially clearly after the organization has spent significant funds to create a particular database or information system, or simply spent these funds on computerization, without any corresponding effect.

    Data- is a collection of various objective facts. In corporations, this is, for example, structured records of transactions (in particular, data on all sales: how much, when and who bought, how much and when paid, etc.). This data does not tell us why the buyer came here and whether he will come again.

    Information is a hierarchical collection of data about certain aspects of the real world. Information is a flow of messages, and knowledge is created from this flow; it depends on the opinions and beliefs of the knowledge bearer.

    Information is a kind of message, usually in the form of a document or in video or audio form. It has a recipient and a sender. It informs, i.e. "gives shape" to the recipient by changing his evaluations or behavior. The extent to which the message is information is determined by the recipient. It is he who evaluates how much the received message informs him, and how much it is simply information noise.

    Data turns into information in several ways:

    o contextualization: we know what this data is for;

    o count: we process data mathematically;

    o correction: we correct errors and eliminate omissions;

    o compression: we compress, concentrate, aggregate data.

    Knowledge- a concept deeper and broader than just data or information. Each enterprise, in the course of its activities, collects data, structures it and generates new knowledge. Most often, this knowledge concerns technology, if we are talking about material production, as well as technology for working with clients and technology for interacting with each other, if we are talking about an enterprise that provides customer service. It can also be knowledge regarding the environment of the enterprise - about demographic, macroeconomic, social, macroeconomic, technological and market trends.


    The difference between knowledge and information and data: an example

    Chrysler has a collection of computer files called the Engineering Knowledge Book, which provide comprehensive data and information about the company's automobiles for use by any new car designer. When the manager received data on the crash tests performed, he refused to put them in files without appropriate processing. He suggested answering the following questions:

    o why these tests were carried out;

    o what are the results compared to other similar tests of this company from other years and competitors;

    o what are the conclusions and tests for the design of the car and its main components?

    Similar questions transform information into knowledge; Moreover, the answers to these questions add value to the information, or, in other words, add value. In practice, there are opposite examples when, by adding unnecessary, empty information, the original information loses its value. There is a loss of value due to the blurring of the necessary information in the flow of information noise.

    Knowledge is a combination of experience, values, contextual information, expert assessments, which provides a general framework for assessing and incorporating new experience and information. Knowledge exists in the minds of those who know. In organizations, it is recorded not only in documents, but also in processes, procedures, norms, and in practice in general.

    Just as information arises from data, so knowledge arises from information by:

    o comparisons, determining the scope (how and when we can apply information about this phenomenon to another, similar one);

    o establishing connections (how this information relates to other information);

    o assessments (how this information can be assessed and how others evaluate it);

    o determining the scope (how this information applies to certain decisions or actions).

    The process of transforming data into information, and information into knowledge is shown in Fig. 14.1.

    Rice. 14.1. Data, information and knowledge

    There is a distinction between individual and group knowledge. Traditional views assume that knowledge is the prerogative of individuals, with a group being just the simple sum of the members of that group, and group knowledge being the sum of their knowledge.

    There is another, modern point of view, according to which a group of people forms a new entity with its own unique specificity. Within the framework of this concept, we can talk about group behavior and group knowledge, respectively. This new concept is widely used within the science of knowledge management. Thus, knowledge can be acquired not only by an individual, but also by a group of people. Then they say that the organization as a whole knows something, a group, a brigade, etc. knows something.

    Bill Gates, in his book Business at the Speed ​​of Thought, writes about the need to increase corporate IQ. By this, he means not only the number of smart employees, but also the accumulation of knowledge in the company as a whole and the free flow of information, which allows employees to benefit from each other's ideas.

    Knowledge can be explicit or tacit. Explicit knowledge can be expressed in words and numbers and can be transmitted in formalized form on media. This refers to those types of knowledge that are transmitted in the form of prescriptions, instructions, books, on various media, in the form of memos, etc.

    Tacit knowledge in principle, it is not formalized and can only exist together with its owner - a person or a group of persons.

    There are two types of tacit knowledge. The first is the technical skills that are demonstrated by masters of their craft and are, as a rule, the result of many years of practice. The second is the beliefs, ideals, values ​​and mental models that we use without thinking about them.

    Tacit knowledge is formed and developed in the process of creating and strengthening a positive corporate culture and through means of group interaction (retreats, creative groups, etc.).

    The attitude towards explicit and tacit knowledge on the part of business firms is very contradictory. On the one hand, many firms strive to transform tacit knowledge into explicit knowledge. This is done in order, on the one hand, not to depend on individuals, and on the other, to duplicate significant achievements. At the same time, these firms are not interested in seeing their core competitive advantages transferred into a form ready for duplication. That is why many companies try to maintain some of their competitive advantages in forms that cannot be duplicated (specific training, corporate culture, special service systems, etc.).

    The bearer of both explicit and implicit knowledge can be not only a specific person, but also an organization. Consequently, we can talk about tacit group knowledge, which underlies stable patterns of collective reactions and internal interactions.

    In Western literature, the term “routines” is sometimes used to denote tacit group knowledge, which are repetitive actions, regular behavioral patterns of an organization or firm. Routines are what happen automatically, without instructions and in the absence of a choice procedure; however, routines cannot be codified.

    In Russian, routine is understood as a routine, established practice, a certain regime, a pattern, established rules regarding people’s activities. At the same time, the concept of “routine” has one more meaning: it is an inert order, i.e. an order that gravitates towards the old, familiar, and, due to its backwardness, is impervious to the new, progressive. In cases where the term “routine” is used to denote group tacit knowledge, the connotations related to rigidity are absent.

    Thus, personal tacit knowledge is, first of all, skills. At the same time, group tacit knowledge is, first of all, routines. Routines do not exist in isolation, but form interdependence. Some routines may be implicit for some members of a group (organization) and explicit for others. Thus, the boundaries between explicit and implicit knowledge are relative, and we can also talk about the degree of tacitness of this knowledge. The ratio of explicit and implicit, individual and group knowledge is presented in Table. 14.1.

    Table 14.1

    Knowledge ratio

    The presence of tacit knowledge in an organization forces us to approach knowledge management in an unconventional way. Traditionally, knowledge management refers to the creation, development and use of various databases and knowledge. The presence of tacit knowledge shifts attention to the means of direct communication between people. It is important not only and not so much to create a corporate encyclopedia that records everything that any of the employees knew and encountered. In the case of tacit knowledge, it is more important to have at hand the coordinates of people who know the recipe and have the relevant experience, to create a culture of communication using brainstorming sessions, meetings, debriefings and appropriate means of communication, such as e-mail, personal websites, teleconferences etc.

    Data and information are often equated, but there is a significant difference between the two terms:

    Information- knowledge relating to concepts and objects (facts, events, things, processes, ideas) in the human brain;

    Data- presentation of processed information suitable for transmission, interpretation, or processing (computer files, paper documents, records in an information system).

    The difference between information and data is that:

    1) data is fixed information about events and phenomena that is stored on certain media, and information appears as a result of data processing when solving specific problems.

    For example, various data are stored in databases, and upon a certain request, the database management system provides the required information.

    2) data are information carriers, not the information itself.

    3) Data turns into information only when a person becomes interested in it. A person extracts information from data, evaluates, analyzes it and, based on the results of the analysis, makes one decision or another.

    Data turns into information in several ways:

    Contextualization: we know what the data is for;

    Counting: We process data mathematically;

    Correction: we correct errors and eliminate omissions;

    Compression: We compress, concentrate, aggregate data.

    Thus, if it is possible to use data to reduce the uncertainty of knowledge about a subject, then data turns into information. Therefore, it can be argued that information is the data used.

    4) Information can be measured. The measure of measuring the content of information is associated with a change in the degree of ignorance of the recipient and is based on methods of information theory.

    2. Subject area- this is a part of the real world, the data about which we want to reflect in the database. The subject area is infinite and contains both essentially important concepts and data, as well as insignificant or non-significant data. Thus, the importance of data depends on the choice of domain.

    Domain model. A domain model is our knowledge about a domain. Knowledge can be either in the form of informal knowledge in the expert’s brain or expressed formally using some means. Experience shows that the textual way of representing a domain model is extremely ineffective. Much more informative and useful when developing databases are descriptions of the subject area made using specialized graphic notations. There are a large number of methods for describing a subject area. The most well-known include the SADT structural analysis technique and IDEF0 based on it, Gein-Sarson data flow diagrams, the UML object-oriented analysis technique, etc. The domain model rather describes the processes occurring in the subject area and the data used by these processes. The success of further application development depends on how correctly the subject area is modeled.

    3. Database- a set of independent materials presented in an objective form (articles, calculations, regulations, court decisions and other similar materials), systematized in such a way that these materials can be found and processed using an electronic computer (computer).

    Many experts point out the common mistake of incorrectly using the term “database” instead of the term “database management system”, and point out the need to distinguish between these concepts.