Automation of the office space rental manager's workplace. Entity-relationship information model of databases

Infological model of databases "Entity-relationship" Basic concepts

The purpose of information modeling is to provide the most natural ways for humans to collect and present the information that is supposed to be stored in the database being created data. Therefore, they are trying to build an infological data model by analogy with natural language (the latter cannot be used in pure form due to complexity computer processing texts and ambiguity of any natural language). The main constructive elements of information models are entities, connections between them and their properties (attributes).

An entity is any distinguishable object (an object that we can distinguish from another), information about which must be stored in a database. Entities can be people, places, planes, flights, taste, color, etc. It is necessary to distinguish between concepts such as entity type and entity instance. The concept of entity type refers to a set of homogeneous individuals, objects, events or ideas acting as a whole. An entity instance refers to a specific thing in a set. For example, the entity type can be CITY, and the instance can be Moscow, Kyiv, etc.

An attribute is a named characteristic of an entity. Its name must be unique for specific type essence, but may be the same for various types entities (for example, COLOR can be defined for many entities: DOG, CAR, SMOKE, etc.). Attributes are used to define what information should be collected about an entity. Examples of attributes for the CAR entity are TYPE, MAKE, LICENSE PLATE, COLOR, etc. Here too there is a distinction between type and instance. The COLOR attribute type has many instances or values:

Red, Blue, Banana, White Night, etc.,

However, each entity instance is assigned only one attribute value.

There is no absolute difference between entity types and attributes. An attribute is such only in relation to the entity type. In another context, an attribute may act as an independent entity. For example, for a car factory, color is only an attribute of the production product, but for a paint and varnish factory, color is a type of entity.

Key - minimum set attributes whose values ​​can be used to uniquely find the required instance of an entity. Minimality means that excluding any attribute from the set does not allow identifying the entity by the remaining ones. For the Schedule entity (clause 1.2), the key is the Flight_number attribute or the set of: Departure_point, Departure_time and Destination_point (provided that one plane flies from point to point at each time).

A relationship is an association of two or more entities. If the purpose of the database was only to store individual, unrelated data, then its structure could be very simple. However, one of the main requirements for organizing a database is to ensure the possibility of finding some entities by the values ​​of others, for which it is necessary to establish between them certain connections. And since real databases often contain hundreds or even thousands of entities, theoretically more than a million connections can be established between them. The presence of such a multitude of connections determines the complexity of information models.

Characteristics of connections and modeling language

When building information models, you can use the language of ER diagrams (from the English Entity-Relationship, i.e. entity-relationship). In them, entities are depicted as marked rectangles, associations as marked diamonds or hexagons, attributes as marked ovals, and connections between them as non-directional edges, above which the degree of connection (1 or a letter replacing the word “many”) and the necessary explanation can be indicated.

Between two entities, for example, A and B, four types of connections are possible.

The first type is a ONE-TO-ONE relationship (1:1): at each moment of time, each representative (instance) of entity A corresponds to 1 or 0 representatives of entity B:

A student may not “earn” a scholarship, receive a regular scholarship, or receive one of the enhanced scholarships.

The second type is a ONE-TO-MANY relationship (1:M): one representative of entity A corresponds to 0, 1 or several representatives of entity B.

The apartment may be empty; one or more residents may live in it.

Since connections in both directions are possible between two entities, there are two more types of relationships: MANY-TO-ONE (M:1) and MANY-TO-MANY (M:N).

Example 2.1. If the connection between the entities of MAN and WOMAN is called MARRIAGE, then there are four possible representations of such a connection:

The nature of connections between entities is not limited to those listed. There are also more complex connections:

Many relationships between the same entities

(a patient, having one attending physician, may also have several consulting physicians; a physician may be the attending physician of several patients and may simultaneously consult several other patients);

Training connections

(a doctor can order more than one patient for more than one test, a test can be ordered by more than one doctor for more than one patient, and a patient can be ordered for more than one test by more than one doctor);

Connections of higher orders, the semantics (meaning) of which is sometimes very complex.

In the examples given, to improve the illustrative nature of the relationships under consideration, the attributes of entities and associations in all ER diagrams are not shown. Thus, entering only a few basic attributes into the description of marriage ties will significantly complicate the ER diagram (Fig. 2.1a). In this regard, the language of ER diagrams is used to build small models and illustrate individual fragments of large ones. More often, a less visual but more meaningful information modeling language (IML) is used, in which entities and associations are represented by sentences of the form:

ENTITY (attribute 1, attribute 2, ..., attribute n)

ASSOCIATION [ENTITY S1, ENTITY S2, ...]

(attribute 1, attribute 2, ..., attribute n)

where S is the degree of connection, and the attributes included in the key must be marked with an underscore.

Thus, the above example of a set of connections between entities can be described in NAM as follows:

Doctor (Doctor_number, Last name, First name, Patronymic, Specialty)

Patient (Registration_number, Bed number, Last name,

Name, Patronymic, Address, Date of Birth, Gender)

Attending_doctor [Doctor 1, Patient M]

(Doctor_number, Registration_number)

Consultant [Doctor M, Patient N]

(Doctor_number, Registration_number).

Rice. 2.1. Examples of ER diagrams

To identify relationships between entities, it is necessary, at a minimum, to define the entities themselves. But it is not simple task, since in different subject areas the same object can be an entity, attribute or association. Let us illustrate this statement with examples related to the description of marital ties (see example 2.1).

Example 2.2. The Civil Registration Office (ZAGS) does not deal with all people, but only with those who have applied to register a marriage, birth or death. Therefore, in countries where only traditional marriages are allowed, civil registry offices can post information about registered marriages in a single entity:

Marriage (Certificate_number, Husband's last name, Husband's first name,

Middle_name of husband, Date of birth_of husband, Last name of wife,

Registration_date, Registration_place, ...),

The ER diagram of which is shown in Fig. 2.1, b.

Example 2.3. Now consider a situation where the civil registry office is located in a country that allows polygamy. If you use the “Marriage” entity of example 2.2 to register marriages, then information about husbands who have several wives will be duplicated (see Table 2.1).

Table 2.1

Duplication can be eliminated by creating an additional entity "Husbands"

Husbands (Code_M, Last name, First name, Patronymic, Date of birth, Place of birth)

and replacing the “Marriage” entity with a characteristic (see clause 2.3) with reference to the corresponding description in the “Husbands” entity.

Marriage (Certificate number, Code_M, Wife's last name, ...,

Date of registration, ... (Husbands).

The ER diagram of the connection between these entities is shown in Fig. 2.1,c, and an example of their copies is in table. 2.2 and 2.3.

Table 2.2

Table 2.3

Example 2.4. Finally, consider the case when an organization required data on the presence of married couples, and there is already an entity to store information about employees

Employees (Personnel_number, Last name, First name, ...).

Using the “Marriage” entity discussed in example 2.2 is inappropriate: “Employees” already contains the surnames, first names, and patronymics of the spouses. Therefore, let's create an association

Marriage [Employee 1, Employee 1]

(Husband_personnel_number, Wife_personnel_number, ...),

connecting certain instances of the “Employees” entity (Fig. 2.1,d).

In conclusion, we note that the ER diagram Fig. 2.1a describes the structure of placing data on marriages in the registry offices of countries that allow group marriages, and the ER diagrams of example 2.1 describe any types of marriages in organizations where there are entities “men” and “women”, including single and unmarried people.

What is “connection”? In ER diagrams, this is the line connecting geometric figures, depicting entities, attributes, associations and other information objects. In the text, this term is used to indicate the interdependence of entities. If this interdependence has attributes, then it is called an association.

Entity classification

The time has come to understand the terminology. K. Date defines three main classes of entities: core, associative and characteristic, as well as a subclass of associative entities - designations.

A core entity (pivot) is an independent entity (it will be defined in more detail below).

In the examples discussed earlier, the rods are “Student”, “Apartment”, “Men”, “Doctor”, “Marriage” (from example 2.2) and others, the names of which are placed in rectangles.

An associative entity (association) is a “many-to-many” (“-to-many”, etc.) relationship between two or more entities or instances of an entity (as in example 2.4). Associations are treated as full-fledged entities:

they can participate in other associations and designations just like core entities;

may have properties, i.e. have not only a set of key attributes necessary to indicate relationships, but also any number of other attributes that characterize the relationship. For example, the “Marriage” associations from examples 2.1 and 2.4 contain key attributes “Code_M”, “Code_Zh” and “Husband’s Personnel Number”, “Wife’s Personnel Number”, as well as clarifying attributes “Certificate Number”, “Registration Date”, “Registration Place” ", "Entry number in the registry office book", etc.

A characteristic entity (characteristic) is a many-to-one or one-to-one relationship between two entities ( special case associations). The only purpose of the characteristic within the framework of the considered subject area consists in describing or clarifying some other entity. The need for them arises due to the fact that entities of the real world sometimes have multi-valued properties. A husband can have several wives (example 2.3), a book can have several characteristics of a reprint (corrected, expanded, revised, ...), etc.

The existence of a characteristic depends entirely on the entity being characterized: women lose their status as wives if their husband dies.

To describe the characteristics, a new JIM proposal is used, which has general case view:

CHARACTERISTIC (attribute 1, attribute 2, ...)

(LIST OF CHARACTERIZED ENTITIES).

Let's also expand the language of ER diagrams by introducing a trapezoid to represent the characteristics (Fig. 2.2).

Rice. 2.2. Elements of the extended ER diagram language

A designating entity or designation is a many-to-one or one-to-one relationship between two entities and differs from a characteristic in that it does not depend on the designated entity.

Let's consider an example related to the enrollment of employees in various departments of the organization.

In the absence of strict rules (an employee can be simultaneously enrolled in several departments or not be enrolled in any department), it is necessary to create a description with the association Enrollment:

Employees (Personnel number, Last name, ...)

Enrollment [Departments M, Employees N]

(Department number, Personnel number, Date of enrollment).

However, provided that each employee must be enrolled in one of the departments, you can create a description with the designation Employees:

Departments (Department number, Department name, ...)

Employees (Personnel number, Last name, ..., Department number,

Date of enrollment)[Departments]

IN in this example employees have an independent existence (if a department is deleted, it does not follow that the employees of that department must also be deleted). Therefore, they cannot be characteristics of departments and are called designations.

Notations are used to store repeating values ​​of large text attributes: “codifiers” of disciplines studied by students, names of organizations and their departments, lists of goods, etc.

The description of a designation differs externally from the description of a characteristic only in that the designated entities are enclosed not in curly brackets, but in square brackets:

NOTE (attribute 1, attribute 2, ...)[LIST

DESIGNATED ENTITIES].

Typically, designations are not treated as full entities, although this would not lead to any error.

Designations and characteristics are not completely independent entities, since they presuppose the existence of some other entity that will be “designated” or “characterized.” However, they still represent special cases of essence and can, of course, have properties, can participate in associations, designations and have their own (more low level) characteristics. We also emphasize that all instances of a characteristic must be associated with some instance of the characterized entity. However, it is allowed that some instances of the characterized entity do not have relationships. True, if this concerns marriages, then the essence of “Husbands” should be replaced by the essence of “Men” (there is no husband without a wife).

Let us now redefine the core entity as an entity that is neither an association, nor a designation, nor a characteristic. Such entities have independent existence, although they can designate other entities, such as employees designate departments.

In conclusion, let us consider an example of constructing an information model of the “Nutrition” database, where information about dishes (Fig. 2.3), their daily consumption, the products from which these dishes are prepared, and the suppliers of these products should be stored. The information will be used by the chef and manager of a small catering establishment, as well as its visitors.

Rice. 2.3. Example of a recipe

With the help of these users, the following objects and characteristics of the designed base were identified:

Dishes that require data included in their culinary recipes to describe them: dish number (for example, from a cookbook), name of the dish, type of dish (appetizer, soup, main course, etc.), recipe (technology for preparing the dish), yield (portion weight), name, calorie content and weight of each product included in the dish. For each product supplier: name, address, name of product supplied, delivery date and price at the time of delivery. Daily food consumption (consumption): dish, number of servings, date.

Analysis of objects allows us to highlight:

Rods Dishes, Products and Cities; associations Composition (links Dishes with Products) and

Supplies (links Suppliers with Products);

Designation Suppliers; characteristics Recipes and Consumption.

The ER diagram of the model is shown in Fig. 2.4. and the model in the YAM language has next view:

Dishes (BL, Dish, View)

Products (PR, Product, Calorie content)

Suppliers (POS, City, Supplier) [City]

Composition [Dishes M, Products N] (BL, PR, Weight (g))

Supplies [Suppliers M, Products N] (POS, PR, Date_P, Price, Weight (kg))

Cities (City, Country)

Recipes (BL, Recipe) (Dishes)

Consumption (BL, Date_R, Portions) (Dishes)

In these models, Dish, Product and Supplier are names, and BL, PR and POS are digital codes dishes, products and organizations that supply these products.

Rice. 2.4. Infological model of the "Nutrition" database

About primary and foreign keys

Recall that a key or candidate key is a minimum set of attributes whose values ​​can be used to uniquely find the required instance of an entity. Minimality means that excluding any attribute from the set does not allow identifying the entity by the remaining ones. Every entity has at least one possible key. One of them is taken as primary key. When choosing a primary key, you should give preference to non-composite keys or keys made up of a minimum number of attributes. It is also inadvisable to use keys with long text values(preferably use integer attributes). Thus, to identify a student, you can use either a unique record book number, or a set of last name, first name, patronymic, group number and possibly additional attributes, since it is possible that two students (and more often female students) with the same last names, first names and patronymics. It is also bad to use as a key not the number of the dish, but its name, for example, “Appetizer of processed cheese “Friendship” with ham and pickled cucumber” or “Hare in sour cream with potato croquettes and red cabbage salad.”

The primary key of a core entity (any attribute participating in the primary key) is not allowed to have an undefined value. Otherwise, a contradictory situation will arise: a non-individual, and therefore non-existent, instance of the core essence will appear. For the same reasons, it is necessary to ensure the uniqueness of the primary key.

Now about foreign keys:

If entity C links entities A and B, then it must include foreign keys corresponding to the primary keys of entities A and B. If entity B refers to entity A, then it must include a foreign key corresponding to the primary key of entity A.

In paragraph 2.3, an example was considered where “Employees” denoted “Departments” and included a foreign key “Department Number” corresponding to the primary key of the “Departments” entity.

The relationship between primary and foreign keys of entities is illustrated in Fig. 2.5.

Rice. 2.5. Structures: a - associations; b - designations (characteristics)

Here, to designate any of the associated entities (cores, characteristics, designations or even associations), a new generalizing term “Goal” or “Target Entity” is used.

Thus, when considering the problem of choosing how to represent associations and notations in a database, the main question that needs to be answered is: “What are foreign keys?” And then, for each foreign key, three questions need to be resolved:

1. Can this foreign key accept undefined values ​​(NULL values)? In other words, can there be some instance of an entity of this type, for which the target entity indicated by the foreign key is unknown? In the case of supplies, this is probably not possible - a supply from an unknown supplier or a supply of an unknown product does not make sense. But in the case of employees, such a situation could however make sense - it is quite possible that any employee in this moment not enrolled in any department at all. Note that the answer to this question does not depend on the whim of the database designer, but is determined by the actual course of action adopted in that part of the real world that is to be represented in the database in question. Similar remarks are relevant to the issues discussed below.

2. What should happen when you try to DELETE a target entity that is referenced by a foreign key? For example, when deleting a supplier who has made at least one delivery. There are three possibilities:

3. What should happen when you try to UPDATE the primary key of a target entity that is referenced by some foreign key? For example, an attempt may be made to update the number of a supplier for which there is at least one corresponding delivery. To be clear, we will again consider this case in more detail. You have the same three options as when deleting:

Thus, for each foreign key in a design, the database designer must specify not only the field or combination of fields that makes up that foreign key and the target table that is identified by that key, but also the answers to the questions above (the three constraints that apply to this foreign key).

Finally, about characteristics - denoting entities, the existence of which depends on the type of denoted entities. The designation is represented by a foreign key in the table corresponding to that characteristic. But the three foreign key constraints discussed above for this case should be specified as follows:

NULL values ​​are not allowed

REMOVAL FROM (target) CASCADES

UPDATE (target primary key) CASCADES

The specified specifications represent dependence on the existence of characteristic entities.

Integrity Constraints

Integrity (from the English integrity - intactness, inviolability, safety, integrity) is understood as the correctness of data at any time. But this goal can only be achieved within certain limits: the DBMS cannot control the correctness of every single value entered into the database (although each value can be checked for plausibility). For example, it cannot be discovered that the input value 5 (representing the day of the week) should actually be 3. On the other hand, the value 9 would clearly be an error and should be rejected by the DBMS. However, to do this, she should be told that the numbers of the days of the week must belong to the set (1,2,3,4,5,6,7).

Maintaining database integrity can be thought of as protecting data from unauthorized changes or destruction (not to be confused with unauthorized changes and destruction, which are a security issue). Modern DBMSs have a number of means to ensure that integrity is maintained (as well as means to ensure that security is maintained).

There are three groups of integrity rules:

Entity integrity. Referential integrity. User-defined integrity.

In section 2.4, the motivation for two integrity rules, common to any relational databases data.

Any attribute participating in a primary key is not allowed to have an undefined value. The value of the foreign key must either: be equal to the value of the target's primary key; be completely uncertain, i.e. Each attribute value participating in a foreign key must be null. For any specific database, there are a number of additional specific rules that apply to it alone and are determined by the developer. Most often controlled:

the uniqueness of certain attributes,
range of values ​​(exam score from 2 to 5),
belonging to a set of values ​​(gender "M" or "F").

About building an information model

A reader who has become acquainted only with the material in this and the previous chapters will not be able to correctly perceive and evaluate those tips and recommendations for building a good information model, which have been developed over decades by the largest specialists in the field of data processing. To do this, you need to at least study the following materials. Ideally, it is necessary for the reader to first implement at least one information system project and propose it real users and have been a database and application administrator long enough to recognize at least a small fraction of the problems that arise from poorly thought out designs. The experience of the author and all the information systems specialists he knows shows that any theoretical recommendations are taken seriously only after several unsuccessful attempts to revive poorly designed systems. (Although there are also designers who continue to believe that they can revive a dying project by changing programs, rather than by changing the database infological model.)

Indeed, to determine the list and structure of stored data, it is necessary to collect information about real and potential applications, as well as about database users, and when building an information model, you should only care about the reliability of storing this data, completely forgetting about the applications and users for which the database is being created data.

This is due to the completely different requirements for the database of application programmers and the database administrator. The former would like to have in one place (for example, in one table) all the data they need to implement a query from application program or from the terminal. The latter take care of eliminating possible distortions of the stored data when entering it into the database. new information and updating or deleting an existing one. To do this, they remove duplicates and unwanted items from the database. functional connections between attributes, dividing the database into many small tables (see section 4.6). Since many years of global experience in using information systems, built on the basis of databases, shows that the shortcomings of the project cannot be eliminated by any tricks in application programs, then experienced designers do not allow themselves to meet halfway application programmers(even when they themselves are such).

Clearly distinguish between such concepts as requesting data and maintaining data (entering, changing and deleting); remember that, as a rule, the database is information basis not one, but several applications, some of which will appear in the future; a bad database design cannot be corrected by any (even the most sophisticated) applications. LITERATURE Atre S. Structural approach to organizing databases. – M.: Finance and Statistics, 1983. – 320 p. Boyko V.V., Savinkov V.M. Design of information systems databases. – M.: Finance and Statistics, 1989. – 351 p. Date K. Guide to the DB2 relational DBMS. – M.: Finance and Statistics, 1988. – 320 p. Jackson G. Designing relational databases for use with microcomputers. -M.: Mir, 1991. – 252 p. Kirillov V.V. Structured Query Language (SQL). – St. Petersburg: ITMO, 1994. – 80 p. Martin J. Development Planning automated systems. – M.: Finance and Statistics, 1984. – 196 p. Meyer M. Theory of relational databases. – M.: Mir, 1987. – 608 p. Tiori T., Fry J. Design of database structures. In 2 books, - M.: Mir, 1985. Book. 1. – 287 pp.: Book. 2. – 320 s. Ullman J. Databases in Pascal. – M.: Mashinostroenie, 1990. – 386 p. Hubbard J. Automated database design. – M.: Mir, 1984. – 294 p. Tsikritisis D., Lochowski F. Data models. – M.: Finance and Statistics, 1985. – 344 p.

It should be easily expandable when reorganizing and expanding the subject area. · The database must be easily modified as the software and hardware environment changes. 2. Database technological process The information model displays real world into some human-readable concepts that are completely independent of the parameters of the data storage environment. There are many approaches to...




First of all, it is connected with an attempt to represent the semantics of the subject area in the database model. Currently, databases are used in almost all areas of human activity. This information database model can be used in various educational institutions. To ensure the reliability of a data management system, the following basic requirements must be met: integrity and...


Entities are realized using relationships. The communication strength is one-to-many (1:M). 1 M M 1 Interaction of entities 2.2. Connections between the entities of the information model Development information support We will carry out the workstation on the basis of the Access XP database management system (DBMS) from the selected integrated Microsoft package Office XP. Access DBMS intended for...

The model was proposed by Peter Ping-Shen Chen in 1976. Most modern approaches to database design (mainly relational). Domain modeling is based on the use graphic charts, including a small number of heterogeneous components. Due to the clarity of presentation of conceptual database diagrams, ER models have become widespread in CASE systems that support automated design of relational databases. Basic concepts ER models are entity, relationship and attribute.

Essence - it is a real or imaginary object about which information is of interest. In ER model diagrams, an entity is represented as a rectangle containing the name of the entity. In this case, the name of the entity is the name of the type, and not a specific object - an instance of this type. Each instance of an entity must be distinguishable from every other instance of the same entity.

Connection is a graphically depicted association established between two entities. This association is always binary and can exist between two different entities or between an entity and itself (recursive relationship). In any connection, two ends are identified (in accordance with the pair of connected entities), each of which indicates the name of the end of the connection, the degree of the end of the connection (how many instances of this entity are connected), the mandatory nature of the connection (i.e., whether any instance of this entity must participate in this connection).

A connection is represented as a line connecting two entities or leading from an entity to itself. In this case, at the point where the connection “joins” with the entity, a three-point entry into the entity rectangle is used, if many instances of the entity can be used for this entity in the connection, and a single-point entry, if only one instance of the entity can participate in the connection. The required end of the connection is depicted with a solid line, and the optional end with a broken line.

Like an entity, a relationship is a generic concept; all instances of both pairs of related entities are subject to the rules of association.

Figure 12 shows an example of an image of entities and the relationship between them.

Rice. 12.

This diagram can be interpreted as follows: Each STUDENT studies in only one GROUP; Any GROUP consists of one or more STUDENTS. The following figure (Fig. 13) depicts the essence of MAN with a recursive connection connecting it with itself.

Fig. 13.

A laconic oral interpretation of the diagram depicted is as follows:

Every PERSON is the son of one and only one PERSON;

Each PERSON may be the father of one or more PEOPLE (“PERSON”).

Attribute An entity is any detail that serves to clarify, identify, classify, numerically characterize, or express the state of an entity. Attribute names are entered in a rectangle representing the entity, under the entity name and are depicted in small letters. For example (see Fig. 14):

Rice. 14.

An entity's unique identifier is an attribute, combination of attributes, combination of relationships, or combination of relationships and attributes that uniquely distinguishes any instance of the entity from other instances of the same type of entity.

As in relational database schemas, ER schemas introduce the concept of normal forms, and their meaning closely matches the meaning of relational normal forms. Note that the formulations of normal forms of ER-schemas make the meaning of normalization of relational schemes clearer. We will consider only very brief and informal definitions of the first three normal forms.

IN first normal form ER schemas eliminate duplicate attributes or groups of attributes, i.e., implicit entities “disguised” as attributes are identified.

In second normal form attributes that depend only on part of the unique identifier are eliminated. This part of the unique identifier identifies an individual entity.

IN third normal form attributes that depend on attributes that are not included in the unique identificator. These attributes are the basis of a separate entity. We focused only on the most important concepts of the ER data model. More complex elements of the model include the following:

Subtypes and supertypes of entities. The ER model allows you to specify the IS-A relationship between types. Moreover, if T 1 IS-A T 2 (where T 1 and T 2 - entity types), then T 1 is called subtype T 2 a T 2- supertype T 1. Thus, it is possible to inherit an entity type based on one or more supertypes.

Many-to-many relationships. Sometimes it is necessary to link entities in such a way that there may be multiple instances of the entity at both ends of the link (for example, all members of a cooperative jointly own the property of the cooperative). To do this, a type of “many-to-many” relationship is introduced.

Specified degrees of connection. Sometimes it is useful to define the possible number of entity instances participating in a given relationship (for example, an employee is allowed to participate in no more than three projects at a time). To express this semantic constraint, it is allowed to indicate at the end of the connection its maximum or mandatory degree.

Cascade deletions of entity instances. Some relationships are so strong (in the case of a one-to-many relationship, of course) that when you delete the reference entity instance (corresponding to the one end of the relationship), you must also delete all entity instances corresponding to the many end of the relationship. The corresponding "cascading delete" requirement can be formulated when defining an entity.

Domains. As with the relational data model, it is useful to be able to define a potentially valid set of values ​​for an entity (domain) attribute.

These and other, more complex elements of the Entity-Relationship data model make it more powerful, but at the same time make it somewhat more difficult to use. Of course, when actually using ER diagrams for database design, you need to become familiar with all the possibilities.

Infological data model "Entity-relationship"

The purpose of information modeling is to provide the most natural ways for humans to collect and present the information that is supposed to be stored in the database being created. Therefore, they are trying to build an infological data model by analogy with natural language (the latter cannot be used in its pure form due to the complexity of computer text processing and the ambiguity of any natural language). The main constructive elements of information models are entities, connections between them and their properties (attributes).

An entity is any distinguishable object (an object that we can distinguish from another), information about which must be stored in a database. Entities can be people, places, planes, flights, taste, color, etc. It is necessary to distinguish between concepts such as entity type and entity instance. The concept of entity type refers to a set of homogeneous individuals, objects, events or ideas acting as a whole. An entity instance refers to a specific thing in a set. For example, the entity type could be CITY, and the instance could be Moscow.

An attribute is a named characteristic of an entity. Its name must be unique for a particular entity type, but can be the same for different entity types (for example, COLOR can be defined for many entities: DOG, CAR, SMOKE, etc.). Attributes are used to define what information should be collected about an entity.

There is no absolute difference between entity types and attributes. An attribute is such only in relation to the entity type. In another context, an attribute may act as an independent entity. For example, for a car factory, color is only an attribute of the production product, but for a paint and varnish factory, color is an entity type.

A key is a minimum set of attributes whose values ​​can be used to uniquely find the required instance of an entity. Minimality means that excluding any attribute from the set does not allow identifying the entity by the remaining ones.

Relationship - association of two or more entities. If the purpose of the database was only to store individual, unrelated data, then its structure could be very simple. However, one of the main requirements for organizing a database is to ensure the ability to find some entities by the values ​​of others, for which it is necessary to establish certain connections between them. And since real databases often contain hundreds or even thousands of entities, theoretically more than a million connections can be established between them. The presence of such a multitude of connections determines the complexity of information models.

Relational data structure

At the end of the 60s, works appeared in which the possibilities of using various tabular datalogical data models were discussed, i.e. the ability to use familiar and natural ways of presenting data. The most significant of them was an article by a company employee IBM Drs. E. Codd (Codd E.F., A Relational Model of Data for Large Shared Data Banks. CACM 13: 6, June 1970), where the term “relational data model” was probably first used.

Being a mathematician by training, E. Codd proposed using the apparatus of set theory (union, intersection, difference, Cartesian product) for data processing. He showed that any representation of data is reduced to a set of two-dimensional tables of a special type, known in mathematics as a relation

The smallest unit of data in a relational model is a single atomic (indecomposable) data value for a given model. So, in one subject area, the surname, first name and patronymic can be considered as a single meaning, and in another - as three different meanings.

A domain is a set of atomic values ​​of the same type. The meaning of domains is as follows. If the values ​​of two attributes are taken from the same domain, then comparisons using these two attributes probably make sense (for example, to organize a transit flight, you could give the query "Get flights in which the departure time from Moscow to Sochi is greater than the arrival time from Arkhangelsk to Moscow"). If the values ​​of two attributes are taken from different domains, then their comparison is probably meaningless: is it worth comparing the flight number with the ticket price?

The header consists of such a fixed set of attributes A1, A2, ..., An that there is a one-to-one correspondence between these attributes Ai and the domains Di (i=1,2,...,n) that define them.

The body consists of a time-varying set of tuples, where each tuple in turn consists of a set of attribute-value pairs (Ai:Vi), (i=1,2,...,n), one such pair for each attribute Ai in the title. For any given attribute-value pair (Ai:Vi), Vi is the value from the single domain Di that is associated with the attribute Ai.

The degree of a relationship is the number of its attributes. A relation of degree one is called unary, degree two - binary, degree three - ternary, ..., and degree n - n-ary.

The cardinal number or cardinality of a relation is the number of its tuples. The cardinal number of a ratio changes over time, unlike its degree.

Since a relation is a set, and sets by definition do not contain matching elements, no two tuples of a relation can be duplicates of each other at any given point in time. Let R be a relation with attributes A1, A2, ..., An. The set of attributes K=(Ai, Aj, ..., Ak) of a relation R is said to be a possible key of R if and only if two time-independent conditions are satisfied:

Uniqueness: at any given time, no two different tuples R have the same value for Ai, Aj, ..., Ak.

Minimality: none of the attributes Ai, Aj, ..., Ak can be excluded from K without violating uniqueness.

Every relation has at least one possible key because at least the combination of all its attributes satisfies the uniqueness condition. One of the possible keys (selected randomly) is taken as its primary key. The remaining possible keys, if any, are called alternative keys.

The above and some others mathematical concepts were the theoretical basis for the creation relational DBMS, development of appropriate linguistic means And software systems providing them high performance, and creating the foundations of database design theory. However for mass user relational DBMSs can successfully use informal equivalents of these concepts:

Relation-Table (sometimes File), Tuple - Row (sometimes Record), Attribute - Column, Field. It is assumed that "record" means "an instance of a record" and "field" means "the name and type of the field."

A relational database is a collection of relationships containing all the information that must be stored in the database. However, users may perceive such a database as a collection of tables.

Project Data Design

In the diploma project, the workstation of a manager for leasing office space when designing data, for its implementation and processing in software environment Eleven main tables were created.

According to relational database theory, each table has a key attribute that is unique with respect to the totality of data in those tables.

The model was proposed by Peter Ping-Shen Chen in 1976. Most modern approaches to database design (mainly relational) are based on the use of variations of the ER model. Domain modeling is based on the use of graphical diagrams that include a small number of heterogeneous components. Due to the clarity of presentation of conceptual database diagrams, ER models have become widespread in CASE systems that support automated design of relational databases. The basic concepts of the ER model are entity, relationship and attribute.

An entity is a real or imaginary object about which information is of interest. In ER model diagrams, an entity is represented as a rectangle containing the name of the entity. In this case, the name of the entity is the name of the type, and not a specific object - an instance of this type. Each instance of an entity must be distinguishable from every other instance of the same entity.

A relationship is a graphically represented association established between two entities. This association is always binary and can exist between two different entities or between an entity and itself (recursive relationship). In any connection, two ends are identified (in accordance with the pair of connected entities), each of which indicates the name of the end of the connection, the degree of the end of the connection (how many instances of this entity are connected), the mandatory nature of the connection (i.e., whether any instance of this entity must participate in this connection).

A connection is represented as a line connecting two entities or leading from an entity to itself. In this case, at the point where the connection “joins” with the entity, a three-point entry into the entity rectangle is used, if many instances of the entity can be used for this entity in the connection, and a single-point entry, if only one instance of the entity can participate in the connection. The required end of the connection is depicted with a solid line, and the optional end with a broken line.

Like an entity, a relationship is a generic concept; all instances of both pairs of related entities are subject to the rules of association.

Figure 12 shows an example of an image of entities and the relationship between them.

Rice. 12. An example of a relationship between entities

This diagram can be interpreted as follows: Each STUDENT studies in only one GROUP; Any GROUP consists of one or more STUDENTS. In the following figure (Fig. 13)

the essence of MAN is depicted with a recursive connection connecting it with itself.

Fig. 13. Recursive Link Example

A laconic oral interpretation of the diagram depicted is as follows:

Every PERSON is the son of one and only one PERSON; Each PERSON may be the father of one or more PEOPLE (“PERSON”).

An entity attribute is any detail that serves to clarify, identify, classify, quantify, or express the state of the entity. Attribute names are entered in a rectangle representing the entity, under the entity name and are depicted in small letters. For example (see Fig. 14):

Fig. 14. Image of an entity with its attributes

An entity's unique identifier is an attribute, combination of attributes, combination of relationships, or combination of relationships and attributes that uniquely distinguishes any instance of the entity from other instances of the same type of entity.

As in relational database schemas, ER schemas introduce the concept of normal forms, and their meaning closely matches the meaning of relational normal forms. Note that the formulations of normal forms of ER-schemas make the meaning of normalization of relational schemes clearer. We will consider only very brief and informal definitions of the first three normal forms.

IN first normal form ER schemas eliminate duplicate attributes or groups of attributes, i.e., implicit entities “disguised” as attributes are identified.

In second normal form attributes that depend only on part of the unique identifier are eliminated. This part of the unique identifier identifies an individual entity.

IN third normal form attributes that depend on attributes that are not part of the unique identifier are eliminated. These attributes are the basis of a separate entity. We focused only on the most important concepts of the ER data model. More complex elements of the model include the following:

Subtypes and supertypes of entities. The ER model allows you to specify the IS-A relationship between types. Moreover, if T 1 IS-A T 2 (where T 1 and T 2 are types of entities), then T 1 is called a subtype of T 2 and T 2 is a supertype of T 1. Thus, it is possible to inherit an entity type based on one or more supertypes.

Connections "many-with-many". Sometimes it is necessary to link entities in such a way that there may be multiple instances of the entity at both ends of the link (for example, all members of a cooperative jointly own the property of the cooperative). To do this, a type of “many-to-many” relationship is introduced.

Specified degrees of connection. Sometimes it is useful to define the possible number of entity instances participating in a given relationship (for example, an employee is allowed to participate in no more than three projects at a time). To express this semantic constraint, it is allowed to indicate at the end of the connection its maximum or mandatory degree.

Cascade deletions of entity instances. Some connections are so strong (of course, in the case of a connection "one-to-many"), that when deleting a reference entity instance (corresponding to the “one” end of the relationship), all entity instances corresponding to the “many” end of the relationship must also be deleted. The corresponding "cascading delete" requirement can be formulated when defining an entity.

Domains. As with the relational data model, it is useful to be able to define a potentially valid set of values ​​for an entity (domain) attribute.

These and other, more complex elements of the EntityLink data model make it more powerful, but at the same time make it somewhat more difficult to use. Of course, when actually using ER diagrams for database design, you need to become familiar with all the possibilities.

Lecture 15. Conceptual data models

In contrast to the infological model of the subject area, which describes, according to certain rules, information about objects of the material world and the connections between them that should be in the database, conceptual model describes the data and connections stored in the computer. Because of this, each data model is inextricably linked with the data description language of a specific DBMS.

Essentially, a data model is a combination of three components: types of data structures, operations on data, and integrity constraints.

In other words, a data model is some kind of intellectual tool for the designer, which makes it possible to implement the interpretation of information about the subject area in the form of formalized data in accordance with certain requirements, i.e., an abstraction tool that makes it possible to see the “forest” (the information content of the data), and not individual “trees” (specific data values).

Types of Data Structures

Among the wide variety of definitions denoting types of data structures, the most common is the terminology of CODASYL (Conference of DAta SYstems Language), an international association for data processing system languages ​​created in 1959.

According to this terminology, five typical structures(in order of difficulty):

1. data element;

2. data aggregate;

3. record;

4. set;

5. database.

Let's give brief definitions these structures.

A data element is the smallest named unit of data that the DBMS can directly address and with the help of which all other data structures are built.

A data aggregate is a named collection of data elements that can be considered as a single whole. The unit can be simple or composite (if it includes other units).

A record is a named collection of data elements and (or) aggregates. Thus, a record is an aggregate that is not included in other aggregates. A record can have a complex hierarchical structure because it allows aggregation to be applied multiple times.

A set is a named collection of records that form a two-level hierarchical structure. Each set type represents a relationship between two record types. A set is defined by declaring one record type to be the "owning record" and the other types

records - "member records". In this case, each instance of the set must contain one instance of the “owner record” and any number of “member records”. If a record represents an entity in the data model, then a set represents a relationship between entities. For example, if we consider the connection “studies” between the entities “study group” and “student”, then the first of the entities is declared the “owner record” (it is the only one in the instance of the set), and the second

- “member record” (there can be several of them in an instance of a set).

A database is a named collection of record instances of various types, containing links between records represented by set instances.

Note that database structures are built on the basis of the following basic compositional rules:

1. A database can contain any number of record types and set types;

2. any number of sets can be defined between two record types;

3. A record type can be both an owner and a member of multiple set types.

Following these rules allows you to simulate data on how many

any complex subject area with the required level of completeness and detail.

The considered types of data structures can be presented in various forms - graph; tabular; in the form of the source text of the data description language of a specific DBMS.

Operations on data

Operations implemented by the DBMS include selection (search) of data and actions on them. Data selection is performed using a criterion based on the use or logical position of the data (element, aggregate, record) or the value of the data, or relationships between data. Selection based on the logical position of a given item is based on the ordering of data in the system memory. In this case, the search criteria can be formulated as follows:

1. find the next given (record);

2. find the previous given;

3. find the given value;

4. find the first (last) given.

This type of selection is called selection through current selection, which uses a current state indicator that is automatically maintained by the DBMS and, as a rule, points to some instance of a DB record.

The selection criterion based on data values ​​is formed from simple or Boolean selection conditions. Examples simple conditions searches are:

1. MILITARY SPECIALTY = 200100;

2. AGE > 20;

3. DATE< 19.04.2002 и т.п.

A Boolean selection condition is formed by combining simple conditions using logical operations, For example:

1. (DATE OF BIRTH< 28.12.1963) И (СТАЖ > 10);

2. (ACADEMIC_TITLE = ASSOCIATE PROFESSOR) OR (ACADEMICAL TITLE = PROFESSOR), etc.

If the data model supported by some DBMS allows you to perform data selection by relationships, then you can find data associated with the current value of any data. For example, if the data model implements a bidirectional “study” relationship between the entities “student” and “study group”, it is possible to identify study groups in which young men study (if the student description includes the attribute “gender”).

As a rule, most modern DBMSs allow various combinations the types of data selection described above.

Integrity constraints. These logical constraints on data are used to ensure that data is consistent with certain predefined conditions when performing operations on it. Essentially, integrity constraints are a set of rules used to create specific model data based on the selected DBMS.

There are internal and explicit restrictions.

Limitations caused by the capabilities of a particular DBMS are called internal integrity constraints. These restrictions

concern the types of data stored (for example, “a text data element can consist of no more than 256 characters” or “a record can contain no more than 100 fields”) and the types of relationships allowed (for example, a DBMS can only support so-called functional relationships, i.e. . communications type 1:1, 1: M or M: 1). Most existing DBMSs primarily support internal integrity constraints, violations of which lead to incorrect data and are quite easily controlled.

Limitations caused by the characteristics of stored data about specific software are called explicit integrity restrictions. These restrictions are also supported by the means of the selected DBMS, but they are necessarily formed with the participation of the database developer by defining (programming) special procedures, ensuring data consistency. For example, if the data item "grade book" in the record "student" is defined as a key, it must be unique, i.e. There should not be two records in the database with the same key values. Another example: suppose in the same record the element “military specialty” is provided and six decimal digits are allocated for it. Then other representations of this data element in the database are not possible. With the help of explicit integrity restrictions, it is possible to organize “simple” control of input data (primarily for the belonging of data elements to a fixed and predetermined

given set of values: for example, the element “academic title” should not take the value “honorary associate professor” if we're talking about about Russian scientists) and more complex procedures (for example, entering the value “professor” of the data element “academic title” in a record about a teacher who is 25 years old should require at least additional confirmation).

An elementary data unit can be implemented in many ways, which, in particular, has led to diversity famous models data. The data model defines the rules according to which data is structured. Typically, operations on data are related to its structure.

Diversity existing models data corresponds to a variety of applications and user preferences.

In the specialized literature there is a description of quite large quantity various models data. Although hierarchical, network and, undoubtedly, relational model, some others should be mentioned along with them.

Using as classification sign features of the logical organization of data, the following list of known models can be given:

1. hierarchical data model;

2. network data model;

3. relational data model;

4. binary data model;

5. semantic network.

Infological data model Entity-relationship. The purpose of information modeling is to provide the most natural ways for humans to collect and present the information that is supposed to be stored in the database being created.

Therefore, they are trying to build an infological data model by analogy with natural language; the latter cannot be used in its pure form due to the complexity of computer text processing and the ambiguity of any natural language. The main constructive elements of information models are entities, connections between them and their properties and attributes. An entity is any distinguishable object, an object that we can distinguish from another, information about which must be stored in a database.

Entities can be people, places, planes, flights, taste, color, etc. It is necessary to distinguish between concepts such as entity type and entity instance. The concept of entity type refers to a set of homogeneous individuals, objects, events or ideas acting as a whole.

An entity instance refers to a specific thing in a set. For example, the entity type could be CITY, and the instance could be Moscow. An attribute is a named characteristic of an entity. Its name must be unique for a particular entity type, but can be the same for different entity types for example, COLOR can be defined for many entities DOG, CAR, SMOKE, etc. Attributes are used to define what information should be collected about the entity.

There is no absolute difference between entity types and attributes. An attribute is such only in relation to the entity type. In another context, an attribute may act as an independent entity. For example, for a car factory, color is only an attribute of the production product, but for a paint and varnish factory, color is an entity type. A key is a minimum set of attributes whose values ​​can be used to uniquely find the required instance of an entity.

Minimality means that excluding any attribute from the set does not allow identifying the entity by the remaining ones. Relationship - association of two or more entities. If the purpose of the database was only to store individual, unrelated data, then its structure could be very simple. However, one of the main requirements for organizing a database is to ensure the ability to find some entities by the values ​​of others, for which it is necessary to establish certain connections between them.

And since real databases often contain hundreds or even thousands of entities, theoretically more than a million connections can be established between them. The presence of such a multitude of connections determines the complexity of information models. 1.2.4

End of work -

This topic belongs to the section:

Structure of the working Internet network

Traditionally, information is stored on paper. However, it is difficult to implement quick selection necessary data when applying for a job. The issue of reliability of storage and confidentiality of personal data about employees at the enterprise is also important.

If you need additional material on this topic, or you did not find what you were looking for, we recommend using the search in our database of works:

What will we do with the received material:

If this material was useful to you, you can save it to your page on social networks:

All topics in this section:

Analysis of the principles of PC-based workstations
Analysis of the principles of PC-based workstations. Automated workplace AWP, or, in foreign terminology, work station work-station, represents the place of the user-specialist of one or another

Classification of workstations
Classification of workstations. Workplaces can be individual, group, or collective. In relation to group and collective workstations for the purpose of effective functioning of the computer system - collective specialists

Database concept
Database concept. Active work to find acceptable ways to socialize the continuously growing volume of information led to the creation of special software programs in the early 60s.

DBMS architecture
DBMS architecture. The DBMS must provide access to data to any users, including those who have little or no idea about the physical location of the data in memory.

Relational data structure
Relational data structure. At the end of the 60s, works appeared in which the possibilities of using various tabular datalogical data models were discussed, i.e. Possibility of use for

Basic Concepts
Basic concepts. The main idea behind this technology is to minimize the amount of data transmitted over the network, since the main time losses and failures occur precisely due to insufficiently high

Advantages of client-server architecture
Advantages of client-server architecture. At first, the question may arise: why such difficulties? Here are some thoughts in favor of this approach. The undoubted advantage is approx.

Components of Client-Server Architecture
Components of Client-Server architecture. There are three main software components of the client-server software architecture end user. middleware. Server software. To the final software

Purpose and principle of operation of SQL
Purpose and principle SQL work. SQL is often pronounced as a sequel, an abbreviation for Structured Query Language to symbolize Structured Language Requests SQL is a tool

Advantages of the SQL language
Advantages of the SQL language. The query language provides undoubted advantages. Firstly, it continues the ideology of the Client-Server architecture. The client part of the application prepares a request for processing information