Relational algebra in databases: operations, examples. Unary renaming operation. During the laboratory work, operations with databases in general were studied. acquired skills in using the "ibexpert" application to create, delete, register

The relationship between an owner record and a member record is also 1:N.

The main difference between these models is that in the network model, a record can be a member of more than one group relationship. According to this model, each group relation is named and a distinction is made between its type and instance. A group relationship type is specified by its name and defines properties common to all instances of this type. A group relationship instance is represented by an owner record and a set of (possibly empty) subordinate records. There is the following limitation: record instance cannot be a member of two instances of group relationships of the same type (i.e., the employee from the example in paragraph 1, for example, cannot work in two departments).

  • trees (a) and (b) shown in Fig. 4.2 are replaced by one network structure in which the EMPLOYEE record is included in two group relationships;
  • to display the M:N type, enter the EMPLOYEE_CONTRACT record, which has no fields and serves only to link the CONTRACT and EMPLOYEE records (see Fig. 4.3). Note that this record can also store useful information, for example, the share of a given employee in the total remuneration under a given contract.


Rice. 4.3.

Each instance of a group relationship is characterized by the following characteristics:

How to arrange subordinate records:

  • arbitrary,
  • chronological /queue/,
  • reverse chronological /stack/,
  • assorted.

If a record is declared subordinate to several group relationships, then each of them can have its own ordering method assigned.

Mode for enabling subordinate records:

  • automatic - it is impossible to enter a record into the database without it being immediately assigned to a certain owner;
  • manual - allows you to remember a subordinate record in the database and not immediately include it in an instance of a group relationship. This operation is later initiated by the user.

Exception mode.

It is customary to distinguish three classes of membership of subordinate records in group relations:

  • Fixed. A subordinate record is tightly linked to an owner record and can only be removed from the group relationship by deleting it. When you delete an owner record, all subordinate records are automatically deleted. In the example above, fixed membership assumes a group relationship of CONCLUSION between the CONTRACT and CUSTOMER records, since a contract cannot exist without a customer.
  • Mandatory. It is possible to switch a subordinate record to another owner, but it cannot exist without an owner. To delete an owner record, it must have no subordinate records with required membership. The records “EMPLOYEE” and “DEPARTMENT” are connected by this relationship. If a department is disbanded, all of its employees must either be transferred to other departments or fired.
  • Optional. You can exclude a record from a group relationship, but keep it in the database without assigning it to a different owner. When an owner record is deleted, its subordinate records - optional members - are retained in the database, no longer participating in a group relationship of this type. An example of such a group relationship is “PERFORM” between “EMPLOYEES” and “CONTRACT”, since there may be employees in the organization whose activities are not related to the fulfillment of any contractual obligations to customers.

Operations on data in a network database model

Add - make a record in the database and, depending on the inclusion mode, either include it in a group relationship, where it is declared subordinate, or not include it in any group relationship.
Include in group relation - link an existing subordinate record to the owner record.
Switch - link an existing subordinate record to another owner record in the same group relationship.
Update - change the value of the elements of a previously extracted record.
Extract - extract records sequentially by key value, as well as using group relationships - you can go from the owner to member records, and from a subordinate record to the owner of the set.
Delete - remove a record from the database. If this record is the owner of a group relationship, then the membership class of the subordinate records is analyzed. Mandatory members must first be excluded from the group relationship, fixed members must be deleted along with the owner, optional members will remain in the database.
Exclude from group relationship - break the connection between the owner record and the member record.

Integrity Constraints

As in hierarchical model Only the maintenance of referential integrity is ensured (the owner of the relationship is a member of the relationship).

Advantages and disadvantages of early DBMSs

Advantages of early DBMS:

  • advanced means of managing data in external memory at a low level;
  • the ability to manually build effective application systems;
  • the ability to save memory by separating subobjects (in network systems)

Disadvantages of early DBMS:

  • difficulty of use;
  • high level of knowledge requirements about the physical organization of the database;
  • dependence of application systems on the physical organization of the database;
  • overloading the logic of application systems with details of organizing access to the database.

Both hierarchical and network data model requires the presence of highly qualified programmers. And even in such cases, the implementation of user requests is often delayed for a long time.

Object-oriented DBMS

The emergence of object-oriented DBMSs was caused by the needs of programmers in OO languages, who needed tools for storing objects that did not fit in the computer's RAM. Also important was the task of saving the state of objects between repeated launches of the application program. Therefore, most OODBMSs are a library whose data management procedures are included in the application program. Examples of OODBMS implementations like dedicated server databases are extremely rare.

It should immediately be noted that the generally accepted definition of " object-oriented data model" does not exist. Now we can only talk about a certain “object” approach to the logical representation of data and about various object-oriented ways of implementing it.

We know that any data model must include three aspects: structural, holistic and manipulation. Let's see how they are implemented on an object-oriented basis programming paradigms.

Structure

The structure of the object model is described using three key concepts:

encapsulation - each object has some internal state (stores a record of data inside itself), as well as a set of methods - procedures, with the help of which (and only in this way) you can access the data that determines the internal state of the object, or change them. Thus, objects can be considered as independent entities, separated from the external world;
inheritance - implies the ability to create new classes of objects from classes of objects that inherit the structure and methods of their ancestors, adding to them features that reflect their own individuality. Inheritance can be simple (one ancestor) or multiple (several ancestors);
polymorphism - different objects can react differently to the same external events depending on how their methods are implemented.

Data integrity

To maintain integrity, the object-oriented approach suggests using the following tools:

  • automatic maintenance of inheritance relationships; the ability to declare some data fields and methods of an object as “hidden”, not visible to other objects; such fields and methods are used only by methods of the object itself; creating integrity control procedures within the object

Data manipulation tools

Unfortunately, object-oriented programming lacks common data manipulation tools such as relational algebra or relational calculus. Data is processed using one of the general-purpose object-oriented programming languages, usually SmallTalk, C++ or Java.

Let us now summarize some results

Object-oriented databases, unlike relational ones, store objects rather than records. The object-oriented approach provides a more advanced means of representing the real world than the relational model, a natural representation of data. In the relational model, all relationships belong to the same level, which is what complicates the transformation of the hierarchical relationships of the entity-relationship model into the relational model. The OO model can be viewed layer by layer, at different levels of abstraction. It is possible to define new data types and operations with them.

At the same time, OO - The model also has a number of disadvantages:

  • There are no powerful non-procedural means of retrieving objects from the database. All queries have to be written in procedural languages, the problem of optimizing them is assigned to the programmer;
  • instead of purely declarative integrity constraints(such as explicitly declaring primary and foreign keys of relational tables using keywords PRIMARY KEY And REFERENCES) or semi-declarative triggers, you have to write procedural code to ensure internal integrity.

Obviously, both of these shortcomings are associated with the lack of developed means of data manipulation. This problem is solved in two ways - expanding OO languages ​​towards data management (ODMG standard), or adding object properties to relational DBMSs (SQL-3, as well as so-called object-relational DBMSs).

Each operation includes data selection (selection) and the actions that will be performed on the selected data. The main operations in a relational database are database update operations and relationship processing operations.

TO database update operations include those operations that insert new tuples, remove unnecessary ones, and adjust the attribute values ​​of existing tuples, namely: these are operations Turn on, Delete, Update.

Operation Turn on requires specifying the name of the relationship and preliminary generation of the attribute values ​​of the new tuple. The tuple key must be specified.

Operation Delete requires the name of the relation, as well as the identification of the tuple or group of tuples to be deleted.

Operation Update is executed for the named relation and can correct both one and several tuples. For example, if the company’s management decided to increase all employee salaries by the same amount, then several tuples will be adjusted at once with one Update operation.

Concerning processing operations, then they are borrowed from relational algebra. According to E. Codd’s approach, relational algebra includes eight operations, five of which are basic: Sample, Projection, Multiplication, Union, Subtraction.

Sample- select from the relation only those tuples that satisfy the given condition.

At Projections relation to a given set of its attributes, a new relation is obtained, created by extracting tuples containing the specified attributes from the original relation.

At Multiplication(Cartesian product) of two relations, a new relation is obtained, the tuples of which are a concatenation of the tuples of the first and second relations.

As a result Associations two relations, a third one is obtained, including tuples included in at least one relation, that is, containing all the elements of the original relations.

At Subtraction Only those tuples of the first relation are returned that remain after subtracting the second relation, that is, all tuples of the second relation are discarded from the first relation.

The remaining three operations are derivatives; they can be obtained from the main operations; they are called additional: Connection, Intersection, Division.

Operation Compound applies to two relationships that have a common attribute. The result of this operation for two relations under some condition is a relation of tuples that are a combination of the first and second relations that satisfy the specified condition.

Intersection two relations is a relation that includes all tuples included in both relations.

Operation Divisions assumes that there are two relations: one is binary (containing two attributes), the other is unary (containing one attribute). The result is a relation consisting of tuples that include the values ​​of the first attribute of the tuples of the first relation, but only those for which the set of values ​​of the second attribute of the first relation coincides with the set of values ​​of the attributes of the second relation.

A distinctive feature of relation processing operations is that the unit of processing in them is not tuples, but relations: one or two relations are used at the input of each operation, and the result of the operations is a new relation.

Let's look at some of the most commonly used operations of relational algebra in more detail.

Operation An association- the input is given two compatible relations of the same dimension: A and B. The result is a relation of the same structure, containing all tuples A and all tuples B

Intersection assumes the presence at the input of two relations of the same dimension: A and B. At the output, a relation of the same structure is created, containing only those tuples A that are in B.

Division. At the input of the operation, two relations are used: A and B. Let relation A, called divisible, contain attributes (A 1, A 2, A 3,..., A n). Relation B is a divisor and contains a subset of attributes A: (A 1, A 2, ..., A k), where k

In general, the operations of the relational data model provide the ability to manipulate relationships, allowing you to update the database, as well as select subsets of stored data and present them in the desired form.

When designing and working with databases, these eight operations are usually not enough. Therefore, operations such as renaming attributes, creating new calculated attributes, assignment operations, comparisons, etc. are added.

Modification Anomalies

Modification Anomalies manifest themselves in the fact that a change in the value of one data may entail scanning the entire table and a corresponding change in some other table records.

Deletion anomalies are that when you delete any data from the table, other information that is not directly related to the deleted data may also disappear.

Addition anomalies arise in cases where information cannot be placed in a table until it is incomplete, or inserting a new record requires additional scanning of the table

Let there be a relation that stores information about students, the courses they take, and the cost of these courses. From this relation, a tuple is removed that contains (in addition to information about the student) information about the name and cost of the course attended by this student. If information about the name and cost of the course was stored in a single copy only in this tuple, it will irrevocably disappear from the relation. This situation is called deletion anomaly. Performing a delete operation results in the loss of information about two entities.

This same relationship can be illustrated by an example insertion anomaly. Let's say we need to add information about the name and cost of a certain course, but we will not be able to add this information until no students are enrolled in the course. You can get rid of both anomalies by splitting the existing relationship into two, each of which will contain data from only one entity. Then deleting student information will not affect course data.

When splitting a relationship into two, problems also arise. For example, is it possible to enroll a student in a course that does not yet exist? These problems must be resolved by discussing business rules. If the business rules require that course and cost information be available when a student enrolls in that course, then when the student enrolls in the course, a check will be made to ensure that the required course exists. These types of checks are called referential integrity constraints or foreign key integrity constraints.

Integrity of entities - no primary key value must contain null.

Design stages:

Conceptual design - the database development process begins with requirements analysis. The designer at this stage of development must find answers to the following questions: what data elements should be stored, who will access them and how. A Spanish model is created. Information that does not depend on physical aspects, target database and programming languages

Boolean - the logical structure of the database is created. To do this, determine how the data will be grouped logically. The structure of the database at this stage is expressed in terms of application objects and relationships between them. Depends on the target DBMS, redundancy checking, normalization.

Physical - The logical structure of the database is converted into a physical structure taking into account performance aspects. Data elements at this stage receive attributes and are defined as columns in the tables of the DBMS chosen for the implementation of the database. Basic file and index organization relationships, integrity constraints and security measures.

Just in case - transactions- an indivisible aftereffect of operations that transfer the database from one stable state to another. Properties - atomicity (indivisibility), consistency (from one state to another), isolation (user transactions do not interfere with each other), durability (the result must be recorded in the database after the release, even if it crashed at the next moment).

Entity-relationship method.

Entity-relationship modeling method gives an abstract model of a problem domain using the following basic concepts: essence(entities), relationships(relationships) between entities and attributes(attributes) to represent properties of entities and relationships.

Any fragment of the subject area can be represented as set of entities, between which there is some many connections. Let's give definitions:

Essence is an object that can be identified in some way that distinguishes it from other objects. Examples: a specific person, enterprise, event, etc.

Entity set- a set of entities of the same type (having the same properties). Examples: all people, businesses, holidays, etc. Entity sets do not have to be disjoint. For example, an entity that belongs to the set MAN also belongs to the set PEOPLE.

An entity is actually a set attributes, which describe the properties of all members of a given entity set. Domain was already higher.

Entity Key- is one or more attributes that uniquely define a given entity.

Connection is an association established between several entities. Examples:

  • since each employee works in some department, there is a “works in” or DEPARTMENT-EMPLOYEE relationship between the EMPLOYEE and DEPARTMENT entities;

Unfortunately, there are no general rules for determining what is considered an entity and what is a relationship. In the example discussed above, we assumed that “leading” is a connection. However, we can consider the entity “manager”, which has connections “manages” with the entity “department” and “is” with the entity “employee”.

A relationship can also have attributes. For example, for the DEPARTMENT-EMPLOYEE relationship, you can set the attribute WORK_TERRENCE_IN_DEPARTMENT.

The role of the entity in the relationship- the function that an entity performs in a given connection. For example, in a PARENT-CHILD relationship, PERSON entities may have the roles "parent" and "child". Specifying roles in the entity-relationship model is optional and serves to clarify the semantics of the relationship.

Set of connections- this is the relationship between n(and n no less than 2) entities, each of which belongs to a certain set of entities.

Although, strictly speaking, the concepts of “connection” and “set of connections” are different (the first is an element of the second), they are nevertheless very often confused.

When n=2, i.e. when a relationship combines two entities, it is called binary. It has been proven that n-ary set of connections ( n>2) can always be replaced by many binary ones, but the former better reflect the semantics of the subject area.

The number of entities that can be associated through a set of connections with another entity is called degree of connection. Considering degrees is especially useful for binary relationships. The following degrees of binary bonds can exist:

  • one to one (denoted 1: 1 ). This means that in such a relationship, entities with one role always correspond to at most one entity with another role.

Another important characteristic of a connection besides its degree is membership class entities included in it or cardinality communications.

"EMPLOYEE" has mandatory membership class(this fact is also indicated by indicating the interval of the number of possible occurrences of the entity in a relationship, in this case it is 1.1), and the entity "DEPARTMENT" has optional membership class(0,1). Now we can describe this relationship as 0,1:1,1 .

  • one to many ( 1:n). In this case, an entity with one role can correspond to any number of entities with another role.

This figure further illustrates the fact that multiple sets of relationships can be defined between two entities.

  • many to one ( n: 1). This relationship is similar to mapping 1:n.

In this case, for completely obvious reasons (each contract is concluded with a specific customer, and each customer has at least one contract, otherwise it would not be such), each entity has a mandatory membership class.

  • many to many ( n: n). In this case, each of the associated entities can be represented by any number of instances.

If the existence of an entity x depends on the existence of an entity y, then x is called dependent entity(sometimes entity x is called "weak" and "entity" y is called strong). As an example, consider the relationship between the previously described entities WORKING_GROUP and CONTRACT. The working group is created only after the contract is signed with the customer, and ceases to exist upon completion of the contract. Then WORKING_GROUP is dependent on the CONTRACT entity. We will denote a dependent entity by a double rectangle, and its connection with a strong entity by a line with an arrow ( we had an oval for the dependent)

The cardinality of the connection for a strong entity will always be (1,1). The membership class and degree of relationship for a dependent entity can be anything.

12. Hierarchical and network data models.

Hierarchical model is a set of elements arranged in the order of their subordination from general to specific and forming a tree (graph) inverted in structure.

The basic concepts of a hierarchical structure include level, node, and relationship. Knot is a set of data attributes that describe an object. In a hierarchical tree diagram, nodes are represented as vertices in the graph. Each node at a lower level is connected to only one node at a higher level. A hierarchical tree has only one vertex, not subordinate to any other vertex and located at the topmost - the first level. Dependent (slave) nodes are at the second, third, etc. levels. The number of trees in the database is determined by the number of root records. Each database record has only one hierarchical path from the root record.

The organization of data in a hierarchical type DBMS is defined in terms of: element, aggregate, record (group), group relation, database.

  • Attribute (data element)- the smallest unit of a data structure. Typically, each element in a database description is given a unique name. It is referred to by this name during processing. A data element is also often called a field.
  • Record- a named collection of attributes. Using records allows you to obtain some logically connected set of data in one access to the database. It is the records that are changed, added and deleted. The type of a record is determined by the composition of its attributes. Record instance - a specific record with a specific element value
  • Group attitude- hierarchical relationship between records of two types. The parent record (the owner of the group relationship) is called the source record, and the child records (members of the group relationship) are called subordinate records. A hierarchical database can only store such tree structures.

The root record of each tree must contain a key with a unique value. The keys of non-root records must have a unique value only within the group relationship. Each record is identified by a complete concatenated key, which is the set of keys of all records from the root along the hierarchical path.

When depicted graphically, group relationships are represented by arcs of a directed graph, and record types are represented by vertices.

For group relationships in the hierarchical model, automatic inclusion mode and fixed membership are provided. This means that for any non-root record to be remembered in the database, its parent record must exist. When a parent record is deleted, all subordinate records are automatically deleted.

Example: An enterprise consists of departments in which employees work. Each department can have multiple employees, but an employee cannot work in more than one department.

Therefore, for a personnel management information system, it is necessary to create a group relationship consisting of a parent record DEPARTMENT (NAME OF DEPARTMENT, NUMBER OF EMPLOYEES) and a child record EMPLOYEE (LAST NAME, POSITION, SALARY). (For simplicity, we assume that there are only two child records.) - rice a (further)

To automate the accounting of contracts with customers, it is necessary to create another hierarchical structure: customer - contracts with him - employees involved in working on the contract. This tree will include the records CUSTOMER(CUSTOMER_NAME, ADDRESS), CONTRACT(NUMBER, DATE, AMOUNT), CONTRACTOR (SURNAME, POSITION, DEPARTMENT_NAME) - fig. b.

Flaws hierarchical databases:

  • Information between records is partially duplicated (such records are called paired), and the hierarchical data model does not provide support for correspondence between paired records.
  • The hierarchical model implements the relationship between the source and child records according to the 1:N scheme, that is, one parent record can correspond to any number of children. Let us now assume that the performer can take part in more than one contract (i.e., an M:N type relationship arises). In this case, it is necessary to enter another group relationship into the database, in which CONTRACTOR will be the original record, and CONTRACT will be the child. Thus, we are again forced to duplicate the information. (Figure C)
  • quite complex logical connections and corresponding cumbersomeness in data processing

Advantages:

The simplest is a fairly efficient use of memory and good time performance for performing operations on data. However, this model is convenient mainly for working with hierarchically organized information.

Operations on data defined in the hierarchical model:

  • ADD a new entry into the database. For the root record, it is necessary to generate a key value.
  • CHANGE the data value of the previously retrieved record. Key data should not be changed.
  • DELETE some record and all records subordinate to it.
  • EXTRACT:
    • extract the root record by key value; sequential viewing of root records is also allowed
    • retrieve next record (next record is retrieved in left traversal order)

In the EXTRACT operation, it is possible to specify selection conditions.

All modification operations are applied only to one “current” record (which was previously retrieved from the database). This approach to data manipulation is called “navigation”.

Integrity constraints.

Only the integrity of the relationships between the owners and members of the group relationship is maintained (no descendant can exist without an ancestor). The correspondence of paired records belonging to different hierarchies is not automatically maintained.

The first database management systems, which appeared in the mid-60s, made it possible to work with a hierarchical database. The most famous was the hierarchical IMS system from IBM. Other systems are also known: PC/Focus, Team-Up, Data Edge and ours: Oka, INES, MIRIS.

Network data model.

Network model- a structure in which any element can be associated with any other element. A network database consists of sets of records that are related to each other so that records can contain explicit links to other sets of records. Thus, sets of records form a network. Relationships between records can be arbitrary, and these relationships are explicitly present and stored in the database.

A network data model is defined in the same terms as a hierarchical one. It consists of many records that can be owners or members of a group relationship. The relationship between an owner record and a member record is also of the form 1:N.

The main difference between these models is that in the network model, a record can be a member more than one group relationship. According to this model, each group relation is named and a distinction is made between its type and its instance. A group relationship type is specified by its name and defines properties common to all instances of this type. A group relationship instance is represented by an owner record and a set of (possibly empty) subordinate records. However, there is the following limitation: a record instance cannot be a member of two instances of group relationships of the same type (an employee cannot work in two departments)

Hierarchical structure from the picture above. is converted to network as follows

Trees (a) and (b) are replaced by a single network structure in which the EMPLOYEE entry is included in two group relationships; to display the M:N type, enter the EMPLOYEE_CONTRACT record, which has no fields and serves only to link the CONTRACT and EMPLOYEE records

Each instance of a group relationship is characterized by the following characteristics:

  • way to organize subordinate records:

arbitrary,

chronological /queue/,

reverse chronological /stack/,

assorted.

If a record is declared subordinate to several group relationships, then each of them can have its own ordering method assigned.

  • mode for enabling subordinate records:

automatic - it is impossible to enter a record into the database without it being immediately assigned to a certain owner;

manual - allows you to remember a subordinate record in the database and not immediately include it in an instance of a group relationship. This operation is later initiated by the user).

  • exception mode It is customary to distinguish three classes of membership of subordinate records in group relations:

Fixed. A subordinate record is tightly linked to an owner record and can only be removed from the group relationship by deleting it. When you delete an owner record, all subordinate records are automatically deleted. In the example, fixed membership assumes the group relationship "CONCLUDES" between the records "CONTRACT" and "CUSTOMER" because a contract cannot exist without a customer.

Mandatory. It is possible to switch a subordinate record to another owner, but it cannot exist without an owner. To delete an owner record, it must have no subordinate records with required membership. The records “EMPLOYEE” and “DEPARTMENT” are connected by this relationship. If a department is disbanded, all of its employees must either be transferred to other departments or fired.

Optional. You can exclude a record from a group relationship, but keep it in the database without assigning it to a different owner. When an owner record is deleted, its subordinate records - optional members - are retained in the database, no longer participating in a group relationship of this type. An example of such a group relationship is “PERFORM” between “EMPLOYEES” and “CONTRACT”, since the organization may have employees whose activities are not related to the fulfillment of any contractual obligations to customers.

Operations on data.

ADD- make a record in the database and, depending on the inclusion mode, either include it in a group relationship, where it is declared subordinate, or not include it in any group relationship.

INCLUDE INTO GROUP RELATIONSHIP- link an existing subordinate record to the owner record.

SWITCH- link an existing subordinate record to another owner record in the same group relationship.

UPDATE- change the value of the elements of a previously extracted record.

EXTRACT- extract records sequentially by key value, as well as using group relationships - you can go from the owner to member records, and from a subordinate record to the owner of the set.

DELETE- remove a record from the database. If this record is the owner of a group relationship, then the membership class of the subordinate records is analyzed. Mandatory members must first be excluded from the group relationship, fixed members must be deleted along with the owner, optional members will remain in the database.
EXCLUDE FROM GROUP RELATIONSHIP- break the connection between the owner record and the member record.

Integrity constraints.

As in the hierarchical model, only the maintenance of referential integrity is ensured (the owner of the relationship is a member of the relationship).

Basics dignity network model is high memory efficiency and efficiency. Flaw– the complexity and rigidity of the base scheme, as well as the difficulty of understanding. In addition, integrity control is weakened in this model, since it allows arbitrary connections to be established between records. The complexity of the DBMS implementation, the complexity of the data access mechanism, and the need to clearly define data connections at the physical level

To well-known network database management systems include: DBMS, IDMS, TOTAL, VISTA, NETWORK, SETOR, COMPASS, etc.

Comparing hierarchical and network databases, we can say the following. In general, hierarchical and network models provide fairly fast access to data. But since in network databases the main structure of information presentation has the form of a network in which each vertex (node) can have a connection with any other, then the data in the network database is more equal than in the hierarchical one, since access to information can be carried out starting from any node.

Graph (hierarchical and network) models are implemented as data models in database management systems running on mainframe computers. For personal computers, relational databases are more common, although there are also database management systems that support the network model.

^

Lecture No. 4. Relational algebra. Unary operations

Relational algebra, as you might guess, this is a special type of algebra in which all operations are performed on relational data models, i.e., on relationships.

In tabular terms, a relationship includes rows, columns, and a row—the column header. Therefore, natural unary operations are the operations of selecting certain rows or columns, as well as changing column headings - renaming attributes.

^

1. Unary fetch operation

The first unary operator we'll look at is fetch operation– the operation of selecting rows from a table representing a relationship according to which

or principle, i.e. selection of rows

tuples that satisfy a certain condition or conditions.

^ Sampling operator denoted by σ <P >, sampling conditionP <S>, i.e., operator σ is always taken by a certain condition on tuples P, and the condition itself P is written depending on the relation scheme S. Considering all this myself fetch operation over the relation scheme S in relation to the relation r

σ <P >r (S) σ <P >r = {t (S) |tr & P <S >t } = {t (S) |tr & IfNull (P <S >t , False };

The result of this operation will be a new relation with the same relation schema S, consisting of those tuples t (S) original relation

operands that satisfy the sampling condition P t. It is clear that in order to apply what

then the condition for the tuple is to substitute the values ​​of the tuple’s attributes instead of the attribute names.

To better understand the principle of this operation, let's give an example. Let the following relationship diagram be given:

^S: Session (grade book number, Last name, Subject, Grade).

Let's take the following sampling condition:

P <S> = (Subject = ‘Computer Science’ and Grade > 3).

We need from the original relation

operand, select those tuples that contain information about students who passed the “Informatics” subject with at least three points.

Let also be given the following tuple from this relation:

t 0 (S) ∈ r (S

Applying our selection condition to the tuple t 0 , we get:

P t 0 = (‘Databases’ = ‘Computer Science’ and 5 > 3);

On this particular tuple, the selection condition is not satisfied.

In general, the result of this particular sample

σ <Предмет = "Информатика" and Оценка >3 > Session

there will be a “Session” table in which rows that satisfy the selection condition are left.

^

2. Unary projection operation

Another standard unary operation we will study is the projection operation. Projection operation is the operation of selecting columns from a table representing a relationship according to which

or sign. Namely, the machine selects those attributes (i.e., literally those columns) of the original relation

operands that were specified in the projection.

^ Projection operator denoted by [ S"] or π . Here S"– subcircuit of the original relation schema S, i.e. some of its columns. What does this mean? This means that S' has fewer attributes than S, because in S" Only those for which the projection condition was satisfied remained. And in the table representing the relation r (S"), there are as many rows as there are in the table r (S), and there are fewer columns, since only those corresponding to the remaining attributes remain. So the projection operator π< S"> in relation to r (S) results in a new relation with a different relation schema r (S"), consisting of projections t (S) [S"] tuples of the original relation. How are these tuple projections determined? Projection any tuple t (S) original relation r (S) per subcircuit S" is determined by the following formula:

t (S) [S' ] = {t (a)|adef (t) ∩ S ’}, S " ⊆S .

It is important to note that duplicate tuples are excluded from the result, i.e., there will be no duplicate rows in the table representing the new resulting relation.

Taking into account all of the above, the projection operation in terms of database management systems will look like this:

π <S" >r (S) ≡ π <S' >rr (S) [S ’] ≡ r [S" ] = {t (S) [S' ] | tr };

Let's look at an example illustrating the principle of the sampling operation.

Let the relation “Session” and the schema of this relation be given:

S: Session (grade book number, Last name, Subject, Grade);

We will be interested in only two attributes from this scheme, namely “grade book number” and “Last name” of the student, so the subschema S" will look like this:

^S": (grade book number, last name).

Need an initial attitude r (S) project onto the subcircuit S" .

t 0 (S) ∈ r (S): ((grade book number: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

This means that the projection of this tuple onto this subcircuit ^S" will look like this:

t 0 (S) S": ((grade book number: 100), (Last name: ‘Ivanov’));

If we talk about the projection operation in terms of tables, then the projection Session [Gradebook No., Last Name] of the original relation is the Session table, from which all columns have been deleted except two: Gradebook No. and Last Name. In addition, all duplicate lines have also been removed.

^

3. Unary renaming operation

The last unary operator we'll look at is attribute renaming operation. If we talk about a relation as a table, then the renaming operation is needed in order to change the names of all or some columns.

^ Rename operator as follows: ρ<φ >, here φ – rename function .

This function sets mutually

one-to-one correspondence between schema attribute names S And Ŝ, where respectively S – scheme of the original relation, and Ŝ relationship diagram with renamed attributes. So the operator ρ <φ> in relation to r (S) gives a new relationship with the schema Ŝ , consisting of tuples of the original relation with only the renamed attributes.

Let's write the operation of renaming attributes in terms of database management systems:

ρ <φ > r (S) ≡ ρ <φ >r = {ρ <φ > t (S)| tr };

Here is an example of using this operation:

Let's consider the already familiar Session relation, with the diagram:

S: Session (grade book number, Last name, Subject, Grade);

Let's introduce a new relation schema Ŝ, with other attribute names that we would like to see instead of the existing ones:

Ŝ : (ZK No., Last name, Subject, Point);

For example, the database customer wanted to see other names in your ready-made relation. To implement this order, you need to design the following rename function:

φ : (grade book number, last name, subject, grade) → (grade book number, last name, subject, score);

In fact, only two attributes need to be renamed, so it would be legal to write the following renaming function instead of the existing one:

φ : (grade book number, grade) (No. ZK, Point);

t 0 (S) r (S): ((grade book number: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

Let's apply the rename operator to this tuple:

ρ<φ>t 0 (S): ((ZK No.: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

So, this is one of the tuples of our relation that has had its attributes renamed.

In tabular terms, the relation

ρ < № зачетной книжки, Оценка “ZK No., Point > Session –

this is a new table obtained from the Session relationship table by renaming the specified attributes.

^

4. Properties of unary operations

Unary operations, like any other, have certain properties. Let's look at the most important of them.

The first property of the unary operations of selection, projection and renaming is a property that characterizes the ratio of cardinalities of relations. (Recall that cardinality is the number of tuples in a particular relation.) It is clear that here we are considering, respectively, the original relation and the relation obtained as a result of applying a particular operation.

Note that all properties of unary operations follow directly from their definitions, so they can be easily explained and even, if desired, deduced independently.

1) power ratio:

a) for the sampling operation: | σ <P >r |≤ |r |;

b) for the projection operation: | r [S" ] | ≤ |r |;

c) for the renaming operation: | ρ <φ >r | = |r |;

In total, we see that for two operators, namely the sampling operator and the projection operator, the power of the original relations - operands is greater than the power of the relations obtained from the original ones using the corresponding operations. This is because the selection process associated with these two selection and projection operations eliminates some rows or columns that do not satisfy the selection conditions. In the case when all rows or columns satisfy the conditions, the cardinality (i.e., the number of tuples) does not decrease, so the inequality in the formulas is not strict.

In the case of a renaming operation, the power of the relation does not change, due to the fact that when changing names, no tuples are excluded from the relation;

2) idempotency property:

a) for the sampling operation: σ <P > σ <P >r = σ <P >;

b) for the projection operation: r [S' ] [S' ] = r [S" ];

c) for the renaming operation, in the general case, the idempotency property is not applicable.

This property means that twice sequential application of the same operator to any

or relation is equivalent to its single use.

For the operation of renaming relation attributes, generally speaking, this property can be applied, but always with special reservations and conditions.

The idempotency property is very often used to simplify the form of an expression and bring it to a more economical, relevant form.

The last property we will consider is the property of monotonicity. It is interesting to note that under any conditions all three operators are monotonic;

3) property of monotonicity:

a) for the sampling operation: r 1 r 2 ⇒ σ <P > r 1 σ <P >r 2 ;

b) for the projection operation: r 1 r 2 r 1 [S" ] r 2 [S" ];

c) for the renaming operation: r 1 r 2 ρ <φ >r 1 ⊆ ρ <φ >r 2 ;

The concept of monotonicity in relational algebra is similar to the same concept from ordinary, general algebra. Let us explain: if initially the relationship r 1 and r 2 were connected in such a way that rr 2 , then even after applying any of the three sampling, projection or renaming operators, this relationship will remain the same.

Relational algebra is a language of operations performed on relationships - tables of a relational database. The operations of relational algebra allow you to create another relation based on one or more relations without changing the original relations themselves. The resulting other relation is usually not written to the database, but exists as a result of the SQL query - an array created by functions for working with databases in programming languages. For each relational algebra operation, its implementation will be given in the form of queries in the SQL language.

Let's consider the operations of relational algebra. So that you are not distracted by the contents of tables that are not your databases, such as “Products”, “Drivers”, “plums”, “pears”, “tea”, “coffee”, Vladimir, Sergei, etc. We will perform operations on relationships (tables) with abstract data, such as R1, R2 (names of tables - relationships), etc. and A1, A2, A3 (names of attributes - columns) and h15, w11, etc. (contents of database table records).

Priorities for performing relational algebra operations (in descending order of list items, and in one item - operations with equal priorities):

  • selection, projection
  • Cartesian product, connection, intersection, division
  • union, difference.

Sampling operation

The select operation operates on one relation and determines the resulting relation R, which contains only those tuples (or rows, or records), relations that satisfy a given condition (predicate P ).

Thus, the fetch operation is a unary operation and is written as follows:

Where P- predicate (logical condition).

SQL Query

Now let's see what happens when we run this relational algebra operation and its corresponding SQL query. The table below shows one relation that this operation works on.

R3
A1A2A3A4
3 hhylms
4 ppa1sr
1 rrylms

We look at column A3 and establish that the predicate A3>"d0" is satisfied by the entries in the first and third rows of the original relation (since the number of the letter y in the alphabet is greater than the number of the letter d). As a result, we get the following new relation, which contains two lines:

R
A1A2A3A4
3 hhylms
1 rrylms

This material will help you combine all kinds of logical conditions for selections. "Boolean algebra (algebra of logic)" .

SQL Query

SELECT A1, A2, A3 from R1 UNION SELECT A1, A2, A3 from R2

Now let's see what happens when we run this relational algebra operation and its corresponding SQL query. Now two relations are given, since the union operation is a binary operation:

R1 R2
A1A2A3A1A2A3
Z7aaw11X8ppk21
B7hhh15Q2eeh15
X8ppw11X8ppw11

We combine the lines of the first and second relation and see that the third line, which is the third in both the first and second relation, is identical, so we include it in the new relation only once. We get the following relation:

R
A1A2A3
Z7aaw11
B7hhh15
X8ppw11
X8ppk21
Q2eeh15

The following is important: a join operation can be performed only when two relations have the same number and names of attributes (columns), or, formally speaking, are join compatible.

Intersection operation

The result of the intersection of two sets (relations) A and B () will be a set (relation) C that includes those and only those elements that are in both set A and set B. The operation of intersection of relational algebra is identical to the operation.

SQL Query

SELECT A1, A2, A3 from R1 INTERSECT SELECT A1, A2, A3 from R2

Some SQL dialects do not have the INTERSECT keyword. Its replacement, for example, in MySQL and others, is INNER JOIN. How the SQL JOIN operator works in general and its varieties INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN and FULL OUTER JOIN - in the lesson SQL JOIN - joining database tables .

MySQL Query

Now let's see what happens when we run this relational algebra operation and its corresponding SQL query. Again, two relations R1 and R2 are given:

R1 R2
A1A2A3A1A2A3
Z7aaw11X8ppk21
B7hhh15Q2eeh15
X8ppw11X8ppw11

We look through all the records in two relations, and find that in both the first and second relations there is one row - the one that is third in both the first and second relations. We get a new relation:

R
A1A2A3
X8ppw11

Difference operation

The difference of two relations R1 and R2 () consists of tuples (or records, or rows) that are present in relation R1, but not in relation R2. The relations R1 and R2 must be join compatible. The relational algebra difference operation is identical to the operation.

SQL Query

SELECT A1, A2, A3 from R2 EXCEPT
SELECT A1, A2, A3 from R1

Let's establish what happens as a result of executing this relational algebra operation and the corresponding SQL query. Again, two relations R1 and R2 are given:

R1 R2
A1A2A3A1A2A3
Z7aaw11X8ppk21
B7hhh15Q2eeh15
X8ppw11X8ppw11

From the relation R2 we exclude the row that also exists in the relation R2 - the third - and we obtain a new relation:

R
A1A2A3
X8ppw11
Q2eeh15

Cartesian product operation

The Cartesian product operation () defines a new relation R, which is the result of concatenating each tuple of relation R1 with each tuple of relation R2.

SQL Query

SELECT * from R3, R4

Let's establish what happens as a result of executing this relational algebra operation and the corresponding SQL query. Given two relations R3 and R4:

R3 R4
A1A2A3A4A5A6
3 hhylms3 hh
4 ppa1sr4 pp
1 rrylms

The new relation must contain all the attributes (columns) of the two relations. First, the first row of relation R3 is concatenated with each of the two rows of relation R4, then the second row of relation R3, then the third. The result should be 3 X 2 = 6 tuples (rows). We get this new relation:

R
A1A2A3A4A5A6
3 hhylms3 hh
3 hhylms4 pp
4 ppa1sr3 hh
4 ppa1sr4 pp
1 rrylms3 hh
1 rrylms4 pp

Relational algebra, as you might guess, this is a special type of algebra in which all operations are performed on relational data models, i.e., on relationships.

In tabular terms, a relationship includes rows, columns, and a row—the column header. Therefore, natural unary operations are the operations of selecting certain rows or columns, as well as changing column headings - renaming attributes.

1. Unary fetch operation

The first unary operator we'll look at is fetch operation– the operation of selecting rows from a table representing a relationship according to some principle, i.e. selecting tuple rows that satisfy a certain condition or conditions.

Sampling operator denoted by ? <P>, sampling conditionP<S>, i.e., operator ? is always taken with a certain condition on tuples P, and the condition itself P is written depending on the relation scheme S. Considering all this myself fetch operation over the relation scheme S in relation to the relation r

? <P>r(S) ? ? <P>r = {t(S) |t ? r & P<S>t} = {t(S) |t ? r & IfNull(P<S>t, False};

The result of this operation will be a new relation with the same relation schema S, consisting of those tuples t(S) of the original operand relation that satisfy the sampling condition P t. It is clear that in order to apply some condition to a tuple, it is necessary to substitute the values ​​of the tuple’s attributes instead of the attribute names.

To better understand the principle of this operation, let's give an example. Let the following relationship diagram be given:

S: Session (grade book number, Last name, Subject, Grade).

Let's take the following sampling condition:

P<S> = (Subject = ‘Computer Science’ and Grade > 3).

We need to select from the original operand relation those tuples that contain information about students who passed the Computer Science subject with at least three points.

Let also be given the following tuple from this relation:

t 0 (S) ? r(S

Applying our selection condition to the tuple t 0 , we get:

P t 0 = (‘Databases’ = ‘Computer Science’ and 5 > 3);

On this particular tuple, the selection condition is not satisfied.

In general, the result of this particular sample

? <Предмет = "Информатика" and Оценка >3 > Session

there will be a “Session” table in which rows that satisfy the selection condition are left.

2. Unary projection operation

Another standard unary operation we will study is the projection operation. Projection operation is the operation of selecting columns from a table representing a relationship based on some characteristic. Namely, the machine selects those attributes (that is, literally those columns) of the original operand relation that were specified in the projection.

Projection operator denoted by [ S"] or ? . Here S"– subcircuit of the original relation schema S, i.e. some of its columns. What does this mean? This means that S' has fewer attributes than S, because in S" Only those for which the projection condition was satisfied remained. And in the table representing the relation r(S"), there are as many rows as there are in the table r(S), and there are fewer columns, since only those corresponding to the remaining attributes remain. So the projection operator ?< S"> in relation to r(S) results in a new relation with a different relation schema r(S"), consisting of projections t(S) [S"] tuples of the original relation. How are these tuple projections determined? Projection any tuple t(S) original relation r(S) per subcircuit S" is determined by the following formula:

t(S) [S'] = {t(a)|a ? def(t) ? S’}, S" ?S.

It is important to note that duplicate tuples are excluded from the result, i.e., there will be no duplicate rows in the table representing the new resulting relation.

Taking into account all of the above, the projection operation in terms of database management systems will look like this:

? <S">r(S) ? ? <S'>r ? r(S) [S’] ? r [S" ] = {t(S) [S'] | t ? r };

Let's look at an example illustrating the principle of the sampling operation.

Let the relation “Session” and the schema of this relation be given:

S: Session (grade book number, Last name, Subject, Grade);

We will be interested in only two attributes from this scheme, namely “grade book number” and “Last name” of the student, so the subschema S" will look like this:

S": (grade book number, last name).

Need an initial attitude r(S) project onto the subcircuit S".

t 0 (S) ? r(S): ((grade book number: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

This means that the projection of this tuple onto this subcircuit S" will look like this:

t 0 (S) S": ((grade book number: 100), (Last name: ‘Ivanov’));

If we talk about the projection operation in terms of tables, then the projection Session [Gradebook No., Last Name] of the original relation is the Session table, from which all columns have been deleted except two: Gradebook No. and Last Name. In addition, all duplicate lines have also been removed.

3. Unary renaming operation

The last unary operator we'll look at is attribute renaming operation. If we talk about a relation as a table, then the renaming operation is needed in order to change the names of all or some columns.

Rename operator as follows: ?>, here ? - rename function.

This function establishes a one-to-one mapping between schema attribute names S And S, where respectively S- schema of the original relation, and S - relationship diagram with renamed attributes. So the operator ? <?> in relation to r(S) gives a new relationship with the schema S, consisting of tuples of the original relation with only the renamed attributes.

Let's write the operation of renaming attributes in terms of database management systems:

? <? > r(S) ? ? <? >r = {? <? > t(S)| t ? r};

Here is an example of using this operation:

Let's consider the already familiar Session relation, with the diagram:

S: Session (grade book number, Last name, Subject, Grade);

Let's introduce a new relation schema S, with different attribute names that we would like to see instead of the existing ones:

S:

For example, the database customer wanted to see other names in your ready-made relation. To implement this order, you need to design the following rename function:

? : (grade book number, last name, subject, grade) > (grade book number, last name, subject, score);

In fact, only two attributes need to be renamed, so it would be legal to write the following renaming function instead of the existing one:

? : (grade book number, grade) > (No. ZK, Point);

t 0 (S) ? r(S): ((grade book number: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

Let's apply the rename operator to this tuple:

?t 0 (S): ((ZK No.: 100), (Last name: ‘Ivanov’), (Subject: ‘Databases’), (Score: 5));

So, this is one of the tuples of our relation that has had its attributes renamed.

In tabular terms, the relation

? < № зачетной книжки, Оценка > “ZK No., Point > Session -

this is a new table obtained from the Session relationship table by renaming the specified attributes.

4. Properties of unary operations

Unary operations, like any other, have certain properties. Let's look at the most important of them.

The first property of the unary operations of selection, projection and renaming is a property that characterizes the ratio of cardinalities of relations. (Recall that cardinality is the number of tuples in a particular relation.) It is clear that here we are considering, respectively, the original relation and the relation obtained as a result of applying a particular operation.

Note that all properties of unary operations follow directly from their definitions, so they can be easily explained and even, if desired, deduced independently.

1) power ratio:

a) for the sampling operation: | ? <P>r |? |r|;

b) for the projection operation: | r[S"] | ? |r|;

c) for the renaming operation: | ? <? >r | = |r|;

In total, we see that for two operators, namely the sampling operator and the projection operator, the power of the original relations - operands is greater than the power of the relations obtained from the original ones using the corresponding operations. This is because the selection process associated with these two selection and projection operations eliminates some rows or columns that do not satisfy the selection conditions. In the case when all rows or columns satisfy the conditions, the cardinality (i.e., the number of tuples) does not decrease, so the inequality in the formulas is not strict.

In the case of a renaming operation, the power of the relation does not change, due to the fact that when changing names, no tuples are excluded from the relation;

2) idempotency property:

a) for the sampling operation: ? <P> ? <P>r = ? <P>;

b) for the projection operation: r [S'] [S'] = r [S"];

c) for the renaming operation, in the general case, the idempotency property is not applicable.

This property means that double sequential application of the same operator to any relation is equivalent to its single application.

For the operation of renaming relation attributes, generally speaking, this property can be applied, but always with special reservations and conditions.

The idempotency property is very often used to simplify the form of an expression and bring it to a more economical, relevant form.

The last property we will consider is the property of monotonicity. It is interesting to note that under any conditions all three operators are monotonic;

3) property of monotonicity:

a) for the sampling operation: r 1 ? r 2 ? ? <P> r 1 ? ? <P>r 2 ;

b) for the projection operation: r 1 ? r 2 ? r 1 [S"] ? r 2 [S"];

c) for the renaming operation: r 1 ? r 2 ? ? <? >r 1 ? ? <? >r 2 ;

The concept of monotonicity in relational algebra is similar to the same concept from ordinary, general algebra. Let us explain: if initially the relationship r 1 and r 2 were connected in such a way that r ? r 2 , then even after applying any of the three sampling, projection or renaming operators, this relationship will remain the same.