General classification and characteristics of distributed information systems technologies. Distributed information systems and networks

    Architecture distributed systems and basic concepts of distributed data processing……………………………………………………………….2

    Open systems concept……………………………………………………….12

    Advantages of the open systems ideology……………………………17

    Open systems and object-oriented approach……………19

    Computer (information) networks…………………………………21

    Global networks………………………………………………………..24

    Local networks……………………………………………………………..27

    Multiprocessor computers……………………………………..31

    Interacting processes……………………………………………………..36

  1. Architecture of distributed systems and basic concepts of distributed data processing

Distributed refers to information systems that are not located in one controlled territory or one facility.

Distributed Information system(RIS) - any information system that allows you to organize the interaction of independent but interconnected computers. These systems are designed to automate such objects that are characterized by the territorial distribution of points of origin and consumption of information.

IN general case distributed information system (RIS) is a set of concentrated IP related in unified system by using communication subsystem

Focused IP can be:

    individual computers, including personal computers,

    computing systems and complexes,

    local computer networks(LAN).

Currently practically not used non-intelligent subscriber points that do not include a computer . Therefore, it is reasonable to assume that the smallest structural unit of RIS is the computer (Fig. 1).

Distributed ISs are built according to network technologies and represent computer networks (VN).

The term "distributed system" refers to an interconnected collection of autonomous computers, processes or processors. Computers, processes or processors are referred to as nodes in a distributed system. Being defined as "standalone", nodes must at least be equipped with their own control unit. Thus, parallel computer single thread multi-data (SIMD) does not fall within the definition of a distributed system. To be defined as "interconnected", nodes must be able to exchange information.

Because processes can act as nodes in a system, the definition includes software systems built as a set of interacting processes, even if they run on the same hardware platform. In most cases, however, a distributed system will at least contain several processors connected by switching hardware.

The communication subsystem includes:

    communication modules (CM);

    channels of connection;

    concentrators;

    Internet gateways (bridges).

Main function communication modules is the transfer of the received packet to another CM or subscriber station in accordance with the transmission route. The communications module is also called a packet switching center.

Rice. 1. Fragment of a distributed information system

Channels of connection combine network elements into single network, channels may have different speed data transmission.

Hubs are used to compress information before transmitting it over high-speed channels.

Internet gateways and bridges used to connect a network to a LAN or to connect segments global networks. Bridges are used to connect network segments with the same network protocols.

In any RIS, in accordance with the functional purpose, three subsystems can be distinguished:

    user subsystem;

    control subsystem;

    communication subsystem.

Custom or subscriber the subsystem includes information systems of users (subscribers) and is intended to meet the needs of users in storing, processing and receiving

Availability control subsystems allows you to combine all elements of the RIS into a single system in which the interaction of elements is carried out according to uniform rules. The subsystem ensures the interaction of system elements by collecting and analyzing service information and influencing the elements in order to create optimal conditions for the functioning of the entire network.

Communication subsystem ensures the transfer of information in the network for the benefit of users and RIS management.

The functioning of RIS can be considered as the interaction of remote processes through a communication subsystem.

Computer network processes are generated by users (subscribers) and other processes.

The interaction of remote processes is as follows:

    file sharing,

    forwarding messages by email,

    sending applications to implement programs and obtain results,

    accessing databases, etc.

Conceptually distributed image Data processing implies one or another type of organization of a communication network and decentralization ation of three categories of resources:

    computing hardware and computing power itself;

    databases;

    system management.

In distributed information systems, the following basic functions are implemented to varying degrees:

Access to resources (computing power, programs, data, etc.) from terminals and from user programs in “file server” mode;

    performing tasks and interactive communication between users and programs launched at their request in the “client-server” mode;

    collecting statistics on the functioning of the system;

    ensuring the reliability and survivability of the system as a whole.

Currently, various approaches are used to classify distributed information systems according to different criteria.

According to the degree of homogeneity, they are distinguished:

    completely heterogeneous RIS;

    partially heterogeneous RIS;

    homogeneous fig.

Fully heterogeneous RIS are characterized by the fact that they combine computer, built on the basis various architectures and functioning P about managing different operating systems (OS) ).

Typically, RIS of this type as a communications service use global networks , based on X.25 protocols, Frame relay , ATM , Internet -technology.

Partially heterogeneous RIS build on the basis computers of the same type working running various OS , or they include computersdifferent types running the same OS.

For example , IBMPC computers are controlled by different operating systems; MSDOS, OS/2, Windows 95, WindowsNT.

Homogeneous distributed systems are built on same type of computing facilities, equipped with the same operating systems.

According to architectural features there are:

    RIS based systems teleprocessing ;

    RIS based network technology .

Under network technology is understood as a form of computer interaction in which any of the processes of one of the machines, on its own initiative, can establish a logical connection with any process on any other computer .

Unlike such systems RICE based teleprocessing systems don't worry ensure complete, symmetrical and independent interaction of processes.

By degree of distribution From the user's perspective, RIS are divided into 2 groups:

regional and local.

Regional RIS include distributed configurations, ha characterized by the following main parameters :

Unlimited geographical distribution;

The presence of certain routing mechanisms;

Every two nodes are connected by their own channel, and there is no problem of its separation;

Wide range of transmission speeds - 10 3 ... 10 8 bit/s;

Arbitrary topology.

There are several ways to organize interaction. between computers:

    circuit switching;

    message switching;

    packet switching;

    frame switching - Framerelay;

    cell switching - ATM-technology.

The basis of local RIS are local networks with the following characteristics: characteristics:

    small geographical distribution;

    the use of a unified communication environment and, consequently, the physical full connectivity of all network nodes, leading to the replacement of routing with addressing;

    high and very high exchange rates - 10 7 ... 10 9 bit/s;

    application of special methods and algorithms for accessing unified environment to provide high speed transmission with simultaneous use of the medium by all nodes of the communication service;

    limited possible topologies.

Under architecture RICE understand its relationship logical , physio cheskoe And program structures .

Logical structure RICE reflects composition of network services and communications between them (Fig. 2).

In this structure information and computing service designed to solve problems of network users.

Terminal Service ensures interaction of terminals with the network.

This service includes:

    conversion of formats and codes,

    management of different types of terminals,

    processing procedures for exchanging information between terminals and the network, etc.

Trance tailor service designed to solve all problems related to transmission of messages over the network.

She drives:

    routes,

    streams and data,

    decomposition of messages into packages and a number of other functions.

Interface service solves problems ensuring interactions between different types of computers, functioning running various operating systems , having different architectures, word lengths, data presentation formats, etc.

Besides, management service interfaces carries out the interaction of computers that are part of various networks.

Administrative service

    manages the network,

    implements reconfiguration and recovery procedures,

    collects statistics on the functioning of the network,

    carries out network testing.

The given complete composition of logical structure elements is not mandatory for all real systems.

So, in homogeneous networks there is no need for inter face service , V simplest networks may be missing administrative service etc.

Information and computing (IVS) And terminal services form subscriber service .

Interface And transport service form communication onny service.

It follows that administrative service does not directly perform any functions related to network user services, and can be considered as a mechanism for servicing itself networks .

Element distribution logical structure on various computers sets physical structure RICE (Fig. 3).

The elements of such a structure are computers connected to each other and to terminals.

Depending on the implementation of a particular network service in the computer would be in physical structure can be distinguished:

1 - main computers;

2 - communication computers;

3 - interface computers;

4 - terminal computers;

5 - administrative computers.

Several services can be implemented on one computer.

Program structure RIS reflects the composition of network components software (software) and connections between them .

It's obvious that compound network software is determined by the logical structure, that is, the functions performed by its services,

In the same time connections between software components are largely depend on the physical structure.

The complexity of the tasks performed by distributed information system network software requires that the network software be developed in a highly structured manner. Nowadays, network software is always organized as a collection of modules, each of which performs very specific functions and builds on the services offered by other modules. In network organizations, there is always a strict hierarchy between these modules, because each module exclusively uses the services offered by the previous module. The modules are named levels in the context of network implementation.

Network software It has multi-level hierarchical organization, which caught two factors:

    the need to minimize costs for modifying network software when changing the composition of the equipment used;

    Any changes made on the network should not affect user programs that use network capabilities.

For a hierarchical organization it is necessary clear description interfaces And protocols, i.e. rules of interaction:

    programs executed on one computer and located on various levels,

    and programs located at the same level, but located in different computers.

The desire to create a unified, universal and open to change logical and physical structures network architecture caused standard ization of software hierarchy levels of computer networks.


Course structure. Lectures Distributed systems: tasks, terminology, operating principles. Client-server architecture. Typical tasks. Areas of use. Example of an information system (typical application in architecture client-server). Multi-tier architecture. Areas of use. Short review modern technologies. XML, CGI/JSP, Servlets, DCOM, CORBA, RMI (.NET). Selecting layers in a multi-tier architecture (typical architecture). "Thin" and "Thick" clients. Application server. Database Server. Migration of objects (issues of distribution of computational load). System deployment. CORBA Basics. CORBA and OOP. Interface Definition Language IDL. IDL mapping to C++. IDL mapping to Java. ORB. Dynamic interaction between clients and servers. CORBA naming services. An example of an information system implemented in a multi-tier architecture.


Course structure. Practice Laboratory work 1 Discount card servicing system Required tools: server - Oracle (MSSQL Server 2000 sp3), client - Java (jdk, VisualCafe, MS J++,...) Laboratory work 2 WMS (Warehouse Management System) Thin client(Web, HandHeld, cellular telephone, …). Application server. Interaction between client and application server. Business logic server. Issues of distribution of computing load. Ensuring fault tolerance. Required tools: server - Oracle (MSSQL Server 2000 sp3), Application/business logic - Java (jdk, VisualCafe, MS J++,...)










Distributed Systems: Definitions A distributed system is a collection of independent nodes (computers) that appear to the user as one computer. a distributed system is a collection of independent computers connected by a network with software that ensures their joint functioning.


Consequences... No global time – Asynchronous message transmission - – Limited accuracy of clock synchronization No system state – There is not a single process in a distributed system that knows the current global state of the system Consequence of parallelism and data transfer mechanism


Consequences... Failures – Processes execute autonomously, in isolation – Failures of individual processes may remain undetected – Individual processes may be unaware of a system-wide failure – Failures occur more often than in a centralized system – New causes of failures (which did not exist in monolithic systems) – Network failures isolate processes and fragment the system


Principles of separation Functional separation: nodes perform various tasks–Client / server –Host / Terminal –Data collection/data processing Solution - creation of shared services Natural separation (defined by task) –Service system for a supermarket chain –Network to support teamwork


Partitioning principles Load sharing/balancing: assigning tasks to processors so as to optimize total load systems. Power amplification: different nodes work on the same task – Distributed systems containing a set of microprocessors can approach the power of a supercomputer – 10000 CPU, each 50 MIPS, together MIPS - > a command is executed in nsec -> light travels 0.6 mm -> any existing chip - more! the command is executed in 0.002 nsec -> light travels 0.6 mm -> any existing chip - more!">


Principles of separation Physical separation: the system is built on the assumption that the nodes are physically separated (reliability requirements, fault tolerance). Economic: A set of cheap chips can provide better price/performance than a mainframe - Mainframe: 10 times faster, 1000 times more expensive














Resource Sharing Resource sharing is often one of the reasons for developing a distributed system – Reduces cost, (file and print servers) – Shares data among users ( collaboration on the project) Services – Manage a set of resources – Provide services to users


Resource sharing Server is used to provide services – Receives service requests from clients call operation – Receive message/reply to message full implementation - remote call – The roles of client and server change from call to call the same process can be both a client and server – Client/Server terminology applies to processes, not nodes!!!




Distributing the application Fragmentation – dividing the application into modules for distribution Configuration – Connecting modules with each other (dependencies) Placement – ​​uploading modules to target system–Distribution of computing modules between nodes (static or dynamic)






Heterogeneity Middleware: middleware layer – allows heterogeneous nodes to communicate – Defines a homogeneous computing model – Supports one or more programming languages ​​– Provides support distributed applications Calling remote objects Remote call SQL Distributed Transaction Processing Examples: CORBA, Java RMI, Microsoft DCOM


Heterogeneity Mobile code: Code is designed to migrate between nodes – Need to overcome hardware differences (different instruction sets) Virtual machines–The compiler “produces” bytecode for the VM –VM is implemented for all hardware platforms (Java) Methods brute force–We port the code for each platform...






Security Scenario 1: Access to test results via NFS – How do we know that the user is a teacher who has access to the data? –Authorization Scenario 2: Sending a number credit card to an online store – No one except the recipient should read the data – Cryptography






Scalability Cost of physical resources – Increases as the number of users increases – Should not increase faster than O(n), where n = number of users Performance overhead – Increases with data size (and number of users) – Search time should not increase faster than O(log n) where n = data size










Parallelism Concurrency control – Access of multiple threads to a resource Proper scheduling of access in parallel threads (elimination of mutual exclusions, transactions) – Synchronization (semaphores) Safe, but reduces performance – Shared objects (resources) must work correctly in multi-threaded environment




Transparency Access transparency: access to local and remote resources through identical calls Location transparency: access to resources regardless of their physical location Concurrency transparency: the ability for multiple processes to work on resources in parallel without affecting each other Replication transparency: the ability for multiple instances of the same resource to be used without knowledge of the physical details of replication. Error Handling Transparency: Protect software components from failures that occur in other software components. Disaster recovery Mobility transparency: The ability to move an application between platforms without redesigning it Performance transparency: The ability to configure the system to increase performance when the composition of the runtime platform changes Scalability transparency: The ability to increase performance without changing the design software system and algorithms used






Results Distributed system: – Autonomous (but connected by a data transmission medium) nodes – Interaction through message passing Many examples that distributed systems are needed and you need to be able to build them Distributed systems exist and you need to be able to develop and support them














Table of contents

INTRODUCTION 4
1.THE CONCEPT OF DISTRIBUTED IS 6
1.1. Prerequisites for creating distributed IS 6
1.2. Concept of distributed information systems 8
1.3. Tools for working with distributed data 11
2. DISTRIBUTED DATABASES 13
2.1. Basic principles 13
2.2 Types of distributed databases 15
2.3. Purpose and principle of operation of a distributed database 16
3. EXAMPLES OF DISTRIBUTED SYSTEMS 21
CONCLUSION 25
LITERATURE 26

INTRODUCTION

The relevance of this topic of the essay lies in the fact that the processes of globalization and information integration are taking place in the world economy. They also affected our country, which, due to its geographical location and size, is forced to use distributed information systems (IS). Distributed information systems provide work with data located on different servers, various hardware and software platforms and stored in various formats. They are easily expandable, based on open standards and protocols, provide integration of their resources with other information systems, and provide users with simple interfaces.
There is a huge amount of ready-to-use information and computing resources in the world. They were created at different times and different approaches were used to develop them. Almost always, when developing a new information system, you can find ready-made components that are suitable for their functions. The problem is that when they were created, incompatibility requirements were not taken into account. These components do not understand each other, they cannot work together. It is desirable to have a mechanism or set of mechanisms to make such independently developed information and computing resources interoperable.
This paper examines basic information about a distributed information system: describes the prerequisites for its development, means of working with data, introduces the concept of a distributed database, as well as its types and basic principles. The third chapter presents examples of distributed information systems, such as: - Informix On-Line from Informix Software; - Ingres Intelligent Database from Ingres Corp; - Oracle (version 7) from Oracle Corp; - Sybase System 10 from Sybase Inc.
The purpose of the study is to study the theoretical foundations of distributed information systems, as well as to develop knowledge about the principles of its operation.
This distribution of data allows, for example, to store in a network node the data that is most often used in this node. This approach makes it easier and faster to work with this data and leaves the opportunity to work with the rest of the database data.

1.THE CONCEPT OF DISTRIBUTED IS
1.1. Prerequisites for creating distributed IS

From the very beginning of the development of computer technology, two main directions of its use emerged. The first direction is the use of computer technology to perform numerical calculations that take too long or are impossible to perform manually. The emergence of this direction contributed to the intensification of methods for numerically solving complex mathematical problems, the development of a class of programming languages ​​focused on convenient recording numerical algorithms, establishing feedback with developers of new computer architectures.
The second direction is the use of funds computer technology in automatic or automated information systems. Typically, the volumes of information that such systems have to deal with are quite large, and the information itself has a rather complex structure. One of the natural requirements for such systems is the average speed of operations and the safety of information.
But because information systems require complex data structures, these individual additional data controls were an essential part of information systems and were practically repeated from one system to another. The desire to identify and generalize the general part of information systems responsible for managing complexly structured data was, apparently, the first motivating reason for the creation of various management systems.
Very soon it became clear that it was impossible to get by with a common library of programs that implemented more complex data storage methods on top of the standard base file system, for example, storing information in several files. Thus, all this contributed to the creation of distributed information systems.
In fact, if an information system supports the consistent storage of information in multiple files, it can be said to support a database. If some auxiliary data management system allows you to work with multiple files, ensuring their consistency, you can call it a database management system. The mere requirement of maintaining data consistency across multiple files does not allow for a library of functions: such a system must have some of its own data (metadata) and even knowledge that determines the integrity of the data.
There is a huge amount of ready-to-use information and computing resources in the world. They were created at different times and different approaches were used to develop them. Almost always, when developing a new information system, you can find ready-made components that are suitable for their functions.

1.2. The concept of distributed information systems

Typically, a system in which more than one database server operates is considered distributed. This is used to reduce the load on the server and ensure the operation of geographically remote departments. The varying complexity of creation, modification, maintenance, and integration with other systems make it possible to divide information systems into classes of small, medium and large distributed systems. Small ICs have a small life cycle(LC), orientation towards mass use, low price, impossibility of modification without the participation of developers, using mainly desktop database management systems (DBMS), homogeneous hardware and software that do not have security features. Large corporate IP systems federal level and others have a long life cycle, migration of legacy systems, diversity of hardware and software, scale and complexity of the tasks being solved, intersection of many subject areas, analytical processing data, territorial distribution of components.
The functions of such information systems include, first of all, working with distributed data located on different physical servers, different hardware and software platforms and stored in different internal formats. In this case, the system must provide full information about yourself and all your resources, easy to expand, be based on open standards and protocols, provide the ability to integrate your resources with the resources of other information systems. For users, the system should provide different levels of user privileges and provide simple interfaces to access information.
Data from disparate systems is usually combined into logical groups, to which requests are addressed. An abstract query system assumes that the system operates not with a specific query syntax, but with its logical essence based on abstract attributes.
When building distributed information systems, as a rule, two basic architectures are used: Client/server and Internet Intranet.
Enterprise IS built on a Client/server architecture provides clients with a wide range of applications and development tools that are focused on maximizing the computing capabilities of client workstations. Server resources are used mainly for storing and exchanging documents, as well as for accessing external environment. This architecture allows you to better protect the server side of applications, while providing the ability for applications to either directly address other server applications, or route requests to them. However, frequent client calls to the server reduce network performance. We have to solve issues safe work on the network, as applications and data are distributed between various clients. The distributed nature of the system construction makes it difficult to configure and maintain

IS based on Internet base Intranet is based on the principle of "open architecture". IS software is implemented in the form of applets or servlets (programs in JAVA language) or in the form of cgi modules (Perl or C programs). The IP of this architecture includes Web-yinh\, implemented using CORBA Enterprise JavaBeans technologies, ActiveX 1X"OM, multi-level applications on Java based and XML, .Net concept with XML, in which the exchange between various servers (data warehouses, business applications, servers for mobile clients and more) is produced using architecture-neutral XML.
A distributed information base means an unlimited number of databases that are remote from each other and have a number of general characteristics:
- operating according to uniform rules defined centrally for all databases included in the distributed information base;
- data exchange is carried out according to rules also defined centrally.
The organization of a distributed database is necessary for companies engaged in various types of activities if in their daily work there is a need to solve the following problems:
- the need to quickly obtain information from databases of remotely located units (or branches);
- the need to consolidate in a single database information from the databases of legal entities included in the company's structure for subsequent data analysis and obtaining reports from one database, both for the company as a whole and for each legal entity separately;
- the need to introduce a centralized change in the structure and rules of database operation for the operation of all remotely located units (branches) and legal entities (with the impossibility of changing certain rules directly in a remote unit);
- the need to limit and control changes in data in remotely located divisions of the company (branches).

1.3. Tools for working with distributed data

When choosing a distributed IS, you should first pay attention to what operating systems and network protocols it supports. However, no less important is what data distribution methods are implemented in it.
1) Fragmentation and duplication
One of the ways to distribute tables is fragmentation. The table can be split into parts that will be placed in different nodes. Another way to distribute data is duplication (replication). You can create duplicates of the entire database or parts of it and place these duplicates in nodes. Both methods allow you to store data exactly on the node where it is most frequently used. This minimizes the cost of data transmission over the network and reduces the use of processors and other resources of other nodes. With this application database architecture, data transfer over the network is performed quite rarely.
2) Data dictionaries and directories
Once the data is distributed across different network nodes, it is important to find and use this data. In order to find data and convert it into the desired format, global data dictionaries and directories are used. The dictionary stores information about data, its usage, data access rights, and applications. Data directories are used to define where data is stored and how to retrieve it. Dictionaries and directories can be global or local
3) Two-phase fixation of changes
Data distribution methods are of course very important, but the heart of modern distributed DBMSs is the two-phase change commit protocol. This protocol manages the execution of transactions that change the data of multiple nodes. The main idea of ​​two-phase commit is the following: it is unacceptable for a transaction that changes data in several nodes to be executed in some nodes and not executed in other nodes. A transaction must either succeed on all nodes or fail on any node.
4) Ensuring integrity
An important characteristic of a distributed IS is the way it maintains referential integrity between the data in the master table and the data in its associated tables. Let's look at an example of referential integrity. Suppose there are three tables in a distributed database:
- a table containing information about the children of employees;
- a table containing information about employee salaries for the year;
- a table containing information about the topics completed by the employee.
All these tables contain a column "Employee's name". The rules for ensuring referential integrity require that when the values ​​of the "Employee's Full Name" column are changed in one table, the values ​​of this column in other tables are automatically adjusted. To ensure referential integrity, 2 are used various methods- triggers and declarative integrity constraints of the ANSI standard.

2. DISTRIBUTED DATABASES
2.1. Basic principles

Distributed databases (RDB) are a set of logically interconnected databases distributed over a computer network.
The RDB consists of a set of nodes connected by a communication network in which:
a) each node is a full-fledged DBMS in itself;
b) nodes interact with each other in such a way that a user of any of them can access any data on the network as if it were on his own node.
Each node is itself a database system. Any user can perform operations on data on his local node in the same way as if this node was not part of the distributed system at all. A distributed database system can be thought of as a partnership between separate local DBMSs on separate local nodes.
Fundamental principle for creating distributed databases (“Rule 0”): To the user, a distributed system should look the same as a non-distributed system.
The fundamental principle has certain consequences additional rules or goals. There are only twelve such goals:
1.Local independence. Nodes in a distributed system must be independent, or autonomous. Local independence means that all operations on a node are controlled by that node.
2. Lack of support for the central unit. Local independence implies that all nodes in a distributed system should be treated as equals. Therefore, there should not be any calls to the "central" or "master" node in order to obtain some centralized service.
3.Continuous operation. Distributed systems should provide a higher degree of reliability and availability.
4.Independence of location. Users should not know where exactly the data is physically stored and should act as if all the data was stored on their own local node.
5.Independence from fragmentation. A system supports fragmentation independence if a given relation variable can be divided into parts or fragments when organizing its physical storage. In this case, data can be stored in the place where it is most often used, which allows localization of most operations and reduced network traffic.
6.Independence from replication. The system supports data replication if a given stored relation variable - or, in general, a given fragment of a given stored relation variable - can be represented by several separate copies or replicas that are stored on several separate nodes.
7.Processing distributed requests. The point is that a request may need to contact multiple nodes. In such a system there can be many possible ways forwarding data to enable the request in question to be completed.
8.Management of distributed transactions. There are 2 main aspects of transaction management: recovery management and concurrency management. With regard to recovery management, to ensure the atomicity of a transaction in a distributed environment, the system must ensure that the entire set of agents related to a given transaction (an agent is a process that runs for a given transaction on a separate node) has either committed its results or performed a rollback. As for concurrency control, in most distributed systems it is based on a blocking mechanism, just like in non-distributed systems.
etc.................

Typically, a system in which more than one database server operates is considered distributed. This is used to reduce the load on the server and ensure the operation of geographically remote departments. The varying complexity of creation, modification, maintenance, and integration with other systems make it possible to divide information systems into classes of small, medium and large distributed systems. Small information systems have a short life cycle (life cycle), orientation towards mass use, low price, impossibility of modification without the participation of developers, using mainly desktop database management systems (DBMS), homogeneous hardware and software, which do not have security features. Large corporate information systems, federal-level systems and others have a long life cycle, migration of legacy systems, diversity of hardware and software, the scale and complexity of the tasks being solved, the intersection of many subject areas, analytical data processing, and territorial distribution of components.

Distributed databases (RDB)- a set of logically interconnected databases distributed on a computer network.

The RDB consists of a set of nodes connected communication network, wherein:

each node is a full-fledged DBMS in itself;

nodes interact with each other in such a way that a user of any of them can access any data on the network as if it were on his own node.

Each node is itself a database system. Any user can perform operations on data on his local node in the same way as if this node was not part of the distributed system at all. A distributed database system can be thought of as a partnership between separate local DBMSs on separate local nodes.

Fundamental principle for creating distributed databases (“Rule 0”): To the user, a distributed system should look the same as a non-distributed system.

A fundamental principle entails certain additional rules or purposes. There are only twelve such goals:

Local independence. Nodes in a distributed system must be independent, or autonomous. Local independence means that all operations on a node are controlled by that node.

Lack of support for the central node. Local independence implies that all nodes in a distributed system should be treated as equals. Therefore, there should not be any calls to the "central" or "master" node in order to obtain some centralized service.

Continuous operation. Distributed systems should provide a higher degree of reliability and availability.

Location independent. Users should not know where exactly the data is physically stored and should act as if all the data was stored on their own local node.

Fragmentation independent. A system supports fragmentation independence if a given relation variable can be divided into parts or fragments when organizing its physical storage. In this case, data can be stored in the place where it is most often used, which allows localization of most operations and reduced network traffic.

Replication independent. A system supports data replication if a given stored relation variable - or in general a given fragment of a given stored relation variable - can be represented by several separate copies or replicas that are stored on several separate nodes.

Processing distributed requests. The point is that a request may need to contact multiple nodes. In such a system, there may be many possible ways to forward data to satisfy the request in question.

Distributed transaction management. There are 2 main aspects of transaction management: recovery management and concurrency management. With regard to recovery management, to ensure the atomicity of a transaction in a distributed environment, the system must ensure that the entire set of agents related to a given transaction (an agent is a process that runs for a given transaction on a separate node) has either committed its results or performed a rollback. As for concurrency control, in most distributed systems it is based on a blocking mechanism, just like in non-distributed systems.

Hardware independence. It is desirable to be able to run the same DBMS on different hardware platforms and, moreover, to ensure that different machines participate in the operation of a distributed system as equal partners.

Operating system independent. Ability to operate the DBMS under various operating systems.

Network independence. The ability to support many fundamentally different nodes, differing in hardware and operating systems, as well as a number of different types of communication networks.

Independence from the type of DBMS. It is necessary that the DBMS instances on different nodes all support the same interface, and it is not at all necessary that these are copies of the same version of the DBMS.

Each organization develops a more or less significant part, but not all, of the information content of its GIS. The need for data is an incentive for users to obtain new data in the most effective and quick ways including purchasing parts of databases for their GIS from other GIS users. In this way, GIS data is managed by multiple users.


Share your work on social networks

If this work does not suit you, at the bottom of the page there is a list of similar works. You can also use the search button


12. GIS DISTRIBUTED INFORMATION SYSTEM

12.1. General information

Today, in most geographic information systems, data for layers and tables comes from different organizations. Each organization develops more or less significant part, but not all, of the information content of its GIS. Typically at least some layers of data come from external sources. The need for data is an incentive for users to obtain new data in the most efficient and fastest ways, including purchasing parts of databases for their GIS from other GIS users. Thus, GIS data is managed by multiple users.

12.2. Possibilities for interaction

The distributed nature of GIS means there is ample opportunity for interaction between many GIS organizations and systems. User collaboration and collaboration is very important to GIS.

GIS users have long relied on mutually beneficial data exchange and sharing in their work. Real reflection This fundamental need is the ongoing effort to create GIS standards. Commitment to industry standards and general principles building a GIS is critical for the successful development and widespread implementation this technology. A GIS must support the most important standards and be able to adapt as new standards become available.

12.3. GIS networks

Many geographic datasets can be compiled and managed as a common information resource and shared by the user community. In addition, GIS users have their own vision of how popular data sets can be shared via the Web.

Key Web sites, called GIS catalog portals, allow users to both post their own information and search for geographic information available for use. As a result, GIS systems are increasingly connected to World Wide Web and gain new opportunities for exchanging and using information.

This vision has been ingrained in people's consciousness over the past decade and is reflected in concepts such as the National Spatial Data Infrastructure (NSDI) and Global infrastructure spatial data (GSDI). These concepts are constantly evolving and gradually being implemented, not only nationally and global levels, but also at the district level and municipalities. In a generalized form, these concepts are included in the concept of Spatial Data Infrastructure (SDI, Spatial Data Infrastructure).

The GIS network is essentially one of the methods for introducing and promoting the principles of SDI. It brings together many user sites and facilitates the publication, retrieval and sharing of geographic information through World Wide Web.

Geographic knowledge is inherently distributed and loosely integrated. All the necessary information is rarely contained in a single database instance with its own data schema. GIS users interact with each other to obtain missing pieces of what they have. GIS data. Through GIS networks, it is easier for users to establish contacts and exchange accumulated geographic knowledge.

A GIS network consists of three main building blocks:

  • Metadata catalog portals where users can search and find GIS information based on their needs
  • GIS nodes where users compile and publishforge sets of GIS information
  • GIS users who search, discover, access, and use published data and services

12.4. Catalogs of GIS portals

An important component A GIS network is a GIS portal directory with a systematic registry of various storage locations for data and information sets. Some GIS users act as data stewards, compiling and publishing their datasets for sharing across organizations. They register their information sets in the portal catalogue. By searching this catalog, other users can find and access the information sets they need.

A GIS catalog portal is a Web site where GIS users can search and find the GIS information they need. The capabilities provided depend on the range of GIS data network services offered, mapping services and metadata services. From time to time, the GIS catalog portal site may conduct a survey of the directories of its associated participating sites in order topublishing and updating one central GIS catalogue. Thus, the GIS catalog may contain links to data sources available both on this and other sites. It is assumed that a series of such catalog nodes will be created, and on their basis a common network will be formed - the Spatial Data Infrastructure.

GIS data and services are documented as catalog records in the GIS portal catalog, which can be used to search for candidates for use in different GIS applications.

One example of a GIS catalog portal is the US government portal (Geospatial One-Stop, see www.geodata.gov). This portal will enable all levels of government and the general public to access geographic information more easily, quickly and cost-effectively.

Other similar works that may interest you.vshm>

4627. Information system Clinic 436.13 KB
the main objective creating databases consists of combining the functions of updating, maintaining and replenishing stored information, as well as a reference function. The main characteristic property of the database is its independence from the operating programs with which it interacts.
6245. Corporate information system (CIS) 39.86 KB
A corporate information system CIS is a set of information systems of individual departments of an enterprise united by a common document flow such that each of the systems performs part of the tasks of managing decision-making and all systems together ensure the functioning of the enterprise in accordance with ISO 9000 quality standards. Modularity Allows parallelization to facilitate and, accordingly, speed up the installation process training personnel and launching the system into commercial operation. This requirement becomes...
1001. Information system at JSC Gazpromneft 44.35 KB
Goals and objectives of management information support. Strategy for the development of management information systems. Information Support management of the activities of the head of the organization Introduction There is a lot of talk about information and only a few organizations clearly and clearly formulate the requirements for this resource necessary for making effective management decisions.
7405. Marketing information system of Riviera-Sochi LLC 1.96 MB
The object of the study is the marketing information system of Riviera-Sochi LLC. Purpose of the study development and implementation marketing system collection of information processing and analysis for the purpose of effective and rational use enterprise resources. In the process of work, research was carried out on the structure of the organization and analysis of its economic indicators. As a result of the research, a Survey module was developed that operates on the website of the Riviera-Sochi LLC company in order to obtain the necessary information from consumers...
11460. Management accounting as an enterprise information system 64.49 KB
Transition to IFRS is A New Look for accounting. Now the actions of the accountant are no longer following instructions, but their own professional judgment on all issues related to accounting, limited by certain principles proposed by IFRS.
17542. Information management system of commodity supply for a supermarket 79.67 KB
The programs store electronic data on inventories that are constantly used for quick solution standard questions that would otherwise require working directly with inventory. Modern supermarkets are characterized by the presence of the following features: - a significant amount of retail space of 200 m2 or more; - a significant number of departments in which it is represented variety of products meat fruit and vegetable dairy products bread bakery products and pastries confectionery tobacco perfumery...
19833. Information system. IP classification. Structural components of corporate IP 33.24 KB
For business, such tasks are increasing profitability, increasing sales, reducing costs, reducing risks and generally stabilizing the market situation. It is important for the state to solve a range of social, economic, defense and other problems at the lowest cost. A certain breakthrough occurred in 2005 when, for the first time, a full-scale computer control. For example, we can mention the latest information technologies, which are characterized by relatively small volumes required...
12160. Information system "Archives of the Russian Academy of Sciences" (ISARAN) 17.86 KB
Short description development. Software ISARAN was created in the popular visual development environment Delphi Delphi client-server version and adapted to the specifics of the departmental Archival Fund of the Russian Academy of Sciences. Advantages of development and comparison with analogues. Regions commercial use development.
12142. Information system for monitoring the scientific and technical potential of the region 17.24 KB
The information system is a software and information complex designed for operational analytical accounting and monitoring of indicators of scientific and technical potential based on data from various statistical indicators analyzed using the author’s methodology. The developed application IS has the following advantages: adaptability for a wide class of indicators continuity of new information technologies automation of a significant number of functions performed when assessing scientific and technical potential. Product...
12060. Multifunctional integrating information system for monitoring water bodies (MISM VO) 17.91 KB
The multifunctional integrating information system for monitoring water bodies MISM VO is developed on the basis of portal web technology and allows you to integrate and process data on the state of water bodies of VO received from all possible sources of monitoring the status of VO including automatic monitoring posts of APM at various levels of individual VO their hydrographic network within the administrative region and the country as a whole, a complex of water reservoirs, for example, a cascade of reservoirs, a water basin to ensure optimal management...