Russian database postgres pro. Parallel computing infrastructure in PostgreSQL. Comparison of PostgreSQL versions

  • Scripting languages- PL/Lua, PL/LOLCODE, PL/Perl, plPHP, PL/Python, PL/Ruby, PL/sh, PL/Tcl and PL/Scheme;
  • Classic languages ​​- C, C++, Java (via the PL/Java module);
  • Statistical language R (via the PL/R module).
  • PostgreSQL allows the use of functions that return a set of records, which can then be used in the same way as the result of a regular query.

    Functions can be executed both with the rights of their creator and with the rights of the current user. Functions are sometimes identified with stored procedures, but there is a difference between the two concepts.

    Triggers

    Triggers are defined as functions initiated by DML operations. For example, an INSERT operation can fire a trigger that checks the added record to see if it meets certain conditions. When writing functions for triggers, various programming languages ​​can be used.

    Triggers are associated with tables. Multiple triggers are executed in alphabetical order.

    Rules and views

    The rules engine is a mechanism for creating custom handlers not only for DML operations, but also for sampling operations. The main difference from the trigger mechanism is that the rules are triggered at the request parsing stage, before choosing the optimal execution plan and the execution process itself. Rules allow you to override the behavior of the system when performing an SQL operation on a table. A good example is the implementation of the views mechanism: when a view is created, a rule is created that specifies that instead of performing a select operation on the view, the system must perform a select operation on the base table/tables, taking into account the selection conditions underlying the view definition. To create views that support update operations, rules for row insertion, modification, and deletion operations must be defined by the user.

    Indexes

    PostgreSQL has support for indexes following types: B-tree, hash, R-tree, GiST, GIN. New types of indexes can be created if necessary, although this is far from a trivial process. Indexes in PostgreSQL have the following properties:

    • it is possible to view the index not only in direct, but also in reverse order - creating a separate index for the operation of the ORDER BY ... DESC construction is not necessary;
    • it is possible to create an index on several table columns, including columns of various data types;
    • indexes can be functional, that is, built not on the basis of a set of values ​​of a certain column/columns, but on the basis of a set of values ​​of a function of a set of values;
    • indexes can be partial, that is, built only on part of the table (on some of its projection); in some cases this helps create much more compact indexes or achieve performance improvements by using different types indexes for different (for example, in terms of update frequency) parts of the table;
    • The query planner can use multiple indexes simultaneously to run complex queries.

    Multi-version (MVCC)

    PostgreSQL supports simultaneous database modification by multiple users using the Multiversion Concurrency Control (MVCC) mechanism. This ensures ACID compliance and virtually eliminates the need for read locks.

    Full text search

    PostgreSQL has a built-in system full text search, which allows you to search documents within the database and sort them in a given order. The main advantages of using built-in full-text search are: close integration with the DBMS (transactionality, simultaneous access, failure recovery), scalability, ample opportunities settings (dictionaries, parsers, etc.).

    Geographic information systems

    PostGIS is an extension of the PostgreSQL DBMS designed for storing geographic data in a database. PostGIS includes support for R-Tree/GiST spatial indexes and geodata processing functions.

    2019: Compatible with TerraLink xDE

    2018

    Inclusion of Postgres Professional co-founder Alexander Korotkov in the list of PostgreSQL DBMS committers

    In June 2018, the list of committers (developers who contribute to the development of the code) of the PostgreSQL DBMS was replenished with a third Russian. The list of the main committers of the PostgreSQL kernel, co-founder and development manager of the Russian company Postgres Professional.

    2017

    Documentation for version 10 is localized for Russia

    Main innovations:

    • Logical replication: Parts of this mechanism have been added to PostgreSQL for quite some time, and in this version logical replication has become fully available to users. With its help, you can selectively replicate individual tables to another server, which can then execute both read and write queries. Servers participating in replication can run under different versions PostgreSQL, which allows you to update the cluster from minimum time just me.
    • Declarative partitioning relieves the administrator of the need to manually define the table hierarchy, create triggers and integrity constraints.
    • Parallel Query Execution It became possible to scan bitmaps and indexes, merge joins and subqueries in addition to the capabilities that appeared in the previous version.
    • Synchronous quorum-aware replication allows you to record changes if they have been confirmed by the required number of random replicas.
    • SCRAM authentication is a more crypto-resistant version of the previously used MD5 authentication.

    In total, according to the developers, version 10 includes more than 100 changes and improvements, some of which were made by Postgres Professional.

    Ethereum integration

    September 14, 2017 Russian company Postgres Professional announced the creation of a prototype of the Postthereum extension for integrating a full-featured PostgreSQL DBMS with a blockchain platform designed for registering transactions with any type of assets based on a “smart contracts” system. According to the company's plan, large Russian banks, corporations and government agencies working with the PostgreSQL DBMS, with the help of this development will be able to combine databases with blockchain applications based on Ethereum. Read more.

    2016

    PostgreSQL 9.6

    On September 29, 2016, the developer community presented the stable branch of the PostgreSQL 9.6 DBMS. Updates for it 9.6 will be released over five years, until September 2021.

    Major Additions

    Comparison of Tibero and PostgreSQL

    Corrective release of all branches

    On February 11, 2016, the PostgreSQL developer community announced the release of corrective updates for all supported PostgreSQL branches: 9.5.1, 9.4.6, 9.3.11, 9.2.15 and 9.1.20, which eliminated two vulnerabilities, introduced a portion of bug fixes, added support for Python 3.5 in PL/Python and the ability to share Python2 and Python3 in one database.

    The 9.0.x branch has been discontinued. Release of updates for the branch:

    • 9.1 extended until September 2016.
    • 9.2 extended until September 2017,
    • 9.3 extended until September 2018,
    • 9.4 extended until December 2019,
    • 9.5 extended until January 2021.

    The first of the vulnerabilities (CVE-2016-0773) appears in the processing engine regular expressions and can cause the backend to crash when parsing regular expressions with characters outside the Unicode range (the problem affects systems that use user input to generate the regular expression).

    The second vulnerability (CVE-2016-0766) is present in the PL/Java engine and allows you to increase your privileges when working with the database.

    PostgreSQL 9.5

    On January 7, 2016, it became known about the release of a stable branch of the PostgreSQL 9.5 DBMS. Release of updates for branch 9.5 will be supported until January 2021.

    Changes

    • "UPSERT" (add-or-modify) functionality, implemented through a new "INSERT ... ON CONFLICT DO NOTHING/UPDATE" expression, allows you to handle the situation when data cannot be added via "INSERT", for example, due to a violation of uniqueness or invalidity conditions values ​​of one of the fields. Instead of throwing an error, you can now ignore the execution of the statement or change the data associated with the key field (i.e., if the record already exists, perform UPDATE instead of INSERT);
    • Access restriction at the row level (Row-Level Security, RLS). User access to data in a table can now be restricted at the level of individual rows; for example, you can prohibit a certain category of users from viewing rows that store data added by another user. To activate RLS, you should use the "ALTER TABLE tablename ENABLE ROW LEVEL SECURITY" directive, after which you should set the access rules using the "CREATE POLICY" expression;
    • BRIN indexes ("block range indexes") allow ultra-compact indexing of very large tables, without the use of traditional B-trees. The essence of BRIN indexes comes down to dividing the overall index into blocks, each of which contains index data only for a certain range of values. In the test, a similar method turned out to be approximately two times slower than b-trees when performing data retrieval operations, but 3-4 times faster when creating and updating an index, and also took up significantly less disk space (64 KB versus 28 MB);
    • New functions and operators for the JSONB data type. To change values ​​in a JSONB document, you can now avoid retrieving and redefining the entire document with the introduction of the jsonb_set() function. Also added are the functions json_strip_nulls (removing attributes containing NULL values) and jsonb_pretty (output in formatted JSON). Added "||" operator to concatenate two JSONB values;
    • The pg_rewind tool allows you to significantly simplify the process of restoring fault-tolerant configurations after switching to a backup server. After the main server returns to service, the task arises of synchronizing its state with the backup server, which has continued to operate, and which has managed to accumulate its share of changes. The pg_rewind utility attempts to restore the state of the primary server from the WAL transaction log, going through them starting from the moment shortly before the failure, identifying the changed data and transferring only the changed blocks, which allows you to do without restoring a full copy from a working backup server.
    • Significantly optimized in-memory sorting and hashing speeds. Thanks to the use of a new method for sorting string values ​​and numbers, it was possible to increase the speed of creating indexes up to 20 times, and the execution time of queries requiring sorting of large volumes of data was reduced by 2-12 times;
    • Added support for the TABLESAMPLE expression, which allows you to create a sample over an incomplete amount of data from large tables, without performing resource-intensive sorting operations on the entire table. For example, the query "SELECT * FROM test TABLESAMPLE SYSTEM(10)" will produce output covering only 10% of the test table. Several algorithms are available for eliminating values ​​during the undersampling process;
    • Improved scaling on systems with big amount processor cores and RAM. For example, on a system with 24 CPU cores and 496 GB of RAM in the EnterpriseDB test under a load of 64 simultaneous connections, PostgreSQL 9.5 showed a 96% performance increase compared to PostgreSQL 9.4;
    • Automated management of transaction log size. Ability to exclude tables from being reflected in the transaction log (ALTER TABLE ... SET LOGGED / UNLOGGED);
    • Analytical capabilities "GROUPING SETS", "CUBE" and "ROLLUP", allowing you to generate output grouped by a set of fields and calculate the number of combinations of different categories;
    • Improved replication and fault tolerance enhancements. A mechanism has been added for tracking the state of replication execution, including methods for determining the cause of individual changes during logical replication;
    • Multiple improvements have been made to the Foreign Data Wrappers engine, including the "IMPORT FOREIGN SCHEMA" statement, which allows you to automate the import of all related foreign tables for existing tables with the selected server label. In addition, it is possible to inherit external tables from local tables and vice versa, for example, "CREATE local_customers () inherits (remote.customers);"
    • The "-j" option has been added to the vacuumdb utility, allowing you to run VACUUM in multiple concurrent threads.

    2015

    Parallel Computing Infrastructure in PostgreSQL

    On May 4, 2015, it became known that changes had been made to the source tree of the PostgreSQL DBMS with the implementation of an infrastructure for parallel computing.

    It provides:

    • Convenient procedures for coordinating the startup and shutdown of concurrent work processes;
    • Synchronization of various internal states (GUCs, combined CID mapping, transaction snapshots) between the group leader parallel works and directly parallelized workflows;
    • Call restriction various operations, which can lead to incorrect changes being made under conditions of active parallelization;
    • Delivery of notifications to the client via ErrorResponse, NoticeResponse and NotifyResponse messages from handlers running in parallel.

    Postgres-XL on EcoServer - an alternative for data centers

    On August 13, 2015, it became known that testing of the Postgres-XL database management system on servers of the EcoServer line was completed.

    Testing was carried out to monitor new technologies and implement the technological development plan for 2015.

    Andrey Chernogorov, CEO“Indigo IT” noted: “Today the most popular DBMS in the IT market are MS SQL and Oracle DataBase. At the same time, according to a number key capabilities they are in no way inferior to them, and in some places even superior, to the open source DBMS PostgreSQL, which opens up broad prospects for its use as part of the import substitution program.”

    For testing, the company's specialists prepared test data sets identical for all DBMSs. The test object was a 1 TB database consisting of 1 million business objects. The duration of testing for each DBMS is 10 hours.

    It was attended by latest versions DBMS most in demand by Indigo IT customers:

    • open DBMS PostgreSQL 9.4.

    A total of 5 sets of tests were carried out:

    • creation of complex structured documents,
    • updating complexly structured documents,
    • document search,
    • writing a file to the database,
    • getting a file from the database.

    Test results, 2015

    The time spent in each of the test sets indicated in the table means the average value over all sets (ms). Testing was carried out on servers with Intel Xeon E5 v3 processors with 128 GB of RAM.

    As a result load testing on two of the five test sets (creating complexly structured documents, updating complexly structured documents), PostgreSQL 9.4 showed results almost three times better than its competitors. In the remaining tests (searching for documents, recording and retrieving files from the database), test participants showed almost identical results.

    This version of the open source PostgreSQL DBMS supports the widely used JSON data exchange format and is aimed at the growing market for non-relational data stores NoSQL and especially the popular MongoDB DBMS.

    The first beta version of PostgreSQL 9.4 introduces a number of new features aimed at the rapidly expanding market of web applications, many of which require fast storage and retrieval of large volumes of user data.

    Version PostgreSQL 9.4 supports the JSON (JavaScript Simple Object Notation) format, which quickly gained popularity when organizing data exchange between different systems, including using the REST (Representational State Transfer) protocol. The success of the MongoDB document DBMS is largely due to the growing popularity of JSON.

    PostgreSQL's structured format for storing data according to JSON specifications (JSONB) eliminates the need to restructure the document before entering it into the database. As a result, PostgreSQL ingests documents as quickly as MongoDB, while continuing to meet the ACID (atomicity, consistency, isolation, durability) requirements for storing information in databases. Additionally, PostgreSQL supports full set index services, functions and operators to efficiently manipulate JSON data.

    Previous versions of PostgreSQL also supported JSON, but JSON documents were stored in text format, as a result of which the operations of recording and retrieving them took much longer.

    PostgreSQL has received a number of new features:

    • A new API for decoding data from a replication stream paves the way for independent software developers to build faster replication systems.
    • A new feature in Materialized Views called “simultaneous refresh” allows summary reports to be updated on the fly.
    • The Alter System Set feature will help administrators modify the PostgreSQL configuration file directly from command line SQL.

    A number of functions and capabilities have been added, including Dynamic Background Workers, array manipulation and table functions, overall productivity has been increased.

    PostgreSQL 9.3

    PostgreSQL 9.3 implements a number of mechanisms that allow you to exchange information with other databases and data warehouses. Foreign Data Wrapper modules, which appeared in version 9.1 and previously only allowed data to be read from other systems, now provide the ability to write. It supports working with both relational tables and semi-structured information from NoSQL systems. A driver has also been created for the DBMS, which allows you to connect two different copies of PostgreSQL itself with each other and ensures accelerated execution of transactions between them.

    Other features include enhanced JSON support and the ability to create arbitrary background server modules with unlimited access to PostgreSQL data. An example is the Mongres module, which automatically translates MongoDB queries into PostgreSQL format.

    Implemented automatic update representations and added a utility that allows you to execute in parallel mode backup large bases. Measures have been taken to improve the reliability of the DBMS. The Fast Failover function allows you to switch work from the master database to a copy in less than a second. Now you can check checksums pages to help diagnose hard drive failures.

    PostgreSQL 9.2

    PostgreSQL 9.0

    The developers of the open database management system PostgreSQL released in September 2010 the first release candidate of the Postrgesql 9.0 system, which implements all the functions prepared for release in the ninth version of this popular DBMS. Freely available on this moment A binary version of the preliminary build of Postgresql 9.0 is available and everyone can test the new capabilities of this development before transferring production servers that work with real information to it.

    Also in the ninth version, it became possible to replicate information from binary logs, corresponding to the Hot Stanby Databases mechanism in Oracle Database. The developers also paid attention to the increasingly popular cloud or SaaS systems. Now the DBMS is optimized for working in a virtual machine environment, supports a mechanism for fast data cloning, as well as the ability to replicate information from a single master server to a large number (more than a hundred) slave servers. Also new release fully supports memory addressing capabilities in 64-bit variants of Windows.

    • Translation

    Today let's talk about the advantages of Postgres over other open source systems. We will definitely cover this topic in more detail at PG Day"16 Russia, which is only two months away.

    You may be asking yourself, “Why PostgreSQL?” After all, there are other open source relational database options (for the purposes of this article we looked at MySQL, MariaDB and Firebird), so what can Postgres offer that they don’t? PostgreSQL's tagline states that it is "The World's Most Advanced Open Source Database." We will give several reasons why Postgres makes such statements.

    In the first part of this series, we will talk about data storage - model, structure, types and size limitations. And in the second part we will focus more on sampling and data manipulation.

    Data model

    PostgreSQL is not just a relational, but an object-relational DBMS. This gives it some advantages over other open source SQL databases such as MySQL, MariaDB and Firebird.

    A fundamental characteristic of an object-relational database is its support for user objects and their behavior, including data types, functions, operations, domains, and indexes. This makes Postgres incredibly flexible and reliable. Among other things, it can create, store and retrieve complex data structures. In some of the examples below, you will see nested and compound constructs that are not supported by standard RDBMSs.

    Structures and data types

    There is an extensive list of data types that Postgres supports. In addition to numeric, float, text, boolean and other expected data types (and many variations thereof), PostgreSQL boasts support for uuid, monetary, enumerated, geometric, binary, network addresses, bit strings, text search, xml, json, arrays, composite types and ranges, as well as some internal types for identifying objects and logging locations. To be fair, MySQL, MariaDB and Firebird also have some of these data types, but only Postgres supports them all.

    Let's take a closer look at some of them:

    Network addresses
    PostgreSQL provides storage of different types of network addresses. The CIDR (Classless Internet Domain Routing) data type follows the convention for IPv4 and IPv6 network addresses. Here are some examples:
    • 192.168.100.128/25
    • 10.1.2.3/32
    • 2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128
    • ::ffff:1.2.3.0/128
    Also available for storing network addresses is the INET data type, used for IPv4 and IPv6 hosts, where subnets are optional. The MACADDR data type can be used to store MAC addresses for hardware identification, such as 08-00-2b-01-02-03.

    MySQL and MariaDB also have INET functions for converting network addresses, but they do not provide data types for storing network addresses internally. Firebird also does not have types for storing network addresses.

    Multidimensional arrays
    Since Postgres is an object-relational database, arrays of values ​​can be stored for most existing types data. This can be done by adding square brackets to the data type specification for the column or by using the ARRAY expression. The size of the array can be specified, but is not required. Let's look at a holiday picnic menu to demonstrate the use of arrays:

    Create a table whose values ​​are arrays CREATE TABLE holiday_picnic (holiday varchar(50) -- string value sandwich text, -- side text array, -- multidimensional array dessert text ARRAY, -- array of beverage text ARRAY -- array of 4 elements); -- insert array values ​​into the table INSERT INTO holiday_picnic VALUES ("Labor Day", "("roast beef","veggie","turkey")", "( ("potato salad","green salad","macaroni salad "), ("chips","crackers") )", "("fruit cocktail","berry pie","ice cream")", "("soda","juice","beer","water ")");
    MySQL, MariaDB, and Firebird can't do this. To store such arrays of values ​​in traditional relational databases data, you will have to use a workaround and create separate table with rows for each of the array values.

    Geometric data
    Location data is quickly becoming a core requirement for many applications. PostgreSQL has long supported many geometric data types such as points, lines, circles, and polygons. One of these types is PATH, which consists of many sequential points and can be open (start and end points are not connected) or closed (start and end points are connected). Let's take a hiking trail as an example. IN in this case The hiking trail is a loop, so the start and end points are connected, which means my path is closed. Round brackets around a set of coordinates indicate a closed path, and square ones indicate an open path.

    Create a table to store trails CREATE TABLE trails (trail_name varchar(250), trail_path path); -- insert the trail into the table, -- for which the route is determined by coordinates in latitude-longitude format INSERT INTO trails VALUES ("Dool Trail - Creeping Forest Trail Loop", ((37.172,-122.22261666667), (37.171616666667,-122.22385), ( 37.1735,-122.2236), (37.175416666667,-122.223), (37.1758,-122.22378333333), (37.179466666667,-122.22866666667), (37.18395,-122.2267 5), (37.180783333333,-122.22466666667), (37.176116666667,-122.2222), (37.1753 ,-122.22293333333), (37.173116666667,-122.22281666667)));
    The PostGIS extension for PostgreSQL extends existing geometric data properties with auxiliary spatial types, functions, operators, and indexes. It provides location support and supports both raster and vector data. It also provides compatibility with a variety of third-party geospatial tools (copyrighted and open source) for displaying, rendering, and working with data.

    Note that MySQL 5.7.8 and MariaDB since version 5.3.3 have added data type extensions to support the OpenGIS geographic information standard. This version of MySQL and subsequent versions of MariaDB offer data type storage similar to native Postgres geodata. However, in MySQL and MariaDB, data values ​​must first be converted into geometric format using simple commands before being inserted into the table. Firebird does not currently support geometric data types.

    JSON support
    JSON support in PostgreSQL allows you to move to storing schema-less data in an SQL database. This can be useful when the data structure requires some flexibility: for example, if the structure is still changing during development or it is not known what fields the data object will contain.

    The JSON data type provides JSON validation, which allows you to use specialized JSON operators and functions built into Postgres to perform queries and manipulate data. Also available is the JSONB type - a binary variation of the JSON format in which spaces are removed, the sorting of objects is not preserved, instead they are stored in the most optimal way, and only last value for duplicate keys. JSONB is usually the preferred format because it requires less object space, can be indexed, and is faster to process because it doesn't require repeated parsing.

    MySQL 5.7.8 and MariaDB 10.0.1 added support for native JSON objects. But while there are many functions and operators for JSON that are now available in these databases, they are not indexed in the same way as JSONB in ​​PostgreSQL. Firebird has not yet joined the trend and only supports JSON objects as text.

    Creating a new type
    If you happen to find Postgres' extensive list of data types not enough, you can use the CREATE TYPE command to create new data types such as composite, enumerated, range, and basic. Let's look at an example of creating and sending requests of a new composite type:

    Create a new composite type "wine" CREATE TYPE wine AS (wine_vineyard varchar(50), wine_type varchar(50), wine_year int); -- create a table that uses the composite type "wine" CREATE TABLE pairings (menu_entree varchar(50), wine_pairing wine); -- insert data into the table using the expression ROW INSERT INTO pairings VALUES ("Lobster Tail",ROW("Stag""s Leap","Chardonnay", 2012)), ("Elk Medallions",ROW("Rombauer", "Cabernet Sauvignon", 2012)); /* select from a table using the column name (use parentheses separated by a dot from the field name in a composite type) */ SELECT (wine_pairing).wine_vineyard, (wine_pairing).wine_type FROM pairings WHERE menu_entree = "Elk Medallions";
    Because they are not object-relational, MySQL, MariaDB and Firebird do not provide such powerful functionality.

    Data Dimensions

    PostgreSQL can handle a lot of data. Current published restrictions are listed below:

    In Compose [approx. lane: the organization in which the author of the original article works] we automatically scale your installation so that you do not have to worry about the growth of the amount of data. But as any database administrator knows, you should be wary of too many or too many options. We recommend using common sense when creating tables and adding indexes.

    In comparison, MySQL and MariaDB are notorious for their row size limit of 65,535 bytes. Firebird also offers only 64Kb as the maximum line size. Typically, the amount of data is limited by the maximum file size of the operating system. Because PostgreSQL can store tabular data in multiple files smaller size, he can bypass this limitation. But it's worth noting that too many files can negatively impact performance. MySQL and MariaDB support more columns per table (up to 4,096 depending on the data type) and larger individual table sizes than PostgreSQL, but the need to exceed the existing Postgres limits arises only in extremely rare cases.

    Data integrity

    Postgres strives to be ANSI-SQL:2008 compliant, ACID (Atomicity, Consistency, Isolation and Durability) compliant, and is known for its referential and transactional integrity. Primary keys, constraint and cascading foreign keys, unique constraints, NOT NULL constraints, check constraints, and other data integrity features ensure that only valid data is retained.

    MySQL and MariaDB work harder to match SQL standard with InnoDB/XtraDB table engines. They now offer a STRICT option using SQL modes that sets up validity checks on the data being used. However, depending on which mode you use, invalid and even truncated data without your knowledge may be inserted or created during the update. None of these databases currently support CHECK restrictions. In addition, they have many features regarding referential integrity constraints. foreign keys. In addition to the above, data integrity can be significantly compromised depending on the storage engine chosen. MySQL (and the MariaDB fork) have made no secret of the fact that they trade integrity and standards compliance for speed and efficiency.

    Summing up

    Postgres has many features. Built using the object-relational model, it supports complex structures and a wide range of built-in and user-defined data types. It provides expanded data capacity and is trusted for its careful handling of data integrity. You may not need all of the advanced storage features we've explored in this article, but since your needs can quickly add up, there's a definite benefit to having it all at your fingertips.

    If you feel that PostgreSQL does not suit your needs, or you prefer to “shoot from the hip”, then you may want to take a look at the NoSQL databases we offer at Compose, or consider others SQL databases the data we mentioned. Each of them has its own advantages. Compose firmly believes that it is very important to choose the right database for specific task…sometimes that means selecting multiple databases!

    Want more Postgres?

  • SQL
  • Website development
    • Translation

    Today let's talk about the advantages of Postgres over other open source systems. We will definitely cover this topic in more detail at PG Day"16 Russia, which is only two months away.

    You may be asking yourself, “Why PostgreSQL?” After all, there are other open source relational database options (for the purposes of this article we looked at MySQL, MariaDB and Firebird), so what can Postgres offer that they don’t? PostgreSQL's tagline states that it is "The World's Most Advanced Open Source Database." We will give several reasons why Postgres makes such statements.

    In the first part of this series, we will talk about data storage - model, structure, types and size limitations. And let’s focus more on sampling and data manipulation.

    Data model

    PostgreSQL is not just a relational, but an object-relational DBMS. This gives it some advantages over other open source SQL databases such as MySQL, MariaDB and Firebird.

    A fundamental characteristic of an object-relational database is its support for user objects and their behavior, including data types, functions, operations, domains, and indexes. This makes Postgres incredibly flexible and reliable. Among other things, it can create, store and retrieve complex data structures. In some of the examples below, you will see nested and compound constructs that are not supported by standard RDBMSs.

    Structures and data types

    There is an extensive list of data types that Postgres supports. In addition to numeric, float, text, boolean and other expected data types (and many variations thereof), PostgreSQL boasts support for uuid, monetary, enumeration, geometric, binary, network addresses, bit strings, text search, xml, json , arrays, composite types and ranges, as well as some internal types for identifying objects and logging locations. To be fair, MySQL, MariaDB and Firebird also have some of these data types, but only Postgres supports them all.

    Let's take a closer look at some of them:

    Network addresses
    PostgreSQL provides storage of different types of network addresses. The CIDR (Classless Internet Domain Routing) data type follows the convention for IPv4 and IPv6 network addresses. Here are some examples:
    • 192.168.100.128/25
    • 10.1.2.3/32
    • 2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128
    • ::ffff:1.2.3.0/128
    Also available for storing network addresses is the INET data type, used for IPv4 and IPv6 hosts, where subnets are optional. The MACADDR data type can be used to store MAC addresses for hardware identification, such as 08-00-2b-01-02-03.

    MySQL and MariaDB also have INET functions for converting network addresses, but they do not provide data types for storing network addresses internally. Firebird also does not have types for storing network addresses.

    Multidimensional arrays
    Since Postgres is an object-relational database, arrays of values ​​can be stored for most existing data types. This can be done by adding square brackets to the data type specification for the column or by using the ARRAY expression. The size of the array can be specified, but is not required. Let's look at a holiday picnic menu to demonstrate the use of arrays:

    We create a table whose values ​​are arrays CREATE TABLE holiday_picnic (holiday varchar(50) -- string value sandwich text, -- array side text , -- multidimensional array dessert text ARRAY, -- array beverage text ARRAY -- array of 4- x elements); -- insert array values ​​into the table INSERT INTO holiday_picnic VALUES ("Labor Day", "("roast beef","veggie","turkey")", "( ("potato salad","green salad","macaroni salad "), ("chips","crackers") )", "("fruit cocktail","berry pie","ice cream")", "("soda","juice","beer","water ")");
    MySQL, MariaDB, and Firebird can't do this. To store such arrays of values ​​in traditional relational databases, you would have to use a workaround and create a separate table with rows for each of the array values.

    Geometric data
    Location data is quickly becoming a core requirement for many applications. PostgreSQL has long supported many geometric data types such as points, lines, circles, and polygons. One of these types is PATH, which consists of many sequential points and can be open (start and end points are not connected) or closed (start and end points are connected). Let's take a hiking trail as an example. In this case, the hiking trail is a loop, so the starting and ending points are connected, and therefore my path is closed. Round brackets around a set of coordinates indicate a closed path, while square brackets indicate an open path.

    Create a table to store trails CREATE TABLE trails (trail_name varchar(250), trail_path path); -- insert the trail into the table, -- for which the route is determined by coordinates in latitude-longitude format INSERT INTO trails VALUES ("Dool Trail - Creeping Forest Trail Loop", ((37.172,-122.22261666667), (37.171616666667,-122.22385), ( 37.1735,-122.2236), (37.175416666667,-122.223), (37.1758,-122.22378333333), (37.179466666667,-122.22866666667), (37.18395,-122.2267 5), (37.180783333333,-122.22466666667), (37.176116666667,-122.2222), (37.1753 ,-122.22293333333), (37.173116666667,-122.22281666667)));
    The PostGIS extension for PostgreSQL extends existing geometric data properties with auxiliary spatial types, functions, operators, and indexes. It provides location support and supports both raster and vector data. It also provides compatibility with a variety of third-party geospatial tools (copyrighted and open source) for displaying, rendering, and working with data.

    Note that MySQL 5.7.8 and MariaDB since version 5.3.3 have added data type extensions to support the OpenGIS geographic information standard. This version of MySQL and subsequent versions of MariaDB offer data type storage similar to native Postgres geodata. However, in MySQL and MariaDB, data values ​​must first be converted into geometric format using simple commands before being inserted into the table. Firebird does not currently support geometric data types.

    JSON support
    JSON support in PostgreSQL allows you to move to storing schema-less data in an SQL database. This can be useful when the data structure requires some flexibility: for example, if the structure is still changing during development or it is not known what fields the data object will contain.

    The JSON data type provides JSON validation, which allows you to use specialized JSON operators and functions built into Postgres to perform queries and manipulate data. Also available is the JSONB type, a binary variant of the JSON format that removes spaces, does not preserve the sorting of objects, instead stores them in the most optimal way, and only stores the last value for duplicate keys. JSONB is usually the preferred format because it requires less object space, can be indexed, and is faster to process because it does not require re-parsing.

    MySQL 5.7.8 and MariaDB 10.0.1 added support for native JSON objects. But while there are many functions and operators for JSON that are now available in these databases, they are not indexed in the same way as JSONB in ​​PostgreSQL. Firebird has not yet joined the trend and only supports JSON objects as text.

    Creating a new type
    If you happen to find Postgres' extensive list of data types not enough, you can use the CREATE TYPE command to create new data types such as composite, enumerated, range, and basic. Let's look at an example of creating and sending requests of a new composite type:

    Create a new composite type "wine" CREATE TYPE wine AS (wine_vineyard varchar(50), wine_type varchar(50), wine_year int); -- create a table that uses the composite type "wine" CREATE TABLE pairings (menu_entree varchar(50), wine_pairing wine); -- insert data into the table using the expression ROW INSERT INTO pairings VALUES ("Lobster Tail",ROW("Stag""s Leap","Chardonnay", 2012)), ("Elk Medallions",ROW("Rombauer", "Cabernet Sauvignon", 2012)); /* select from a table using the column name (use parentheses separated by a dot from the field name in a composite type) */ SELECT (wine_pairing).wine_vineyard, (wine_pairing).wine_type FROM pairings WHERE menu_entree = "Elk Medallions";
    Because they are not object-relational, MySQL, MariaDB and Firebird do not provide such powerful functionality.

    Data Dimensions

    PostgreSQL can handle a lot of data. Current published restrictions are listed below:

    In Compose [approx. lane: the organization in which the author of the original article works] we automatically scale your installation so that you do not have to worry about the growth of the amount of data. But as any database administrator knows, you should be wary of too many or too many options. We recommend using common sense when creating tables and adding indexes.

    In comparison, MySQL and MariaDB are notorious for their row size limit of 65,535 bytes. Firebird also offers only 64Kb as the maximum line size. Typically, the amount of data is limited by the maximum file size of the operating system. Because PostgreSQL can store tabular data in many smaller files, it can work around this limitation. But it's worth noting that too many files can negatively impact performance. MySQL and MariaDB support more columns per table (up to 4,096 depending on the data type) and larger individual table sizes than PostgreSQL, but the need to exceed the existing Postgres limits arises only in extremely rare cases.

    Data integrity

    Postgres strives to be ANSI-SQL:2008 compliant, ACID (Atomicity, Consistency, Isolation and Durability) compliant, and is known for its referential and transactional integrity. Primary keys, constraint and cascading foreign keys, unique constraints, NOT NULL constraints, check constraints, and other data integrity features ensure that only valid data is retained.

    MySQL and MariaDB are working more towards conforming to the SQL standard with InnoDB/XtraDB table engines. They now offer a STRICT option using SQL modes that sets up validity checks on the data being used. However, depending on which mode you use, invalid and even truncated data without your knowledge may be inserted or created during the update. None of these databases currently support CHECK restrictions. In addition, they have many features regarding referential integrity constraints on foreign keys. In addition to the above, data integrity can be significantly compromised depending on the storage engine chosen. MySQL (and the MariaDB fork) have made no secret of the fact that they trade integrity and standards compliance for speed and efficiency.

    Summing up

    Postgres has many features. Built using the object-relational model, it supports complex structures and a wide range of built-in and user-defined data types. It provides expanded data capacity and is trusted for its careful handling of data integrity. You may not need all of the advanced storage features we've explored in this article, but since your needs can quickly add up, there's a definite benefit to having it all at your fingertips.

    If you feel like PostgreSQL doesn't suit your needs, or you prefer to “shoot from the hip,” then you might want to take a look at the NoSQL databases we offer at Compose, or consider the other SQL databases we've mentioned. Each of them has its own advantages. Compose firmly believes that it is very important to choose the right database for a given task... sometimes that means choosing multiple databases!

    Want more Postgres?

    PostgreSQL is an open-source, cross-platform object-relational DBMS. This article will show you how to install PostgreSQL in Ubuntu Linux, connect to it and run a couple of simple SQL queries, as well as how to set up a backup.

    To install PostgreSQL 9.2 on Ubuntu 12.10, run the following commands:

    sudo apt-add-repository ppa:pitti/postgresql
    sudo apt-get update
    sudo apt-get install postgresql-9.2

    Let's try to work with the DBMS through the shell:

    sudo -u postgres psql

    Let's create a test database and a test user:

    CREATE DATABASE test_database;
    CREATE USER test_user WITH password "qwerty" ;
    GRANT ALL ON DATABASE test_database TO test_user;

    To exit the shell, enter the command \q .

    Now let's try to work with the created database on behalf of test_user:

    psql -h localhost test_database test_user

    Let's create a new table:

    CREATE SEQUENCE user_ids;
    CREATE TABLE users (
    id INTEGER PRIMARY KEY DEFAULT NEXTVAL ("user_ids" ) ,
    login CHAR(64) ,
    password CHAR(64));

    Please note that, unlike some other DBMSs, PostgreSQL does not have columns with the auto_increment property. Instead, Postgres uses sequences. For now, it is enough to know that using the nextval function we can get unique numbers for a given sequence:

    SELECT NEXTVAL ("user_ids" ) ;

    By setting the default value for the id field of the users table to NEXTVAL("user_ids"), we have achieved the same effect that auto_increment gives. When adding new records to the table, we don't have to specify the id, because a unique id will be generated automatically. Multiple tables can use the same sequence. This way we can guarantee that the values ​​of some fields in these tables do not overlap. In this sense, sequences are more flexible than auto_increment.

    The exact same table can be created using just one command:

    CREATE TABLE users2 (
    id SERIAL PRIMARY KEY ,
    login CHAR(64) ,
    password CHAR(64));

    In this case, the sequence for the id field is created automatically.

    Now using the \d command you can see a list of all available tables, and using \d users you can see a description of the users table. If you don't get the information you're looking for, try \d+ instead of \d . You can get a list of databases with the \l command, and switch to a specific database with the \c dbname command. To display command help, say \? .

    It's important to note that PostgreSQL converts table and column names to lowercase by default. If you don't want this behavior, you can use double quotes:

    CREATE TABLE "anotherTable" ("someValue" VARCHAR (64 ) ) ;

    Another feature of PostgreSQL that may cause difficulties when starting to work with this DBMS is the so-called “schemas”. A schema is something like a namespace for tables, like a directory with tables inside a database.

    Creating a schema:

    CREATE SCHEMA bookings;

    Switch to scheme:

    SET search_path TO bookings;

    View list existing schemes you can use the \dn command. The default schema is named public. In principle, you can successfully use PostgreSQL without knowing about the existence of schemas. But when working with legacy code, and in some edge cases, knowing about schemas can be very useful.

    Otherwise, working with PostgreSQL is not much different from working with any other relational DBMS:

    INSERT INTO users (login, password)
    VALUES ("afiskon" , "123456" ) ;
    SELECT * FROM users;

    If you now try to connect to Postgres from another machine, you will fail:

    psql -h 192.168.0.1 test_database test_user

    Psql: could not connect to server: Connection refused
    Is the server running on host "192.168.0.1" and accepting
    TCP/IP connections on port 5432?

    To fix this, add the line:

    listen_addresses = "localhost,192.168.0.1"

    ...to the /etc/postgresql/9.2/main/postgresql.conf file as well.

    Postgres Pro is a Russian DBMS developed by Postgres Professional based on the freely distributed PostgreSQL DBMS. Postgres Pro is included in the Russian software register (see https://reestr.minsvyaz.ru/reestr/65273/)

    This way, customers can access functionality and performance benefits that benefit them without having to wait for a new PostgreSQL release (which can take up to a year). As authors, we provide support for all of our developments. As representatives of the international community of PostgreSQL developers, we also provide commercial support for the freely distributed PostgreSQL DBMS.

    Comparison of PostgreSQL versions

    DBMSPostgreSQL
    Enterprise
    PostgreSQL
    Standard
    PostgreSQL
    A commercial DBMS developed by Postgres Professional for mission-critical applications and high loads.Russian open source DBMS developed by Postgres Professional based on the freely distributed PostgreSQL DBMSA freely distributed DBMS developed by the international community.
    Unified Register of Russian Software
    64-bit transaction counter
    Incremental backup at block level
    Certificate FSTEC SVT 5, NDV 4
    Autonomous Transactions
    Partitioning tables
    Data compression
    Multimaster
    1C support
    Portable tables
    Scheduler hints


    PostgreSQL Versions

    The PostgreSQL version number is derived from the PostgreSQL version number with the addition of one digit indicating the current release number. When a new minor version of PostgreSQL is released (this usually happens when patches related to security and correction of serious errors appear), the PostgreSQL numbering is reset to one. For example, when PostgreSQL 9.5.1 is released, PostgresPro 9.5.1.1 is released, then before PostgreSQL 9.5.2 is released, Postgres Pro 9.5.1.2, 9.5.1.3, etc. may be released. When PostgreSQL 9.5.2 is released, Postgres Pro will be updated to version 9.5.2.1, etc.

    Simultaneously with the release source code Postgres Pro we publish our assemblies as packages under various platforms. These are the following OS and their versions:

    1. Linux
      • CentOS 6/7
      • Debian 7/8
      • Ubuntu 12.04/14.04/16.04/16.10,
      • Oracle Linux,
      • Rosa Enterprise Linux server,
      • ROSA SX Cobalt Server,
      • ROSA DX Cobalt Server,
      • ROSA Marathon LTS 2012,
      • Alt Linux Centaur 8,
      • Alt Linux SPT 6,
      • Alt Linux SPT 7,
      • SUSE Linux Enterprise Server,
    2. Microsoft ® Windows ® 2012 or 2016.

    Postgres Pro 9.5.*.* databases are compatible with PostgreSQL 9.5.* When upgrading from 9.5, dump/restore is not required. When moving from more earlier versions PostgreSQL requires the use of dump/restore or pg_upgrade.

    The current version of Postgres Pro Standard is 11.2.1. Release date - March 28, 2019. .

    Differences between Postgres Pro Standard and PostgreSQL

    In PostgreSQL Standard vs. current version PostgreSQL currently includes the following changes:

    1. Performance improvements on multi-core systems:
      • optimizing the allocation of hash tables in shared memory, eliminating lock contention for a large number of processes.
      • Resource Owner optimization. Improves the performance of complex queries and queries against tables with a large number of partitions.
      • Buffer manager optimizations
      • LWLock optimization for Power8 architecture
      • Two-phase commit optimizations
    2. Full text search improvements:
      • phrase search support
      • support for hunspell dictionaries for working with word forms
      • some dictionaries, including Russian and English, are included in the distribution and their connection requires one SQL command
      • shared_ispell module, which optimizes full-text search performance by loading dictionaries into memory at server startup, and not at the start of the session.
    3. Covering indexes. Support for the INCLUDING construct in CREATE INDEX.
    4. Portability: libicu is supported on all platforms, providing unambiguous handling of sort order and other operations with Unicode characters. On a number of platforms, this library improves sorting performance, and, importantly, allows Postgres Pro to use abbreviated keys, which were disabled in the main version of PostgreSQL.
    5. The pg_trgm module not only supports fuzzy string comparison, but also fuzzy substring search.
    6. The pageinspect module supports access not only to meta information, but also to the internal representation of table data.
    7. A new module has been added, sr_plan, which allows you to save query execution plans and use saved plans instead of generating a query plan anew each time it is executed.
    8. The dump_stat module has been added, which allows you to save information about statistics and restore it when dumping a database. This allows you to speed up the recovery process by eliminating the need to calculate statistics with the VACUUM ANALYZE command after recovery.
    9. Added JSQuery module allowing special language formulate queries to JSONB type fields with support for GIN indexes.
    10. The module provides an additional data type for compatibility with Microsoft SQL Server.
    11. The module provides an additional equality operator for compatibility with Microsoft SQL Server.
    12. The module provides a transactionally unsafe function for truncating temporary tables, preventing the pg_class directory from growing.
    13. The module provides a set of functions that immediately update statistics on target tables after INSERT, UPDATE, DELETE, and SELECT INTO operations on them.
    14. The module adds support for instructions to the scheduler, allowing you to disable or enable certain indexes when executing a query.

    You can learn more about the differences between PostgreSQL and PostgreSQL in the Product Comparison Table.

    PostgreSQL Standard License

    Postgres Pro Standard is distributed under the PostgreSQL license with Postgres Professional add-ons:

    Portions Copyright (c) 2015-2019, Postgres Professional
    Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
    Portions Copyright (c) 1994 Regents of the University of California

    Rights are granted to use, copy, modify and distribute this software and its documentation for the purposes of testing, software development, familiarization with the functionality of the DBMS, use in educational process free of charge and without the signing of any agreement, provided that the above copyright notice, the current paragraph and the following four paragraphs are provided with each copy. Use for other purposes, integration into other products, replication and other actions require the purchase of a separate license.

    The University of California assumes no liability for any damages, including loss of income, arising directly or indirectly, special or incidental, from the use of this software or its documentation, even if the University of California has been advised of the possibility of such damages.

    The University of California specifically disclaims any warranties of any kind, including, but not limited to, the implied warranties of merchantability or fitness for a particular purpose. This software is provided on an "as is" basis and the University of California is under no obligation to provide maintenance, support, updates, enhancements, or modifications.

    Postgres Professional Limited Liability Company assumes no liability for any damages, including loss of income, caused by direct or indirect, special or incidental use of this software or its documentation, even if Postgres Professional Limited Liability Company has been advised of the possibility of such damage.

    Postgres Professional Limited Liability Company specifically disclaims any warranties, including, but not limited to, the implied warranties of merchantability or fitness for a particular purpose. This software is provided on an "as is" basis and Postgres Professional Limited Liability Company is not obligated to provide maintenance, support, updates, extensions or changes.