Using mysql. Optimal use of MySQL. Selecting a Database

Optimal use of MySQL

Introduction

In order to start working with MySQL, you need to download this database, install, configure and study the documentation.

In the process of providing hosting services, we pay attention to the most common mistakes that users make when developing their virtual servers. One of the "hard" places for a typical webmaster is working with a MySQL server. Typically, studying the principles of SQL functioning and methods of working with databases is carried out from the literature, from which only things that are relevant at the time of reading are selected - how to connect to the database, how to make a request, how to update information or add new entry to the database and so on.

This approach, of course, gives the desired result - the user's website interfaces end up integrated with the database. However, users do not always think about how optimally their database works, how the processes that occur when working with MySQL can be optimized, and what the functioning will be like. virtual server with increased load, “influx” of users as a result, for example, of “promotion” of the site.

This article will help you optimize your work with the MySQL DBMS. The presented material does not pretend to be a detailed description of MySQL optimization in general, but only draws attention to the most common mistakes made by users and talks about how to avoid them. You can learn more about the intricacies of setting up MySQL on specialized pages, links to which are given at the end of this article.

What data needs to be stored in MySQL

Do not try to put all the information you have into the database. For example, there is no need to store pictures there, although MySQL allows this. By placing binary images of graphic files in the database, you will only slow down your server. Reading a file with an image from disk is much simpler and, from the point of view of consumed resources, more economical than connecting from a script to SQL, making a request, receiving an image, processing it and, issuing the necessary http headers, showing it to a visitor to the web server. In the second case, the operation of issuing a picture will require several times more processor, memory and disk resources. It is also worth remembering that there are mechanisms for caching web documents that allow the user to save on traffic, and with dynamic content generation you actually deprive your visitors of this convenient opportunity.

Instead of pictures, it is better to store information in MySQL, on the basis of which you can generate links to static pictures in documents dynamically created by scripts.

Query optimization

In situations where you really only need to get a certain portion of data from MySQL, you can use the LIMIT key for the SELECT function. This is useful when, for example, you need to show the results of searching for something in a database. Let's say the database contains a list of products that your online store offers. Providing the entire list of products in the desired category is somewhat inhumane towards the user - communication channels with the Internet are not fast for everyone, and issuing an extra hundred kilobytes of information often forces users to spend more than one minute waiting for page loading results. In such situations, information is given in portions of, say, 10 positions. It is wrong to select all the information from the database and filter the output with a script. It would be much better to make a request like

select good, price from books limit 20.10

As a result, MySQL will “give” you 10 records from the database starting from the 20th position. Having given the result to the user, make links “Next 10 products”, passing as a parameter to the script the next position from which the list of products will be displayed, and use this number when generating a query to MySQL.

You should also remember that when writing database queries (SQL queries), you should only request the information that you really need. For example, if there are 10 fields in the database, and at the moment you really need to get only two of them, instead of requesting

select * from table_name

use a construction like

select field1, field2 from table_name

This way you won't burden MySQL with unnecessary work, take up extra memory and perform additional disk operations.

You should also use the WHERE key where you need to get information that matches a specific pattern. For example, if you need to get from the database fields with the titles of books authored by Ivanov, you should use a construction like

select title from books where author="Ivanov"

There is also a LIKE key that allows you to search for fields whose values ​​are “similar” to a given pattern:

select title from books where author like "Ivanov%"

IN in this case MySQL will return the names of books whose author field values ​​begin with "Ivanov"

.

Resource-intensive operations

However, it should be remembered that there are operations, the execution of which in itself requires more resources than for ordinary queries. For example, using the DISTINCT operation on a SELECT function consumes much more more CPU time than normal SELECT. DISTINCT attempts to look for unique values, often performing many comparisons, substitutions, and calculations. Moreover, the larger the volume of data to which DISTINCT is applied (after all, your database grows over time), the slower such a request will be executed and the growth of resources needed to perform such a function will not occur in direct proportion to the volume of stored and processed data, but much faster.

Indexes

Indexes are used for more quick search by the value of one of the fields. If the index is not created, then MySQL sequentially scans all fields from the very first record to the most recent, comparing the selected value with the original one. How more table and the more fields it has, the longer the sampling takes. If a given table has an index on the column in question, then MySQL can make a quick positioning to the physical location of the data without having to perform a full table scan. For example, if the table consists of 1000 rows, then the search speed will be at least 100 times faster. This speed will be even higher if there is a need to access all 1000 columns at once, because in this case, there is no time spent on positioning the hard drive.

In what situations is it advisable to create an index:

  1. Finding strings quickly when using the WHERE clause
  2. Finding rows from other tables when performing a join
  3. Finding the MIN() or MAX() value for an indexed field
  4. Sorting or grouping a table if an indexed field is used
  5. In some cases, the need to access the data file is completely lost. If all of the used fields for some table are numeric and form a left-handed index on some key, then the values ​​can be returned entirely from the index tree with much greater speed.
  6. If queries like
    SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
    and there is a mixed index on fields col1 and col2, then the data will be returned directly. If you create separate indexes for col1 and col2, the optimizer will try to find the most constrained index by determining which index can find fewer rows and use that index to retrieve the data.
    If the table has a mixed index, then any left match to the existing index will be used. For example, if there is a mixed index of 3 fields (col1, col2, col3), then the index search can be performed on the fields (col1), (col1, col2) and (col1, col2, col3).

Learn more about indexing

Connection support

As you probably know, to work with the MySQL server, you must first establish a connection with it by providing your login and password. The process of establishing a connection can take much longer than directly processing a request to the database after establishing a connection. Following the logic, you need to avoid unnecessary connections to the database, not disconnecting from it where this can be done, if you plan to continue working with the SQL server in the future. For example, if your script has established a connection to the database and made a selection of data for analysis, you do not need to close the connection to the database if, while the same script is running, you plan to place the analysis results in the database.

It is also possible to maintain a so-called persistent connection to the database, but this is fully possible when using more complex programming environments than php or perl in the usual CGI mode, when the interpreter of the corresponding language is launched one-time by the web server to fulfill the incoming request.

Andrey Klochkov

What is MySQL

Before drawing conclusions, is it worth using MySQL package as a database server, you first need to find out what it is. MySQL is a relational DBMS.

MySQL supports SQL (Structured Query Language) and can be used as a SQL server. This means that you can communicate with the server in SQL: the client sends a request to the server, which processes it and gives the client only the data that was received as a result of this request. Thus, the client does not need to download data and perform calculations, as, for example, in Microsoft Access.

In addition, MySQL is open source software, i.e. it can be freely studied and modified. The package is distributed under the terms of the GPL (General Public License) and can be downloaded free of charge from the Internet (http://www.mysql.com) for non-commercial use.

With the advent of Internet technologies that make it possible to create dynamic Web pages, the demand for DBMSs that would be most suitable for this in terms of speed, reliability and stability has increased enormously. And here the MySQL package performed well, which turned out to be fast, simple and reliable, but, however, at the expense of functionality deterioration (let’s immediately make a reservation that the MySQL developers promise to add the missing functions in the next versions of the program).

By by and large, the absence of some functions that were sacrificed for speed and reliability does not create much trouble for users (although sometimes some discomfort does occur). MySQL is not up to the task of running a full-fledged corporate database, but MySQL copes quite well with everyday tasks.

Flaws

Here is a short list of the main features that MySQL is missing.

Transactions- allow you to combine several SQL queries into one unit of work and, if any of the queries included in this unit fail, rollback to return the data to the initial state. Let's explain with an example.

You need to withdraw money from one account and put it into another. To do this, you need to execute two SQL queries: the first is to withdraw money from one account, the second is to credit it to another account. If you do not apply transactions, then if the second request fails, the money will be withdrawn from the account, but will not be credited to another account. Using transactions allows you to roll back as if the money had not been withdrawn from the account at all.

Note that using the LOCK TABLES command in MySQL you can emulate a transaction. This command locks the table while queries are executed, thereby ensuring data integrity, but still cannot be rolled back.

Triggers- serve to automate control over the state and operation of the database. The trigger is stored in the database and fires when a certain event occurs. Let's take the same example with the transfer of money: if the second request fails, a trigger will fire, which will roll back or send a message to the database administrator.

Stored procedures- these are several SQL commands that are stored in the database under a certain name and together perform a certain function. Using stored procedures you can extend SQL syntax so that it will be similar to a regular programming language (for example, Oracle PL/SQL). In our money transfer example, we could save two SQL queries under the same name, and then call this procedure, passing it two account numbers and the amount of money as parameters.

Then both queries would be executed in one transaction. Nested Queries

- allow you to substitute values ​​into the selection conditions dynamically, based on the results of executing another query. According to the author, if you can somehow manage without all of the above, then the lack of nested queries sometimes really spoils life. For example, to find out which vehicle transported more cargo than the fleet average, you need to make the following SQL query:

SELECT auto FROM autopark WHERE massa > !More than what? I have no idea what the average is!

To do this, the average value in the massa field needs to be calculated:

SELECT AVG(massa) FROM autopark

If nested queries are supported, then these two queries can be nested inside each other:

SELECT auto FROM autopark WHERE massa >(SELECT AVG(massa) FROM autopark)

But in the case of MySQL, the average value has to be found separately and substituted into another query directly in the CGI script, which undoubtedly affects performance. UNION instruction

- simply put, it combines the output of several queries into one, with the ability to eliminate duplicate rows. Cascade data update

- Allows you to delete and update related data. For example, when you delete a customer record from the database, all order records for that customer are automatically deleted from the related tables.

Advantages

Now let's list the advantages of MySQL. Performance.

Thanks to the internal multithreading mechanism, MySQL is very fast. A fairly high level of security is ensured thanks to the mysql database, which is created during installation of the package and contains five tables.

Using these tables, you can describe which user from which domain can work with which table and what commands he can use. Passwords stored in the database can be encrypted using MySQL's built-in password() function. License.

MySQL licensing used to be a bit confusing; This program is now distributed free of charge for non-commercial purposes. Openness of the code.

Thanks to this, you can add the necessary functions to the package yourself, expanding its functionality as you require. By the way, the MySQL authors themselves can do this for you for a fee. To order the MySQL extension from the creators of the package, simply go to http://www.mysql.com and fill out the appropriate form. Reliability. The creators of MySQL did a great job: as far as I know, this package is quite stable and difficult to break. I don't specifically track results summaries. hacker attacks

on MySQL, but I have never seen (unlike the same Web servers) a message that MySQL was damaged as a result of someone’s malicious intent. Resources.

This may depend on various factors, but in any case you will not need a supercomputer. Community. As a result of the openness of the code, the free nature of the program, and its stable and reliable operation, a community of people has formed who are not only loyal to MySQL, but also participate in every possible way both in the development of the package itself and in training less experienced people to work with it. Exists great amount mailing lists and conferences where you can get free help

any time of the day. Portability. Currently, there are versions of the program for most common computer platforms. This suggests that a certain operating system . You can choose what to work with, for example, Linux or Windows, but even if you change the OS, you will not lose your data and you will not even need it additional tools

to transfer them.

There are several client programs for MySQL that have a GUI, but they are far from perfect and for the most part only slow down the work. If you prefer GUI, I recommend downloading and trying these programs to encourage their creators to further improve their products. Here is a far from complete list of programs with GUI:

  • Winmysqladmin - included in the MySQL Windows distribution, has a standard graphical interface and allows you to administer MySQL;
  • MySqlManager - included in the Windows distribution of MySQL, it contains client functions (but the author was not able to do anything serious with its help);
  • MySQL Administrator for Windows- a more “advanced” third-party utility. Allows you to register and connect simultaneously to several MySQL servers, create, delete and change the structure of databases and tables, create keys in tables, write SQL queries and save them in a file:
  • XMySQL is a MySQL client for X Window-like systems. Provides full access to tables, allows group inserts and removal, has a query designer and package administration functions. The program can be found at http://web.wt.net/~dblhack/.

A more complete list of all kinds of utilities for MySQL (and it is very large) is at http://www.mysql.com/downloads/.

There you will find a lot of interesting and useful things: exporting data from MySQL to Microsoft Access and vice versa, ODBC drivers, etc. Despite the lack of a graphical interface, the MySQL package includes quite powerful administration tools with a command line interface. Below is a list of them with brief descriptions

  • utilities
  • MySQLAdmin is the main MySQL administration tool. With it, you can create, destroy, modify databases and have complete control over your server.
  • MySQLDump is a data backup utility.
  • MySQLAccess - allows you to change access rights tables and display their contents in an easy-to-read form.
  • MySQLBug - in the event of a bug in MySQL, this utility creates a bug report for the program developers, also sending it to the MySQL mailing list so that experts can help solve your problem.
  • MySQLImport - Imports data from a delimited file into the database.

MySQLShow - shows the structure of databases and the tables they consist of. I would like to draw the reader’s attention to this: now there are programs that work through a CGI interface that provide practically database administration services. These programs are located on Web servers and are ordinary CGI scripts. Very often these scripts are placed in publicly accessible directories. The danger is that using search engines anyone can find such programs by file name, and then do whatever their heart desires with your database. This problem can be easily circumvented by placing these scripts in password-protected directories on the server. But The best decision- generally refuse to use such programs on the server.

MySQL also has its own extension SQL language. These functions can be used in a query in two ways. First, as a value to be retrieved: the function is included in the list of fields to be retrieved. The function's return value will be calculated for each table record and displayed as if it were a table field. For example, let's display the title of the article and its length:

SELECT title, LENGTH(title) FROM table

As a result, we get two columns, where one is taken from the table, and the second was calculated.

Secondly, the function can be used as part of a WHERE clause - in this case, it will replace a constant at the time the query is executed. Let me explain with an example: you need to find events that happened more than a day ago.

SELECT event FROM table WHERE time>(Unix_TIMESTAMP()-(60*60*24))

Here the Unix_TIMESTAMP() function calculates current time, from which we subtract one day.

Application

According to the author, the most suitable area of ​​application for MySQL is the Internet, thanks to the good security system of this package, stable operation and high performance. If your Internet project lacks transactions, you can use Postgres. Postgres is in many ways similar to MySQL, almost as good in performance, but has more functionality. However, as experience shows, MySQL’s capabilities are quite sufficient for simple Internet projects.

As for using MySQL as a corporate database, the situation here is not very favorable. Let's formulate the requirements for the SQL server of a corporate database, based on the specifics of the employees' work, and evaluate MySQL from the point of view of these requirements.

Ability to work with multiple users. This obvious requirement should be supplemented by the fact that the intensity of database use in this case will be much higher than on a Web server. In fact, for a website, 20 visitors at the same time is considered a great success, and in the case of a corporate base, even a small company can boast of this indicator. Particular attention should be paid to the fact that the corporate database uses more complex user interfaces than pages on the site; in other words, it sends requests to the server more intensively. IN technically this means that locking is required at the level of the record being modified. Here MySQL shows itself not in the best possible way: Locking in it is carried out at the table level. This means, in particular, that if someone enters an order, then all queries (analyzing statistics, selecting records for a report, etc.) will have to wait until the order has been entered. In the case of a corporate database, this negates even such an advantage of MySQL as speed.

Data integrity control at the SQL server level. The corporate database is different complex circuit data, and maintaining data integrity using a client program is very difficult: one relation can connect five to seven tables, and the number of tables can reach 30-40. And in this case, the ability to cascade update and delete records in related tables, which MySQL lacks, becomes significant.

In addition, we have already mentioned that the corporate database uses a more complex interface, and this circumstance gives rise to two more requirements: support for all standard SQL statements (and useful extensions), as well as the use of stored procedures and triggers. Alas, MySQL is not happy here either.

Summarizing all that has been said, we can conclude that for most Internet projects, the capabilities of the MySQL DBMS are quite sufficient. They will be enough for storage address book in the internal network of the enterprise.

MySQL is a type of relational database. MySQL is a server to which various users can connect.

When you connect to the Internet, do you enter your username and password, as well as the name of the server you are connecting to? When working with MySQL, the same system is used.

One more thing: what is a relational database? Relational means based on tables. Famous electronic editor Excel tables from Microsoft is actually a relational database editor.

Connecting to MySQL server

To connect to a MySQL server in PHP, use the mysqli_connect() function. This function takes three arguments: server name, username, and password.

The mysqli_connect() function returns the connection identifier, it is stored in a variable and later used to work with databases.

MySQL server connection code:

$link = mysqli_connect("localhost", "root", "");

In this case, I'm working on a local machine on Denwere, so the hostname is localhost, the username is root, and there is no password.

The connection also needs to be closed after finishing working with MySQL. The mysqli_close() function is used to close the connection. Let's expand the example:

$link = mysqli_connect("localhost", "root", ""); if (!$link) die("Error"); mysqli_close($link);

Here we checked the connection identifier for truth; if there is something wrong with our connection, then the program will not be executed, the die() function will stop its execution and display an error message in the browser.

Connection errors

The following functions are used to check the connection:

  • mysqli_connect_errno() - returns the error code of the last connection attempt. If there are no errors, returns zero.
  • mysqli_connect_error() - returns a description of the last connection error to the MySQL server.
define("HOST", "localhost"); define("DB_USER", "root"); define("DB_PASSWORD", ""); define("DB", "tester"); $link = mysqli_connect(HOST, DB_USER, DB_PASSWORD, DB); /* check connection */ if (mysqli_connect_errno()) ( printf("Unable to connect: %s\n", mysqli_connect_error()); exit(); ) else ( printf("Successful to connect: %s\n", mysqli_get_host_info($link));

The mysqli_get_host_info() function returns a string containing the type of connection being used.

Also note that using the define command, I saved all connection parameters as constants. When you write large projects and there will be a lot of files connecting to the MySQL server, it is convenient to store connection parameters in separate file and paste it using include functions or require .

Selecting a Database

A MySQL server can have multiple databases. First of all, we need to select the base we need to work with. In PHP, there is another parameter for this in the mysqli_connect() function - the database name.

I created it on my computer via phpMyAdmin with the name tester. Let's connect to it:

$link = mysqli_connect("localhost", "root", "", "tester"); if (!$link) die("Error"); mysql_close($link);

So, we have chosen a database to work with. But as we know, a relational database consists of tables, and our database does not yet have tables. The database is created empty, without tables. Tables must be added to it separately. Now let's add a table to it using PHP.

Create a table

In the name of the MySQL databases, the SQL part stands for Structured Query Language, which translates as a structured query language. In SQL language we will write queries from PHP programs send them to the MySQL server.

To create a table we just need to issue the CREATE TABLE command. Let's create a table called users in the columns of which the logins (login column) and passwords (password column) of users will be stored.

$query = "CREATE TABLE users(login VARCHAR(20), password VARCHAR(20))";

In this code, we have assigned the $query variable a string of text that represents SQL query. We create a table called users that contains two columns login and password, both of VARCHAR(20) data type. We'll talk about data types later, for now I'll just note that VARCHAR(20) is a string maximum length 20 characters.

To send our request to MySQL server we use the PHP function mysqli_query() . This function returns a positive number if the operation was successful and false if an error occurred (the request syntax is incorrect or the program does not have permission to execute the request).

$link = mysqli_connect("localhost", "root", "", "tester"); if (!$link) die("Error"); $query = "CREATE TABLE users(login VARCHAR(20), password VARCHAR(20))"; mysqli_query($query); mysqli_close($link);

The SQL query does not need to be written into a variable; it can be written directly as an argument to the mysql_query() function. It just makes the code more readable.

This script has one drawback - it does not output anything to the browser. Let's add a message:

$link = mysqli_connect("localhost", "root", "", "tester"); if (!$link) die("Error"); $query = "CREATE TABLE users(login VARCHAR(20), password VARCHAR(20))"; if (mysqli_query($query)) echo "The table has been created."; else echo "Table not created."; mysqli_close($link);

If we run this script again, we will see a message in the browser: “The table has not been created.” The fact is that the table was created during the first launch, and it is impossible to create a table with the same name again. We are faced with an error situation, so it’s time to talk about error handling when working with MySQL.

Error processing

When debugging a program, we may need accurate information about the error. When an error occurs in MySQL, the database server sets the error number and a line with its description. PHP has special functions to access this data.

  • mysqli_errno() - returns the error number.
  • mysqli_error() - returns a string describing the error.

Now let's add the mysql_error() function to our script:

$link = mysql_connect("localhost", "root", "", "tester"); if (!$link) die("Error"); $query = "CREATE TABLE users(login VARCHAR(20), password VARCHAR(20))"; if (mysqli_query($query)) echo "The table has been created."; else echo "Table not created: ".mysqli_error(); mysqli_close($link);

Now our script will return the line to the browser: “Table not created: Table “users” already exists.”

Delete a table

So, now we have a table that we don’t need. It's time to learn how to drop tables from a database.

To drop a table, use the DROP TABLE command followed by the table name.

$link = mysqli_connect("localhost", "root", "", "tester"); if (!$link) die("Error"); $query = "DROP TABLE users"; if (!mysqli_query($query)) echo "Error while deleting table: ".mysqli_error(); else echo "Table deleted."; mysqli_close($link);

Results

So we've mastered MySQL basics. What we learned to do:

  • Connect to a MySQL database using the mysqli_connect() function.
  • Close the connection to the MySQL server using the mysqli_close() function.
  • Send SQL queries to the MySQL server using the mysqli_query() function.
  • We learned the SQL query for creating a table: create table.
  • We learned the SQL query for deleting a table: drop table.
  • We learned how to handle errors using the mysqli_errno() and mysqli_error() functions.

Then we'll take a closer look at MySQL data types.

Read the next lesson:

MySQL is one of the easiest databases to administer on all platforms. In addition, this database does not require resources, so it can be used even on personal computers with small volume random access memory and hard drive. In this regard, developers using PHP tools, have been using MySQL for a long time to create a complete local development environment for the web on all types of client computers, even laptops.

Any developer working in the PHP environment is able to independently administer a MySQL database, unlike some other databases. To further simplify the task of administration, you can take advantage of numerous tools that are not only provided in the MySQL database itself, but can also be obtained from third-party developers.

Basic concepts to be familiar with at this stage, are:

  • database- container for the entire MySQL data collection;
  • table- a container nested in the database in which the data itself is stored;
  • line- a separate record that may contain several fields;
  • column- field name inside the line.

It should be noted that I am not trying to reproduce the exact terminology used in educational literature on relational databases, but rather to provide simple, common definitions to help you quickly understand the basic concepts and get started working with the database.

Accessing MySQL from the command line

There are three main ways to work with MySQL: using the command line, using a web interface like phpMyAdmin, and using a programming language like PHP. The third of these methods will be discussed in subsequent articles, but now let's look at the first two methods.

If you have installed the OpenServer WAMP server in accordance with the instructions in the article, then access to executable program MySQL can be obtained from the following directory:

MySQL course from scratch.

Instead of MySQL-5.7-x64 you need to substitute the version specified in the OpenServer settings in the "Modules" tab.

You need to open the Command Prompt program and go to this directory. This is done using the command cd PATH_TO_NEEDED_FOLDER:

The code is only available after purchasing the MySQL from scratch course.

After this, you need to run the mysql.exe program in this directory, passing its special parameter. For this purpose in command line now you need to run the command:

The code is only available after purchasing the MySQL from scratch course.

As a result, the MySQL client will start. It is connected to the MySQL server, which was launched when OpenServer started.

If this does not lead to the desired result and an error message is displayed about connecting to the MySQL server “Can"t connect to MySQL server on "localhost"", make sure that OpenServer is running and MySQL is specified in the modules.

The -u parameter stands for user. That is, this is a checkbox for specifying the user under which you need to connect to the server. root is the most main user in MySQL. It is created when the server is installed and by default it does not have a password.

But let's return to our terminal. Through this console client we can send various commands to the server. Let's run a command that lists all the databases created on this server.

The code is only available after purchasing the MySQL from scratch course.

In response, we will receive a beautifully designed list of databases. You will have fewer of them than I do, because I have already added them.

Working with MySQL via phpMyAdmin

To manage databases and tables it will be much easier and faster to use the phpMyAdmin program. Before entering into the browser address bar next line, you need to make sure that the OpenServer program is already running and, therefore, the MySQL database is ready for use:

Here we will be greeted by such a beautiful window for logging into the system.

Just as in the case with console application specify the root user and leave blank password. Click "login".

After this, you will be taken to the database control panel with a fairly friendly interface. On the left you will see the same list of databases that you received in the console version. You can poke them and see what’s inside.

Now let's click on the SQL tab and go to the window where you can directly write queries to the MySQL DBMS, as it would be in the console:

In the window that opens, enter the same query:

The code is only available after purchasing the MySQL from scratch course.

We press the “forward” button and see the same result as in the case of the console application.

Basic SQL Concepts

According to Andrew Taylor, who developed the SQL language, the name of the language is not short for Structured Query Language (or anything similar), although many believe that it is. The SQL language underlies a more rigorous and more general method data storage compared to the previous DBM-style database organization standard, which was based on the use of flat files.

SQL language is defined by standards ANSI(American National Standards Institute) and ECMA(European Computer Manufacturer's Association); both of these standardization organizations are internationally recognized. But it should be borne in mind that in general recommendations There are notable differences in SQL standards between commercial software products on the one hand and open source database organizations on the other. For example, the past few years have seen rapid development of so-called object-relational databases, as well as SQL software products specifically designed for the Web market. List of databases that can be used in combination with PHP system, is extremely large, so certain principles must be followed when choosing the most suitable database.

The basic principle is that you must first consider your own development needs, or at least know exactly what goal you want to achieve. If you try to read other recommendations, you will come across a lot of very convincing messages with such arguments that you literally “cannot do” without some advanced database tools (such as triggers or stored procedures) and any SQL-enabled software product that does not have these tools hardly deserves the right to be called a SQL product. Take these judgments with a grain of salt. It is much better to make a list of the required characteristics yourself, in order of importance, and then start searching for the software product that best suits your requirements.

But despite the above, much of the SQL language support does follow the standard. Therefore, in programs, as a rule, only a few are used repeatedly. SQL statements, regardless of which specific software product was chosen for implementation.

Essentially, a SQL database has a very simple logical structure. Any given SQL software installation can typically consist of multiple databases. For example, one database can be used to store data about customers, and another can contain data about products. (A certain difficulty arises due to the fact that SQL server and the collections of tables supported by this server are usually denoted general term database.) Each database contains several tables, each table consists of carefully defined columns, and each position in the table can be considered as a record, or row, entered into the table.

Any SQL server supports four so-called data manipulation operators, and in general these operators underlie the vast majority of operations performed with relational base data. These four basic database statements are SELECT, INSERT, UPDATE and DELETE. SQL statements, also called commands, are very convenient and allow you to perform almost all necessary actions in the database.

An important feature of these four SQL statements is that they only manipulate values ​​stored in the database, but do not affect the structure of the database itself. In other words, commands based on these operators can be used, for example, to enter data rather than to create a database; Using such commands, you can remove any piece of data from the database, but the “shell” itself will remain intact, so, in particular, you cannot assign the name of an existing database to another database running on the same server. To add or remove columns, destroy an entire database without leaving a trace, or create a new database, you must use other commands such as DROP, ALTER, and CREATE.

We will look at all these operators in detail in the next article on MySQL commands.

Database operations very often become a bottleneck when implementing a web project. Optimization issues in such cases concern not only the database administrator. Programmers need to properly structure tables, write optimal queries and more productive code. This article provides a small list of techniques for optimizing work with MySQL for programmers.

1. Optimize your queries for the query cache.

Most MySQL servers use query caching. This is one of effective methods performance improvements that are performed by the database engine in background. If a query is executed many times, the cache begins to be used to obtain the result and the operation is completed much faster.

The problem is that it is so simple and at the same time hidden from the developer, and most programmers ignore such a great opportunity to improve the performance of the project. Some actions may actually prevent the query cache from being used during execution.

// Query cache NOT WORKING $r = mysql_query("SELECT username FROM user WHERE signup_date >= CURDATE()"); // The query cache is WORKING! $today = date("Y-m-d"); $r = mysql_query("SELECT username FROM user WHERE signup_date >= "$today"");

The reason the query cache doesn't work in the first case is because of the use of the function CURDATE(). This approach is used for all non-deterministic functions, for example, NOW(), RAND(), etc. Since the function's return result may change, MySQL decides not to post this request in the cache. All that is needed to fix the situation is to add an extra line of PHP code before the request.

2. Use EXPLAIN for your SELECT queries

Using the EXPLAIN keyword can help paint a picture of what MySQL is doing to complete your query. This picture makes it easy to identify bottlenecks and other problems in queries or table structure.

The result of an EXPLAIN query shows which indexes are used, how the table is scanned and sorted, and so on.

Let's take a SELECT query (preferably a complex one, with a JOIN), and add the EXPLAIN keyword before it. You can use PhpMyAdmin for this. Such a query will output the result in a nice table. Let's say we forgot to add an index on a column that is used for a JOIN:

After adding an index for the group_id field:

Now instead of scanning 7883 rows, only 9 and 16 rows from two tables will be scanned. A good way to evaluate performance is to multiply all the numbers in the “rows” column. The result is approximately proportional to the amount of data being processed.

3. Use LIMIT 1 if you need to get a unique string

Sometimes, while using a query, you already know that you are looking for only one row. You can get a unique record or simply check for the existence of any number of records that satisfy the WHERE clause.

In such a case, adding LIMIT 1 to your query may improve performance. Under this condition, the database engine stops scanning records as soon as it finds one and does not scan the entire table or index.

// Is there any user from Alabama? // Don't do this: $r = mysql_query("SELECT * FROM user WHERE state = "Alabama""); if (mysql_num_rows($r) > 0) ( // ... ) // This will be much better: $r = mysql_query("SELECT 1 FROM user WHERE state = "Alabama" LIMIT 1"); if (mysql_num_rows($r) > 0) ( // ... )

4. Index your search fields

Index more than just primary and unique keys. If any columns in your table are used for search queries, then they need to be indexed.

As you can see, this rule also applies to searching by part of a string, for example, “last_name LIKE ‘a%’”. When the start of a string is used for a search, MySQL can use the index of the column being searched.

You should also understand what types of searches you can't use regular indexing for. For example, when searching for the word (“WHERE post_content LIKE ‘%apple%’”), the benefits of indexing will not be available. In such cases, it is better to use mysql full-text search or build own decisions based on indexing.

5. Indexing and using the same types for linked columns

If your application contains many JOIN queries, you need to index the columns that are linked in both tables. This has an effect on internal optimization bind operations in MySQL.

Also, the columns being linked must be of the same type. For example, if you associate a DECIMAL column with an INT column from another table, MySQL will not be able to use an index on at least one of the two tables. Even the character encoding should be the same for identical columns line type.

// Search for a company from a specific state $r = mysql_query("SELECT company_name FROM users LEFT JOIN companies ON (users.state = companies.state) WHERE users.id = $user_id"); // both columns for the state name must be indexed // and both must be the same type and character encoding // or MySQL will fail full scan tables

6. Don't use ORDER BY RAND()

This is one of those tricks that looks cool, and many new programmers fall into its trap. They cannot even imagine what a terrible problem they are creating for themselves by starting to use this expression in their queries.

If you really need to randomize the rows in your query result, there are many better ways to do this. Of course, this will be implemented with additional code, but you will be saved from a problem that grows exponentially with the volume of data. The thing is, MySQL performs a RAND() operation (which takes up CPU time) on every single row in the table before sorting it and giving you just one row.

// DO NOT do this: $r = mysql_query("SELECT username FROM user ORDER BY RAND() LIMIT 1"); // This will work better: $r = mysql_query("SELECT count(*) FROM user"); $d = mysql_fetch_row($r); $rand = mt_rand(0,$d - 1); $r = mysql_query("SELECT username FROM user LIMIT $rand, 1");

This is how you get random number, which is less than the number of rows in the query result, and use it as an offset in the LIMIT clause.

7. Avoid using SELECT *

The more data is read from the table, the slower the query will be. Such operations also take time to complete disk operations. And if the database server is separate from the web server, then delays will also be caused by data transfer over the network between the servers.

It's a good habit to specify a column when doing a SELECT.

// Bad: $r = mysql_query("SELECT * FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // This is better: $r = mysql_query("SELECT username FROM user WHERE user_id = 1"); $d = mysql_fetch_assoc($r); echo "Welcome ($d["username"])"; // The difference becomes significant with large amounts of data

8. Try to use the id field everywhere

A good practice is to use an id field in each table that has the PRIMARY KEY, AUTO_INCREMENT properties set, and is of type from the INT family. Preferably - UNSIGNED, since in this case the value cannot be negative.

Even if your table has a field with a unique username, don't make it the primary key. VARCHAR fields are slow to work as primary keys. Also, the structure of your database will be better if it internally uses links to records based on id.

Additionally, the MySQL engine uses primary keys for its internal purposes, and using the id field creates optimal conditions to solve them.

One possible exception to this rule is “associative tables,” which are used for many-to-many relationships between two other tables. For example, the “posts_tags” table contains 2 columns: post_id, tag_id. They are used to describe the relationship between two tables “post” and “tags”. The table described may have a primary key that contains both id fields.

9. Use ENUM instead of VARCHAR

// Create a prepared expression if ($stmt = $mysqli->prepare("SELECT username FROM user WHERE state=?")) ( // Bind the parameters $stmt->bind_param("s", $state); // Execute $stmt->execute(); // Bind the result variables $stmt->bind_result($username); // Get the values ​​$stmt->fetch(); printf("%s is from %s\n", $username , $state); $stmt->close();

13. Unbuffered requests

Typically, when you execute a request from a script, the script is interrupted until the request is completed. This behavior can be changed using unbuffered queries.

Excellent explanation of the mysql_unbuffered_query() function from the PHP documentation:

“mysql_unbuffered_query() sends a SQL query to the MySQL server without automatically retrieving and buffering the result rows as the mysql_query() function does. Thus, a certain amount of memory is saved by SQL queries that issue big set result, and you can start working with the result set immediately after receiving the first row, without waiting for the SQL query to be fully executed.”

However, there are several limitations. You must either read all rows or call mysql_free_result() before executing the next query. You also cannot use mysql_num_rows() or mysql_data_seek() to set the result.

14. Store the IP address as UNSIGNED INT

Many programmers create a VARCHAR(15) field to store an IP address, without even thinking about the fact that they will store an integer value in this field. If you use INT, the field size will be reduced to 4 bytes, and it will have a fixed length.

You must use the UNSIGNED INT type because the IP address uses all 32 bits of an unsigned integer.

$r = "UPDATE users SET ip = INET_ATON("($_SERVER["REMOTE_ADDR"])") WHERE user_id = $user_id";

15. Tables with a fixed record length (Static) are faster

When each individual column in a table has a fixed length, then the table as a whole is considered “static” or “fixed record length”. Examples of column types that do not have a fixed length: VARCHAR, TEXT, BLOB. If you include at least one column of this type, the table will no longer be considered "static" and will be processed differently by the MySQL engine.

"static" tables are processed faster by the MySQL engine when searching for records. When you need to read a specific record in a table, its position is quickly calculated. If the row size is not fixed, then determining the position of the record requires time to search and compare with the index of the main key.

Such tables are also easier to cache and easier to recover from failures. But they can take up more space. For example, if you convert a VARCHAR(20) field to a CHAR(20) field, then 20 bytes will always be occupied, regardless of whether they are in use or not.

Using the Vertical Splitting technique makes it possible to separate variable-length columns into a separate table.

16. Vertical separation

Vertical partitioning is the act of dividing a table structure vertically for the purpose of optimization.

Example 1: You have a table that contains home addresses that are rarely used in the application. You can partition your table and store addresses in separate table. This will reduce the size of the main user table. And as you know, a smaller table is processed faster.

Example 2: You have a “last_login” field in your table. It is updated every time a user registers on the site. But every table update causes the query to be cached, which can create system overload. You can allocate this field to another table to make updates to the user table less frequent.

But you need to be sure that you don't need to permanently link the two tables you just split, as this could lead to poor performance.

17. Separate large DELETE or INSERT queries

If you need to perform big request DELETE or INSERT on a live site, you need to be careful not to disrupt traffic. When a large query is running, it can lock your tables and cause your application to stop.

Apache runs many parallel processes/threads. for this reason it works more efficiently when the script finishes executing as quickly as possible, so the server doesn't use too much open connections and processes that consume resources, especially memory.

If you lock tables for an extended period of time (for example, 30 seconds or more) on a heavily loaded web server, you can cause a backlog of processes and requests that will require significant cleanup time or even cause your web server to stop.

If you have a script that removes a large number of records, just use the LIMIT clause to split it into small batches to avoid the situation described.

While (1) ( mysql_query("DELETE FROM logs WHERE log_date<= "2009-10-01" LIMIT 10000"); if (mysql_affected_rows() == 0) { // выполняем удаление break; } // вы можете сделать небольшую паузу usleep(50000); }

18. Small columns are processed faster

For a database engine, disk is the most important bottleneck. The desire to make everything smaller and smaller usually has a positive impact on productivity by reducing the amount of data being moved.

The MySQL documentation contains a list of data storage standards for all types.

If the table will only contain a few rows, then there is no reason to make the primary key type INT rather than MEDIUMINT, SMALLINT, or even TINYINT. if you only need the date, use DATE instead of DATETIME.

You just need to remember about the opportunities for growth.

19. Choose the right storage mechanism

There are two main data storage engines for MySQL: MyISAM and InnoDB. Each has its own advantages and disadvantages.

MyISAM is great for read-heavy applications, but it doesn't scale well when there are a lot of records. Even if you update one field in one row, the entire table will be locked and no process will be able to read anything until the query completes. MyISAM performs calculations quickly for queries like SELECT COUNT(*).

InnoDB is a more complex storage engine, and it can be slower than MyISAM for most small applications. But it supports row locking, which is better for scaling tables. It also supports some additional features such as transactions.

20. Use object-relational mapping

Using an object-relational mapping (ORM - Object Relational Mapper) provides a number of advantages. Everything that can be done in an ORM can be done manually, but with more effort and higher developer level requirements.

ORM is great for lazy loading. This means that values ​​can be retrieved when they are needed. But you need to be careful because you can create a lot of small queries that will reduce performance.

An ORM can also combine your queries into transactions, which are significantly faster than individual database queries.

For PHP you can use ORM Doctrine.

21. Be careful with persistent connections

Persistent connections are designed to reduce the cost of restoring connections to MySQL. When a persistent connection is created, it remains open even after the script completes. Since Apache reuses child processes, the process is executed for the new script, and it also uses the MySQL connection.

This sounds great in theory. But in reality, this feature is not worth a penny due to the problems. It can cause serious problems with connection limits, memory overflow, and so on.

Apache works on concurrency principles and creates many child processes. This is the reason why persistent connections do not work as expected on this system. Before using the mysql_pconnect() function, consult your system administrator.