Get and post parameters in the request. Learning to work with GET and POST requests. More about HTTP
Today I wanted to get a little into primitive things and describe what can be found in worldwide network in large quantities and without much effort. We will talk practically about the holy of holies of the HTTP protocol: POST and GET requests.
Many will ask why? I will answer briefly and clearly: not everyone knows what it is and why it is needed, and those who want to learn about it (while understanding little in the IT field) often cannot understand what is written in many, many articles devoted to this topic. I’ll try to explain with my fingers what POST and GET requests are and what they are used for.
So, let's begin our journey into a fairy tale...
If you are reading this message, then you at least know what the Internet looks like and what an Internet site is. Omitting all the subtleties of work world wide web, we will operate with such concepts as user and site. Whatever one may say, these two entities must somehow interact with each other. For example, people communicate with each other through gestures, emotions and speech, animals make some sounds, but what happens when a person and an Internet resource “communicate”? Here we have a case of information exchange, which can be transferred to a human “Question-Answer” conversation. Moreover, both the user and the site can ask questions and answers. When we talk about a website, its questions and answers, as a rule, are always expressed in the form of an Internet page with one text or another. When we're talking about about the user, then everything happens thanks to GET and POST requests (of course not only, but we are talking about them).
Thus, we found out that our theme objects are necessary to “communicate” with sites. Moreover, both GET and POST requests can be used to “ask questions” to the site and to “answer” them. How are they different? Everything is quite simple. However, to explain the differences, we will have to consider an example, for which we will take the site of an online store plan.
You've probably often noticed when you were looking for something in online stores, that when resorting to a search using filters, the site address turned from the beautiful “http://magaazin.ru” to the terrible “http://magaazin.ru/?category=shoes&size=38”. So, everything that comes after the ‘?’ symbol is your GET request to the site, and to be completely precise, in in this case You seem to be asking the site what it has in the “Shoes” category from sizes “38” (this example is taken from my head, in reality everything may not look so obvious). As a result, we have that we can ask questions ourselves by indicating them in the address bar of the site. It's obvious that this method has several disadvantages. Firstly, anyone who is next to the user at the computer can easily spy on all the data, so use this type Requests for password transfer are highly discouraged. Secondly, there is a limitation on the length of the string that can be transferred from the site address field, which means that it will not be possible to transfer much data. However a definite plus using GET queries is its ease of use and the ability to quickly query the site, which is especially useful during development, but that’s another story...
Now let's talk about POST requests. Smart readers may have realized that the main difference between this request and its counterpart is the secrecy of the transmitted data. If we consider an online store, a striking example where a POST request is used is registration on the site. The site asks for your data, you fill in this data and when you click on the “Registration” button you send your answer. Moreover, this data will not be displayed externally in any way. It is also worth noting that they can request quite a large number of information - but the POST request has no restrictions. Well, if you touch on the minus, then such a request cannot be generated quickly. You can’t do this without special skills. Although in reality everything is not so difficult, but again, that’s another story.
Let's summarize. POST and GET requests are needed for “communication” between the user and the site. They are essentially almost the opposite of each other. The use of certain types of queries depends on the specific situation and using only one type of query is extremely inconvenient.
1. HTTP protocol. Introduction
I would like to clarify one small thing right away. The terrible word protocol is nothing more than an agreement of many people, just at one fine moment people decided: “Let's do it this way, and then everything will be fine.” There is nothing to be afraid of, everything is simply outrageous and we will now reveal this disgrace. So, what is the HTTP protocol and what is it used for?
1.1 Client and server
There are no miracles in the world, and especially in the world of programming and the Internet! Accept this as an unshakable truth. And, if the program does not work or does not work as desired, then, most likely, it is either written incorrectly or contains errors. So, how does the browser ask the server to send it anything? Yes, very simple! You just need to relax a little and start enjoying the process :-)
1.2. Writing our first HTTP request
If you think that everything is too complicated, then you are mistaken. Man is designed in such a way that he is simply not capable of creating something complex, otherwise he himself will get confused in it :-) So, there is a browser and there is a Web server. The browser is always the initiator of data exchange. A Web server will never simply send anything to anyone so that it sends something to the browser - the browser must ask for it. Simplest HTTP The request might look, for example, like this:
GET http : //www.php.net/ HTTP/1.0rnrn
* GET (translated from English means “get”) - the type of request, the type of request can be different, for example POST, HEAD, PUT, DELETE (we will look at some of them below).
* http://www.php.net/ - URI (address) from which we want to receive at least some information (naturally, we hope to learn the HTML page).
* HTTP/1.0 - the type and version of the protocol that we will use when communicating with the server.
* rn - the end of the line, which must be repeated twice; why will become clear a little later.
You can perform this request very simply. Run the telnet.exe program, enter www.php.net as the host, specify port 80, and simply type this request by pressing Enter twice as rnrn. In response you will receive HTML code home page website www.php.net.
1.3 Request structure
Let's look at what an HTTP request consists of. Everything is quite simple. Let's start with the fact that an HTTP request is a completely meaningful text. What does it consist of? general case? We will consider the HTTP 1.0 protocol. So :
Request - Line [ General - Header | Request - Header | Entity - Header ] rn [ Entity - Body ]
* Request-Line - request line
*
Format : "Method Request-URI HTTP-Versionrn"
* Method - the method by which the Request-URI resource will be processed can be GET, POST, PUT, DELETE or HEAD.
* Request-URI - relative or absolute reference to a page with a set of parameters, for example, /index.html or http://www.myhost.ru/index.html or /index.html?a=1&b=qq. In the latter case, the server will be sent a request with a set of variables a and b with the corresponding values, and the “&” sign - an ampersand - serves as a separator between the parameters.
* HTTP-Version - version HTTP protocol and, in our case, "HTTP/1.0".
We are extremely interested in GET and POST processing methods. Using the GET method, you can simply pass parameters to the script, and POST method you can emulate form submit.
For the GET method, the Request-URI might look like this: "/index.html?param1=1¶m2=2".
* General-Header - the main part of the header.
Format:
Can only have two parameters: Date or Pragma. Date - Greenwich date in the format "Day of week, Day Month Year HH:MM:SS GMT", for example, "Tue, 15 Nov 1994 08:12:31 GMT" - date of creation of the request. Pragma can have a single no-cache value, which disables caching of the page.
* Request-Header - part of the header that describes the request.
Request-Header can have the following parameters : Allow, Authorization, From, If-Modified-Since, Referer, User-Agent.
In this chapter, we will not cover the Authorization parameter, since it is used to access closed resources, which is not required very often. You can learn how to create an authorized access header yourself at www.w3c.org.
* Allow - sets acceptable processing methods.
Format: "Allow: GET | HEADn".
The parameter is ignored when specifying the POST processing method in Request-Line. Specifies acceptable request processing methods. Proxy servers do not modify the Allow parameter and it reaches the server unchanged.
* From - e-mail address who sent the request.
Format: "From: adderssrn".
For example, "From: [email protected]".
* If-Modified-Since - indicates that the request has not been modified since such and such time.
Format: "If-Modified-Since: datern"
Used only for the GET processing method. The date is specified in GMT in the same format as for the Date parameter in the General-Header.
* Referrer - an absolute link to the page from which the request was initiated, i.e. a link to the page from which the user came to ours.
Format: "Referrer: urln".
Example: "Referrer: www.host.ru/index.htmln".
* User-Agent - browser type.
For example: "User-Agent: Mozilla/4.0n"
* Entity-Header - part of the header that describes the Entity-Body data.
This part of the request specifies parameters that describe the body of the page. Entity-Header can contain the following parameters: Allow, Content-Encoding, Content-Length, Content-Type, Expires, Last-Modified, extension-header.
* Allow - a parameter similar to Allow from General-Header.
* Content-Encoding - Entity-Body data encoding type.
Format: "Content-Encoding: x-gzip | x-compress | other type".
Example: "Content-Encoding: x-gzipn". The "|" character means the word “or”, that is, this or that or that, etc.
Another type may indicate how the data is encoded, for example, for the POST method: "Content-Encoding: application/x-www-form-urlencodedn".
* Content-Length - the number of bytes sent to the Entity-Body. The Content-Length value has a completely different meaning for data sent in MIME format, where it acts as a parameter for describing a part of the data - "external/entity-body". Valid numbers are integers from zero and above.
Example: "Content-Length: 26457n".
* Content-Type - type of transmitted data.
For example: "Content-Type: text/htmln".
* Expires - Time when the page should be deleted from the browser cache.
Format: "Expires: dated". The date format is the same as the date format for the Date parameter from General-Header.
* Last-Modified - time last change sent data.
Format: "Last-Modified: dated". The date format is the same as the date format for the Date parameter from General-Header.
* Extention-header - part of the header, which can be intended, for example, to be processed by a browser or other program that receives the document. In this part, you can describe your parameters in the "ParameterName: parametervaluen" format. These parameters will be ignored if the client program does not know how to process them.
For example: "Cookie: r=1rn" - sets well-known cookies for the page.
And now, after such terrible words, let's try to calm down a little and understand what we need? Naturally, we will understand with examples.
Let's imagine that we need to get a page from the site by passing Cookies, otherwise we will simply be sent as uninvited guests, and moreover, it is known that you are allowed to access this page only after you have visited the main page of the site.
2 GET method
Let's write our request.
GET http:
Host: www. site. rurn
Cookie: income = 1rn
rn
This request tells us that we want to get the contents of the page at http://www.site.ru/news.html using the GET method. The Host field indicates that this page is located on the server www.site.ru, the Referer field indicates that we came for news from the main page of the site, and the Cookie field indicates that we were assigned such and such a cookie. Why are the Host, Referer and Cookie fields so important? Because normal programmers, when creating dynamic sites, check the data fields that appear in scripts (including PHP) in the form of variables. What is this for? In order, for example, to prevent the site from being robbed, i.e. they didn’t set a program on it for automatic downloading, or so that a person visiting the site would always get to it only from the main page, etc.
Now let's imagine that we need to fill out the form fields on the page and send a request from the form, let there be two fields in this form: login and password (login and password) - and, of course, we know the login and password.
GET http: //www.site.ru/news.html?login=Petya%20Vasechkin&password=qq HTTP/1.0rn
Host: www. site. rurn
Referer : http : //www.site.ru/index.htmlrn
Cookie: income = 1rn
rn
Our login is "Petya Vasechkin" Why should we write Petya%20Vasechkin? This is because Special symbols can be recognized by the server as signs of the presence of a new parameter or the end of a request, etc. Therefore, there is an algorithm for encoding parameter names and their values in order to avoid error situations in the request. Full description of this algorithm can be found here, and PHP has rawurlencode and rawurldecode functions for encoding and decoding respectively. I would like to note that PHP does the decoding itself if encoded parameters were passed in the request. This concludes the first chapter of my acquaintance with the HTTP protocol. In the next chapter we will look at building requests like POST (translated from English as “send”), which will be much more interesting, because exactly this type requests is used when sending data from HTML forms.
3. POST method.
In the case of an HTTP POST request, there are two options for transferring fields from HTML forms, namely, using the application/x-www-form-urlencoded and multipart/form-data algorithm. The differences between these algorithms are quite significant. The fact is that the first type of algorithm was created a long time ago, when HTML language have not yet provided the ability to transfer files via HTML forms. So, let's look at these algorithms with examples.
3.1 Content-Type: application/x-www-form-urlencoded.
We write a request similar to our GET request to transfer the login and password, which was discussed in the previous chapter:
POST http: //www.site.ru/news.html HTTP/1.0rn
Host: www. site. rurn
Referer : http : //www.site.ru/index.htmlrn
Cookie: income = 1rn
Content - Type : application / x - www - form - urlencodedrn
Content - Length : 35rn
rn
Here we see an example of using the Content-Type and Content-Length header fields. Content-Length tells how many bytes the data area will occupy, which is separated from the header by another newline rn. But the parameters that were previously placed in the Request-URI for a GET request are now in the Entity-Body. It can be seen that they are formed in the same way, you just need to write them after the title. I want to point out one more thing important point, nothing prevents, simultaneously with the set of parameters in the Entity-Body, placing parameters with other names in the Request-URI, for example:
POST http: //www.site.ru/news.html?type=user HTTP/1.0rn
.....
rn
login = Petya % 20Vasechkin & password = qq
3.2 Content-Type: multipart/form-data
As soon as the Internet world realized that it would be nice to send files through forms, the W3C consortium set about refining the POST request format. By that time, the MIME format (Multipurpose Internet Mail Extensions - multi-purpose protocol extensions for creating Mail messages), therefore, in order not to reinvent the wheel, we decided to use part of this message generation format to create POST requests in the HTTP protocol.
What are the main differences between this format and the application/x-www-form-urlencoded type?
The main difference is that Entity-Body can now be divided into sections, which are separated by boundaries (boundary). What's most interesting is that each section can have its own header to describe the data that is stored in it, i.e. You can transfer data in one request various types(how in Mail letter You can transfer files simultaneously with text).
So let's get started. Let's consider again the same example with the transfer of login and password, but now in a new format.
POST http: //www.site.ru/news.html HTTP/1.0rn
Host: www. site. rurn
Referer : http : //www.site.ru/index.htmlrn
Cookie: income = 1rn
Content - Length : 209rn
rn
-- 1BEF0A57BE110FD467Arn
Content - Disposition : form - data ; name = "login" rn
rn
Petya Vasechkinrn
-- 1BEF0A57BE110FD467Arn
Content - Disposition : form - data ; name = "password" rn
rn
qqrn
-- 1BEF0A57BE110FD467A -- rn
Now let's understand what is written. :-) I deliberately highlighted some rn characters in bold so that they do not merge with the data. If you look closely, you will notice the boundary field after Content-Type. This field specifies the section separator - border. The boundary can be a string consisting of Latin letters and numbers, as well as some other symbols (unfortunately, I don’t remember which ones). In the body of the request, “--” is added to the beginning of the boundary, and the request ends with a boundary, to which the characters “--” are also added to the end. Our request has two sections, the first describes the login field, and the second describes the password field. Content-Disposition (the data type in the section) says that this will be data from the form, and the name field specifies the name of the field. This is where the section header ends and what follows is the section data area in which the field value is placed (no need to encode the value!).
I would like to draw your attention to the fact that you do not need to use Content-Length in section headers, but in the request header you should and its value is the size of the entire Entity-Body, which appears after the second rn following Content-Length: 209rn. Those. Entity-Body is separated from the header by an additional line break (which can also be seen in sections).
Now let's write a request to transfer a file.
POST http: //www.site.ru/postnews.html HTTP/1.0rn
Host: www. site. rurn
Referer: http: //www.site.ru/news.htmlrn
Cookie: income = 1rn
Content-Type: multipart/form-data; boundary = 1BEF0A57BE110FD467Arn
Content - Length : 491rn
rn
-- 1BEF0A57BE110FD467Arn
Content - Disposition : form - data ; name = "news_header" rn
rn
Example newsrn
-- 1BEF0A57BE110FD467Arn
Content - Disposition : form - data ; name = "news_file" ; filename = "news.txt" rn
Content - Type : application / octet - streamrn
Content - Transfer - Encoding : binaryrn
rn
Here's the news,
which is in the news file. txtrn
-- 1BEF0A57BE110FD467A -- rn
IN in this example the first section sends the news title, and the second section sends the news.txt file. If you are attentive, you will see the filename and Content-Type fields in the second section. The filename field specifies the name of the file being sent, and the Content-Type field specifies the type of this file. Application/octet-stream indicates that this is a standard data stream, and Content-Transfer-Encoding: binary indicates that this is binary data, not encoded in any way.
A very important point. Most CGI scripts are written smart people, so they like to check the type of the incoming file, which is in Content-Type. For what? Most often, uploading files on websites is used to receive images from the visitor. So, the browser itself tries to determine what kind of file the visitor wants to send and inserts the appropriate Content-Type into the request. The script checks it upon receipt, and, for example, if it is not a gif or jpeg, it ignores it this file. Therefore, when creating a request “manually”, take care of the Content-Type value so that it is closest to the format of the transferred file.
Image/gif for gif
image/jpeg for jpeg
image/png for png
image/tiff for tiff (which is used extremely rarely, the format is too capacious)
In our example, a request is generated in which text file. A request for transferring a binary file is generated in the same way.
4. Postscript.
I think that it is not worth talking in detail about sending requests to the server. This is a matter of purely RHP technology :-). Just carefully read the section on functions for working with sockets, or on the functions of the CURL module in official documentation RNR.
From the above, I hope it is now clear why the question is: “How can I generate a POST request using the header function?” - meaningless. The header(string) function adds an entry only to the request header, but not to the request body.
Back |