Application tcp ip client server. Client-server application on a TCP stream socket. Common uses of fields

Client-server application on a TCP stream socket

The following example uses TCP to provide orderly, reliable two-way byte streams. Let's build a complete application that includes a client and a server. First, we demonstrate how to construct a server using TCP stream sockets, and then a client application to test our server.

The following program creates a server that receives connection requests from clients. The server is built synchronously, therefore, thread execution is blocked until the server agrees to connect with the client. This application demonstrates a simple server responding to a client. The client ends the connection by sending a message to the server .

TCP Server

The creation of the server structure is shown in the following functional diagram:

Here is the complete code for the SocketServer.cs program:

// SocketServer.cs using System; using System.Text; using System.Net; using System.Net.Sockets; namespace SocketServer ( class Program ( static void Main(string args) ( // Set the local endpoint for the socket IPHostEntry ipHost = Dns.GetHostEntry("localhost"); IPAddress ipAddr = ipHost.AddressList; IPEndPoint ipEndPoint = new IPEndPoint(ipAddr, 11000 ); // Create a Tcp/Ip socket sListener = new Socket(ipAddr.AddressFamily, SocketType.Stream, ProtocolType.Tcp); // Assign the socket to the local endpoint and listen for incoming sockets try ( sListener.Bind(ipEndPoint); sListener. Listen(10); // Start listening for connections while (true) ( Console.WriteLine("Waiting for a connection on port (0)", ipEndPoint); // The program pauses, waiting for an incoming connection Socket handler = sListener.Accept(); string data = null; // We waited for the client trying to connect to us byte bytes = new byte; on the console Console.Write("Received text: " + data + "\n\n"); ") > -1) ( Console.WriteLine("The server has terminated the connection with the client."); break; ) handler.Shutdown(SocketShutdown.Both); handler.Close(); ) ) catch (Exception ex) ( Console.WriteLine (ex.ToString()); ) finally ( Console.ReadLine(); ) ) ) )

Let's look at the structure of this program.

The first step is to set the socket to a local endpoint. Before opening a socket to listen for connections, you need to prepare a local endpoint address for it. A unique TCP/IP servicing address is determined by the combination of the host's IP address with the servicing port number, which creates the servicing endpoint.

The Dns class provides methods that return information about the network addresses supported by a device on the local network. If a LAN device has more than one network address, the Dns class returns information about all network addresses, and the application must select the appropriate address to serve from the array.

Let's create an IPEndPoint for the server by combining the first IP address of the host computer obtained from the Dns.Resolve() method with the port number:

IPHostEntry ipHost = Dns.GetHostEntry("localhost"); IPAddress ipAddr = ipHost.AddressList; IPEndPoint ipEndPoint = new IPEndPoint(ipAddr, 11000);

Here the IPEndPoint class represents localhost on port 11000. Next, we create a stream socket with a new instance of the Socket class. Having set up a local endpoint to listen for connections, we can create a socket:

Socket sListener = new Socket(ipAddr.AddressFamily, SocketType.Stream, ProtocolType.Tcp);

Transfer AddressFamily specifies the addressing schemes that an instance of the Socket class can use to resolve an address.

In the parameter SocketType TCP and UDP sockets are different. In it you can define, among other things, the following values:

Dgram

Supports datagrams. The Dgram value requires Udp to be specified for the protocol type and InterNetwork in the address family parameter.

Raw

Supports access to the underlying transport protocol.

Stream

Supports stream sockets. The Stream value requires Tcp to be specified for the protocol type.

The third and final parameter specifies the protocol type required for the socket. In the parameter ProtocolType You can specify the following most important values - Tcp, Udp, Ip, Raw.

The next step should be to assign the socket using the method Bind(). When a socket is opened by a constructor, it is not assigned a name, only a handle is reserved. The Bind() method is called to assign a name to the server socket. In order for a client socket to identify a TCP stream socket, the server program must give its socket a name:

SListener.Bind(ipEndPoint);

The Bind() method binds a socket to a local endpoint. The Bind() method must be called before any attempts to call the Listen() and Accept() methods.

Now, having created a socket and associated a name with it, you can listen to incoming messages using the method Listen(). In the listening state, the socket will listen for incoming connection attempts:

SListener.Listen(10);

The parameter defines backlog, indicating the maximum number of connections waiting in the queue. In the above code, the parameter value allows up to ten connections to accumulate in the queue.

In the listening state, you must be ready to agree to connect with the client, for which the method is used Accept(). This method obtains a client connection and completes the client and server name association. The Accept() method blocks the calling program's thread until a connection arrives.

The Accept() method removes the first connection request from the queue of pending requests and creates a new socket to process it. Although a new socket is created, the original socket continues to listen and can be used with multi-threading to accept multiple connection requests from clients. No server application should close a listening socket. It must continue to work alongside sockets created by the Accept method to process incoming client requests.

While (true) ( Console.WriteLine("Waiting for a connection on port (0)", ipEndPoint); // The program pauses while waiting for an incoming connection Socket handler = sListener.Accept();

Once the client and server have established a connection with each other, messages can be sent and received using the methods Send() And Receive() class Socket.

The Send() method writes outgoing data to the connected socket. The Receive() method reads incoming data into the stream socket. When using a TCP based system, a connection must be established between the sockets before executing the Send() and Receive() methods. The exact protocol between the two communicating entities must be defined in advance so that the client and server applications do not block each other by not knowing who should send their data first.

When the data exchange between the server and the client is completed, you need to close the connection using the methods Shutdown() And Close():

Handler.Shutdown(SocketShutdown.Both); handler.Close();

SocketShutdown is an enum containing three values to stop: Both- stops sending and receiving data by the socket, Receive- stops the socket from receiving data and Send- stops sending data by the socket.

The socket is closed by calling the Close() method, which also sets the Connected property of the socket to false.

TCP client

The functions that are used to create a client application are more or less similar to a server application. As with the server, the same methods are used to determine the endpoint, instantiate the socket, send and receive data, and close the socket.

Traveling through network protocols.

TCP and UDP are both transport layer protocols. UDP is a connectionless protocol with non-guaranteed packet delivery. TCP (Transmission Control Protocol) is a connection-oriented protocol with guaranteed packet delivery. First there is a handshake (Hello. | Hello. | Let's chat? | Let's go.), after which the connection is considered established. Then packets are sent back and forth over this connection (a conversation is in progress), and it is checked whether the packet has reached the recipient. If the packet is lost, or arrived but with a broken checksum, then it is sent again (“repeat, I didn’t hear”). Thus, TCP is more reliable, but it is more complex from an implementation point of view and, accordingly, requires more clock cycles / memory, which is not the least important for microcontrollers. Examples of application protocols that use TCP include FTP, HTTP, SMTP, and many others.

TL;DR

HTTP (Hypertext Transfer Protocol) is an application protocol with which the server sends pages to our browser. HTTP is now widely used on the World Wide Web to retrieve information from websites. The picture shows a lamp on a microcontroller with an OS on board, in which colors are set via a browser.

The HTTP protocol is text-based and quite simple. Actually, this is what the GET method looks like, sent by the netcat utility to the local IPv6 address of the server with lights:

~$ nc fe80::200:e2ff:fe58:b66b%mazko 80<

HTTP Method is usually a short English word written in capital letters and is case sensitive. Every server must support at least the GET and HEAD methods. In addition to the GET and HEAD methods, the POST, PUT and DELETE methods are often used. The GET method is used to request the contents of a specified resource, in our case here GET /b HTTP/1.0 where the /b path is responsible for the color (blue). Server response:

HTTP/1.0 200 OK Server: Contiki/2.4 http://www.sics.se/contiki/ Connection: close Cache-Control: no-cache, no-store, must-revalidate Pragma: no-cache Expires: 0 Content- type: text/html Contiki RGB

Red is OFF

Green is OFF

Blue is ON

The status code (we have 200) is part of the first line of the server response. It is a three-digit integer. The first digit indicates the class of the condition. The response code is usually followed by an explanatory phrase in English separated by a space, which explains to the person the reason for this particular response. In our case, the server worked without errors, everything was fine (OK).

Both the request and response contain headers (each line is a separate header field, the name-value pair is separated by a colon). The headers end with an empty line, after which data can follow.

My browser refuses to open the local IPv6 address, so an additional address is written in the microcontroller firmware and the same prefix also needs to be assigned to the virtual network interface of the simulator:

~$ sudo ip addr add abcd::1/64 dev mazko # linux ~$ netsh interface ipv6 set address mazko abcd::1 # windows ~$ curl http://

TCP integrates naturally into a client/server environment (see Figure 10.1). Server application bugs(listen) incoming connection requests. For example, WWW, file transfer, or terminal access services listen for requests coming from clients. Communications in TCP are initiated by appropriate routines, which initiate a connection to the server (see Chapter 21 about the socket programming interface).

Rice. 10.1. The client calls the server.

In reality, the client may be another server. For example, mail servers can connect to other mail servers to send email messages between computers.

10.2 TCP concepts

In what form should applications send data in TCP? How does TCP transmit data to IP? How do the sending and receiving TCP protocols identify the connection between applications and the data elements required to implement it? All of these questions are answered in the following sections, which describe the basic concepts of TCP.

10.2.1 Input and output data streams

Conceptual The connection model involves an application forwarding a data stream to a peer application. At the same time, it is capable of receiving a data stream from its connection partner. TCP provides full duplex(full duplex) operating mode in which simultaneously serviced two streams data (see Fig. 10.2).

Rice. 10.2. Applications exchange data streams.

10.2.2 Segments

TCP can convert the data stream leaving an application into a form suitable for being stored in datagrams. How?

The application sends data to TCP, and this protocol places it in output buffer(send buffer). Next, TCP cuts pieces of data from the buffer and sends them, adding a header (this creates segments- segment). In Fig. 10.3 shows how the data from output buffer TCP is packetized into segments. TCP passes the segment to IP for delivery as a separate datagram. Packetizing data into chunks of the correct length ensures efficient forwarding, so TCP will wait until the appropriate amount of data is available in the output buffer before creating a segment.

Rice. 10.3 Creating a TCP Segment

10.2.3 Pushing

However, large amounts of data are often impossible to apply to real-world applications. For example, when an end-user client program initiates an interactive session with a remote server, the user then only enters commands (followed by pressing the key Return).

The user's client program needs TCP to know that data is being sent to the remote host and to perform this operation immediately. In this case it is used pushing out(push).

If you look at the operations in an interactive session, you will find many segments with little data, and furthermore, popping can be found in almost every data segment. However, pushing should not be used during file transfers (except for the very last segment), and TCP will be able to pack the data into segments most efficiently.

10.2.4 Urgent data

The application's data forwarding model involves an ordered stream of bytes traveling to the destination. Referring again to the interactive session example, suppose the user pressed a key attention(attention) or break(interrupt). The remote application must be able to skip the interfering bytes and respond to the keystroke as quickly as possible.

Mechanism urgent data(urgent data) marks special information in a segment as urgent. By doing this, TCP tells its peer that the segment contains urgent data and can indicate where it is located. The partner must forward this information to the destination application as soon as possible.

10.2.5 Application ports

The client must identify the service it wants to access. This is done through the specification of the host service's IP address and its TCP port number. Like UDP, TCP port numbers range from 0 to 65535. Ports in the range 0 to 1023 are called well-known and are used to access standard services.

Several examples of well-known ports and their corresponding applications are shown in Table 10.1. Services Discard(port 9) and charged(port 19) are TCP versions of services already known to us via UDP. It must be remembered that traffic to port 9 of the TCP protocol is completely isolated from traffic to port 9 of the UDP protocol.

Table 10.1 Well-Known TCP Ports and Their Corresponding Applications

Port	Application	Description
9	Discard	Cancel all incoming data
19	Chargen	Character generator. Character stream exchange
20	FTP-Data	FTP data forwarding port
21	FTP	Port for FTP dialogue
23	TELNET	Port for remote registration via Telnet
25	SMTP	SMTP protocol port
110	POP3	Mail sampling service for personal computers
119	NNTP	Access to online news

What about the ports used by clients? In rare cases, the client does not work through a well-known port. But in such situations, wanting to open a connection, it often asks the operating system to assign it an unused and unreserved port. At the end of the connection, the client must return this port, after which the port can be reused by another client. Since there are more than 63,000 TCP ports in the unreserved number pool, client port restrictions can be ignored.

10.2.6 Socket addresses

As we already know, the combination of IP address and port for communication is called address socket. A TCP connection is completely identified by the socket address on each end of the connection. In Fig. Figure 10.4 shows the connection between a client with socket address (128.36.1.24, port = 3358) and a server with socket address (130.42.88.22, port = 21).

Rice. 10.4. Socket addresses

The header of each datagram contains the source and destination IP addresses. You will see later that the source and destination port numbers are specified in the TCP segment header.

Typically, a server is capable of managing multiple clients simultaneously. Unique server socket addresses are assigned simultaneously to all its clients (see Fig. 10.5).

Rice. 10.5. Multiple clients connected to server socket addresses

Because a datagram contains a TCP connection segment identified by IP addresses and ports, it is very easy for a server to keep track of multiple connections to clients.

10.3 TCP Reliability Mechanism

In this section, we will look at the TCP mechanism used to reliably deliver data while preserving the order of transmission and avoiding loss or duplication.

10.3.1 Numbering and confirmation

TCP uses numbering and acknowledgment (ACK) to ensure reliable data transfer. The TCP numbering scheme is somewhat unusual: every forwarded over the connection octet is considered to have a serial number. The TCP segment header contains a sequence number the first octet of data in this segment.

The receiver is required to confirm that the data has been received. If the ACK does not arrive within the timeout interval, the data is retransmitted. This method is called positive confirmation with relay(positive acknowledgment with retransmission).

The recipient of TCP data strictly checks the incoming sequence numbers to verify that the data is received in sequence and that there are no missing parts. Since the ACK may be randomly lost or delayed, duplicate segments may arrive at the recipient. Sequence numbers allow you to identify duplicate data, which is then discarded.

In Fig. Figure 10.6 shows a simplified view of timeout and retransmission in TCP.

Rice. 10.6. Timeout and retransmission in TCP

10.3.2 Port, sequence and ACK fields in the TCP header

As shown in Fig. 10.7, the first few fields of the TCP header provide space for the source and destination port values, the sequence number of the first byte of the enclosed data, and an ACK equal to the sequence number next byte expected at the other end. In other words, if TCP receives all bytes up to 30 from its peer, this field will have the value 31, indicating the segment to forward next.

Rice. 10.7. Initial values in TCP header fields

One small detail cannot be overlooked. Let's assume that TCP has sent bytes 1 to 50 and there is no more data to send. If data is received from a peer, TCP must acknowledge its receipt by sending a header without the data attached to it. Naturally, this header contains the value ACK. In the sequence field - the value is 51, i.e. the number of the next byte, which intends send TCP. When TCP sends the next data, the new TCP header will also have a sequence field of 51.

10.4 Establishing a connection

How do the two applications connect to each other? Before communication, each of them calls a subroutine to form a memory block that will be used to store the TCP and IP parameters of a given connection, for example, socket addresses, current sequence number, initial lifetime value, etc.

The server application waits for a client to appear, who, wanting to gain access to the server, issues a request for compound(connect), identifying the server's IP address and port.

There is one technical feature. Each side begins numbering each byte not with one, but with random serial number(we will find out later why this is done). The original specification advises that the initial sequence number be generated based on a 32-bit external timer that increments approximately every 4 µs.

10.4.1 Connection scenario

The connection procedure is often called a three-way handshake because three messages are exchanged to establish a connection - SYN, SYN and ACK.

During connection establishment, partners exchange three important pieces of information:

1. Buffer space for receiving data

2. Maximum amount of data carried in an incoming segment

3. Starting sequence number used for outgoing data

Note that each party uses operations 1 and 2 to indicate the limits within which the other party will act. A personal computer may have a small receive buffer, but a supercomputer may have a huge buffer. The memory structure of a personal computer may limit incoming data chunks to 1 KB, but a supercomputer manages larger segments.

The ability to control how the other side sends data is an important feature that makes TCP/IP scalable.

In Fig. Figure 10.8 shows an example connection script. Very simple initial sequence numbers are presented so as not to overload the drawing. Note that in this figure, the client is able to receive larger segments than the server.

Rice. 10.8. Establishing a connection

The following operations are performed:

1. The server initializes and becomes ready to connect with clients (this state is called passive open).

2. The client requests TCP to open a connection to the server at the specified IP address and port (this state is called active open).

3. The client TCP receives the initial sequence number (1000 in this example) and sends sync segment(synchronize segment - SYN). This segment carries the sequence number, the size of the receive window (4K), and the size of the largest segment that the client can receive (1460 bytes).

4. When a SYN arrives, the server TCP receives mine starting sequence number (3000). It sends a SYN segment containing the starting sequence number (3000), ACK 1001 (which means the first byte sent by the client is numbered as 1001), the receive window size (4K), and the size of the largest segment the server can receive (1024 bytes).

5. Client TCP, having received a SYN/ACK message from the server, sends back ACK 3001 (the first byte of data sent by the server should be numbered 3001).

6. The client TCP instructs its application to open the connection.

7. The server TCP, having received an ACK message from the client TCP, informs its application about opening the connection.

The client and server announce their rules for the received data, synchronize their sequence numbers and become ready to exchange data. The TCP specification also allows for another scenario (not very successful), when peer applications simultaneously actively open each other.

10.4.2 Setting IP parameter values

An application request to establish a connection may also specify parameters for the IP datagrams that will carry the connection data. If a specific parameter value is not specified, the default value is used.

For example, an application can select a desired value for IP priority or service type. Since each of the connected parties independently sets its own priority and type of service, theoretically these values can differ for different directions of data flows. As a rule, in practice the same values are used for each direction of exchange.

When an application involves government or military security options, each endpoint of the connection must use the same security levels or the connection will not be established.

10.5 Data transfer

Data transfer begins after completing the three-step confirmation of connection creation (see Figure 10.9). The TCP standard allows normal data to be included in acknowledgment segments, but it will not be delivered to the application until the connection is completed. To simplify numbering, 1000-byte messages are used. Each TCP header segment has an ACK field identifying the byte sequence number expected to be received from the connection partner.

Rice. 10.9. Simple data exchange and ACK flow

The first segment sent by the client contains bytes 1001 to 2000. Its ACK field should contain the value 3001, which indicates the byte sequence number that is expected to be received from the server.

The server responds to the client with a segment containing 1000 bytes of data (starting with number 3001). Its TCP header ACK field will indicate that bytes 1001 through 2000 have already been successfully received, so the next segment sequence number expected from the client should be 2001.

The client then sends segments starting with bytes 2001, 3001, and 4001 in the specified sequence. Note that the client does not wait for an ACK after each of the segments sent. The data is sent to the peer until its buffer space is full (we will see below that the recipient can very precisely specify the amount of data sent to him).

The server saves connection bandwidth by using a single ACK to indicate success in forwarding all segments.

In Fig. Figure 10.10 shows data transfer when the first segment is lost. When the timeout expires, the segment is retransmitted. Note that upon receiving a lost segment, the receiver sends a single ACK confirming the forwarding of both segments.

Rice. 10.10. Data loss and retransmission

10.6 Closing a connection

Normal termination of a connection is performed using the same triple handshake procedure as when opening a connection. Each party can start closing the connection using the following scenario:

B:"Fine".

IN:"I finished the job too."

A:"Fine".

Let's assume the following scenario (although it is used extremely rarely):

A:"I'm done. There's no more data to send."

IN:"Okay. However, there is some data..."

IN:"I finished the job too."

A:"Fine".

In the example below, the connection is closed by the server, as is often the case for client/server connections. In this case, after the user enters in the session telnet The logout command causes the server to initiate a request to close the connection. In the situation shown in Fig. 10.11, the following actions are performed:

1. The application on the server tells TCP to close the connection.

2. The TCP server sends a final segment (Final Segment - FIN), informing its peer that there is no more data to send.

3. The client's TCP sends an ACK in the FIN segment.

4. The client's TCP tells its application that the server wants to close the connection.

5. The client application tells its TCP to close the connection.

6. The TCP client sends a FIN message.

7. The TCP server receives the FIN from the client and responds with an ACK message.

8. The server's TCP tells its application to close the connection.

Rice. 10.11. Closing a connection

Both parties can begin closing at the same time. In this case, the normal connection closure is completed after each peer sends an ACK message.

10.6.1 Abrupt termination

Each party can request an abrupt termination (abrupt close) of the connection. This is acceptable when an application wishes to terminate a connection or when TCP detects a serious communication problem that it cannot resolve by its own means. Abrupt termination is requested by sending one or more reset messages to the peer, indicated by a specific flag in the TCP header.

10.7 Flow control

The TCP receiver is loaded with the incoming data stream and determines how much information it can accept. This limitation affects the TCP sender. The following explanation of this mechanism is conceptual, and developers may implement it in different ways in their products.

During connection setup, each peer allocates space for the connection's input buffer and notifies the other party about it. Typically, the buffer size is expressed as an integer number of the maximum segment sizes.

The data stream enters the input buffer and is stored there before being forwarded to the application (defined by the TCP port). In Fig. Figure 10.12 shows an input buffer that can accept 4 KB.

Rice. 10.12. Input Buffer Receive Window

The buffer space is filled as data arrives. When the receiving application retrieves data from the buffer, the freed space becomes available for new incoming data.

10.7.1 Receiving window

Receiving window(receive window) - any space in the input buffer not already occupied by data. The data remains in the input buffer until it is used by the target application. Why doesn't the application collect data immediately?

A simple scenario will help answer this question. Let's assume that a client has sent a file to an FTP server running on a very busy multi-user computer. The FTP program must then read the data from the buffer and write it to disk. When the server performs disk I/O operations, the program waits for those operations to complete. At this time, another program may start (for example, according to a schedule) and, while the FTP program starts again, the next data will already arrive in the buffer.

The receive window extends from the last acknowledged byte to the end of the buffer. In Fig. 10.12, first the entire buffer is available and, therefore, a 4 KB receive window is available. When the first KB arrives, the receive window will be reduced to 3 KB (for simplicity, we will assume that each segment is 1 KB in size, although in practice this value varies depending on the needs of the application). The arrival of the next two 1 KB segments will reduce the receive window to 1 KB.

Each ACK sent by the receiver contains information about the current state of the receiving window, depending on which the data flow from the source is regulated.

For the most part, the size of the input buffer is set when the connection is started, although the TCP standard does not specify how to manage this buffer. The input buffer can be increased or decreased by providing feedback to the sender.

What happens if the arriving segment can be placed in the receiving window, but it arrives out of order? It is generally assumed that all implementations store incoming data in a receive window and send an acknowledgment (ACK) only for the entire contiguous block of multiple segments. This is the correct way, because otherwise, discarding data that arrives out of order will significantly reduce performance.

10.7.2 Submission window

The system transmitting data must keep track of two characteristics: how much data has already been sent and acknowledged, and the current size of the recipient's receive window. Active sending space(send space) extends from the first unacknowledged octet to the left edge of the current receive window. Part window, used to send, indicates how much additional data can be sent to the partner.

The initial sequence number and initial receive window size are specified during connection setup. Rice. Figure 10.13 illustrates some of the features of the data transfer mechanism.

1. The sender starts with a 4 KB sending window.

2. The sender sends 1 KB. A copy of this data is retained until an acknowledgment (ACK) is received, as it may need to be retransmitted.

3. The ACK message for the first KB arrives and the next 2 KB of data are sent. The result is shown in the third part from the top of Fig. 10.13. 2 KB storage continues.

4. Finally, an ACK arrives for all the data transmitted (i.e., all of it is received by the receiver). ACK restores the send window size to 4 KB.

Rice. 10.13. Send window

There are several interesting features worth pointing out:

■ The sender does not wait for an ACK for each data segment it sends. The only restriction on forwarding is the size of the receive window (for example, the sender must forward only 4K single-byte segments).

■ Suppose the sender sends data in several very short segments (for example, 80 bytes). In this case, the data can be reformatted for more efficient transmission (for example, into a single segment).

10.8 TCP header

In Fig. Figure 10.14 shows the segment format (TCP header and data). The header begins with the source and destination port IDs. Next next field serial number(sequence number) indicates the position in the outgoing data stream that this segment occupies. Field ACK(acknowledgment) contains information about the expected next segment that should appear in the input data stream.

Rice. 10.14. TCP segment

There are six flags:

Field data offsets(Data Offset) contains the size of the TCP header in 32-bit words. The TCP header must end on a 32-bit boundary.

10.8.1 Maximum segment size option

Parameter "maximum segment size"(maximum segment size - MSS) is used to declare the largest piece of data that can be accepted and processed by the system. However, the name is somewhat inaccurate. Typically in TCP segment treated as header plus data. However maximum segment size defined as:

The size of the largest datagram that can be received is 40

In other words, MSS reflects the greatest payload at the receiver when the TCP and IP headers are 20 bytes long. If there are additional parameters, their length should be subtracted from the total size. Therefore, the amount of data that can be sent in a segment is defined as:

Stated value MSS + 40 – (sum of TCP and IP header lengths)

Peers typically exchange MSS values in initial SYN messages when opening a connection. If the system does not advertise a maximum segment size value, the default value of 536 bytes is used.

The size of the maximum segment is encoded as a 2-byte lead-in followed by a 2-byte value, i.e. the largest value will be 2 16 -1 (65,535 bytes).

MSS imposes a severe limitation on the data sent in TCP: the receiver will not be able to process large values. However, the sender uses segments smaller size since the MTU size along the route is also determined for the connection.

10.8.2 Using header fields in a connection request

The first segment sent to open a connection has a SYN flag of 1 and an ACK flag of 0. The initial SYN is the only one a segment that has an ACK field with a value of 0. Note that security tools use this feature to identify input requests for a TCP session.

Field serial number contains initial sequence number(initial sequence number), field window - initial size receiving window. The only TCP parameter currently defined is the maximum segment size (when not specified, the default value of 536 bytes is used) that TCP expects to receive. This value occupies 32 bits and is usually present in the connection request in the field options(Option). The TCP header containing the MSS value is 24 bytes long.

10.8.3 Using header fields in a connection request response

In a grant response to a connection request, both flags (SYN and ACK) are equal to 1. The responding system indicates the starting sequence number in the appropriate field and the receive window size in the field Window. The maximum segment size that the recipient wishes to use is usually found in the response to the connection request (in the options). This value may differ from the value of the party requesting the connection, i.e. two different values may be used.

A connection request can be rejected by specifying a reset flag (RST) with a value of 1 in the response.

10.8.4 Selecting the starting sequence number

The TCP specification assumes that during connection setup each party chooses initial sequence number(based on the current value of the 32-bit internal timer). How is this done?

Let's imagine what will happen if the system collapses. Let's assume that the user opened a connection just before the crash and sent a small amount of data. After recovery, the system no longer remembers anything that was done before the crash, including already started connections and assigned port numbers. The user re-establishes the connection. The port numbers do not match the original assignments, and some of them may already be in use by other connections established a few seconds before the crash.

Therefore, the other party at the very end of the connection may not know that its partner crashed and was then restored. All this will lead to serious disruptions, especially when it takes a long time for the old data to pass through the network and mix with the data from the newly created connection. Selecting a start timer with an update (fresh start) eliminates such problems. The old data will have a different numbering than the new connection's sequence number range. Hackers, when spoofing the source IP address for a trusted host, attempt to gain access to computers by specifying a predictable initial sequence number in the message. A cryptographic hashing function based on internal keys provides the best method for selecting secure seeds.

10.8.5 Common uses of fields

When preparing the TCP header for transmission, the sequence number of the first octet of the transmitted data is indicated in the field serial number(Sequence Number).

The number of the next octet expected from the connection partner is entered in the field confirmation(Acknowledgment Number) when the ACK bit is set to 1. Field window(Window) is for the current receiving window size. This field contains number of bytes from the acknowledgment number that can be received. Note that this value allows precise control of the data flow. Using this value, the peer indicates the actual state of the receive window during the exchange session.

If an application specifies a push operation in TCP, then the PUSH flag is set to 1. The receiving TCP MUST respond to this flag by quickly delivering the data to the application as soon as the sender wants to forward it.

The URGENT flag, when set to 1, implies urgent data transfer, and the corresponding pointer must refer to the last octet of the urgent data. A typical use of urgent data is to send abort or abort signals from the terminal.

Urgent data is often called out-of-band information(out-of-band). However, this term is imprecise. Urgent data is sent in a normal TCP stream, although individual implementations may have special mechanisms to indicate to the application that urgent data has arrived, and the application must check the contents of the urgent data before all of the message bytes have arrived.

The RESET flag is set to 1 when the connection should be terminated abnormally. The same flag is set in the response when a segment arrives that is not associated with any of the current TCP connections.

The FIN flag is set to 1 for connection close messages.

10.8.6 Checksum

The IP checksum is for the IP header only, while the TCP checksum is calculated for the entire segment as well as the pseudo-header created from the IP header. During TCP checksum calculation, the corresponding field has the value 0. In Fig. Figure 10.15 shows a pseudo-header that closely resembles that used in a UDP checksum.

Rice. 10.15. The pseudo-header field is included in the TCP checksum

The TCP length is calculated by adding the length of the TCP header to the length of the data. The TCP checksum is mandatory, not like UDP. The checksum of an arriving segment is first calculated by the receiver and then compared with the contents of the TCP header checksum field. If the values do not match, the segment is discarded.

10.9 TCP Segment Example

Rice. 10.16, analyzer operation protocol Sniffer from Network General, is a sequence of TCP segments. The first three segments establish the connection between the client and server Telnet. The last segment carries 12 bytes of data.

Rice. 10.16. Displaying the TCP header by the Sniffer analyzer

Analyzer Sniffer converts most values to decimal form. However, the flag values are output as hexadecimal. A flag with a value of 12 represents 010010. The checksum is also output in hexadecimal.

10.10 Session support

10.10.1 Window probing

A fast sender and a slow receiver can form a receive window of size 0 bytes. This result is called closing the window(close window). When there is free space to update the receive window size, ACK is used. However, if such a message is lost, both parties will have to wait indefinitely.

To avoid this situation, the sender sets saveable timer(persist timer) when closing the temporary window. The timer value is the retransmission timeout. Upon completion of the timer, a segment is sent to the partner window sensing(window probe; some implementations include data as well). Probing causes the peer to send back an ACK that reports the current window status.

If the window still remains at zero size, the value of the timer being saved is doubled. This process is repeated until the timer reaches a maximum of 60 seconds. TCP will continue to send probe messages every 60 seconds until the window opens, the user terminates the process, or the application times out.

10.11 Ending a session

10.11.1 Time-out

The connection partner may crash or be completely interrupted due to a faulty gateway or link. To prevent data from being retransmitted in TCP, there are several mechanisms.

When TCP reaches the first retransmission threshold, it tells IP to check the failed router and simultaneously informs the application that there is a problem. TCP continues forwarding data until the second threshold is reached before releasing the connection.

Of course, before this happens, an ICMP message may arrive indicating that the destination is unreachable for some reason. In some implementations, even after this, TCP will continue to attempt to access the destination until the timeout interval expires (at which point the problem may be fixed). Next, the application is informed that the destination point is unreachable.

An application can set its own timeout for data delivery and perform its own operations when this interval ends. Usually the connection is terminated.

10.11.2 Maintaining a connection

When an incomplete connection has data to forward for a long time, it becomes inactive. During a period of inactivity, the network may crash or physical communication lines may be interrupted. As soon as the network becomes operational again, the partners will continue to exchange data without interrupting the communication session. This strategy complied with the requirements of the Ministry of Defense.

However, any connection - active or inactive - takes up a lot of computer memory. Some administrators need to return unused resources to systems. Therefore, many TCP implementations are capable of sending a message about maintaining connection(keep-alive), which tests inactive connections. Such messages are periodically sent to the partner to check its existence on the network. The response should be ACK messages. The use of keep-alive messages is optional. If the system has this capability, the application can override it using its own means. Estimated period default The connection maintenance timeout is a full two hours!

Let us remember that an application can set its own timer, according to which it will decide at its own level whether to terminate the connection.

10.12 Performance

How efficient is TCP? Resource performance is affected by many factors, the main ones being memory and bandwidth (see Figure 10.17).

Rice. 10.17. TCP Performance Factors

The bandwidth and latency of the physical network used significantly limits throughput. Poor data forwarding quality results in a large volume of discarded datagrams, which causes retransmissions and consequently reduces bandwidth efficiency.

The receiving end must provide sufficient buffer space to allow the sender to send data without interruption. This is especially important for high-latency networks, where there is a long time interval between sending data and receiving an ACK (and when negotiating the window size). To maintain a steady flow of data from the source, the receiving end must have a window of at least the size of the product of bandwidth and latency.

For example, if the source can send data at a rate of 10,000 bytes/s, and the ACK takes 2 seconds to return, then the other side must provide a receiving window of at least 20,000 bytes in size, otherwise the data flow will not be continuous. A 10,000 byte receive buffer will cut the throughput in half.

Another important factor for performance is the host's ability to respond to high-priority events and quickly execute context switching, i.e. complete some operations and switch to others. The host can interactively support multiple local users, batch background processes, and dozens of simultaneous communication connections. Context switching allows you to handle all of these operations while hiding the load on the system. Implementations that integrate TCP/IP with the operating system kernel can significantly reduce the burden of using context switching.

Computer CPU resources are required for TCP header processing operations. If the processor cannot quickly calculate checksums, this leads to a decrease in the speed of data transfer over the network.

In addition, developers should consider making TCP settings easier to configure so that the network administrator can customize them to suit their local requirements. For example, the ability to adjust the buffer size to network bandwidth and latency will significantly improve performance. Unfortunately, many implementations do not pay enough attention to this issue and rigidly program communication parameters.

Let's assume that the network environment is perfect: there are sufficient resources and context switching is performed faster than cowboys whipping out their revolvers. Will excellent performance be obtained?

Not always. The quality of TCP software development also matters. Over the years, many performance problems have been diagnosed and resolved in various TCP implementations. It can be considered that the best software will be one that complies with RFC 1122, which defines the requirements for the communication layer of Internet hosts.

Equally important is the exception and the use of Jacobson, Kern and Partridge algorithms (these interesting algorithms will be discussed below).

Software developers can gain significant benefits by creating programs that eliminate unnecessary transfers of small amounts of data and have built-in timers to free network resources that are not currently in use.

10.13 Algorithms for improving performance

Moving on to get to know the rather complex part of TCP, we will look at mechanisms for increasing performance and solving problems with reduced throughput. This section discusses the following issues:

■ Slow start(slow start) prevents a large portion of the network traffic from being used for a new session, which can result in waste.

■ Cure from "clueless window" syndrome(silly window syndrome) prevents poorly designed applications from overloading the network with messages.

■ Delayed ACK(delayed ACK) reduces congestion by reducing the number of independent data forward acknowledgment messages.

■ Calculated retransmission timeout(computing retransmission timeout) is based on the negotiation of real session time, reducing the amount of unnecessary retransmissions, but does not cause large delays for really necessary data exchanges.

■ Inhibition of TCP forwarding when overloads in a network allows routers to revert to their original mode and share network resources across all sessions.

■ Dispatch duplicate ACKs(duplicate ACK) when receiving a segment out of sequence allows peers to retransmit before a timeout occurs.

10.13.1 Slow start

If you turn on all household electrical appliances at home at the same time, the electrical network will be overloaded. In computer networks slow start prevents the mains fuses from blowing.

A new connection that instantly starts sending a large amount of data on an already busy network can lead to problems. The idea of a slow start is to ensure that a new connection starts up successfully while slowly increasing the data transfer rate according to the actual network load. The sender is limited by the size of the load window, not by the larger receive window.

Load window(congestion window) starts with a size of 1 segment. For each segment with a successfully received ACK, the load window size is increased by 1 segment until it remains smaller than the receive window. If the network is not congested, the load window will gradually reach the size of the receiving window. Under normal forwarding conditions, the sizes of these windows will be the same.

Note that a slow start is not that slow. After the first ACK, the load window size is 2 segments, and after successfully receiving an ACK for two segments, the size can increase to 8 segments. In other words, the window size increases exponentially.

Let's assume that instead of receiving an ACK, a timeout situation occurs. The behavior of the load window in this case is discussed below.

10.13.2 The “clueless window” syndrome

In the first implementations of TCP/IP, developers encountered the phenomenon "clueless window" syndrome(Silly Window Syndrome - SWS), which appeared quite often. To understand the events taking place, consider the following scenario, which leads to undesirable consequences, but is quite possible:

1. The sending application sends data quickly.

2. The receiving application reads 1 byte of data from the input buffer (i.e. slowly).

3. The input buffer fills up quickly after reading.

4. The receiving application reads 1 byte and TCP sends an ACK meaning “I have free space for 1 byte of data.”

5. The sending application sends a 1-byte TCP packet over the network.

6. The receiving TCP sends an ACK meaning "Thank you. I received the packet and have no more free space."

7. The receiving application again reads 1 byte and sends an ACK, and the whole process repeats.

A slow receiving application waits a long time for data to arrive and constantly pushes the received information to the left edge of the window, performing a completely useless operation that generates additional traffic on the network.

Real situations, of course, are not so extreme. A fast sender and a slow receiver will exchange small (relative to the maximum segment size) pieces of data and switch over an almost full receive window. In Fig. Figure 10.18 shows the conditions for the appearance of the "stupid window" syndrome.

Rice. 10.18. Buffer receiving window with very small free space

Solving this problem is not difficult. As soon as the receive window is reduced to a length less than this target size, TCP begins to deceive the sender. In this situation, TCP should not indicate to the sender additional window space when the receiving application reads data from the buffer in small chunks. Instead, you need to keep the released resources secret from the sender until there are enough of them. A single segment size is recommended, unless the entire input buffer stores a single segment (in the latter case, a half-buffer size is used). The target size that TCP should report can be expressed as:

minimum(1/2 input buffer, Maximum segment size)

TCP begins to lie when the window size becomes smaller than this size, and will tell the truth when the window size is no less than the value obtained from the formula. Note that there is no harm to the sender, since the receiving application would not be able to process much of the data it expects anyway.

The proposed solution can be easily verified in the case discussed above with an ACK output for each of the received bytes. The same method is also suitable for the case when the input buffer can store several segments (as is often the case in practice). The fast sender will fill the input buffer, but the receiver will indicate that it has no free space to accommodate the information and will not open this resource until its size reaches an entire segment.

10.13.3 Nagle's algorithm

The sender must, independently of the recipient, avoid sending very short segments by accumulating data before sending. Nagle's algorithm implements a very simple idea that allows you to reduce the number of short datagrams sent over the network.

The algorithm recommends delaying data forwarding (and pushing) while waiting for an ACK from previously transmitted data. Accumulated data is sent after receiving an ACK for a previously sent piece of information, or after receiving data to send in the amount of a full segment, or upon completion of a timeout. This algorithm should not be used for real-time applications that need to send data as quickly as possible.

10.13.4 Delayed ACK

Another mechanism to improve performance is the ACK delay method. Reducing the number of ACKs reduces the bandwidth that can be used to forward other traffic. If the TCP peer slightly delays sending the ACK, then:

■ Multiple segments can be acknowledged with a single ACK.

■ The receiving application is able to receive some amount of data within the timeout interval, i.e. the ACK can contain the output header and does not require generating a separate message.

To avoid delays when sending a stream of full-length segments (for example, when exchanging files), an ACK should be sent for at least every other full segment.

Many implementations use a 200ms timeout. But a delayed ACK does not reduce the exchange rate. When a short segment arrives, there is still enough free space in the input buffer to receive new data, and the sender can continue forwarding (in addition, resending is usually much slower). If a whole segment arrives, you need to respond to it at the same second with an ACK message.

10.13.5 Retransmission timeout

After sending the segment, TCP sets a timer and monitors the arrival of the ACK. If an ACK is not received within the timeout period, TCP retransmits the segment (relay). However, what should the time-out period be?

If it is too short, the sender will flood the network by forwarding unnecessary segments that duplicate information already sent. Too large a timeout will prevent the quick repair of segments that are actually destroyed during forwarding, which will reduce throughput.

How to choose the right timeout period? A value that is suitable for a high-speed local network will not be suitable for a remote connection with many hits. This means that the principle of “one value for any conditions” is clearly unsuitable. Moreover, even for an existing specific connection, network conditions may change and delays may increase or decrease.

Jacobson, Kern and Partridge algorithms (described in the articles , Van Jacobson, and Improving Round-Trip Time Estimates in Reliable Transport Protocols, Karn and Partridge) allow TCP to adapt to changing network conditions. These algorithms are recommended for use in new implementations. We'll look at them briefly below.

Common sense dictates that the best basis for estimating the correct timeout time for a particular connection would be to monitor cycle time(round-trip time) as the interval between sending data and receiving confirmation of its receipt.

Good solutions for the following quantities can be obtained from basic statistics (see Figure 10.19) to help calculate the timeout time. However, you should not rely on averages, since more than half of the estimates will be greater than the statistical average. By considering a couple of outliers, we can get better estimates that take into account the normal distribution and reduce the overly long retransmission wait time.

Rice. 10.19. Distribution of cycle times

There is no need for a large amount of calculations to obtain formal mathematical estimates of deviations. You can use fairly rough estimates based on the absolute value of the difference between the last value and the average estimate:

Last deviation = | Last cycle - Average |

To calculate the correct timeout value, another factor that needs to be taken into account is the change in cycle time due to current network conditions. What happens online at the last minute is more important than what happened an hour ago.

Let's assume that we are calculating the cycle average for a very long session. Let the network be lightly loaded at first, and we identified 1000 small values, but then there was an increase in traffic with a significant increase in latency.

For example, if 1000 values yielded a statistical average of 170 units, but then 50 values were measured with an average of 282, then the current average would be:

170×1000/1050 + 282×50/1050 = 175

A more reasonable value would be smoothed cycle time(Smoothed Round-Trip Time - SRTT), which takes into account the priority of later values:

New SRTT = (1 – α)×(old SRTT) + α×Last cycle value

The value of α is between 0 and 1. Increase a results in a greater influence of the current cycle time on the smoothed average. Since computers can quickly perform division by powers of 2 by shifting binary numbers to the right, α is always chosen to be (1/2)n (usually 1/8), so:

New SRTT = 7/8×Old SRTT + 1/8×Last Cycle Time

Table 10.2 shows how the formula for SRTT adjusts to the current SRTT value of 230 units when a change in network conditions causes the cycle time to progressively increase (assuming no timeout occurs). The values in column 3 are used as the values in column 1 for the next row of the table (ie, like the old SRTT).

Table 10.2 Calculation of smoothed cycle time

Old SRTT	Latest RTT	(7/8)×(old SRTT) + (1/8)×(RTT)
230.00	294	238.00
238.00	264	241.25
241.25	340	253.59
253.59	246	252.64
252.64	201	246.19
246.19	340	257.92
257.92	272	259.68
259.68	311	266.10
266.10	282	268.09
268.09	246	265.33
265.33	304	270.16
270.16	308	274.89
274.89	230	269.28
269.28	328	276.62
276.62	266	275.29
275.29	257	273.00
273.00	305	277.00

Now comes the question of choosing a value for the retransmission timeout. Analysis of cycle time values shows a significant deviation of these values from the current average value. It makes sense to set a limit for the magnitude of deviations. A good value for the retransmission timeout (in the RFC standards this value is called Retransmission TimeOut - RTO) is given by the following formula with a smoothed deviation (SDEV) constraint:

T = Retransmission Timeout = SRTT + 2×SDEV

T = SRTT + 4×SDEV

To calculate SDEV, first determine the absolute value of the current deviation:

DEV = | Last cycle time - old SRTT |

The smoothing formula is then used to take the latter value into account:

New SDEV = 3/4×old SDEV + 1/4×DEV

One question remains - what initial values to take? Recommended:

Initial timeout = 3s

Initial SRTT = 0

Initial SDEV = 1.5 s

Van Jacobson has defined a fast algorithm that calculates the data retransmission timeout very efficiently.

10.13.6 Statistics example

How successful will the timeout calculated above be? When this value was implemented, significant performance improvements were observed. An example would be team statistics netstat, received on the system tigger- an Internet server accessed by many hosts from all over the world.

1510769 packets (314955304 bytes) received in-sequence

System tigger Less than 2.5% of TCP data segments were retransmitted. For one and a half million incoming data segments (the rest are pure ACK messages), only 0.6% were duplicated. It should be taken into account that the level of losses in the input data approximately corresponds to the level for the output segments. Thus, wasteful retransmission traffic accounts for about 0.6% of the total traffic.

10.13.7 Calculations after resend

The formulas presented above use cycle time as the interval between sending a segment and receiving an acknowledgment of its receipt. However, suppose that no acknowledgment is received within the timeout period and the data must be resent.

Kern's algorithm assumes that the cycle time should not be changed in this case. Current smoothed cycle time value and smoothed deviation retain their values until confirmation is received to forward a certain segment without resending it. From this point on, calculations are resumed based on the saved values and new measurements.

10.13.8 Actions after retransmission

But what happens before confirmation is received? After retransmission, TCP behavior changes radically, mainly due to data loss due to network congestion. Therefore, the response to resending the data will be:

■ Reduced retransmission rate

■ Combat network congestion by reducing overall traffic

10.13.9 Exponential braking

After retransmission, the timeout interval is doubled. However, what happens if the timer overflows again? The data will be sent again, and the retransmission period will double again. This process is called exponential braking(exponential backoff).

If the network fault continues to occur, the timeout period will double until the preset maximum value is reached (usually 1 minute). After a timeout, only one segment can be sent. A timeout also occurs when the preset value for the number of data transfers without receiving an ACK is exceeded.

10.13.10 Reducing congestion by reducing data sent over the network

Reducing the amount of data sent is somewhat more complex than the mechanisms discussed above. It starts to work, just like the already mentioned slow start. But, since a limit is set on the level of traffic that can initially lead to problems, the exchange speed will actually slow down due to an increase in the size of the load window for one segment. You need to set the boundary values to actually reduce the sending speed. First, the danger threshold is calculated:

Boundary – 1/2 minimum (current load window, partner receiving window)

If the resulting value is more than two segments, it is used as a boundary. Otherwise, the border size is set to two segments. The full recovery algorithm requires:

■ Set the load window size to one segment.

■ For each ACK received, increase the load window size by one segment until the boundary is reached (much like the slow start mechanism).

■ Thereafter, with each ACK received, add a smaller value to the load window, which is selected based on the rate of increase per segment for the cycle time (increase is calculated as MSS/N, where N is the size of the load window in segments).

The ideal scenario can provide a simplified representation of how the recovery mechanism works. Let's assume that the partner's receiving window (and the current load window) had a size of 8 segments before the timeout was detected, and the boundary was defined as 4 segments. If the receiving application instantly reads data from the buffer, the receive window size will remain 8 segments.

■ 1 segment is sent (load window = 1 segment).

■ ACK received - 2 segments are sent.

■ ACK received for 2 segments - 4 segments are sent (boundary reached).

■ ACK received for 4 segments. 5 segments are sent.

■ ACK received for 5 segments. 6 segments are sent.

■ ACK received for 6 segments. 7 segments are sent.

■ ACK received for 7 segments. 8 segments are sent (the loading window is again equal in size to the receiving window).

Since timeout retransmission requires confirmation of receipt of all sent data, the process continues until the load window reaches the size of the receive window. The events that take place are shown in Fig. 10.20. The window size increases exponentially, doubling during the slow start period, and increases linearly once the boundary is reached.

Rice. 10.20. Limiting forwarding speed during congestion

10.13.11 Duplicate ACKs

Some implementations use an optional feature called fast resend(fast retransmit) - in order to speed up the re-sending of data under certain conditions. Its basic idea involves the recipient sending additional ACKs indicating a gap in the received data.

When receiving a segment out of order, the receiver sends back an ACK pointing to the first byte lost data (see Figure 10.21).

Rice. 10.21. Duplicate ACKs

The sender does not immediately retransmit the data because IP can normally deliver data to the recipient without a sending sequence. But when several additional duplicate data ACKs are received (for example, three), the missing segment will be sent without waiting for the timeout to complete.

Note that each duplicate ACK indicates receipt of a data segment. Several duplicate ACKs indicate that the network is capable of delivering a sufficient amount of data and is therefore not overly loaded. As part of the overall algorithm, a small reduction in the size of the load window is performed when there is a real increase in network traffic. In this case, the process of radical resizing when restoring work is not applied.

According to standard Host Requirements(host requirements) TCP must perform the same slow start as described above when quenching the source. However, this message is not targeted or effective because the connection receiving the message may not be generating very much traffic. Current Specification Router Requirements(router requirements) indicates that routers should not send messages about source suppression.

13/10/13 TCP Statistics

Finally, let's look at the team's statistical messages netstat, to see many of the mechanisms described above at work.

Segments are called packets.

879137 data packets (226966295 bytes)

21815 data packets (8100927 bytes) retransmitted

Re-shipment.

132957 ack-only packets (104216 delayed)

Let us note a large number

delayed ACK.

Probing window opening

zero size.

These are SYN and FIN messages.

762469 acks (for 226904227 bytes)

Signal about packets arriving

out of sequence.

1510769 packets (314955304 bytes)

9006 completely duplicate packets (867042 bytes)

Timeout result with real

data delivery.

74 packets with some dup. data (12193 bytes duped)

For greater efficiency

some data was repackaged to include additional bytes when resent.

13452 out-of-order packets (2515087 bytes)

530 packets (8551 bytes) of data after window

Perhaps this data was

included in the probe messages.

402 packets received after close

These are subsequent repeats

sending.

108 discarded for bad checksums

Invalid TCP checksum.

0 discarded for bad header offset fields

7 discarded because packet too short

14677 connections established (including accepts)

18929 connections closed (including 643 drops)

4100 embryonic connections dropped

572187 segments updated rtt (of 587397 attempts)

Failed Change Attempts

cycle time, since the ACK did not arrive before the timeout expired,

26 connections dropped by rexmit timeout

Subsequent unsuccessful attempts

resend, indicating a lost connection.

Probing timeouts

zero window.

Verification timeouts

broken connection.

472 connections dropped by keepalive

10.14 Compliance with developer requirements

The current TCP standard requires implementations to adhere strictly to the slow start procedure when initializing a connection and to use the Kern and Jacobson algorithms for estimating data resend timeout and load control. Tests have shown that these mechanisms lead to significant performance improvements.

What happens when you install a system that does not strictly adhere to these standards? It will not provide adequate performance for its own users, and will be a poor neighbor for other systems on the network, preventing it from recovering from a temporary overload and generating excessive traffic that causes datagrams to be dropped.

10.15 Barriers to productivity

TCP has proven its flexibility by operating over networks with transfer rates of hundreds or millions of bits per second. This protocol has achieved good results in modern local area networks with Ethernet, Token-Ring and Fiber Distributed Data Interface (FDDI) topologies, as well as for low-speed communication links or long-distance connections (similar to satellite links).

TCP is designed to respond to extreme conditions such as network congestion. However, the current version of the protocol has features that limit performance in emerging technologies that offer hundreds or thousands of megabytes of bandwidth. To understand the problems involved, consider a simple (albeit unrealistic) example.

Suppose that when moving a file between two systems, you want to perform a continuous stream of exchange as efficiently as possible. Let's assume that:

■ The maximum receiver segment size is 1 KB.

■ Receiving window - 4 KB.

■ The bandwidth allows you to send two segments in 1 s.

■ The receiving application consumes data as it arrives.

■ ACK messages arrive after 2 s.

The sender is capable of sending data continuously. After all, when the volume allocated for the window is full, an ACK arrives, allowing the sending of another segment:

After 2 s:

RECEIVE ACK OF SEGMENT 1, CAN SEND SEGMENT 5.

RECEIVE ACK OF SEGMENT 2, CAN SEND SEGMENT 6.

RECEIVE ACK OF SEGMENT 3, CAN SEND SEGMENT 7.

RECEIVE ACK OF SEGMENT 4, CAN SEND SEGMENT 8.

After another 2 s:

RECEIVE ACK OF SEGMENT 5, CAN SEND SEGMENT 9.

If the receive window was only 2 KB, the sender would be forced to wait one second out of every two before sending the next data. In fact, to hold a continuous stream of data, the receiving window must be at least:

Window = Bandwidth×Cycle Time

Although the example is somewhat exaggerated (to provide simpler numbers), a small window can cause problems on high latency satellite connections.

Now let's look at what happens with high-speed connections. For example, if the bandwidth and forwarding rate are measured at 10 million bits per second, but the cycle time is 100 ms (1/10 of a second), then for a continuous stream the receive window must store at least 1,000,000 bits, i.e. . 125,000 bytes. But the largest number that can be written in the header field for a TCP receive window is 65,536.

Another problem arises at high baud rates because the sequence numbers will run out very quickly. If the connection can transfer data at a speed of 4 GB/s, then the sequence numbers must be updated within every second. There will be no way to distinguish between old duplicate datagrams that were delayed by more than a second as they moved across the Internet from fresh new data.

New research is being actively conducted to improve TCP/IP and eliminate the obstacles mentioned above.

10.16 TCP functions

This chapter covers the many functions of TCP. The main ones are listed below:

■Associating ports with connections

■ Initialize connections using three-step confirmation

■ Performing a slow start to avoid network congestion

■ Data segmentation during forwarding

■ Data numbering

■ Processing of incoming duplicate segments

■ Calculation of checksums

■ Control of data flow through the receiving window and sending window

■ Terminate the connection in the established manner

■ Terminating the connection

■ Forwarding urgent data

■ Positive confirmation of reshipment

■ Calculate the retransmission timeout

■ Reduced return traffic during network congestion

■ Signaling that segments arrive out of order

■ Sensing the closing of the receiving window

10.17 TCP states

A TCP connection goes through several stages: the connection is established by exchanging messages, then data is sent, and then the connection is closed by exchanging special messages. Each step in the connection operation corresponds to a specific condition this connection. The TCP software on each end of the connection constantly monitors the current state of the other side of the connection.

Below we will briefly look at the typical state transition between a server and a client located at opposite ends of a connection. We do not aim to provide an exhaustive description of all possible states when sending data. It is given in RFC 793 and the document Host Requirements.

During connection establishment, the server and client go through similar sequences of states. Server states are shown in Table 10.3, and client states are shown in Table 10.4.

Table 10.3 Server state sequence

Server status	Event	Description
CLOSED		Dummy state before starting connection establishment.
CLOSED	Passive opening by a server application.
LISTEN (tracking)		The server is waiting for a connection from the client.
LISTEN (tracking)	The TCP server receives the SYN and sends a SYN/ACK.	The server received SYN and sent SYN/ACK. Goes to waiting for ACK.
SYN-RECEIVED	The TCP server receives an ACK.
ESTABLISHED (installed)		ACK received, connection open.

Table 10.4 Client State Sequence

If the partners were to simultaneously try to establish a connection with each other (which is extremely rare), each would go through the CLOSED, SYN-SENT, SYN-RECEIVED, and ESTABLISHED states.

The end parties of the connection remain in the ESTABLISHED state until one of the parties starts closure connections by sending the FIN segment. During a normal closing, the party initiating the closing goes through the states shown in Table 10.5. Her partner goes through the states presented in Table 10.6.

Table 10.5 Sequence of states of the side closing the connection

Closing side states	Event	Description
ESTABLISHED	A local application requests that the connection be closed.
ESTABLISHED	TCP sends FIN/ACK.
FIN-WAIT-1		The closing party awaits the partner's response. Let us remind you that new data may still arrive from the partner.
FIN-WAIT-1	TCP receives an ACK.
FIN-WAIT-2		The closing party has received an ACK from the peer, but has not yet received a FIN. The closing party waits for FIN when accepting incoming data.
	TCP receives FIN/ACK.
	Sends ACK.
TIME-WAIT		The connection is maintained in an indeterminate state to allow duplicate data or duplicate FINs still existing on the network to arrive or be discarded. The waiting period is twice the maximum segment lifetime estimate.
CLOSED

Table 10.6 Sequence of partner states upon closing a connection

Partner status	Event	Description
ESTABLISHED	TCP receives FIN/ACK.
CLOSE-WAIT		FIN has arrived.
	TCP sends an ACK.
		TCP expects its application to close the connection. At this point, the application may be sending a fairly large amount of data.
	The local application initiates closing the connection.
	TCP sends FIN/ACK.
LAST-ACK		TCP waits for the final ACK.
LAST-ACK	TCP receives an ACK.
CLOSED		All connection information has been deleted.

10.17.1 Analyzing TCP connection states

Team netstat -an allows you to check the current connection status. The connections in states are shown below listen, startup, established, closing And time-wait.

Note that the connection port number is listed at the end of each local and external address. You can see that there is TCP traffic for both the input and output queues.

Pro Recv-Q Send-Q Local Address Foreign Address (state)

Tcp 0 0 128.121.50.145.25 128.252.223.5.1526 SYN_RCVD

Tcp 0 0 128.121.50.145.25 148.79.160.65.3368 ESTABLISHED

Tcp 0 0 127.0.0.1.1339 127.0.0.1.111 TIME_WAIT

Tcp 0 438 128.121.50.145.23 130.132.57.246.2219 ESTABLISHED

Tcp 0 0 128.121.50.145.25 192.5.5.1.4022 TIME_WAIT

Tcp 0 0 128.121.50.145.25 141.218.1.100.3968 TIME_WAIT

Tcp 0 848 128.121.50.145.23 192.67.236.10.1050 ESTABLISHED

Tcp 0 0 128.121.50.145.1082 128.121.50.141.6000 ESTABLISHED

Tcp 0 0 128.121.50.145.1022 128.121.50.141.1017 ESTABLISHED

Tcp 0 0 128.121.50.145.514 128.121.50.141.1020 CLOSE_WAIT

Tcp 0 1152 128.121.50.145.119 192.67.239.23.3572 ESTABLISHED

Tcp 0 0 128.121.50.145.1070 192.41.171.5.119 TIME_WAIT

Tcp 579 4096 128.121.50.145.119 204.143.19.30.1884 ESTABLISHED

Tcp 0 0 128.121.50.145.119 192.67.243.13.3704 ESTABLISHED

Tcp 0 53 128.121.50.145.119 192.67.236.218.2018 FIN_WAIT_1

Tcp 0 0 128.121.50.145.119 192.67.239.14.1545 ESTABLISHED

10.18 Notes on implementations

From the very beginning, the TCP protocol was designed to interoperate network equipment from different manufacturers. The TCP specification does not specify exactly how the internal implementation structures should work. These questions are left to developers, who are tasked with finding the best mechanisms for each specific implementation.

Even RFC 1122 (document Host Requirements - requirements for hosts) leaves sufficient freedom for variations. Each of the implemented functions is marked with a certain level of compatibility:

■ MAY (Allowed)

■ MUST NOT

Unfortunately, sometimes there are products that do not implement MUST requirements. As a result, users experience the inconvenience of reduced performance.

Some good implementation practices are not covered in the standards. For example, security can be improved by restricting the use of well-known ports by privileged processes on the system, if the local operating system supports this method. To improve performance, implementations should copy and move sent or retrieved data as little as possible.

Standard Application Programming Interface indefined(as well as security policy), so that there is free space for experimenting with different sets of software tools. However, this may result in different programming interfaces being used on each platform and will not allow application software to be moved between platforms.

In fact, developers base their toolkits on the Socket programming interface, borrowed from Berkeley. The importance of the software interface increased with the advent of WINSock (Windows Socket), which led to a proliferation of new desktop applications that could run on top of any WINSock interface compatible with the TCP/IP stack.

10.19 Further reading

The original TCP standard is defined in RFC 793. Upgrades, revisions, and compatibility requirements are addressed in RFC 1122. Kern and Partridge published the article Improving Round-Trip Estimates in Reliable Transport Protocols In the magazine Proceedings of the ACM SIGCOMM 1987. Jacobson's article Congestion Avoidance and Control appeared in Proceedings of the ACM SIGCOMM 1988 Workshop. Jacobson also issued several RFCs revising performance algorithms.

Servers that implement these protocols on a corporate network provide the client with an IP address, gateway, netmask, name servers, and even a printer. Users do not have to manually configure their hosts in order to use the network.

The QNX Neutrino operating system implements another auto-configuration protocol called AutoIP, which is a project of the IETF Auto-Configuration Committee. This protocol is used in small networks to assign link-local IP addresses to hosts.

The AutoIP protocol independently determines the IP address local to the link, using a negotiation scheme with other hosts and without contacting a central server.

Using the PPPoE protocol

The abbreviation PPPoE stands for Point-to-Point Protocol over Ethernet. This protocol encapsulates data for transmission over an Ethernet network with a bridged topology.

The PPPoE protocol combines Ethernet technology with the PPP protocol, effectively creating a separate connection to a remote server for each user. Access control, connection accounting, and service provider selection are determined for users, not hosts. The advantage of this approach is that neither the telephone company nor the Internet service provider has to provide any special support for this.

Unlike dial-up connections, DSL and cable modem connections are always active. Because the physical connection to a remote service provider is shared among multiple users, an accounting method is needed that records the senders and destinations of traffic and charges users. The PPPoE protocol allows the user and the remote host that are participating in a communication session to learn each other's network addresses during an initial exchange called detection

(discovery). Once a session has been established between an individual user and a remote host (eg, an Internet service provider), the session can be monitored for accrual purposes. Many homes, hotels, and corporations provide public Internet access through digital subscriber lines using Ethernet technology and the PPPoE protocol.

A connection via the PPPoE protocol consists of a client and a server. The client and server operate using any interface that is close to Ethernet specifications. This interface is used to issue IP addresses to clients and associate those IP addresses with users and, optionally, workstations, rather than authentication based on workstation alone. The PPPoE server creates a point-to-point connection for each client.

Setting up a PPPoE sessionIn order to create a PPPoE session, you should use the servicepppoed. Moduleio-pkt-*nProvides PPPoE protocol services. First you need to runio-pkt-*Withsuitable driver: