The characteristics of TCP protocol
TCP (which means Transmission Control Protocol) is one of the main protocols of the transport layer of the TCP/IP model. It makes it possible, at application level, to manage data coming from (or going to) the lower layer of the model (i.e. the IP protocol). When data is provided to the IP protocol, it encapsulates them in IP datagrams, by fixing the protocol field to 6 (so that it knows in advance that the protocol is TCP...). TCP is a connection orientated protocol, i.e. it enables two machines which are communicating to control the status of the transmission.
The main characteristics of the TCP protocol are as follows:
- TCP makes it possible to put datagrams back in order when coming from the IP protocol
- TCP enables the data flow to be monitored so as to avoid network saturation
- TCP allows data to be formed in variable length segments in order to "return" them to the IP protocol
- TCP makes it possible to multiplex data, i.e. so that information coming from distinct sources (applications for example) on the same line can be circulated simultaneously
- Finally, TCP allows communication to be courteously started and ended
The aim of TCP
Using the TCP protocol, applications can communicate securely (thanks to the TCP protocol's acknowledgements system), independently from the lower layers. This means that routers (which work in the internet layer) only have to route data in the form of datagrams, without being concerned with data monitoring because this is performed by the transport layer (or more specifically by the TCP protocol).
During a communication using the TCP protocol, the two machines must establish a connection. The originator machine (the one which requests the connection) is called the client, while the recipient machine is called the server. So it is said that we are in a Client-Server environment.
The machines in such an environment communicate in online mode, i.e. the communication takes place in both directions.
To enable the communication and all the controls which accompany it to operate well, the data is encapsulated, i.e. a header is added to data packets which will enable the transmissions to be synchronized and ensure their reception.
Another feature of TCP is the ability to control the data speed using
its capability to issue variably sized messages, these messages are called segments.
The multiplexing function
TCP makes it possible to carry out an important task: multiplexing/demultiplexing, i.e. to convey data from various applications on the same line or in other words put information arriving in parallel into order.
These operations are conducted using the concept of ports
(or sockets), i.e. a number linked to an application type which, when combined with an IP address, makes it possible to uniquely determine an application which is running on a given machine.
The format of data under TCP
A TCP segment is made up as follows:
Meanings of the different fields:
- Source port (16 bits): Port related to the application in progress on the source machine
- Destination port (16 bits): Port related to the application in progress on the destination machine
- Sequence number (32 bits): When the SYN flag is set to 0, the sequence number is that of the first word of the current segment.
When SYN is set to 1, the sequence number is equal to the initial sequence number used to synchronize the sequence numbers (ISN)
- Acknowledgement number (32 bits): The acknowledgement number, also called the acquittal number relates to the (sequence) number of the last segment expected and not the number of the last segment received.
- Data offset (4 bits): This makes it possible to locate the start of the data in the packet. Here, the offset is vital because the option field is a variable size
- Reserved (6 bits): A currently unused field but provided for future use
- Flags (6x1 bit): The flags represent additional information:
- URG: if this flag is set to 1 the packet must be processed urgently
- ACK: if this flag is set to 1 the packet is an acknowledgement.
- PSH (PUSH): if this flag is set to 1 the packet operates according to the PUSH method.
- RST: if this flag is set to 1 the connection is reset.
- SYN: The TCP SYN flag indicates a request to establish a connection.
- FIN: if this flag is set to 1 the connection is interrupted.
- Window (16 bits): Field making it possible to know the number of bytes that the recipient wants to receive without
- Checksum (CRC): The checksum is conducted by taking the sum of the header data field, so as to be able to check the integrity of the header
- Urgent pointer (16 bits): Indicates the sequence number after which information becomes urgent
- Options (variable size): Various options
- Padding: Space remaining after the options is padded with zeros to have a length which is a multiple of 32 bits
Reliability of transfers
The TCP protocol makes it possible to ensure reliable data transfer, although it uses the IP protocol, which does not include any monitoring of datagram delivery.
In reality, the TCP protocol has an acknowledgement system
enabling the client and server to ensure mutual receipt of data.
When a segment is issued, a sequence number is linked to it. Upon receipt of a data segment, the recipient machine will return a data segment where the ACK flag is set to 1 (in order to signal that it is an acknowledgement) accompanied by an acknowledgement number equal to the previous sequence number.
In addition, using a timer which starts upon receipt of a segment at the level of the originator machine, the segment is resent when the time allowed has passed, because in this case the originator machine considers that the segment is lost...
However, if the segment is not lost and it arrives at the destination, the recipient machine will know, thanks to the sequence number that it is a duplication and will only retain the last segment arrived at the destination...
Establishing a connection
Considering that this communication process, which takes place using data transmission and acknowledgement, is based on a sequence number, the originator and recipient machines (client and server) must know the initial sequence number of the other machine.
Establishing the connection between two applications is often done according to the following schema:
- The TCP ports must be open
- The application on the server is passive, i.e. the application is listening, awaiting a connection
- The application on the client makes a connection request to the server where the application is passive open. The application on the client is said to be "active open"
The two machines must then synchronize their sequences using a mechanism commonly called a three ways handshake that is also found during the closure of the session.
This dialogue makes it possible to start the communication, it takes place in three stages, as its name indicates:
- In the first stage the originator machine (the client) transmits a segment where the SYN flag is set to 1 (to indicate that it is a synchronization segment), with a sequence number N which is called the initial sequence number of the client.
- In the second stage, the recipient machine (the server) receives the initial segment coming from the client, then sends it an acknowledgement which is a segment where the ACK flag is set to 1 and the SYN flag is set to 1 (because it is again a synchronization). This segment contains the sequence number of this machine (the server) which is the initial sequence number for the client. The most important field in this segment is the acknowledgement field which contains the initial sequence number for the client, incremented by 1.
- Finally, the client transmits an acknowledgement which is a segment where the ACK flag is set to 1 and the SYN flag is set to 0 (it is no longer a synchronization segment). Its sequence number is incremented and the acknowledgement number represents the initial sequence number for the server incremented by 1.
Following this sequence involving three exchanges the two machines are synchronized and communication can begin!
There is a hacking technique, called IP spoofing, which allows this approval link to be corrupted for malicious purposes!
Sliding window method
In many cases, it is possible to limit the number of acknowledgements, in order to relieve traffic on the network, by fixing a sequence number at the end of which an acknowledgement is required. This number is in fact stored in the window field of the TCP/IP header.
This method is effectively called the "sliding window method"
because to some extent a range of sequences is defined that does not need
acknowledgements and which moves as acknowledgements are received.
In addition, the size of this window is not fixed. In fact, the server can include the size of the window which seems most suitable in its acknowledgements by storing it in the window field. So, when the acknowledgement indicates a request to increase the window, the client will move the right border of the window.
Conversely, in the case of a reduction, the client will not move the right border of the window towards the left but wait for the left border to advance (with the arrival of the acknowledgements).
Ending a connection
The client can request to end a connection in the same way as the server.
Ending a connection is done in the following way:
- One of the machines sends a segment with the FIN flag set to 1, and the application puts itself in a waiting state, i.e. it finishes receiving the current segment and ignores the following ones.
- After receipt of this segment, the other machine sends an acknowledgement with the FIN flag set to 1 and continues to send the segments in progress. Following this, the machine informs the application that a FIN segment has been received, then sends a FIN segment to the other machine, which closes the connection.
For more information on TCP protocol, please refer to RFC793 which explains the protocol in detail: