October 1-15

Calculating Application Delay

 

We are interested in measuring the “application delay” in order to detect periods of severe impairment in a trace, meaning periods where an application experiences packet losses, high delay, jitter etc.  For each flow in our trace, we are interested in measuring the time a packet is sent and the time that the packet is received by the receiver and delivered to the application layer. Therefore, application delay is the interval between the transmission of the packet from the sender and the transmission of the corresponding ACK from the receiver:

                                App_Delay = Timestamp_ACK_sentTimestamp_Packet_sent     (1)

The below picture shows the position of the monitoring point that captured our packet traces. Due to the position of the monitor the application delay cannot be calculated from (1) since we do not have “end-to-end” collection of data. Therefore, application delay can be approximated by the interval between the capture of a packet at the monitor and the capture of its corresponding ACK.

                                App_Delay = Timestamp_ACK_capturedTimestamp_Packet_captured     (2)

 

                               

Therefore, in order to compute the application delay, we need to find an algorithm tha couples every packet to its corresponding ACK in our trace. This task however isn not trivial due to many reasons referenced below.

Ideal Network Conditions

TCP uses two ways to acknowledge a packet: immediate ACKs and delayed ACKs.

1)      Successful packet transmission with immediate ACKs                                                                                                

Every time a packet is sent, TCP will send an acknowledgment for this packet. The sender sends D[1-5] and the receiver sends its ACK. In order to calculate application delay we can use (2):

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_captured     (2)

2)        Successful packet transmission with delayed ACKs                                                                                                                      

In the previous case, we mentioned that every time a packet is sent, TCP will send an acknowledgment for this packet. TCP may choose to wait for two packets in order to send one ACK for the two packets. These are called delayed ACKs and if these segments are not separated by more than 200 msec then the receiver sends one ACK for the two segments.  Therefore, segment D[1-5] is successfully transmitted. The receiver waits for 200 sec in case it receives another packet. Packet D[6-10] is sent within this interval. The receiver then sends an ACK for the two segments D[1-5] and D[6-10]. In this case we will not see in our trace an ACK[6]. In order to calculate application delay we can use (2):

App_Delay = Timestamp_ACK[11]_captured – Timestamp_Packet_D[1-5]_captured     (2)

App_Delay = Timestamp_ACK[11]_captured – Timestamp_Packet_D[6-10]_captured     (2)

Real Network Conditions

3)        Packet is lost before the monitor point

           

The packet is transmitted the first time but it is lost before the monitor point. The packet is then re-transmitted successfully the second time. The actual delay is:

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_not_captured     (2)

Due to the fact that the monitor cannot capture the first segment D[1-5] because it is lost before the monitoring point, the delay computed is an underestimation of the actual delay:

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_captured     (2)

 

4)        Packet is lost after the monitor point one or more times

               

The packet is transmitted the first time but it is lost after the monitor point. The packet is then re-transmitted a second time and gets lost. The packet is then re-transmitted successfully the third time. In such a case the delay for packet D[1-5] should include all the time elapsed between the first un-successful transmission and the reception of the corresponding ack. The re-transmissions of the packet must be discarded since and the actual delay must be computed with the first appearance of the segment: 

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_first_captured     (2)

5)      Loss of a packet of the sender’s window

           

The sender sends 3 segments and the middle of them is lost. The receiver generates two acks with sequence number 6. The first one is to ack D[1-5] while the second one is to inform the sender that it misses D[6-10]. When the sender sends the missing segment D[6-10], the receiver generates ACK[16] for all intermediate segments. As in step (1) we will couple D[1-5] with its corresponding ACK:

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_captured     (2)

 

As in step (4) we will discard retransmitted packets and we eill couple the first packet seen in the trace D[6-10] with its corresponding ACK which is ACK[16] (case (2)). Therefore,

App_Delay = Timestamp_ACK[16]_captured – Timestamp_Packet_D[6-10]_first_captured 

Finally we will couple D[11-15] with ACK[16]

App_Delay = Timestamp_ACK[16]_captured – Timestamp_Packet_D[11-15]_captured  

It is wrong to couple D[11-15] with ACK[6]. Even if the second transmission of ACK[6] was sent because an out-of-order packet was received, the segment D[11=15] will not be delivered to the application layer until packets are sent in right order. Therefore, we must couple segment D[11-15] with the corresponding ACK[16].

 

6)        Duplication of the packet before and after the monitor point

           

The sender sends 1 segment and before the monitor the network duplicates the packet.

We will handle this case as packet retransmission. As in step (4) we will discard retransmitted packets and we will couple the first packet seen in the trace D[1-5] with its corresponding ACK[6]. Therefore,

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_first_captured 

 if packet duplication takes place after the monitor point then we are not going to notice the second packet and we will just see a normal packet transmission. Therefore,

App_Delay = Timestamp_ACK[6]_captured – Timestamp_Packet_D[1-5]_captured     (2)

7)      Packet re-ordering before and after the monitor point

                           

In this case, segment D[1-5] is transmitted first and D[6-10] second. However, at the monitor point and at the receiver D[6-10] is observed first. If immediate acks are used, then the receiver will notice an out of order segment ( D[6-10] ) and thus will generate and ACK[1]. When the second segment is received and the gap is filled it will issue ACK[11] for both of the segments. The delays we will compute if we couple every segment with its ACK are:

App_Delay = Timestamp_ACK[11]_captured – Timestamp_Packet_D[1-5]_captured 

App_Delay = Timestamp_ACK[11]_captured – Timestamp_Packet_D[6-10]_captured 

However in this case we calculate a shorter delay for D[1-5] than for D[6-10]

 The case is also similar when delayed ACKs are used since instead of issuing ACK[1] when the out of order segment arrives, ACK[11] would have been issued after two consecutive packets. The delay for packets D[1-5], D[6-10] is computed in exactly the same way when packet re-ordering takes place after the monitor point.

Algorithm description for calculating application delays

               

The algorithm proposed in order to infer application layer delays is the following:

 

1. For each flow 'f' in trace {  

2.            sort(pkts('f'), timestamp );

3.             For each data_segment 'd' [sesgments are TCP packets with size > 40 bytes. Packets that do not include only headers] {

4.                         if( 'd' == marked_coupled ) {

5.                                    next data_segment;

6.                        }

 7.                       For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1 ]

8.                                    if( (seq(ACK)=sn == seq('d') + len('d')) && ( ts(ACK) >= ts('d')  ) {

9.                                                mark( 'd', marked_coupled );

10.                                                delay = ts(ACK) - ts('d');

11.                                                mark( all packets with sequence number = sn, marked_coupled );

12.                                                next data_segment;

 13.                                   }

 14.                       }

 15.                       if( 'd' NOT marked_coupled ) {

 16.                                   For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1 ] {

 17.                                               if( (seq(ACK)=sn > seq('d') + len('d')) && ( ts(ACK) >= ts('d')  ) {

 18.                                                           mark( 'd', marked_coupled );

 19.                                                           delay = ts(ACK) - ts('d');

 20                                                           mark( all packets with sequence number = sn, marked_coupled );

 21.                                                           next data_segment;

 22.                                               }         

 23.                                  }

 24.                       }

25.            }

}

 This algorithm tracks for every segment the exactly next corresponding ACK, meaning the ACK with segments’ sequence number plus one. If such ACK not found then this means either that we have delayed ACKs, therefore packets ACK has a larger sequence number than then number we search or  packet loss of a segment of the windows sender or packet reordering.

 

As far as the complexity of this algorithm is concerned:

If F is the number of flows in the trace and N the average number of packets per flow then the average complexity should be:

 

F*( NlogN + 2*N^2 ) where NlogN is the overhead of the sort function when used to sort in terms of timestamps all packets that belong to a certain flow.

 Validation of the algorithm in each case

1)      Successful packet transmission with immediate ACKs

Excecutes lines 1-14.

In line 7 for ACK[6] scans packets with timestamp<timestamp(ACK[6]) and if current packet has SN+length=6 then it couples this packet with the ACK[6] and computes delay. Therefore, for packet D[1-5] which has SN=1 and length =5 it will couple it with ACK[SN+length]=ACK[6].  

2)      Successful packet transmission with delayed ACKs                

Excecutes all algorithm.

Lines 7-14 will be excecuted but no match willl be found since there id no ACK[6]. Therefore, it will excecute lines 15-21 and will search for an ACK with greater sequence number.

3)      Packet is lost before the monitor point

Due to tha fact that the packet is not captured by the monitor, the algorithm will compute delay as in (1)

4)      Packet is lost after the monitor point one or more times

Excecute lines 1-14.

Firstly, the algorithm will excecute lines 2, 3 and 7-14. After it couples the first segment D[1-5] that sees in the trace marks the packet and in line 11 marks all the packet with the same SN. Therefore,  when it will scan the next segment it will excecute lines 4 and 5 and it will discard the retransmissions of this segment. It calculates correctly the delay between the ACK[6] and the first appearance of the segment in the packet.

5)      Loss of a packet of the sender’s window

Excecutes all algorithm.

For the first segment D[1-5], excecutes lines 1,2 and 7-14. It matches D[1-5] with the first ACK[6] as in case (1). For the second segment D[6-10] excecutes also lines 15-21 since it finds no ACK[11] to match packet (as in case (2)). For segment D[11-15] excecutes lines 7-12 since it finds a match with ACK[16]. For the segment D[6-10] excecutes lines 4-5 where the algorithm abserves that the packet D[6-10] has been already matched and thus, correclty, calculates no delay for this packet.

6)      Duplication of the packet before and after the monitor point

This case is the same with case 4, where the duplicate packet is considered a retransmission of the first one and is discarded.

7)      Packet re-ordering before and after the monitor point

Firstly, for the first segment D[6-10] a match is found after executing lines 7-14. For the segment D[1-5] no match is found in lines 7-14 and therefore an ACK with greater SN than 6 is searched in 15-20. The delay computed is an underestiomation of the actual delay and we are going to deal with this case below.

Cases of underestimation - Future Work

The above algorithm couples segments with its corresponding ACKs. There are some cases where the algorithm underestimates delay for a segment. Both cases described above are related to reordering of segments in the trace.

Case1: As we have seen in (8), if we couple each packet with its corresponding ACK and compute delay from (2) then we underestimate the delay for D[1-5].If we use (1) then the delay is t1 for D[1-5] and t2 for D[6-10] which are the actual delays.

t1

t2

 
                       

Case 2: It is the same with (5) with the difference that the segment is lost before the monitor. Therefore, the delay estimated for segment D[1-6] is underestimated since the delay is computed with the second segment sent from the sender.

 

Possible correction of the underestimated delay: if the segment with SN is out of order, then this segment’s timestamp should be between timestamps of packets SN-1 and SN+1.

                References: www.csd.uoc.gr/~ploumid

 

 

October 30

Calculating application delay – algorithm changed to catch correctly the data

 

1. For each flow 'f' in trace {  

2.            sort(pkts('f'), timestamp );

3.             For each data_segment 'd' [sesgments are TCP packets with size > 40 bytes (packets that do not include only headers). We will refer to headersize as the size of packets containing only header therefore headersize=40. SYN packets have size>40 bytes and must be excluded because they have no impact on the application since it is not yet started.] {

4.                         if( 'd' == marked_coupled ) {

5.                                    next data_segment;

6.                        }

 7.                       For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1àline contains ‘ack’ ]

                                                Capture correctly this  line

                                                Case1: ack alone not piggybacked

                                                                [symbol captures If packet is SYN, FIN, P (data) etc or if packet has no flag]

if size[line]==headersize and symbol =’ .’  then ack is alone and not piggybacked.

Case2:ack is piggybagged

Checking if size[line] ==headersize is not enough because there are lines corresponding to FYN packets which contain acks and have size[line]==headersize but have different structure than ack only lines.

if size[line]>headersize or if symbol = F then ack is piggybacked in data or in a FYN packet respectively.

                                          

8.                                    if( (seq(ACK)=sn == seq('d') + len('d')) && ( ts(ACK) >= ts('d')  ) {

9.                                                mark( 'd', marked_coupled );

10.                                                delay = ts(ACK) - ts('d');

11.                                                mark( all packets with sequence number = sn, marked_coupled );

12.                                                next data_segment;

 13.                                   }

 14.                       }

 15.                       if( 'd' NOT marked_coupled ) {

16.                                                          For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1 ] {

 

Capture correctly this  line

                                                Case1: ack alone not piggybacked

                                                                [symbol captures If packet is SYN, FIN, P (data) etc or if packet has no flag]

if size[line]==headersize and symbol =’ .’  then ack is alone and not piggybacked.

                                                Case2:ack is piggybacked

Checking if size[line] ==headersize is not enough because there are lines corresponding to FYN packets which contain acks and have size[line]==headersize but have different structure than ack only lines.

if size[line]>headersize or if symbol = F then ack is piggybacked in data or in a FYN packet respectively.

 

 17.                                               if( (seq(ACK)=sn > seq('d') + len('d')) && ( ts(ACK) >= ts('d')  ) {

 18.                                                           mark( 'd', marked_coupled );

 19.                                                           delay = ts(ACK) - ts('d');

 20                                                           mark( all packets with sequence number = sn, marked_coupled );

 21.                                                           next data_segment;

22.                                                                              }         

 23.                                  }

 24.                       }

25.            }

 

7949850 1113409121.996529 96.129.204.176.1309 > 48.84.0.44.80: S 3022693050:3022693050(0) win 16384 <mss 1460,nop,nop,sackOK> (DF) (ttl 125, id 45548, len 40)

7949851 1113409122.080714 96.129.204.176.1309 < 48.84.0.44.80: S 3343177364:3343177364(0) ack 3022693051 win 5840 <mss 1460> (DF) (ttl 242, id 0, len 44)

7949850 1113409122.082585 96.129.204.176.1309 > 48.84.0.44.80: . ack 3343177365 win 17520 (DF) (ttl 125, id 45549, len 40)

7949850 1113409122.157444 96.129.204.176.1309 > 48.84.0.44.80: P 3022693051:3022693691(640) ack 3343177365 win 17520 (DF) (ttl 125, id 45550, len 680)

7949851 1113409122.241966 96.129.204.176.1309 < 48.84.0.44.80: . ack 3022693691 win 7040 (DF) (ttl 242, id 6279, len 40)

7949851 1113409123.033030 96.129.204.176.1309 < 48.84.0.44.80: . 3343177365:3343178825(1460) ack 3022693691 win 7040 (DF) (ttl 242, id 6281, len 1500)

7949851 1113409123.033050 96.129.204.176.1309 < 48.84.0.44.80: FP 3343180285:3343180307(22) ack 3022693691 win 7040 (DF) (ttl 242, id 6285, len 62)

7949851 1113409123.033131 96.129.204.176.1309 < 48.84.0.44.80: . 3343178825:3343180285(1460) ack 3022693691 win 7040 (DF) (ttl 242, id 6283, len 1500)

7949850 1113409123.166760 96.129.204.176.1309 > 48.84.0.44.80: . ack 3343180285 win 17520 (DF) (ttl 125, id 45553, len 40)

7949850 1113409123.167094 96.129.204.176.1309 > 48.84.0.44.80: . ack 3343180308 win 17498 (DF) (ttl 125, id 45554, len 40)

7949850 1113409123.167591 96.129.204.176.1309 > 48.84.0.44.80: F 3022693691:3022693691(0) ack 3343180308 win 17498 (DF) (ttl 125, id 45557, len 40)

7949851 1113409123.251892 96.129.204.176.1309 < 48.84.0.44.80: . ack 3022693692 win 7040 (DF) (ttl 242, id 6287, len 40)

 

 

 

 

 

November 1

Calculating application delay – algorithm changed again to catch correctly the data

 

While running the above algorithm, I have noticed that there are ack packets not piggybacked with data but their size is not equal to the header size (40 bytes). These ack packets have length 52 bytes and contain additional information. This additional information is bytes that correspond to TCP options such as timestamp (Its purpose is to track the round-trip delivery time for data in order to identify changes in latency that may require acknowledgment timer adjustments). These acks are not piggybacked acks since they contain no segments (only the ack flag is set and the tcp options) and they are not acknowledged by other acks. Therefore, the above algorithm cannot be applied. This is the correct version of the algorithm that calculates the application delay (per packet delay):

 

1. For each flow 'f' in trace {  

2.            sort(pkts('f'), timestamp );

3.             For each data_segment 'd'   (if line does not contain ‘: . ack ‘ then line is data line or data line with piggybacked ack) {

4.                         if( 'd' == marked_coupled ) {

5.                                    next data_segment;

6.                        }

 7.                       For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1 therefore line contains ‘ ack ‘ ]

 8.                           If ACK packet ‘f’ is only ACK capture line correctly ( ‘: . ack’)

9.                            If ACK packet ‘f’ is piggybacked ACK capture line correctly ( not ‘: . ack’)

10.                          If the destination of ‘f’ equals destination of ‘d’ or the timestamp of ‘f’ is smaller than timestamp of ‘d’ then next ACK packet;

11.                                    if( (seq(ACK)=sn == seq('d') + len('d') +1 ) && ( ts(ACK) >= ts('d')  ) {

12.                                                mark( 'd', marked_coupled );

13.                                                delay = ts(ACK) - ts('d');

14.                                                mark( all packets with sequence number = sn, marked_coupled );

15.                                                next data_segment;

 16.                                   }

 17.                       }

 18.                       if( 'd' NOT marked_coupled ) {

19.                        For each ACK packet of 'f' [ ACK are TCP segments with tcp.header.ACKflag == 1  therefore contains ‘ ack ‘] {

 20.                         If ACK packet ‘f’ is only ACK capture line correctly ( ‘: . ack’)

 21.                         If ACK packet ‘f’ is piggybacked ACK capture line correctly ( not ‘: . ack’)

 22.                         If the destination of ‘f’ equals destination of ‘d’ or the timestamp of ‘f’ is smaller than timestamp of ‘d’ then next ACK packet;

 

 23.                                               if( (seq(ACK)=sn > seq('d') + len('d') +1) && ( ts(ACK) >= ts('d')  ) {

 24.                                                           mark( 'd', marked_coupled );

 25.                                                           delay = ts(ACK) - ts('d');

 26                                                           mark( all packets with sequence number = sn, marked_coupled );

 27.                                                           next data_segment;

 28.                                               }         

 29.                                  }

 30.                       }

31.            }

32. }

 

Information we extract:

flowid packettype ip1 destination ip2 time_ack time_data delay(ms) packet_size count_coupled not_coupled

 

Reference: http://www.securityfocus.com/infocus/1223

 

 

#!/usr/bin/perl

 

#input of this algorithm is a file which corresponds to a flow e.g /home/maria/Desktop/delay/flow96.219.6.34_34.56.32.4

 

my $string = $ARGV[0];  # as input takes the file which contains a flow

my $flowid = $ARGV[1];  # this is a counter which is increased every time a flow file is loaded. Therefore different flows have different numbers.

 

$infile = $string;

open (IN, "<$infile");

 

# the file delays.txt contains the below information

# flowid packet_type ip1 destination ip2 time_of_ack time_of_datasegment delay packet_size packets_coupled packets_not_coupled

# flowid: packets with the same flowid belong to the same flow

# packet_type: the type of data segment: SYN (S), FIN (F), PUSH(P), etc

# ip1: the ip number and port number of sender or receiver

# destination: > if ip1 is sender and ip2 receiver, < if ip1 is receiver and ip2 sender

# time_of_ack: time ack is captured for data_segment

# time_of_datasegment: time datasegment is captured

# delay: time_of_ack - time_of_datasegment in msec

# packet_size: the size of the packet in bytes

# packets_coupled: how many data_segments were examined

# packets_not_coupled: how many of the examined data_segments were not coupled

 

my $outfile = "delays.txt";  

my @data = <IN>;

close IN;

$count_data=0;    #initialize counter that counts how may data_segments we have in the flow

foreach $line (@data){

      if ($line =~ /^#/){     #(it is a comment line)

         next;

       }

           

      if (!($line=~m/:\s\.\sack\s/i)){ # if the line is a data_segment (if it is not an only ack line)

            #track lines which have ack

            $line=~ /^(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)\S{1}(\d+)\S{1}(\d+)\S{1}\s+(\w+)\s(\d+).+\S+\s(\d+)\S+/;

            print "data......."."$line";       

            $seq_num = $7;    #this is the sequence number of the first segment in the data line

            $timestamp=$2;    # the timestamp the data is captured

            $ip1 = $3;        # the ip1 of the sender/receiver

            $ip2 = $5;        # the ip2 of the receiver/sender

            $destination = $4;  # the symbol which clarifies which one is the sender and which one os the receiver

            #print $destination;

            $length = $9;     # the length of the segment e.g 4 then 4 segments are sent

            $endpacket = $8;  # the sequence number of  the last packet of the segment

            $size = $12;      # the size of the packet in bytes

            $symbol = $6;     # this gives us the packet type: S,F,R, .

     

            if ($coupled{$seq_num}){            #if packet is coupled then go to next packet

                        next;

             }

            $count_data = $count_data+1;    # increase counter for every data you examine

     

            foreach $line2 (@data){       # search for this data all trace to find the corresponding ack

                  if ($line2 =~ /^#/){    #(it is a comment line)

                        next;

                   }

                  if ($line2 =~m/\sack\s/){  # examine only for lines that contain ack (the ack flag is set)       

                  if (!($line2=~m/:\s\.\sack\s/i)){       #if ack line is piggybacked then capture it correclty

                        $line2=~ /^(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)\S{1}(\d+)\S{1}(\d+)\S{1}\s+(\w+)\s(\d+).+\S+\s(\d+)\S+/;

                       

                        # the ack we are searching must have opposite destination of the data and bigger timestamp

                        $destination2=$4;  

                        $timestamp2=$2;

                        if (($destination2 eq $destination) || ($timestamp2<$timestamp1)){       

                              #print $destination2;

                              next;

                        }                            

                        # if the ack we are examining is a possible ack then capture its sequence number                    

                        $seq_num2=$11;

                       

                  }

                  elsif ($line2=~m/:\s\.\sack\s/i) { #if ack line is only ack without data capture it correctly

                        $line2=~ /^(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S)\s+(\w+)\s+(\d+).+\S+\s(\d+)\S+/;

                        $destination2=$4;

                        $timestamp2=$2;

                        # check as before

                        if (($destination2 eq $destination) || ($timestamp2<$timestamp1)){        #(it is a comment line)

                              #print $destination2;

                              next;

                        }

                        print "ack "."$line2";

                        $seq_num2 = $8;

                        }

 

                  # if the ack we examine is the corresponding ack of our data 

                  if (($seq_num2 == $seq_num + $length +1)&&($timestamp2 > $timestamp)){

                        print "packet coupled";

                        $coupled{$seq_num}=1;  #mark that packet with seq_num is coupled

                        $ack_time{$seq_num}=$timestamp2;

                        $data_time{$seq_num}=$timestamp;

                              $delay{$seq_num}= ($timestamp2 - $timestamp)*1000;

                        $packet_size{$seq_num}=$size;

                        $ip_1{$seq_num} = $ip1;

                        $ip_2{$seq_num} = $ip2;

                        $destin{$seq_num} = $destination;

                        $symbolo{$seq_num} = $symbol;

                        last;   #if ACK found and packet coupled go to next data.. exit this for                          

                  }

            }  #end ror each line2

    

            if (!$coupled{$seq_num}){  #if packet still not coupled then we must search for the next ack with greater sequence number

                  print "not found yet for "."$seq_num";

                        foreach $line2 (@data){    #search for an ACK with greater seq_num

                              if ($line2 =~ /^#/){  #(it is a comment line)

                              next;

                        }     

                        if (!($line2=~m/:\s\.\sack\s/i)){       #if ack line is piggybacked then capture it correclty

                        $line2=~ /^(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d+)\S{1}(\d+)\S{1}(\d+)\S{1}\s+(\w+)\s(\d+).+\S+\s(\d+)\S+/;

                             

                              # the ack we are searching must have opposite destination of the data and bigger timestamp

 

                              $destination2=$4;

                              $timestamp2=$2;

                              if (($destination2 eq $destination) || ($timestamp2<$timestamp1)){  #(it is a comment line)

                                    #print $destination2;

                                    next;

                              }                            

                              #print "piggy "."$line2";                            

                              #print "$2\n";                           

                              #$timestamp2=$2;

                              $seq_num2=$11;

                              #print $destination2;

                              #print "$seq_num2\n";

                              #print "$symbol "."$1 "."$2 "."$3 "."$4 "."$5 "."$6 "."$7 "."$8 "."$9\n";

                        }

                        elsif ($line2=~m/:\s\.\sack\s/i)    { #if ack line is only ack without data capture it correctly

                              $line2=~ /^(\d+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S)\s+(\w+)\s+(\d+).+\S+\s(\d+)\S+/;

                              $destination2=$4;

                              $timestamp2=$2;

                              if (($destination2 eq $destination) || ($timestamp2<$timestamp1)){  #(it is a comment line)

                                    #print $destination2;

                                    next;

                              }

                              #print $symbol;

                       

                              print "ack"."$line2";

                              #$timestamp2=$2;

                              #print $destination2;

                              $seq_num2 = $8;

                        }

                        if (($seq_num2 > $seq_num + $length +1)&&($timestamp2>$timestamp)){

                                    $coupled{$seq_num} = 1;

                                    print "found with grater seq num"."$seq_num"." -----"."$seq_num2\n";

                                    #print "$seq_num2\n";

                                    $ack_time{$seq_num}=$timestamp2;

                                    $data_time{$seq_num}=$timestamp;

                                    $delay{$seq_num} = ($timestamp2 - $timestamp)*1000;

                                    $packet_size{$seq_num} = $size;

                                    $ip_1{$seq_num} = $ip1;

                                    $ip_2{$seq_num} = $ip2;

                                    $destin{$seq_num} = $destination;

                                    $symbolo{$seq_num} = $symbol;

                                    last;

                        }

                       

                         } #end foreach $line2

            } #end if not coupled

      }#if line 2 is an ack

      }#if data line

} #end foreach $line

 

 

 $count_coupled = 0;

foreach $seq_num (sort { $a <=> $b } keys %coupled){

  $count_coupled=$count_coupled+1;

}

 

 

$not_coupled = $count_data - $count_coupled;

open (OUT,">>$outfile");

foreach $seq_num (sort { $a <=> $b } keys %coupled){

 

print OUT "$flowid "."$symbolo{$seq_num} "."$ip_1{$seq_num} "."$destin{$seq_num} "."$ip_2{$seq_num} "."$ack_time{$seq_num} "."$data_time{$seq_num} "."$delay{$seq_num} "."$packet_size{$seq_num} "."$count_coupled "."$not_coupled\n";

}

 

close OUT;

 

 

 

 

October 3

TCP Packet Structure as captured by tcpdump

13:10:30.134481 [timestamp] my.win98.box.2172 [address and port ] > anon.ftp.box.21: [address and port] 
[TCP flags] 71870464:71870465 [sequence numbers] (1) [bytes of data] ack 3456789 win 65535 [TCP window size] [checksum] 
[urgent] [TCP options] (DF) [don't fragment flag is set] (ttl 128, id 44644, len 41) [time to live value, and IP 
identification number, length of tcp]
 

Every line corresponds to a TCP packet

Timestamp: the time the packet is captured. In our trace timestamp is in another format.

by default, the timestamp is in the following format - hours : minutes : seconds . seconds 15:22:41

-t suppresses the timestamp output

-tt gives an unfomatted time stamp, this value is a count in seconds from the OS clock initial value 1029507868.335134

-tttt gives the interval between the packet received and the previous packet

358020 orac.erg.abdn.ac.uk.1052 > 224.2.156.220.57392: udp 586
328704 orac.erg.abdn.ac.uk.1052 > 224.2.156.220.57392: udp 893
391361 orac.erg.abdn.ac.uk.1052 > 224.2.156.220.57392: udp 491

 

TCP flags:

SYN: synchronize sequence numbers

FIN: terminate connection

PSH: data must be delivered to upper layer

URG: urgent flag

RST: Reset the connection

. no flag is set

 

71870464:71870465 [sequence numbers]: 71870464 is the first byte this tcp carries. It is the first byte of the application data.

71870465: the last byte of the application data

(1) [bytes of data]: the difference between the above sequence numbers.

ack 3456789: acknowledgment flag and acknowledgement number

Win 65535: this is the maximum window size. The maximum window size is the maximum number of bytes that the sender has available in its buffer. It acknowledges the other part that this is the maximum data it can receive without dropping packets. If the other part has bigger maximum window size then they agree on the minimum of the two.

[urgent]: if Urgent flag is set then urgent pointer is included here. The Urgent Pointer holds an offset pointer to the end of some urgent data.

DF: When TCP segments are destined to a non-local network, the "do not fragment" bit is set in the IP header. Any router or media along the path can have an MTU that differs from that of the two hosts. If a media segment has an MTU that is too small for the IP datagram being routed, the router will attempt to fragment the datagram accordingly. It will then find that the "do not fragment" bit is set in the IP header. At this point, the router should inform the sending host that the datagram cannot be forwarded further without fragmentation.

(ttl 128, id 44644, len 41) [time to live value, and IP identification number, length of tcp header]

ttl: measured in hops. Every time an ip diagram walks through a router, this unit is reduced. If it reaches zero then the ip diagram is rejected from the router. This ttl is used to avoid the constant wandering of an ip diagram to the internet. If it does not reach its destination on “time” then it is rejected.

id: the unique id number of the ip packet.

len : the length of the ip datagram in bytes.

       

 

TCP options:

Maximum segment size: this is the maximum IP datagram size that receiver and sender can handle without using fragmentation.
Negotiation takes place in SYN, SYN/ACK packets. By default 536 bytes   [RFC 793, RFC 1122]

 

NOP (no operation): provides padding [RFC 793]

SackOK: only in SYN, SYN/ACK. Permits selective acknowledgments 

Selective acknowledgment data: if SackON is in SYN/ACK then other tcp packets contain more than one or more pairs of sequence numbers. [RFC 1072, RFC 2018]

Timestamp: 2 fields Timestamp Value Field and Timestamp Echo Reply field. Senders packet contains only timestamp value and the echo field is zero. Receiver replies with an ack and the echo field contains the senders value. Used to calculate RTT for transmitted packets. [RFC 1323]

Window scale: shifts the windows size values up to a maximum value. Only appear in SYN and SYN/ACK packets. If not then the 
maximum window size remains unchanged. By using the window scale option, the receive window size may be increased up to a 
maximum value of 1 gigabyte (1,073,741,824 bytes). This TCP option is proposed to allow windows larger than 2**16. This 
option will define an implicit scale factor, to be used to multiply the window size value found in a TCP header to obtain 
the true window size. This is useful for elephant networks. [RFC 1072, RFC 1323]
 
RFC does not define the position of each option in the option field. Not all optioms are supported by the operating systems 
(passive fingerprinting)