Networking Analysis
Work belongs to Dan Qutaishat, this work can be referenced by using any of the established referencing methods such as Harvard.
This report conveys the analysis of each of the labs: 2, 3 and 4 and thus addresses the importance of the use of the HTTP layer, DNS servers and TCP – most of these can be attributed to aid in troubleshooting issues that the user may face. Each segment on this report that covers a lab contains its analysis and learning experience descriptions. Evidence to support the analysis of each labs is evident by the addition of screenshots of the lab work. In order to increase the learning gained from each lab, as a personal challenge, I initiated personal experiments/investigations in order to push my comprehension further – these are also discussed within the report.
Lab 2 (HTTP) analysis:
1. Analysis of basic HTTP GET/response interaction:
After running http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file1.html, only two packets appear, the HTTP GET packet which signifies the request sent to the server and the HTTP OK packet which signifies that the TCP data has been successfully received by the client (response).
The HTTP GET packet and the HTTP OK (response) packet can be be broken down into 5 layers: Frame, Ethernet II, Internet Protocol, Transmission Control Protocol and the Hypertext Transfer Protocol. However, for this part of the lab, only the Hypertext Transfer Protocol layer needs to be analysed. The HTTP layer contains multitudinous features. One of the important features is that it shows the version of the HTTP that is used, in this case, the host’s browser and the website’s server are running is 1.1 – HTTP 1.1 is a much more efficient HTTP in comparison to 1.0 as it allows the client to send many requests on the same connection without having to wait for a response to each request and thus saving time. The HTTP layer also includes the ‘User-Agent’ which illustrates what the host device’s browser used is and what they are operating system they are running (in this case it is Macintosh) and the processing architecture size of the device used – the website receives this information about the host device of the user which is why users should avoid visiting suspicious sites as it may lead to them being vulnerable to attacks by hackers.
However, using the HTTP OK packet, the website’s server information is received by the user’s host device and thus this ‘trade off’ of data can be viewed as a safeguard and can prove authenticity of the website visitation. Additionally, the HTTP layer indicates the language that is used by the server as English (US) – this is due to it being set as your browser language so is used by the website. Moreover, the IP address of the host device (Src/source) and the IP address of the client server/ website (Dst/destination) are displayed which is helpful as it explains the route of the packet transmission. Other notable features are: that the time in which modification last occurred is shown – this is important to keep track of changes that the website goes through/ its different versions and the type of file the website is i.e. text/html.
Learning Experience Description:
This part of the lab was extremely enlightening as it exposed me to the potential risks of visiting websites that appear sketchy/suspicious as it exposes a wealth of information regarding the user this includes: the user’s IP address and hardware and software details about their device e.g. processor size which could be used to track/ID a specific machine. However, this issue can be resolved by using a virtual machine to browse instead of the user using their host device as it would only give generic information regarding the system being used to the website.
2. Analysis of HTTP conditional GET/response interaction:
In order for this part of the lab to work, clearing the browser’s cache is necessary, this is to ensure that unwanted packets are not recorded in the Wireshark scan thus making it clearer which packets belong to the website being run. Throughout this part of the lab, the key is repetition, the website http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file2.html is loaded in more than once and thus there are more than one set of HTTP GET/OK packets evident in the scan. Though the breakdown of each feature under the HTTP layer of these packets may be similar, there are a few differences due to the fact that the website was fetched/refreshed multiple times. Interestingly, the original website when ran did not show an “If-Modified-Since” but after the website was refreshed, it displayed the “If-Modified-Since”, this is due to the feature displaying the date and time of the last modification of the website since the previous GET request.
After that, the third HTTP OK packet displayed the packed as “Not Modified” – which depicts that the server didn’t return the contents of the website due to the browser loading it from the cache- thus no transfer of data. Additionally, there was a connection feature, it noted that: “Connection: Keep-Alive” and had two parameters which is “timeout” and “max” – this represents how long the connection will continue to be guaranteed as open, in this case it is for five transactions and 98 seconds. During this time, the connection is persistent continuously and thus simultaneous requests can occur on the same server.
Another interesting feature, under the HTTP layer, is the “Cache-Control” and it is set to “max-age = 0”, this is significant as it informs the user that the content has gone stale and thus needs to be re-fetched immediately. However, for experimental purposes, I have decided to refresh/reload the website five more times to test the accuracy of the data supplied by the Wireshark software. Once the website was refreshed for the fifth time, the HTTP OK packet illustrated a “403 Forbidden” message on it – this client error code was received by the website’s server but was not authorized and thus was not run so connection closed. Moreover, simultaneously, the connection changed from “Keep-Alive” to “Close”, the client/server asked to close the connection. Therefore, the connection needs to be re-established with the server by restarting the link to the site. If the HTTP operation does not end, it will shut down immediately and thus the website will be brought up repeatedly from the cache material and thus no information would be imported so the connection fails. Thus, the experiment illustrated that the data supplied by Wireshark was reliable as the connection to the website failed after five transactions between HTTP GET/OK packets. The remaining features of the HTTP GET layer were quite similar when comparing them with the different HTTP GET/OK interactions. A lot of the information was repetitive between them, such as the information about the language, connection time used, the hardware information and the IP addresses and MAC addresses of the host and website’s server.
Learning Experience Description:
The importance of this lab is that it encourages the user to be more aware of the rate of the connection and its type i.e. how long the connection is guaranteed for. It also teaches the importance of constant deletion of cache history and the refreshing of sites/ restarting the server in order to ensure a constant connection/ longer connection time.
3. Analysis of Retrieving Long Documents:
After running http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file3.html (Bill of Rights document) it was evident that the file was too large and so the single HTTP response required more than one TCP packet. Thus, the data was sent in 4 segments containing 1440, 1440, 1440 and 541 bytes respectively, thus a total of 4861 bytes. These segments are then reassembled. Due to the protocols being arranged in a hierarchy where the HTTP is a higher-level protocol in comparison to TCP, it is important that the TCP is reassembled in a contiguous sequence (#57, #58, #59, #60) in order to allow the construction of the entirety of the HTTP message. When analyzing the data-containing segments, it is evident that segment #57 contained the introduction of the Bill of Rights document, #58 contained the content after the introduction to amendment V, #59 contained the content from amendment V to amendment VIIII, #60 contained the content from amendment VIII to amendment X. In order to ensure there were no packets lost after the reassembly of the TCP segments, I used the packet counter tool on Wireshark where it shows that there is 100% of the HTTP packets being successfully received and transmitted. When comparing the number of bytes captured between the HTTP GET and the HTTP OK, the number of bytes received is much greater than the number of bytes sent because the TCP fragmentation from the large file increases the number of packets thus increases number of bytes.
Learning Experience Description:
This is beneficial as it allowed me to expand my understanding of TCP segmentation and the level hierarchy of the different protocols, which is information that can be applied when focusing on packet transmission.
4. Analysis of HTML Documents with Embedded Systems:
After running http://gaia.cs.umass.edu/wireshark-labs/HTTP-wireshark-file4.html, three HTTP GET request messages are sent, one corresponding to each of: the page, the Pearson logo and the image of the Pearson book. The IP corresponding to the page is 128.119.245.12, the IP for the Pearson logo is 128.119.245.12 whilst the IP address for the image of the Pearson book cover is 178.79.137.164. This is important as it informs us that the source of the site of and the Pearson logo is the same whilst the source of the image of the cover of the Pearson book is different. It is clear that these images were downloaded serially due to the second image being requested after the request for the first image has returned.
Learning Experience Description:
For investigative purposes, I tried to run this section of the lab again using a VPN which placed me in the UK, this was helpful to my learning as different IP addresses showed up for the images thus it might mean that the server that I visited without the VPN corresponded to my region which may explain why the image and the site came from the same source.
Lab 3 (DNS) analysis:
1. Using “nslookup”:
Due to the most commonly visited sites being social media sites, I decided to run the command on some of the popularly social media sites: Instagram, Facebook, Twitter and WhatsApp. The IP addresses of the sites and the names of the domains visited. The IP addresses are interesting between Instagram, Facebook and WhatsApp – that is because they have the same IP addresses, this is due to Facebook owning Instagram and WhatsApp as well and thus they are only operating from the same server.
There is a non-authoritative answer section shown because the local DNS could not answer the query itself so had to contact one or more other external DNS servers. However, the use of nslookup can provide inaccurate results because the server that is being looked for may not be in the cache of the local name server and thus the user can choose the DNS server themselves which they want to use to make the query, this would help as more authoritative answers would be displayed. This can only be done by displaying the type to NS as it specifies the domain name via changing the server. Thus, I visited the social media sites again using this modification to the nslookup command and authoritative answers were shown, this increased the reliability of the results as the data was provided from their original source.
Learning Experience Description:
The use of nslookup is beneficial because it means that users can use it to troubleshoot issues with their active directory so it can essentially tell if all the servers are successfully being converted in the DNS.
2. Using “ifconfig”:
Due to me operating from a device that uses Mac OS, ipconfig does not run and thus I had to use the alternative to it which is ifconfig. When running ifonfig -a (the equivalent to ipconfig/all), the results display a range of different interphases, these are: loopback, Software Network Interface, 6 to 4 tunnel interface, Ethernet 5, access point, Ethernet 0, Apple Wireless Direct Link, Low-latency WLAN interface and the Thunderbolt Bridge. Some of these are seen as active whilst the rest are inactive, this is helpful as it enables me to see the rate of frame transmission to the different interphases, an example of this is when an interface is noted as having “mtu 1500”, this means that the maximum transmission unit is 1500 bytes – this is the largest frame size that can be sent. Moreover, I displayed the DNS servers that are accessible via my device, a set of different resolvers were displayed, resolver #1 handles my DNS lookups thus has the name server set to my IP address, resolver #2 handles the local domain, whilst the remainder of the resolvers are attributed to root servers due to them being from the “.arpa” domain, thus some map IPv4 addresses to internet domain names, whilst others are focused on mapping IPv6 addresses to internet domain names. Nevertheless, due to the DNS servers being visible, it’s a good middle step to ensure the flushing of the DNS is efficient.
3. Tracing DNS with Wireshark:
After clearing the DNS cache and the browser cache, I searched the www.ntu.ac.uk site. After running Wireshark and filtering the results via IP, the results indicate that the DNS queries and response messages are sent over UDP with the source port being 56788 and the destination port ran is 4433, being able to follow the DNS queries and responses via tracing is important in order to troubleshoot problems i.e. problems connecting with a website/ server to be able to determine whether the issue is from your home server or the site’s server.
Nslookup command is then used like part 1 of this lab but with www.ntu.ac.uk when the DNS queries and responses are analysed, it is clear that there are 3 queries and responses with one question but no answer in the query packet - but one question with two answers in the response packet – they are considered standard query and responses of type A DNS host record which contains the host name and its corresponding IPv4 address. When you set the type to “ns” the number of queries and responses decreases to only one rather than 3 but that is due to specifying the domain that you want to be used, that is why their type of messages is set to NS and CNAME, these correspond to the DNS zone specified and the use of another host name so that the process can be repeated.
Lab 4 (TCP) analysis:
Uploading the large txt file onto http://gaia.cs.umass.edu/wireshark-labs/lab3-1-reply.html leads to a three way handshake to occur, the first handshake is equivalent to the SYN message which indicates that there was a successful connection with the server, the second handshake is equivalent to the SYN-ACK message which is the server’s response to the client and the third handshake is the ACK message where the client acknowledges the servers’ response to ensure the connection occurred successfully. In order for the TCP data to be transferred via this method, it establishes connection on a sequence number, the initial sequence number in this case is 0, it is 1 as it is the number corresponding to the TCP segment’s byte size i.e. data in first TCP packet.
The acknowledgement number is related to the sequence number i.e. it is the sequence number + 1 in the SYN/ACK packet. Plotting the RTT values of each of the TCP segments is important as it allows the user to know if there is high levels of congestion on the website, which would help in troubleshooting when there is connection issues, these plots can further be detailed via creating a TCP stream graph. There are flags in each of the TCP packets which correspond to a connection state, each of the values of these flags are different as they have different hex values e.g. for the SYN/ACK packet the hex value of the flag was 0x012 which in binary is equivalent to 10010, this when mapped in binary bits is equivalent to the SYN/ACK packets. The same could be said about the other packets, their flag number is corresponding to the type of flag used, so for the SYN flag it is 0x002, the SYN/ACK flag is 0x012 and ACK is 0x010.
Learning Experience Description:
TCP analysis is significant as it allows congestion control to occur – this happens by a congestion window being kept open which permits mainly acknowledged packets to be transferred via end-to-end transit, this thus limits the amount of unacknowledged packets that can be transferred, this helps ensure the flow is maintained which improves network performance. Using the proper graphing methods, it enables the user to analyse the congestion rates and when the congestion avoidance mechanism occurs.