Sunday, November 17, 2013

Network X Web Servers: What if you could setup everything from scratch?

 You're a dummy programmer who's always been dealing with pre-configured hosting environment like CPanel et al. But this time, you want to take it to an upper level of control on the hosting, your own servers, your own configuration, your own security. Well, the following sequence of articles is for you. I will cut the chase later
but for now, you have to understand the network matrix which can be summarized in the following topics:  how internet works,  OSI model, TCP/IP,  IP addressing/subnetting, Routers, Firewalls and Quality of Service.

1) Digesting the network matrix in a few lines
       In order to easily get what follows, you have to keep in mind that network communication is pretty much similar to human communication, but in more conventional and adaptive manner. In any friendly talk (network), when you (device 1) talk(sending request) to another person (device 2) using your voice (data), you expect the person to reply (response via data). If that person doesn't want to hear what you're saying, he just has to clog his ears (firewall). Now, in bigger and complex conversations (internet) such as debates, conferences,.. there is a need of a moderator (router) to arbiter the communication between groups of people.
Now, let's go deeper by explaining the reverse: How machines communicate in a network. But before, I have to introduce you to a model called the Open Systems Interconnection (OSI) model that was built by the International Organization of Standardization (aka ISO) in 1984 to facilitate the understanding of this type of communication. You might ask yourself how? Well, let me fly over a sweet example that will enlighten you.
The OSI model simply formalizes the communication between machines over a network by splitting the communication flow into layers. The following conversation is a client-to-server PHP webpage request - http://www.example.com/index.php

Client Server (example.com)
Application layer (L7): 
The client  requests a web page by typing in a web address (a URL) in the application (browser)
Application layer (L7): 
The web server application (PHP from the LAMP* stack) processes the data and produce an (HTTP) response(4)
Presentation layer (L6): 
The request is formatted into HTTP generic request message (3)
Presentation layer (L6):
The request is formatted into HTTP generic response message (3)
Session layer (L5): 

The browser initiates an (HTTP) session by opening a TCP connection to the (HTTP) server with which it wishes to communicate.

Your web browser automatically opens additional TCP connections to the server and request those media

   -------------------------------- >>>






  <<< --------------------------------
Session layer (L5):

The server sends back the response / web page and closes the connection. 
                       |
                       |
Your web browser then parses the HTML of the web page to read the instructions (HTML tags) which tell the browser where to find additional files to be displayed within the web page such as style sheets, images,.... 
Transport layer (L4): 
The opened TCP connection(s) breaks up the request message(s) into managable chunks, labels them with numbers so they can be reassembled in the correct order and transport the pieces across the correct session
<<< ----------------------------- >>> Transport layer (L4): 
reorder the data stream if the incoming packets are out of order, multiplexing data in case of different flows to applications on the same hosts
Network layer (L3): 
Internet Protocol (IP) provides unique addresses for the web server and for your computer, then creates a message addressed to the web server and to be sent via your default gateway
<<< ----------------------------- >>> Network layer (L3): 
read each received packet to check if any matches any access restriction or filtering feature set at the computer firewall
Data layer (L2): 
Your computer uses ARP to figure out the physical MAC address of the default gateway and then passes the messages to the network card. Once there, each  chunked message is transformed into a network request/packet then forwarded to the Internet via the detected gateway
Data layer (L2): 
The network card reads the received packets and converts them into packets readable by your machine
Physical layer (L1): 
transmits the packets to the default gateway.
-------------------------------------------- Physical layer (L1): 
receives the transmitted packets and send to the network card

*LAMP: Linux-Apache-MySQL-PHP (5)

Such modeled separation can be applied to any type of computers' communication to understand or study this last; the only difficult part is to put the right element or process of the communication in the right layer.
However, the example above is an easy one and is only valid for a small number of machines. If you want a huge multiple computers' conversation like internet, things will get jammed and your network will look like a loopy bin.


Figure 1: Traditional devices communication

That's why, you will need a moderator that will work at the layer 3 (TCP/IP) to deal with the routing of information using the computers' IP. Such moderator is called a router. And what it does, is that it divides the multiple computers' conversation into sub-conversations. After incorporating routers, your network will look like this:


Figure 2: Modern devices communication

This schema (Figure 2), when projected at a global scale, is basically called internet.

Each router has a routing table to compute the next hop for a packet. The routing table stores the routes (and in some cases, metrics associated with those routes) to particular network destinations. Your routing table is created automatically, based on the current TCP/IP configuration of your Linux / UNIX computer. You can manually add / modify / edit routing table using route and ip command on Linux (6). In addition, some routers also provide security via NAT/DNAT, additional firewalls (to not be confused with the ones of your computer) and Quality of Service  as we'll see in our future articles while setting up our own environment.

Now you know that if you don't have valid destination IP address to match in the routing table, you might have trouble in your conversation (this is not new since it's the same scenario in human conversation :) ).

But:
- If the communication is only from IP to IP, how does my machine communicate via an URL in the browser? 
-How does it work inside each sub-network monitored by a router? 
-How about the interfacing between your computer and the router itself? 

Ok ok. One question at a time.
The communication over internet is indeed IP-to-IP. Your machine has no clue of what an URL is. At L7 (see the above OSI-modeled communication example), when the user provides the URL, your computer contacts your DNS servers to convert the supplied URL into an IP address, which is later interpreted by your device as a sequence of octets through a binary division. The following figure illustrated this translation process:
                   
                                   Browser
                        http://www.example.com
                                        DNS
                 74               125             228               17
                                      Device
   01001010    01111101  11100001  00010001

Figure 3: Interpretation sequence of a URL by a device within a network

 For Linux-users, you can find the IP addresses of your DNS servers in the file "/etc/resolv.conf". Those DNS servers' IPs are automatically configured by the router of the sub-network to which you belong. to

Now talking about sub-networks, two things to keep in mind are: private IP and public IP. At the early beginning, everybody used public IPs to communicate. After the arrival of routers, for security reasons and to prevent an eventual shortage of public IPs, private IPs had been integrated for internal communication within sub-networks. And then, when a request, sent from your machine at L7,  reached L3, your router would convert your private IP into public IP to be attached to each packet header. However, no matter the number of private IPs (hosts) in your sub-network, the public IP is the same.
The private IP is automatically assigned to the network card's interface of your device by the router of the sub-network to which you are connected and must follow the standards set by RFC 1918. Here is the range assigned private IPs depending on the class of your network.

IP address range  Number of addresses  Class of the network
10.0.0.0 - 10.255.255.255  16,777,216  class A
172.16.0.0 - 172.31.255.255  1,048,576  class B
192.168.0.0 - 192.168.255.255  65,536  class C

And yes, two machines from two (2) different sub-networks can have the same private IP; collision of private IPs is allowed as long as the machines belong to different networks.
Moreover, the structure of the private IP address reveals 2(two) major hidden lines: an identifier of your sub-network and an identifier for your machine. These lasts can be easily unmasked by reading another sequence called the subnet mask. For example, if the private IP of my wireless card's interface(wlan0) is 192.168.1.1 and my subnet mask is 255.255.255.0. We simply use an alignment reading technique to figure it out.
We stick both sequences and divide from  the 0-side of the subnet mask
    192.168.1.
255.255.255.
1
0
The red container is the network identifier and the green one is the identifier of the device on the network. So now, you can check yours by typing in the command line, ifconfig for Linux and ipconfig for Windows.

Another use of the subnet mask is in the design of network. For instance, let's say, after buying your router, you connect your laptop, and type ifconfig and notice your private IP is 192.168.1.0. Then, you speak to yourself and say: " I want to have a network/sub-network with 254 hosts". Using that amount, we can determine the subnet mask for your future network and thus the splitting above.
Remember (from Table 1) that an IP address is nothing more than a sequence of 4 bytes, that is 4 sequences of  8 bits.
 _ _ _ _  _ _ _ _     _ _ _ _ _ _ _ _    _ _ _ _ _ _ _ _   _ _ _ _ _ _ _ _ 

There's a formula to use to determine depending on the starting point: the willing number D of devices OR the willing number N of subnets/networks.
D = 2n- n,  N = 2n          
 where  n is the number of bits to skip before splitting
 For D, we count from the right and for N, we count from left  

So for the network we want to build,  it will be: 2n- n = 254 => n = 8
Hence our subnet will be:
                    
  11111111    11111111    11111111  _ _ _ _ _ _ _ _ 
     255        255      255           0 

Knowing that our subnet mask will be 255.255.255.0, we can now use this information with the targeted amount of hosts to allocate private IP addresses and their corresponding lease time to the network's devices via the DHCP  This process is the manual allocation of IP addresses. Despite its effectiveness, this method is only suitable for company-size network (for example, when allocating an IP address to a company's FTP server). For minor cases like home networks,  the IP allocation is usually done automatically by the router.

And because each sub-network holds a private conversation and the router is the one in charge converting your insider(private)IP into a public one X , the inside interface of the router must have a private IP; meantime its outside interface has the public IP address X.

2) Subsequent themes of this series
Now, you're ready to move forward. Please note that the implementations will be entirely done in a Linux environment.
- A1: Setting up an all-level ( From the modem to the DMZ via the router)  network (virtual and physical)
- A2: Designing and implementing firewalls and Quality-of-Service for your network
- A3: Configuring and securing the servers for your website - Case study: LAMP
(If you are a web developer)

3) Advanced work
A similar sequential approach can be used depending on the service provided by the network you are developing. In our case, it's a hosting network. But you can have a network configuration dedicated to supply cloud services (E.g: IaaS platforms like Amazon EC2), Storage services (E.g: Dropbox), Distributed delivery services (E.g: CDNs)...etc

4) References
1- Designing and Implementing firewalls and QoS with Linux using netfilter, iproute2, NAT and l7-filter, 2006, Lucian Gheorghe
2- http://www.inetdaemon.com/tutorials/basic_concepts/network_models/osi_model/OSI_model_real_world_example.shtml
3- http://www.tcpipguide.com/free/t_HTTPRequestMessageFormat.htm
4- http://www.tcpipguide.com/free/t_HTTPResponseMessageFormat.htm#Figure_318
5- http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29
6- http://www.cyberciti.biz/faq/what-is-a-routing-table/
7- http://en.wikipedia.org/wiki/Private_network
8- http://www.tarunz.org/~vassilii/TAU/protocols/dhcp/ipaddr.htm

Thanks for your time and don't forget to comment and share this article, add me on G+ and subscribe to the blog's RSS feeds

No comments:

Post a Comment