A Simple explanation of what happens when you look up a website.

What happens when you type a website into your browser? It needs to talk to many different parts of the internet to get where it is going. It needs to find the correct place, talk using the correct language, pass through security, go through traffic control, and finally talk to the different parts of the website itself.

So the first real step in bringing up a webpage is the DNS lookup. The Domain Name Service first checks the browser cache to see if it knows the name of the website and its associated IP address. The cache is used for frequently visited sites like google.com or youtube. If the domain name is not in the server cache it next goes to the OS and calls gethostbyname, this function checks the local machine to see if it knows the name of the given domain, this is done by looking in the host file. If it is not found gethostbyname moves on to the ISP’s DNS cache. If the website is not found in this cache it then moves onto the root DNS servers. The root servers will first look at the ending of the website such as .com and return the location of the .com servers. Which will then find the name part of the website which will then eventually return the IP address of the website.

Once the IP address is resolved the web browser will talk to the server using TCP/IP. The four layers of TCP/IP are: the application layer, the transport layer, the network layer, and the physical layer. The application layer contains HTTP (the basics of a website), FTP (file transfer protocol), POP3 (post office protocol 3), SMTP (simple mail transfer protocol), and SNMP (simple network management protocol). The transport layer maintains communication between the host and server. The network layer controls the network traffic. The physical layer is mostly used for local networking.

Once an IP address is established and the host attempts to use TCP/IP to communicate with the server it must pass through the firewall. Hopefully both the client and server have their respective firewalls. A firewall is what filters out unwanted ip address or bad packets of information that may contain a virus or other malicious software. There are many types of firewalls that can keep us safe, however the simplest ones allow only the good stuff through and keep the bad stuff out.

Once we are through the firewall we need to make sure we got to where we wanted to go, This is handled by HTTPS/SSL. HyperText Transfer Protocol Secure is done over a Secure Socket Layer. This not only ensures who we are talking to is who it says we are talking to it also encrypts our data to keep it safe from thieves and other nefarious people. To get a SSL certificate is not too hard but it is next to impossible to get one that says you are google.com, the point is that they are given to websites that are who they say they are.

Once through the firewall and after obtaining a secure connection via HTTPS/SSL we are probably going to hit a load-balancer. The load balancer distributes the web traffic to various servers depending on the specified behavior of the system administrator. There are lots of ways to direct traffic, depending on the servers and their capabilities there might be different methods used.

At this point in the journey we are going to hit a web server. Web servers can serve up static content, such as a plain webpage, they can be hosted with nginx, or Apache2 or any number of other web server software packages. This is where the HTTP and HTTPS information comes from, in short the website. The main thing to note is that they do not serve up dynamic content for that you need an application server.

If the website requires some form of significant calculation, or interaction with the user we are going to probably encounter an application server. Application servers are where all of the business related calculations take place. However they are not where the data is stored, for this you need a database.

The database is where everything is stored long term, it can be a json database, or a sql database or some other form of database, The database is where all the non static information is stored for the user.

So when we type in a website we are first going to encounter the DNS and the DNS is going to find the IP address. We will then use TCP/IP to communicate with the website. We will then pass through the firewall. Hopefully use HTTPS/SSL to communicate with the load-balancer and web server and then if there is any significant calculations or business logic to run the web server will talk to the application server which will talk to the database. All this will happen and then we will get a response from the web page with the information we requested.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store