Monday, June 9, 2008

What's a Web Server? How does a Web Server work?


What's a Web Server?

It's a server used to communicate with Web Browsers as its clients and the communication protocol used in this case is HTTP (HyperText Transfer Protocol). This is why a Web Server is also called an HTTP Server.

Before moving on to discussing how a typical Web Server works, it's better to understand what exactly the HTTP is all about? If you don't know it already then you may like to go through this article.

How does a Web Server work?

As is the case with any client-server comunication, in this case also the client (i.e., the Web Browser) and the server (i.e., HTTP/Web Server) should be able to communicate with each other in a defined way. This pre-defined set of rules which form the basis of the communication are normally termed as a protocol and in this case the underlying protocol will be HTTP.

Irrespective of how the client or the server has been implemented, there will always be a way to form a valid HTTP Request for the client to work and similarly the server needs to be capable of understanding the HTTP Requests sent to it and form valid HTTP Responses to all the arrived HTTP Requests. Both the client and the server machines should also be equipped with the capability of establishing the connection to each other (in this case it'll be a TCP reliable connection) to be able to transfer the HTTP Requests (client -> server) and HTTP Responses (server -> client).

How these things can be done, depends on a variety of factors including the very important factor - what's the choice of programming language. Let's take Java as the programming language here. Now the question arises - How can we implement a typical Web Server using Java? Okay... let's try it. For a Web Server to work perfectly, we need the client to form and send valid HTTP Requests to it. So, the Web Browser (client in this case) should be also be discussed for how it can have those required capabilities. Let's consider that the browser is also being implemented in Java only. The other preferred language for developing Web Browsers is C++ and we use Sockets to implement the above mentioned capabilities in a Web Browser. Implementation of Sockets in C++ is quite similar to that in Java (at least conceptually). Java makes the overall implementation relatively simpler and that's the only difference we would notice.

HTTP Request building and transfer by the client (i.e., by a Web Browser)

In this case, the Browser will use the java.net.Socket class to establish itself as one endpoint of the communication on a TCP reliable connection (the other endpoint will be another Socket instance returned by the accept() method of the ServerSocket class at the server). For creating a socket and being able to communicate to a machine, we need to know two things - the IP Address and the Port Number. In our case, a part of the URL will help us getting the IP Address of the server and for HTTP protocol the default port is assumed to be 80. The DNS server gives the IP Address for a valid URL entered into the address bar of a typical Web Browser. You can enter the IP Address directly as well, but it'll of course be extremely difficult to remember the IP Addresses and of course they won't be so self-explanatory (as compared to typical URLs).

So, if we type
www.google.com in the address bar of the web browser, then a Socket will probably be created using the construtor 'public Socket(String host, int port)' and the statement will be something similar to 'Socket socket = new Socket("www.google.com", 80);'. After the successful creation of the Socket instance, the client will be able to send/receive stream of bytes simply by using the OutputStream/InputStream associated with the Socket instance. The HTTP Request is created by examining the client machine (for HTTP Request Headers) and the URL (for HTTP Method and the URI of the resource being requested). Once the client has successfully created a valid HTTP Request, it'll use the TCP connection established using the Socket (at the client machine) and ServerSocket (at the server machine... we'll discuss it next) objects to transfer the HTTP Request to the Web Server.

HTTP Response building and transfer by the server (i.e., by a Web Server)

The Web Server needs to listen to the connection requests made by various clients and it should also be equipped with the capability of accepting HTTP Requests, understanding the requests, forming the corresponding HTTP Responses, and finally to transfer it to the appropriate client machines. The Web Server may use a java.net.ServerSocket object to do all these tasks. It may use one of the constructors of this class 'public ServerSocket(int port, int backlog, InetAddress bindAddr)'. Here port specifies the port number where the ServerSocket object will listen; backlog specifies the maximum number of queued requests before the ServerSocket object will start refusing any more incoming HTTP Requests; and the bindAddr specifies the IP Address of the machine where this ServerSocket object will listen to the incoming requests. So, it's normally the loopback address i.e., '127.0.0.1'.

After we are ready with a working ServerSocket instance, we can call the instance method having the signature as 'public Socket accept() throws IOException' for listening to the connection requests made by the client machines to this server. This method blocks until a connection is made and after that it returns a new Socket object, which forms the second endpoint of the communication between the client and the server. If there is a Security Manager at the server machine then the checkAccept() method is called with socket.getInetAddress().getHostAddress and socket.getPort() as parameters, where 'socket' is the newly returned Socket instance returned by the accept() method. If the checkAccept() method returns as the operation to be allowed then only we can proceed further otherwise a SecurityException will be raised and the communication will be terminated instantly.

If the Security Manager allows the operation then the newly created Socket instance will be used to retrieve the InputStream and the OutputStream associated with it, which the Web Server will ultimately use to read the HTTP Request from and to write the HTTP Response to. It's important to understand here that the InputStream os this Socket instance is just a replica of the OutputStream of the Socket instance at the client end. Whatever is written there is simply transferred by using the reliable TCP communication. Similarly, the OutputStream associated with the Socket instance at the server will be a replica of the InputStream associated with the Socket instance at the client end. We don't need to do anything extra than creating the Socket instance at the client end with appropriate parameters, creating the ServerSocket instance at the server end, call the accept() method ion this ServerSocket instance, and use the returned Socket instance by the accept() method to read and write the HTTP Request and Response. Everything else is taken care by the library classes of the java.net package. So simple, isn't it? That's the beauty of Java!

This Web Server is normally a multithreaded one, where the accept() on the ServerSocket object will return a Socket instance in a newly spawned Thread and the responsibilities of reading the request, processing it, forming the response, and sending the response back to corresponding client is delegated to this Thread only whereas the main thread keeps on running the accept() method indefinitely (till the server machine shuts down :-)) for accepting the incoming requests coming to this server.



Share/Save/Bookmark


7 comments:

Anonymous said...

Wonderful article! Web Server -> HTTP Req/Res formats and other details -> Need for Session Tracking - these three inter-linked articles explain the entire concept and its working so wonderfully. Possibly the best read on this topic so far.

Orijit Banerjee said...

One word... Awesome. Very nicely explained. Thanx

sukhvinder said...

really a nice article.

Anonymous said...

Execellent article!!

Anonymous said...

This blog was... how do I say it? Relevant!! Finally I've found something that helped me. Kudos!
Also visit my webpage click here

ChantellCeleste said...

Thank you, great article!

Unknown said...

Well written article even for a non-techie like me to comprehend....wish you had included a request and response example.