HTTP & Session Tracking





This article explains how to implement session tracking using two of the simplest & oldest methods available to programmers.

I feel that in order to understand the beauty of new technologies that exist today it is often necessary to understand what used to be done before that technology came into being.

The techniques presented in this article do not use the new technologies present to implement session tracking, but use some old, tried and tested ways which are extremely popular even today.

After reading this article you would be able to implement session tracking using any language, since you would understand the concepts of session tracking rather than some language dependent implementation of session tracking.

Various languages provide higher level API for implementing session tracking. There is a detailed session tracking API available in Java which

enables many programmers to get session tracking implemented quickly and easily.

But that is not what this article talks about. It focuses on understanding the basic techniques so that you can use it with any language.

To understand this article you need to know 3 things –
1. Familiarity with any server side technology such as JSP, ASP, Java servlets, etc.
2. You need to know HTML very well.
3. You need to know how to access the contents of a HTML Form from within a programming language such as JSP, ASP, etc.


What is session tracking? 




Session tracking (for those who haven’t heard of it) is a concept which allows you to maintain a relation between 2 successive requests made to

a server on the Internet.

Whenever a user browses any website, he uses “HTTP (the underlying protocol) for all the data transfers taking place”. This ofcourse is not important to the user. But it is for you as a programmer.

“” HTTP is a stateless protocol. “” 








HTTP characteristics:  




Stateless – Each transaction between the client and server is independent and no state is set based on a previous transaction or condition.Request & Response – Uses requests from the client to the server and responses from the server to the client for sending and receiving data.





HTTP 1.1 is defined by RFC 2616.

” HTTP messages consist of requests from client to server and responses from server to client.

HTTP-message = Request | Response ; HTTP/1.1 messages

How the web works: HTTP  




Browser contacts the server and requests that the server deliver the document to it. The server then gives a response which contains the

document and the browser happily displays this to the user.

The server also tells the browser what kind of document this is (HTML file, PDF file, ZIP file etc) and the browser then shows the document

with the program it was configured to use for this kind of document.

The browser will display HTML documents directly, and if there are references to images, Java applets, sound clips etc in it and the browser

has been set up to display these it will request these also from the servers on which they reside. (Usually the same server as the document,

but not always.) It’s worth noting that these will be separate requests, and add additional load to the server and network. When the user

follows another link the whole sequence starts a new.

     These requests and responses are issued in a special language called HTTP, which is short for HyperText Transfer Protocol. 




 Other common protocols that work in similar ways are “FTP and Gopher”,but there are also protocols that work in completely different ways.

It’s worth noting that HTTP only defines what the browser and web server say to each other, not how they communicate.

The actual work of moving bits and bytes back and forth across the network is done by TCP and IP, which are also used by FTP and Gopher (as

well as most other internet protocols).

When you continue, note that any software program that does the same as a web browser (ie: retrieve documents from servers) is called a client

in network terminology and a user agent in web terminology. Also note that the server is properly the server program, and not the computer on

which the server is an application program. (Sometimes called the server machine.)


What happens when I follow a link?  ( MOST IMPORANT : WHOLE PROCEDURE ) 




Step 1: Parsing the URL 




The first thing the browser has to do is to look at the URL of the new document to find out how to get hold of the new document.

Most URLs have this basic form: “protocol://server/request-URI”.

The protocol part describes how to tell the server which document the you want and how to retrieve it.

The server part tells the browser which server to contact, and the request-URI is the name used by the web server to identify the document.(I use the term request-URI since it’s the one used by the HTTP standard, and I can’t think of anything else that is general enough to not be misleading.)

Step 2: Sending the request 




Usually, the protocol is “http”. To retrieve a document via HTTP the browser transmits the following request to the server: “GET /request-URI  




HTTP/version”, where version tells the server which HTTP version is used.   




(Usually, the browser includes some more information as well. The details are covered later.)

One important point here is that this request string is all the server ever sees. So the server doesn’t care if the request came from a

browser, a link checker, a validator, a search engine robot or if you typed it in manually. It just performs the request and returns the


Step 3: The server response 




When the server receives the HTTP request it “locates the appropriate document and returns it”. However, an HTTP response is required to have a

particular form. It must look like this: ( IMPORATANT :– THIS IS THE RESPONSE FROM THE SERVER )
Field1: Value1
Field2: Value2

…Document content here…

The first line shows the HTTP version used, followed by a three-digit number (the HTTP status code) and a reason phrase meant for humans.

Usually the code is 200 (which basically means that all is well) and the phrase “OK”. 

The first line is followed by some lines called the header, which contains information about the document. The header ends with a blank line, followed by the document content. This is a typical header:

HTTP/1.0 200 OK
Server: Netscape-Communications/1.1
Date: Tuesday, 25-Nov-97 01:22:04 GMT
Last-modified: Thursday, 20-Nov-97 10:44:53 GMT
Content-length: 6372
Content-type: text/html

…followed by document content…

We see from the first line that the request was successful. The second line is optional and tells us that the ” server runs the Netscape

Communications web server, version 1.1. “





We then get what the server thinks is the current date and when the document was modified last, followed by the size of the document in bytes and the most important field: “Content-type”.

The content-type field is used by the browser to tell which format the document it receives is in. HTML is identified with “text/html”,ordinary text with “text/plain”, a GIF is “image/gif” and so on. The advantage of this is that the URL can have any ending and the browser will still get it right.

An important concept here is that to the browser, the server works as a black box. Ie: the browser requests a specific document and the

document is either returned or an error message is returned.

How the server produces the document remains unknown to the browser. This means that the server can read it from a file, run a program that

generates it, compile it by parsing some kind of command file or (very unlikely, but in principle possible) have it dictated by the server administrator via speech recognition software.

This gives the server administrator great freedom to experiment with different kinds of services as the users don’t care (or even know) how pages are produced.

What the server does ? 




When the server is set up it is usually configured to use a directory somewhere on disk as its root directory and that there be a default file

name (say “index.html”) for each directory.
This means that if you ask the server for the file “/” (as in “
http://www.domain.tld/“) you’ll get the file index.html in the server root


Usually, asking for “/foo/bar.html” will give you the bar.html file from the foo directory directly beneath the server root.

Usually, that is. The server can be set up to map “/foo/” into some other directory elsewhere on disk or even to use server-side programs to

answer all requests that ask for that directory.

The server does not even have to map requests onto a directory structure at all, but can use some other scheme.

HTTP versions

So far there are three versions of HTTP.

1) The first one was HTTP/0.9, which was truly primitive and never really specified in any standard.

2) This was corrected by HTTP/1.0, which was issued as a standard in RFC 1945. HTTP/1.0 is the version of HTTP that is in common use today     

   (usually with some 1.1 extensions), while HTTP/0.9 is rarely, if ever, used by browsers. (Some simpler HTTP clients still use it since they 

    don’t need the later extensions.)

3) RFC 2068 describes HTTP/1.1, which extends and improves HTTP/1.0 in a number of areas. Very few browsers support it (MSIE 4.0 is the only

one known to the author), but servers are beginning to do so.

The major differences are a some extensions in HTTP/1.1 for authoring documents online via HTTP and a feature that lets clients request that

the connection be kept open after a request so that it does not have to be reestablished for the next request. This can save some waiting and

server load if several requests have to be issued quickly.

This document describes HTTP/1.0, except some sections that cover the HTTP/1.1 extensions. Those will be explicitly labeled.


The request sent by the client

The shape of a request :-

Basically, all requests look like this:

[fieldname1]: [field-value1]
[fieldname2]: [field-value2]

[request body, if any]

The METH (for request method) gives the request method used, of which there are several, and which all do different things.

The above example used GET, but below some more are explained.The REQUEST-URI is the identifier of the document on the server, such as “/index.html” or whatever.VER is the HTTP version, like in the




response. The header fields are also the same as in the server response.The request body is only used for requests that transfer data to the

server, such as POST and PUT. (Described below.)

GETting a document 




There are several request types, with the most common one being GET. A GET request basically means “send me this document” and looks like this:

“GET document_path HTTP/version”.

For the URL “” the document_path would be “/”, and for “” it is “/Talks/General.html”.

However, this first line is not the only thing a user agent (UA) usually sends, although it’s the only thing that’s really necessary. The UA

can include a number of header fields in the request to give the server more information. These fields have the form “fieldname: value” and are

all put on separate lines after the first request line.

Some of the header fields that can be used with GET are:  




This is a string identifying the user agent. An English version of Netscape 4.03 running under Windows NT would send “Mozilla/4.03 [en] (WinNT; I ;Nav)”.

The referer field (yes, it’s misspelled in the standard) tells the server where the user came from, which is very useful for logging and

keeping track of who links to ones pages.

If a browser already has a version of the document in its cache it can include this field and set it to the time it retrieved that version. The

server can then check if the document has been modified since the browser last downloaded it and send it again if necessary. The whole point is

of course that if the document hasn’t changed, then the server can just say so and save some waiting and network traffic.
( MATLAB browser check karta hai if there is any change thats been done in this document, if yes than than it fetches it again and display, and

if not it’ll fetch that from the browser cache )


This header field is a spammers dream come true: it is supposed to contain the email address of whoever controls the user agent.Very few, if any, browsers use it, partly because of the threat from spammers. However, web robots should use it, so that webmasters can




contact the people responsible for the robot should it misbehave.

This field can hold username and password if the document in question requires authorization to be accessed.

To put all these pieces together: this is a typical GET request, as issued by my browser (Opera):

GET / HTTP/1.0
User-Agent: Mozilla/3.0 (compatible; Opera/3.0; Windows 95/NT4)
Accept: */*

HEAD: checking documents 




One may somtimes want to see the headers returned by the server for a particular document, without actually downloading the document.
This is exactly what the HEAD request method provides.

HEAD looks and works exactly like GET, only with the difference that the server only returns the headers and not the document content.

This is very useful for programs like link checkers, people who want to see the response headers (to see what server is used or to verify that

they are correct) and many other kinds of uses.

Playing web browser

You can actually play web browser yourself and write HTTP requests directly to web servers.  This can be done by telnetting to port 80, writing

the request and hitting enter twice, like this:

larsga – tyrfing>telnet 80
Connected to
Escape character is ‘^]’.

HTTP/1.1 200 OK
Date: Tue, 17 Feb 1998 22:24:53 GMT
Server: Apache/1.2.5
Last-Modified: Wed, 11 Feb 1998 18:22:22 GMT
ETag: “2c3136-23c1-34e1ec5e”
Content-Length: 9153
Accept-Ranges: bytes
Connection: close
Content-Type: text/html; charset=ISO-8859-1

Connection closed by foreign host.
larsga – tyrfing>

However, this works best under Unix as the Windows telnet clients I’ve used are not very suitable for this (and hard to set up so that it


The response returned by the server


What the server returns consists of a line with the status code, a list of header fields, a blank line and then the requested document, if it

is returned at all. Sort of like this:

HTTP/1.0 code text
Field1: Value1
Field2: Value2

…Document content here…

The status codes

The status codes are all three-digit numbers that are grouped by the first digit into 5 groups. The reason phrases given with the status codes

below are just suggestions. Server can return any reason phrase they wish.

1xx: Informational
No 1xx status codes are defined, and they are reserved for experimental purposes only.

2xx: Successful
Means that the request was processed successfully.

200 OK
Means that the server did whatever the client wanted it to, and all is well.
The rest of the 2xx status codes are mainly meant for script processing and are not often used.


3xx: Redirection
Means that the resource is somewhere else and that the client should try again at a new address.

301 Moved permanently
The resource the client requested is somewhere else, and the client should go there to get it. Any links or other references to this resource

should be updated.
302 Moved temporarily
This means the same as the 301 response, but links should now not be updated, since the resource may be moved again in the future.
304 Not modified
This response can be returned if the client used the if-modified-since header field and the resource has not been modified since the given

time. Simply means that the cached version should be displayed for the user.


4xx: Client error
Means that the client screwed up somehow, usually by asking for something it should not have asked for.

400: Bad request
The request sent by the client didn’t have the correct syntax.
401: Unauthorized
Means that the client is not allowed to access the resource. This may change if the client retries with an authorization header.
403: Forbidden
The client is not allowed to access the resource and authorization will not help.
404: Not found
Seen this one before? 🙂 It means that the server has not heard of the resource and has no further clues as to what the client should do about

it. In other words: dead link.


5xx: Server error
This means that the server screwed up or that it couldn’t do as the client requested.

500: Internal server error
Something went wrong inside the server.
501: Not implemented
The request method is not supported by the server.
503: Service unavailable
This sometimes happens if the server is too heavily loaded and cannot service the request. Usually, the solution is for the client to wait a

while and try again.

The response header fields

These are the header fields a server can return in response to a request.

This tells the user agent where the resource it requested can be found. The value is just the URL of the new resource.
This tells the user agent which web server is used. Nearly all web servers return this header, although some leave it out.
This gives the size of the resource, in bytes.
This describes the file format of the resource.
This means that the resource has been coded in some way and must be decoded before use.
This field can be set for data that are updated at a known time (for instance if they are generated by a script). It is used to prevent

browsers from caching the resource beyond the given date.
This tells the browser when the resource was last modified. Can be useful for mirroring, update notification etc.





Caching: agents between the server and client

1)  The browser cache

You may have noticed that when you go back to a page you’ve looked at not too long before the page loads much quicker. That’s because the browser stored a local copy of it when it was first downloaded.

These local copies are kept in what’s called a cache.
Usually one sets a maximum size for the cache and a maximum caching time for documents.

This means that when a new page is visited it is stored in the cache, and if the cache is full (near the maximum size limit) some document that the browser considers unlikely to be visited again soon is deleted to make room.

Also, if you go to a page that is stored in the cache the browser may find that you’ve set 7 days as a the maximum storage time and 8 days have now passed since the last visit, so the page needs to be reloaded.

Exactly how caches work differ between browsers, but this is the basic idea, and it’s a good one because it saves both time for the user and network traffic. 

There are also some HTTP details involved, but they will be covered later.

2)   Proxy caches

Browser caches are a nice feature, but when many users browse from the same site one usually ends up storing the same document in many

different caches and refreshing it over and over for different uses. Clearly, this isn’t optimal.

The solution is to let the users share a cache, and this is exactly what proxy caches are all about. Browsers still have their local caches,

but HTTP requests for documents not in the browser cache are not sent to the server any more, instead they are sent to the proxy cache.

If the proxy has the document in its cache it will just return the document (like the browser cache would), and if it doesn’t it will submit

the request on behalf of the browser, store the result and relay it to the browser.

( Means Proxy cache works between the browser and the Server )

So the proxy is really a common cache for a number of users and can reduce network traffic rather dramatically. It can also skew log-based

statistics badly. 

A more advanced solution than a single proxy cache is a hierarchy of proxy caches. Imagine a large ISP may have one proxy cache for each part

of the country and set up each of the regional proxies to use a national proxy cache instead of going directly to the source web servers. This solution can reduce network traffic even further.



Submitting forms

The most common way for server-side programs to communicate with the web server is through ordinary HTML forms.


The user fills in the form and hits the submit button, upon which the data are submitted to the server. If the form author specified that the

data should be submitted via a GET request the form data are encoded into the URL, using the  ?  syntax I described above. 


The encoding used is in fact very simple. If the form consists of the fields name and email and the user fills them out as Joe and the resulting URL looks like this:


If the data contain characters that are not allowed in URLs, these characters are URL-encoded. If the data contain characters that are not

allowed in URLs, these characters are URL-encoded. 

This basically means that the character (say ~) is replaced with a % followed by its two-digit ASCII number (say %7E).

POST: Pushing data to the server

GET is not the only way to submit data from a form, however. One can also use POST, in which case the request contains both headers and a body.

(This is just like the response from the server.) The body is then the form data encoded just like they would be on the URL if one had used


MOST IMPORTANT :— ( Difference between POST method and GET method )

Primarily, POST should be used when the request causes a permanent change of state on the server (such as adding to a data list) and GET when

this is not the case (like when doing a search).

If the data can be long (more than 256 characters) it is a bit risky to use GET as the URL can end up being snipped in transit.

Some OSes don’t allow environment variables to be longer than 256 characters, so the environment variable that holds the ?-part of the request

may be silently truncated.

This problem is avoided with POST as the data are then not pushed through an environment variable.

Some scripts that handle POST requests cause problems by using a 302 status code and Location header to redirect browsers to a confirmation

page. However, according to the standard, this does not mean that the browser should retrieve the referenced page, but rather that it should resubmit the data to the new destination. 



An inconvenient side of the HTTP protocol is that each request essentially stands on its own and is completely unrelated to any other request.
This is inconvenient for scripting, because one may want to know what a user has done before this last request was issued.

As long as plain HTTP is used there is really no way to know this. (There are some tricks, but they are ugly and expensive.)

To illustrate the problem: imagine a server that offers a lottery. People have to view ads to participate in the lottery, and those who offer the lottery don’t want people to be able to just reload and reload until they win something. This can be done by not allowing subsequent visits from a single IP address (ie: computer) within a certain time interval. However, this causes problems as one doesn’t really know if it’s the same user.


People who dial up via modems to connect to the internet are usually given a new IP address from a pool of available addresses each time, which

means that the same IP address may be given to two different users within an hour if the first one disconnects and another user is given the

same IP later.

The solution proposed by Netscape is to use magic strings called cookies.

The server returns a “Set-cookie” header that gives a cookie name, expiry time and some more info.

When the user returns to the same URL (or some other URL, this can be specified by the server) the browser returns the cookie if it hasn’t


This way, our imaginary lottery could set a cookie when the user first tries the lottery and set it to expire when the user can return.

The lottery script could then check if the cookie is delivered with the request and if so just tell the user to try again later. This would

work just fine if browsers didn’t allow users to turn off cookies…

Cookies can also be used to track the path of a user through a web site or to give pages a personalized look by remembering what the user has done before.

Server logs

Most servers (if not all) create logs of their usage. This means that every time the server gets a request it will add a line in its log, which

gives information about the request. Below an excerpt from an actual log: – – [04/Jan/1998:21:24:46 +0100] “HEAD /ftp/pub/software/ HTTP/1.0” 200 6312 – “Mozilla/4.04 [en] (WinNT; I)” – – [04/Jan/1998:21:30:32 +0100] “GET /robots.txt HTTP/1.0” 304 158 – “Mozilla/4.0 (compatible; MSIE 4.0; MSIECrawler;

Windows 95)”
microsnot.HIP.Berkeley.EDU – – [04/Jan/1998:22:28:21 +0100] “GET /cgi-bin/ HTTP/1.0” 200 1445” “Mozilla/4.03 [en] (Win95; U)” – – [05/Jan/1998:00:13:53 +0100] “GET /download/RFCsearch.html HTTP/1.0” 200 2399 “

“Mozilla/4.04 [en] (Win95; I)” – – [05/Jan/1998:00:13:53 +0100] “GET /standard.css HTTP/1.0” 200 1064 – “Mozilla/4.04 [en] (Win95; I)”

This log is in the extended common log format, which is supported by most web servers. The first hit is from Netscape 4.04, the second from

some robot version of MSIE 4.0, while three to five are again from Netscape 4.04. (Note that the MSIECrawler got a 304 response, which means

that if used the If-modified-since header.)

A server log can be useful when debugging applications and scripts or the server setup. It can also be run through a log analyzer, which can

create various kinds of usage reports. One should however be aware that these reports are not 100% accurate due to the use of caches.

A sample HTTP client  ( USING PYTHON )

Just to illustrate, in case some are interested, here is a simple HTTP client written as a Python function that takes a host name and path as

parameters and issues a GET request, printing the returned results. (This can be made even simpler by using the Python URL library, but that

would make the example useless.)

# Simple Python function that issues an HTTP request

from socket import *

def http_req(server, path):

    # Creating a socket to connect and read from

    # Finding server address (assuming port 80)

    # Connecting to server

    # Sending request
    s.send(“GET “+path+” HTTP/1.0\n\n”)

    # Printing response
    while resp!=””:
 print resp


Here is the server log entry that resulted from this call: http_req(“”,”/”) – – [26/Jan/1998:12:01:51 +0100] “GET / HTTP/1.0” 200 2272 – –

The “- -“s at the end are the referrer and user-agent fields, which are empty because the request did not contain this information.


if you are at a website buying books online, then you may add books to your Cart and continue searching for more books. Every time you click on

a new page your old selected books in the Cart should not disappear.

In case you use the default way the WWW works, then since 2 successive request (by the same user) have no connection, there would be no books

in your Cart every time you click on a new link. I mean every click would be considered as a separate request and no having no relation to

previous request.

Thus as you browse, all the information that relates to you should be maintained and should be carried on as you browse more and more. Your

previous Shopping Cart contents should be present when you want to add a new book to the Cart.

This is what session tracking enables you to do. It lets you maintain a active session as long as you are browsing. And it gives HTTP a sort of

new quality with every successive request having some relation to previous requests within the same session.

Session tracking is so common that you may not even realise that it is present. You might be used to it. It is used on almost every possible

site you visit on the net. For example at Hotmail once you enter your username-pass and you reach your inbox, had there been no session

tracking then every time you click on a particular link in your inbox, you would be asked for your password.

This would be the case since there would be no way to understand that the one who had originally entered his username-password is the same

person who is currently asking for more pages. Session tracking allows you to store the information that you have successfully logged in and

this information would be checked every time you do any thing within your inbox. Thus you would not be asked to enter your password with every

click. I can give you thousands of examples where session tracking is used, but I guess you have got the point.

Now lets begin with the actual way to implement session tracking. I shall explain 2 ways to implement session tracking

1. Hidden Fields In Forms
2. URL Rewriting

Also I conclude the article with a few lines on cookies which is also used for session tracking.

Hidden Fields In Forms

This is the simplest and most easy way to implement session tracking. I find this method extremely useful to get the work done quickly. I can

explain this with the help of the example I was speaking about – A Cart to hold your books.

In case you visit a site and you are presented a list of books with checkboxes next to each of them. You could select books and click on a Add

to Cart Submit button. A sample code for such a page is shown below.

Remember this is just what the code may look like and not the exact page. You should try to understand the logic rather than focus on the

syntax. Also remember that these are all dynamic pages being generated using some language such as JSP.

<b>Search results for books</b>
<form method=”post” action=”serverprogram.jsp”>
<input type=”checkbox” name=”bookID” value=”100″>Java Servlet Programming<br>
<input type=”checkbox” name=”bookID” value=”101″>Professional JSP<br>
<input type=”submit” name=”Submit” value=”Add to Cart”><br>

Suppose a page similar to the above one was generated when the user searched for some books. The above page has only 2 search results. There is

a Form with 2 checkboxes, each next to the name of a book and a Submit button to add any selected books to the Cart.

Now suppose the user clicks on the checkbox next to book named ‘Java Servlet Programming’ , and then clicks on the Submit button. Note that the

value of a checkbox is used in this case to store the bookID. Generally when you have many checkboxes each representing one-of-many kind of

entity then the value for that checkbox differentiates between all of them. In our case since all the checkboxes represent books, each value

represents a different bookID and thus a different book (one book-of-many books). This is actually a programming concept you would be familiar

with in case you have done web programming.

Now coming back to the point, in case the user checked the checkbox next to the book named ‘Java Servlet Programming’ and then clicked the

Submit button, the contents of the form are all bundled together and sent to the server side program. In our case the program is named

addcart.jsp .

Now suppose at any further instant when the same user is searching for more books then on a search result he might be presented with page such

as the one shown below. Remember that he has already selected a book previously. So that book should be present in his Cart and now he would

like to add more books.

<b>Search results for books</b>
<form method=”post” action=”serverprogram.jsp”>

<input type=”hidden” name=”bookID” value=”100″> <!–  HIDDEN FIELD IS USED TO STORE SELECTED BOOK –>

<input type=”checkbox” name=”bookID” value=”150″>Teach yourself WML Programming<br>
<input type=”checkbox” name=”bookID” value=”160″>Teach yourself C++<br>
<input type=”submit” name=”Submit” value=”Add to Cart”><br>

Those of you’ll who are experts in programming must have already figured out how hidden fields help in session tracking. For the rest of you’ll

who are like me and take more time to figure out what is happening, let me explain..

The new search result produced once again 2 new books. One book named ‘Teach yourself WML Programming’ with a bookID of 150 and another book

named ‘Teach Yourself C++’ with a bookID of 160. So a form was generated with the names of these 2 books and with 2 checkboxes so that the user

may select any of these books and add them to the Cart.

But there is one more important thing in the form that was generated.

There is a hidden input field named bookID and having a value of 100. You might have noticed that 100 was the bookID of the book named ‘Java

Servlet Programming’ which the user had initially selected.

This line describing a hidden input does not make any difference on the HTML page displayed in the browser. It would be totally invisible to

the user. But within the form it makes a hell lot of a difference. This way when the user keeps adding more and more books, there would be many

hidden input fields each with a different value, each representing a previously selected book. 

When this form is submitted to the server side program, that program would not only fetch the newly selected checkboxes (newly selected books)

but also these hidden fields each representing a previously selected book by that user. Note that all the input fields have the same name

bookID but their values are different. Within the server side program you would simply expect a parameter called bookID which would be an array

with different values.

You could extract all the values and then use them as required. It is the job of the server side program to add these lines indicating hidden

fields whenever it generates a new page.
Once again..the main concept to be understood is that a hidden field displays nothing ON the HTML page. So the user who is browsing the page

sees nothing unusual, but the value associated with these hidden fields can be used to hold any kind of data that you want.


Only care is to be taken so that every time your server side program generates a new form, it should read all the parameters passed to it from

the previous form and then add all these values as new hidden fields in any new form that it generates.

Thus you could carry information from one HTML page to another and thus maintain a connection between 2 pages.

The disadvantage of session tracking is that in case you do not want the user to know what information is being passed around to maintain a

session (in case that information is somewhat vital..maybe a password or something) then this method is not the best one since the user can

simply select to View the Source of the HTML page and get to see all the hidden fields present in the Form.

URL Rewriting

This is another popular session tracking method used by many. But it has a few bad points associated with it. Inspite of that I like to use

this method.

It doesn’t require a lot of understanding to get the work done. URL Rewriting basically means that when the user is presented with a link to a

particular resource instead of simply presenting the URL as you would normally do, the URL for that resource is modified so that more

information is passed when requesting for that resource.

I will try explaining URL Rewriting with the same Shopping Cart example used in the hidden field method. Actually I could have shown simpler

examples, but for you to compare the 2 methods I shall take up the same example once again.

So once again assume that a user has searched for some books and he has been presented with a search result that has 2 books listed. It is

basically a Form with 2 checkboxes, each for one book and a Submit button to add any of these book to his Cart.

<b>Search results for books</b>
<form method=”post” action=”serverprogram.jsp”>
<input type=”checkbox” name=”bookID” value=”100″>Java Servlet Programming<br>
<input type=”checkbox” name=”bookID” value=”101″>Professional JSP<br>
<input type=”submit” name=”Submit” value=”Add to Cart”><br>

Now once again suppose the user selects the book named ‘Java Servlet Programming’ and then clicks on the Submit button. This would pass the

contents of the form to the server side program called serverprogram.jsp which should read the selected checkboxes and do the necessary (i.e..

make some arrangements to keep a track of the selected books, which basically means implement session tracking).

Now suppose the user continues browsing and searches for more books and is presented with a new search result just like in the previous

example. For better understanding I shall once again give you the same 2 results as shown in hidden fields method. The 2 books named ‘Teach

yourself WML Programming’ and ‘Teach yourself C++’

<b>Search results for books</b>

<form method=”post” action=”serverprogram.jsp?bookID=100″> <!– this is known as URL Rewriting –>

<input type=”checkbox” name=”bookID” value=”150″>Teach yourself WML Programming<br>
<input type=”checkbox” name=”bookID” value=”160″>Teach yourself C++<br>
<input type=”submit” name=”Submit” value=”Add to Cart”><br>

You should be able to guess by now what URL rewriting is all about. In the above html source, the target for the form has been changed from

serverprogram.jsp to serverprogram.jsp?bookID=100 .

This is exactly what URL Rewriting means. The original URL which was only serverprogram.jsp has now been rewritten as

serverprogram.jsp?bookID=100 . The effect of this is that the any part of the URL after the ? (question mark) is treated as extra parameters

that are passed to the server side program.

They are known as GET parameters. GET method of submitting forms always uses URL Rewriting.

Now when the serverprogram.jsp fetches the parameters by the name bookID it would be presented with the one that was present after the ? in the

URL as well as the newly selected checkboxes by the user in that Form.

Consider a general example where a user has selected 2 values, then whenever a program generates a new Form the target for that form should

look something like

<form method=”post” action=”serversideprogram.jsp?name1=value1+name2=value2″>

This sort of URL would keep on increasing as more and more values have to be carried on from one page to another.

The basic concept of URL Rewriting is that the server side program should continuously keep changing all the URLs and keep modifying them and

keep increasing their length as more and more data has to be maintained between pages. The user does not see anything on the surface as such

but when he clicks on a link he not only asks for that resource but because of the information after the ? in the URL he is actually sending

previous data to the program.

The disadvantage of URL Rewriting (though its a minor one) is that the displayed URL in the browser is of course the rewritten URL. Thus the

clean simple URL that was seen when hidden fields were used, is replaced with a one with a ? followed by many parameter values. This doesn’t

suit those who want the URL to look clean. Another disadvantage is that some browsers specify a limit on the length of a URL. So once the data

which is being tracked exceeds beyond a certain limit, you may no longer be able to use URL Rewriting to implement session tracking.

But that limit is generally large enough and so don’t feel afraid to use this method. But do note that actually rewriting all the URLs within

your program is not a simple task and requires some experience.

In case you are confused with what we have been doing with hidden fields and URL Rewriting, I shall sum it up once again for you. We are trying

to learn methods that allow us to carry information from one HTML page to another since by default you cannot pass information from one HTML

page to another.

So to carry data from one page to another, we are either using hidden fields invisible to normal users or rewriting all the links on a page so

that the server side program receives the old as well as new data. Thus we can maintain a session (a connection between multiple pages) for

every user.


This is one of the most famous methods and the one used by almost all professional sites.

This allows you complete flexibility and whatever you want as far as session tracking is concerned. But it is not as easy as the other 2


 Besides some applications may not allow cookies in which case you have to revert back to the other 2 methods. I had designed websites using

WML (Wireless Markup Language) which worked on WAP based cell phones. Unfortunately the cellphones did not have enough memory to support

cookies, so I had to use hidden fields to get session tracking working.

But cookies would work on almost every every computer, except when a user may have blocked all cookies for security reasons in which case you

would once again have to use either of the other 2 methods.

Using cookies is probably the best and the neatest of all the methods to maintain sessions. Cookies are basically small text files that are

stored on the user’s computers.

This has information pertaining to that user. Once the cookie is created on the user’s computer then for every further request made by that

user in that session, the cookie is sent along with the request. The value of every cookie is unique (for users browsing a particular website),

so the server side program can differentiate between various users.

The method to program cookies is different for different languages. Most of the language provide some class that covers all the details of

cookie creation and maintenance.

For example in Java you have a javax.servlet.http.Cookie class that is used to work with cookies.



Consider the following example, in which two JSP files, say hello1.jsp and hello2.jsp, interact with each other. Basically, we create a new

session within hello1.jsp and place an object within this session. The user can then traverse to hello2.jsp by clicking on the link present

within the page.Within hello2.jsp, we simply extract the object that was earlier placed in the session and display its contents.

Notice that we invoke the encodeURL() within hello1.jsp on the link used to invoke hello2.jsp; if cookies are disabled, the session ID is

automatically appended to the URL, allowing hello2.jsp to still retrieve the session object.

Try this example first with cookies enabled. Then disable cookie support, restart the brower, and try again. Each time you should see the

maintenance of the session across pages.

Do note that to get this example to work with cookies disabled at the browser, your JSP engine has to support URL rewriting.


<%@ page session=”true” %>
  Integer num = new Integer(100);
 String url =response.encodeURL(“hello2.jsp”);
<a href='<%=url%>’>hello2.jsp</a>



<%@ page session=”true” %>
  Integer i= (Integer )session.getValue(“num”);
  out.println(“Num value in session is “+i.intValue());



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: