Wednesday, October 29, 2008

Web Vulnerability Detection: SQL Injection, XSS, GHDB, Sniffers

Web Vulnerability Detection: SQL Injection, XSS, GHDB, Sniffers

What's Web Vulnerability & how does it impact?

Why the Web Sites are still vulnerable when we have technologies like Firewaals, SSL, etc.? Because most of the web sites are made for a public use and need to be available 24x7 on a publicly accessible route. Since the Web applications have access to the data in the back-end and in most of the cases many vital pieces of secured info are stored in the middleware which may provide the hackers some loopholes to get hold of those secure and highly valuable pieces. It's not the web applications don't have any security measures in place, but the fact is that it's extremely difficult (if at all possible) to eliminate the probability of a hacker cheating to web apps. This para sort of covers why the web vulnerability still exist and the below text will try to cover in what forms the hackers can attack a site and how can we minimize such risks.

There are many security testing tools normally known as WVS (Web Vulnerability Scanner) tools are available in the market, many of them being Open Source as well. These tools are used for auditing the ins and outs of your web application and help you minimizing the direct of indirect security threats which attackers can pose to your web application. This can be very critical for web sites belonging to the finance, military, and government domains in particular. Many of such tools are able to detect even the complex scenarios which can lead an attacker to access the restricted information. Some popular WVS tools available are: Acunetix WVS, Wikto, Nikto, etc. A list of such tools are available here.

Most popular Web Vulnerabilities
  • SQL Injection: it's a technique to modify the SQL statements hitting your web application to get hold of secure data. Read more in this article - SQL Injection & its prevention >>
  • Cross Site Scripting (XSS): this is a technique which allows an attacker to execute a malicious script in the browser of expected visitors of your sites. This way it can collect the credentials and other important piece of info simply by reading the typed-in keystrokes. It's believed to be among the most common application layer hacking techniques. XSS allows the hacker to embed malicious scripts written either in JavaScript, VBScript, ActiveX, or Flash (the scripting languages which execute on a client machine) and the scripts seem to execute as part of the other dynamic contents on the site inviting the users to enter critical info or to simply ask them to execute it on their machine which may scan their local system for other vital pieces of information. How can the hackers embed the scripts? One way of doing it is by intercepting the client query and appending the malicious script code in the URL reaching the Web Server. If the server is not having all the preventions of detecting and handling the presence of dangerous characters in both ASCII and Hex formats then the script may return and execute at the client machine without much of an issue. The data collected this way can be gathered by the hackers in a number of ways. For example: the script may add the data to the URL which can then be intercepted by the hackers or the script can lead the user to an insecure page and ask them to enter their secure information there.
  • CRLF Injection: such an attack allows the hacker to fire commands which can lead your web application into an inconsistent state depending on how open the application is for such attacks. If the inputs coming to the application are not properly validated then the application maybe prone to damage. As you would be knowing that CR (Carriage Return, ASCII 13, \r) and LF (Line Feed, ASCII 10, \n) is the sequence used by Windows systems to indicate the end of line and also to indicate the "Enter" keystroke. On Linux/UNIX systems the end of line is indicated by LF only. How severe can such attacks damage the web application depends on how many loopholes the developer have left in the design and code.
  • Directory Traversal: as the name suggests such an attack can lead the hacker to get access to the directory structure of your hosted web application which may in turn allow him/her to run view/update/delete/insert critical resources and this can obviously cause a complete mess.
  • Authentication Vulnerabilities: such an attack may allow the hackers to login to the system as legitimate users and you can easily visualize how much harm can they really do to the system. If they somehow manage to get the credentials of an admin user then the entire system can collapse (or be compromised) in a jiffy. Some of ways of dealing with such attacks to have a complex password policy and to have a layered authentication scheme for users having critical privileges. For example: Fund Transfers on Internet requiring you to enter ATM Card digits/codes, Expiry Date, CVV, etc. are nice examples of layered authentication. Chances of breaking all the layers at one time are significantly less than breaking one (even the complex ones).
  • AJAX and Web 2.0 Vulnerabilities: the more powerful the technology is the more lethal the attack can be if the technology has been used without sufficient precaution. AJAX and Web 2.0 technologies are excellent in serving their purpose but many a times developers miss a thing or two while using them (which usually don't impact the functionality directly) which ultimately leave loopholes to be exploited by the attackers.
GHDB - Google Hacking Database

This is a huge database of all those queries which have been used by attackers (or which are supposed to be used by them) to get access to the sensitive information. Almost all the WVS tools launch all these queries to all the crawled contents of your site to report you how vulnerable the site is. Since this is a huge database hence the complete fixing of all the reported issues will make your site quite healthy and immune to the attacks.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


XML: what's it an how it different from HTML?

XML - what's it an how it different from HTML?

XML - Extensible Markup Language

XML is a system and hardware independent language used for producing Unicode text files called XML Documents which define data and describe the structure of the contained data. The World Wide Web Consortium (also known as W3C) owns and controls the specifications of this language.

Since XML Documents contain data so they can be used for transporting data from one system to another. Additionally, XML Doc describes the structure of the contained data as well so the receiving system can easily interpret the contained data. This makes XML a standard in data communication between systems (homogeneous or heterogeneous).


Both the languages use tags and attributes and hence they may look similar, but they are vastly different in terms of what they are used for and what they are capable of. XML is primarily used for communication between two systems and hence concentrates more on the structure of the data whereas HTML is primarily used for the presentation of the data and hence concentrates more on the appearance of the data.

Another obvious difference is that the HTML tags are pre-defined, fixed in number and each of them have a specific meaning attached to it whereas XML tags don't have a fixed meaning attached to them and the name of the tags are also not pre-defined.

XML Document Structure

An XML document structure is very simple and it consists of two parts:-
  • Prolog: it is made up of two components - XML Declaration (which may contain the actual Unicode encoding scheme and it may also specify if the XML Doc is a Standalone doc or not) and DTD Declaration. A Document Type Definition (DTD) is used to identify the markup elements used in the XML body. The prolog is an optional part, but it's normally good to have one for a XML Doc.
  • Document Body - this part of an XML doc contains the actual data and its structure definition. It always contains a single Root Element which may contain any number of sub-elements within it.
Both the parts - prolog and document body may be followed by Processing Instructions. As the name suggests they are instructions used by the applications to process the XML Doc in a particular way as specified by the instructions.

Like any other language, one can have comments embedded in a XML Doc which are mainly meant to explain the data or to provide any additional details to the human readers of the particular XML Doc.

Well-formed XML Doc vs Valid XML Doc

A Well-formed XML Doc means the document has been written as per the XML Specifications. For example: the Prolog should contain the XML version, Encoding Scheme and Standalone info in a proper order, the Elements should be properly nested, there should be only one root element, etc.

A Valid XML Doc is a well-formed XML Doc which complies with the associated DTD as well. That means a Valid XML Doc is formed only of the elements which have already been defined in the referenced DTD document and the DTD document should also be written as per the specifications.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


Sunday, October 26, 2008

ORM: Object Relational Mapping Technology, JDO, JPA, SDO

ORM (Object Relational Mapping): This is a programming technique which eliminates the need of transforming the objects into scalar values while saving the objects into (and also while building the objects while retrieving) a relational database which can't directly store object values. This technique is also referred to as I/RM and O/R Mapping. Thus we see that ORM is a technique which makes objects persistent i.e., developers can save and retrieve objects directly the same as way as they can do with scalar values like int, string, etc. Example: suppose you have a Employee class having various field like name, dob, addresses, etc. Here the addresses may refer to a list of objects of type Address (another class encapsulating the address details). Now if a developer wants to make such an Employee object persistent, he probably can't do directly with most of the popular relational databases. ORM helps achieving this. Once you start using any ORM technology then you forget about how internally the data is transformed back and forth and you get a feeling that the objects are getting stored and retrieved the same way as numbers, for instance.

Challenges in the implementation of an ORM system?

Not that difficult to imagine, isn't it? There would be some mechanism of generating a set of SQL statements responsible for storing the various pieces of the object data and similarly another set of SQL statements to convert them back to form the object while retrieval. The challenges in generating a predictable and efficient set of SQL statements doing the trick are: Performance, Scalability, CRUD operation management, Maintainability, Flexibility, etc. to name a few. As you can see that these are infrastructure related challenges and hence an ORM system avoids the application developer's need of focusing on such things which consequently allows him to pay full attention on the implementation of the business logic. Clear separation of roles, right? In addition, such infrastructure related challenges are so vast and generic in nature that a specilalized set of people (involved in the ORM tools design and dev) will probably end up doing the task in far better manner compared to a set of application deveopers who are more skilled into the implementation of the business logic.

Currently available ORM implementations

We have three popular standards whose implementations serve as ORM tools used by majority of application development worldwide. These standards and few of their implementations are listed below:
  • JDO - Java Data Objects is a standard having many implementations by multiple vendors. This is a standard unlike another ORM implementation named EOF (Enterprise Objects Framework) developed by NeXT. Initially EOF was tightly tied to the NeXT's toolkit, OpenStep. EOF now comes in two different implementations: the Objective-C implementation which comes with Apple Developers Tools and the pure Java implementation comes in WebObjects which is the first object-oriented Web Application Server (developed by NeXT).
  • JPA - Java Persistence API is another popular standard for ORM. Hibernate being a popular implementation conforming to this standard.
  • EJB - Enterprise Java Beans is also popular ORM standard having implementation by multiple vendors. EJB3 is the latest release of this standard.
  • SDO - Service Data Objects is another standard gaining ground these days. It aims to deliver updatable datagraphs to business level components written in any programming language and the mapping is usually done with the help of an enterprise Metadata repository.

How can we avoid ORM while meeting the needs?

Is it possible? Yes.. it's certainly possible, but you got to use an OODBMS (Object Oriented Database Management System) in that case instead of the far more popular RDBMS (Relational Database Management System). OODBMS is also known as a Non-SQL DBMS as we don't use SQL queries to insert/update/manipulate data. SQL is a standard querying language designed for Relational Databases. This is one of the chief reasons why OODBMSs have not got popular so far as people as so habitual of using SQL for their data access needs that living without that seems quite difficult. For all those ORM technology is the best possible solution to facilitate them best of both the worlds.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


Wednesday, October 22, 2008

XML Database - NXD, XAPI, Apache Xindice, Ozone DB, Sedna

XML Database - NXD, XAPI, Apache Xindice, Ozone DB, Sedna

XML Database - what's it?

XML Databse is just like any other database system with the only difference being the data is stored in XML format in this case. We have two major categories of XML Databases:-
  • XML-enabled DB: Here the underlying database is one of the traditional databases only, but the database gets equipped with the ability of mapping the XML input into the format which the underlying DB can understand and it also takes care of mapping the returned output of the underlying DB into the corresponding XML output for the user. Evidently such an XML DB is different from any other traditional DB in only one sense - it handles the mappings automatically and the user of not bothered to do that in the middleware.
  • Native XML DB (NXD): This category is also known as NXD. Databases belonging to this category has XML document as the fundamental unit of storage (the way we have row of a table as the fundamental unit of storage in a traditional relational DB). How the data is actually stored in the DB is of little concern (if at all) to the user and it's an implementation dependent stuff and it may not be in form of text files.
Why do we use XML Databases?

Very simple to answer this question... right? XML has become a norm for data transportation these days so you may have a requirement of directly dealing with XML Input and similarly you may need to return an XML Output. In such cases (which are increasingly getting more and more popular) XML DB can be of great help which avoids you to pay much attention to how the mappings actually take place.


Similar to JDBC ad ODBC in case of relational databases, we have XAPI for XML Databases which provide an implementation-independent access to the XML database.

Examples of XML Databases
  • Apache Xindice - it is pronounced as (zeen-dee-chay). It's an example of a Native XML DB. It's basically a continuation of the project named 'dbXML Core' which was donated to the Apache Software Foundation in Dec, 2001. As already specified why NXDs are used, if you deal with XML Input and XML output then Apache Xindice is something you may find quite interesting. Needless to mention, like other Apache projects this is also Open Source and Free. Apache Xindice V1.1 released on May 09, 2007. Apache Xindice supports XAPI for Java Development and XML-RPC API for other languages. More details can be found here.
  • Ozone Database - it's an Open Source, Java-based, Object-Oriented Native XML DBMS which is completely implemented in Java and hence it supports an application program to run directly in a transactional database environment. Ozone DB has a DOM Implementation (compliant to W3C) supporting the XML data storage and retrieval and you can use almost every XML tool to access the data. It also supports Apache Xerces-J and xalan-J. One important point to note here is that Ozone DB doesn't depend on any back-end DB or mapping technology as it contains its own clustered storage and cache system to store and manage persistent Java objects. More details can be found here.
  • Sedna DB - it's another Native XML DB implemented in C/C++ which supports a wide range of XML applications including Content Management, Event-based SOA, etc. It's a full-fledged DBMS and it supports W3C compliant XQuery language for querying. This is also an Open Source DB and can be obtained for free under Apache Licence 2.0. It's now available for all popular platforms - Linux, Windows, FreeBSD, and Mac OS. More details can be found here.
  • myXMLDB - another Open Source Native XML Db which has been implemented on top of MySQL and which uses BLOBs to store the XML documents with a maximum size of 256MB each. It has a GUI interafce and it has a Java implementation of XAPI. This XML DB is known for supporting huge XML documents. Since most of other well-known XML databases use DOM for storing an XML doc hence they can't support really huge XML docs. DOM is inherently very memory consuming and this is where myXMLDB takes the lead over the others. If your project requires some really huge XML Docs to be stored and accessed then you would probably be left with this option only. More details can be found here.
Apart from these Native XML databases, we have several XML-enabled Databases as well. Soem of them being MS Access 2002, IBM DB2, MS FoxPro, IBM Informix, Oracle, MS SQL Server, Sybase ASE 12.5, PostgreSQL, etc. A comprehensive list and details of all these XML Databases can be found here.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


Sunday, October 19, 2008

Evolution of Agile Methodologies. Engineering vs Agile.

Evolution of Agile Methodologies. Engineering vs Agile Methodologies.

Software Development was initially based on coding and fixing. That worked well for smaller software, but as the size and complexities of software grew a need for a proper process was felt because the debugging and testing of such software became extremely difficult. This gave birth to the Engineering Methodologies.

Engineering Methodologies or Plan-Driven Methodologies

These methodologies were developed to make the software development to happen in a more disciplined and structured manner which ultimately avoided many bugs and/or inefficiencies and/or inflexibilities of the software in the early stages of the development itself. Evidently the testing and debugging became less terrible and due to all these advantages the engineering methodologies became hugely successful. Example: Waterfall Model

Agile Methodologies

If engineering methodologies were so successful why did we need this? Well... nothing is perfect and Engineering Methodologies are no exceptions either. They require hell lot of documentation (to make the software development more disciplined and structured) and hence they cause the pace of the development to slow down considerably, especially for larger software. In a way Engineering Methodologies look bureaucratic in nature.

Agile Methodologies evolved to significantly eliminate (or lessen at least) targeted this drawback of engineering methodologies. Agile Methodologies require quite less documentation than engineering methodologies as they make a useful compromise between no-process (as was the case earlier) and heavy-processes (as is the case with Engineering Methodologies). So that way it comes somewhere in between the two. Isn't it going backwards? No... Agile Methodologies use different approach to cut down on documentation and hence the software development remains almost as structured as it is with Engineering Methodologies. How do Agile Methodologies manage to chop off the documentation to that extent? Simply by emphasizing on a smaller amount of documentation for any given task and by making the documentation code-oriented most of the times. The designers of Agile Methodologies believe that source code should be a key part of the documentation. This approach not only avoids huge documentation otherwise, but also make the documentation more precise and effective. Using Agile Methodology we try to focus on the smallest workable piece of functionality to deliver business value early, then we develop this functionality, test it, and keep on adding other/new functionality throughout the life cycle of the project. This gives a chance to look at project and adapt to changes if required every few weeks instead of finalizing everything in the beginning and then continue working on it thereafter as is the case with Engineering Methodologies like the very famous Waterfall Model.

Differences between Engineering Methodologies and Agile Methodologies

  • Predictive vs Adaptive - Engineering Methodologies are Predictive in nature and they focus too much on pre-planning of the processes in great detail. This obviously makes them comparatively less flexible for future changes. Agile Methodologies in contrast are Adaptive in nature and they welcome changes even to the point of changing themselves.
  • Process-oriented vs People-oriented - Engineering Methodologies are Process-oriented. As mentioned in the above point they focus on pre-planning of processes in great detail and subsequently come up with a defined overall process to be used by whosoever uses it. Agile Methodologies on the other hand are People-oriented as they believe process definition is not an independent thing and the development of a software relies heavily on the skills of the development team rather than on defined processes. Agile Methodologies use processes only to support the development team in doing their work more effectively and efficiently. Process never takes a lead in agile methodologies.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


Thursday, October 16, 2008

What is SQL Injection? How can it be handled?

What is SQL Injection? How can it be handled?

SQL Injection - what's it?

It's an attack where some malicious piece of code is added to a SQL statement which is later passed to a DBMS to be parsed and executed. If the overall SQL statement string is syntactically correct (and of course if the user has rights to fire those SQL commands) then the SQL gets executed by the SQL Engine and consequently makes the database inconsistent (to say the least otherwise such an effort may completely spoil the database as well).

Example: suppose you have a SQL statement string as specified below:

String parm = request.getParamater("parameterName");
String queryString = "SELECT * FROM table_name WHERE column_name = '" + parm + "'";

Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(queryString);

Now, if the end user enters say "value'; DROP TABLE table_name --" as the parameter then the SQL query string will ultimately become "SELECT * FROM table_name WHERE column_name = 'value'; DROP TABLE table_name --'" which is valid SQL statement and it'll first fetch the results of the query and immediately after it it'll drop the table until you actually realize it. Notice the usage of '--' at the end of the parameter which mischievously makes the last "'" of the queryString ineffective. Anything following '--' is considered as comment in SQL ... right? Needless to mention here that ';' is an SQL statement terminator and it gives the attacker a luxury of terminating the expected SQL statement and fire his/her own malicious SQL statement.

Thus we see how easily one can add malicious but valid code to SQL statements and how terrible results can that produce for your application.

How can SQL Injection be handled?

Next obvious questions which comes to mind is - how can we effectively handle it? Can we really restrict it to happen altogether? I really doubt that we can claim 100% avoidance, but we can follow several measures to catch it beforehand and then accordingly we can deal with the situation.

As we can easily understand that as long as a SQL query is valid (and the user has right set of privileges) the execution of SQL queries can hardly be stopped without programmatic intervention both at the middleware level as well as the DBMS level. So how can we actually deal with such a situation? By following the same old rule of not relying on what comes as an input and by ensuring that all the user inputs go through a strict set of validations before they reach the DBMS only in the case they pass all the stages. Now deciding on how many validations the input should go through depends upon the foresight of the DBA/Developer and the success of SQL Injection will then depend upon how far can the DBA/Developer think ahead of the attacker.

Some Typical Validations for SQL Injection avoidance

All Inputs should be strictly validated without making any assumption about their

  • datatype
  • length
  • format
  • range

Strictly checking and rejecting inputs having avoidable characters like

  • ; : the SQL query delimeter (we saw the impact above)
  • -- : the SQL comment
  • ' : data string specifier (it should be appended by the code)
  • /*..*/ : comment delimeters can be used to fool the app to a great extent

Checking the SQL Parameters - as we know that Parameters collection inherently checks the type and length checks so it's better to use them while passing the parameters to the parametrized SQL statements whenever possible.

Checking Injection caused by Truncation - pay attention to the maximum length a variable can hold as rest of the characters in the assigned value will be trunctaed silently and that may cause severe damage in certain cases.

These are just few basic validations. Depending upon the actual DBMS you're using you can have various other validations before the SQL statement is permitted to be parsed and executed.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.


Sunday, October 12, 2008

CGI, Servlets, JSP, Model 1 & 2 Architectures

CGI, Servlets, JSP, Model 1 & 2 Architectures

Evolution of Dynamic Content Generation

When World Wide Web (WWW) started in the year 1989 at he CERN laboratory the idea was to have a mechanism which enabled sharing of research information among researchers using hypertext docs. And hence Web was designed to serve static contents in the beginning. The obvious and natural progression was to have the ability of dynamic content generation on Web and there came the concept of CGI.

CGI - Common Gateway Interface

CGI was designed to generate the contents on the Web dynamically. It became hugely popular in no time as it allowed the Web applications to access the database to show results based on dynamically selected criterio and it also facilitated insert/update of information in the database accepting the info from the end-user.

As the popularity grew the scalability of the web applications also grew and then we started realizing the limitaions of CGI.

Limitations of CGI

New Process per Request - the main drawback of CGI approach was that it spawned a new heavy-weight Operating System process everytime a new request was made from the end users (from browser). This immensely restricted the scalability and responsiveness of the web
applications as creating (and reclaiming once the request is served) OS processes are time and resource consuming stuff.Communication Gap between Web Server and Requests - since all the requests are executed in a different OS process from the Web Server process hence it becomes difficult to have a smooth communication between the server and the requests to handle stuff like logging, authorization, etc.

Alternatives of CGI

Several alternatives of CGI came into picture - all having relatively better performance and scalability support, but it was the Java Servlets Technology which actually replaced CGI almost entirely. These alternatives are:

  • FastCGI
  • mod_perl
  • Java Servlets
Java Servlets - what are they?

Sun Microsystems introduced this technology in the year 1997 and it became an instant hit due to various advantages it provided over CGI. It was no longer required to have a new process every time a request was made from an end-user. It was a platform-independent (As it is written completely in Java), component-based approach of developing web applications having dynamic content genration capabilities. Since it is written in Java hence all the tested and tried rich set of APIs of Java are readily available to be used and this advantage took this technology way above its competitors.

Why were JSPs needed when we had Servlets?

Servlets do an excellent job of dynamic content generation, but it becomes difficult and tedious to use Servlets for presentation of the data in HTML. EVery HTML change requires us to recompile the Servlet and the maintenance of Servlets become difficult as HTML changes are quite frequest in nature and using Servlets to do that ends up making corresponding Java code changes everytime.

Another serious drawback of this approach was that it didn't facilitate clear separation of roles and responsibilities. HTML design and devlopment is primarily the responsibility of a Web Designer (usually a person having limited Java expertise) whereas the responsibility of design & development of Servlets belongs to Java Developers. Using Servlets for presentaion of data mixed both these roles and hence the entire devlopment life cycle used to be more complex and hence slower. A clear separation of roles and responsibilities enhanced the overall development cycle and it also made the applications better maintainable.

JSP (JavaServer Pages) Technology is used to achieve this clear separation. A JSP can use normal HTML tags/elements the way we can have in any other normal HTML file and in addition it can have Tags, Scriptlets, etc. to encapsulate the business logic for the dynamic content generation. A Web Designer can simply use those tags or leave the scriptlets to be embedded by the Java Developers. In fact, it's become a better practice to avoid scriptlets as much as we can from a JSP page and to rely only on Tags for the dynamic content generation. This not only makes the life of JSP Page easier, but also enhances the reusability of the code (writen for Tags) and hence improves maintanability.

JSP Model 1 Architecture

JSP Model 1 Architecture

In this architecture a JSP Page is used not only for the display of the output to the client, but also for the entire request processing including Accepting the Request, Creating JavaBeans (or connecting to the DB for data), Executing the Business Logic to generate Dynamic Content, etc. This approach has the obvious disadvantage of having a complex inflexible and less maintainable JSP Page.

JSP Model 2 Archirecture

JSP Model 2 Architecture The main difference between Model 1 and Model 2 architectures lies in the way the request is processed. It's based on the MVC (Model View Controller) pattern and a Servlet (serving as the Controller) is used to intercept all the client requests which connects to the DB and builds the JavaBeans (which serve as the Model), and finally makes the data to the corresponding JSP (serving as the View) which actually serves the request. Evidently this approach ias far more organized, scalable, efficient, and maintainable than the previous one.


How to implement Pagination in JSP Pages?

This question was asked by one of our visitors, Ranvijay. Thank you Ranvijay for posting it. I thought it's worth posting the response as a separate article to improve the chances of it reaching to a wider range of visitors.

How to implement Pagination in JSP?

I'm afraid there isn't a standard way of implementing pagination. Different situations may demand diffrent strategies to be followed. If you don't have huge data (result set) for a query you're not very much concerned about the scalability and memory (which is very unusual in a typical web applications) then you may like to read the entire data into a bean in the middleware and from that bean you can build the different pages. You just need to iterate through the master bean to get he data for a particular page depending upon its page number and maybe you can either populate a temp page bean with that data or you may directly populate the view fields. This approach has the obvious advantage of minimizing the DB trips. Now, it's quite easy to understand that evn if you've sufficient memory but if you have huge data then the first query will take too much time to get all the data back which may be frustrating for the end-users. So as already said this approach should be considered only in the cases where you're sure that a query won't result into too much of data to be returned.

If the query returns (or may return) huge data and if you're concerned about the scalability and memory requirements then you'll probably need to make more than one DB calls (which will eventually slow down the overall performance). You need to pass-on either the page number directly or some other computed data so that you can pick only those many records which you need for that page - maybe by putting appropriate values in the ORDER BY clause of the DB query. I hope this makes some sense :-)

To implement the view of the pagination you will probably need to associate all the page numbers with some servlet accepting a parameter identifying which page number it was called from. For example: you may associate page 1 with <a href="/PaginationServlet?pageNo=1>1<a>

How to find how many page numbers to display at the bottom/top? Well... first fire a query which returns you the count of the records and then based on this count you can easily calculate the page numbers. This approach has another advantage of mixing the above two approaches based on what he count is. That means if the count is low (which can be accomocdated in 4-5 pages) then you'll probably prefer to get all the data in one DB trip only otherwise get data in chunks and prepare the pages from those chunks.

You would probably like to create separate temp beans in the middle tier for at least few pages (for many pages is again a situation depndent thing) so that if sombody clicks on already visited page then you have a chance of geting the already populated temp bean (instead of creating that every time that link is clicked). This will ensure the end users get a faster display of already visited pages.

If you're using Hibernate then you may like to have a look at the Criterion and Query interfaces.


Recursion - adv/lim, factorial, Ackemann's fun

Recursion - adv/disadv, types, factorial, Ackemann's fun, etc.

What is Recursion?

It's a programming technique which facilitates writing simpler, shorter, and clearer code. A fucntion calling itself (either directly or indirectly) is called recursion. There are many programming situations which are very difficult (if not practically impossible) to be modelled without using recursion.

Advantages of using Recursion?

Recursive functions are simpler, clearer, and shorter as compared to their iterative counterparts.Recursive functions focus directly and precisely on the actual problem as compared to their non-recursive counterparts (which need to focus on other manipulations to avoid a recursion).

Disadvantages/Limitations of Recursion?

  • Inherently Inefficient - Recursive functions are inherently inefficient as they normally require relatively more space and time to execute. Every single call requires space to have its own set of local variables. In addition the caller needs to retain its own set so that when the callee returns it can resume its own execution. A function call requires some CPU time for context switching as well because the control requires to get transferred from the caller to the callee and then back to the caller. Such inherent overheads can't be eliminated from a recursive function and hence it'll almost always be relatively inefficient than its iterative counterparts. So why to use them? Because ther are certain problems which are very difficult to be designed and devloped (and in turn maintained) without using recursion. There may be cases where you don't really care too much about the performance instead on the correctness and maintainability of the solution. Recursion may be of use in those cases.
  • Difficulty in Debugging and fixing bugs - Does it sound little conflicting with what we discussed in the above point? Well... it's not actually. Recursion do facilitate better maintainability as the code is shorter, clearer, and directly focused towards the actual strategy. But, what if you're not very clear about the strategy? Unlike the iterative counterparts you don't get much access to the intermediate results and hence debugging a subtle issue may be frustrating at times particularly in the cases where the function calls itself a large number of times.

Types of Recursion?
  • Preemptive Recursion - the normal recursion where a function keeps calling itself unless some boundray condition is satisfied. It's normally achieved by decrementing/incrementing the passed arguments. Example: Recursive Factorial Algorithm where fact(n) calls itself with fact(n-1).
  • Non-Premptive Recursion: if a function calls itself from one of its parameters then such a recursion is called Non-Premptive Recursion. Example: Ackermann's function is an example of Non-Preemptie Recursion.

Ackermann_fun(m,n) = n+1 if m=0; Ackermann_fun(m-1, 1) if n=0; Ackermann_fun(m-1, Ackermann_fun(m,n-1)) otherwise.

It's important to note here that iterative counterparts of Preemptive Recursion is relatively easier to find as compared to the same in case of Non-Preemptive Recursion.

Recursive Factorial Function - Code

long int factorial(int n){

if(n == 0)
return 1;
return (n * factorial(n-1));

Recursive Factorial Function - Call Sequence for 'factorial(3)'

Geek Explains: factorial call sequence diagram


Thursday, October 2, 2008

Structural Patterns contd. - Bridge, Facade, Flyweight, Proxy Design Patterns

Structural Patterns contd. - Bridge, Facade, Flyweight, Proxy

If you have directly reached to this article then you may like to first go through the first article on structural design
patterns which covers what they are all about and discusses Adapter, Decorator, and Composite design patterns in particular.

Bridge Design Pattern

This design pattern is used to achieve the very basic purpose of Object Oriented Programming, which is to separate the
interface of a class from its implementation. This approach has the obvious advantage of achieving the flexibility of being able to alter the implementation any time one wants to without affecting the client code using the class. Sounds similar to the Adapter Design Pattern? Well... it's actually similar to that, but the intent is quite different. The Adapter Design Pattern is used to enable the interfaces of one or more classes to look like and be compatible to the interface of a particular class whereas in case Bridge Design Pattern the class is designed in such a way that the interface is fixed and separate from the implementation. Here we don't have one or more classes to make compatible with the interface of some other class instead we separate the interface of a class from its implementation so that we can have the liberty of distributing the interface to be used by clients without being tightly tied to a particular implementation. Any future changes in the implementation won't affect the client code and this requirement can be really crucial in some cases. For example: if the class is being used to display some data then in future the revised implementation may be used provide additional details/graphs. This will only require to replace the older implementation with the newer one.

How can we implement this design pattern? Very simple... by just having an interface and subsequently implementing the
interface in one or more classes. We of course need to have a fixed interface otherwise the very purpose of this design pattern to provide the flexibility of having a loosely coupled implementation won't be achieved.

Facade Design Pattern

This design pattern is primarily used to achieve simplicity. The pattern provides a simpler interface with no or little
details of the underlying subsystems so that the normal users can simply use them in their client code easily. The advanced users can of course go to the subsystem level to achieve access to more detailed information.

Another obvious advantage of Facade Design Pattern is that it makes the normal client code independent of the particular
implementations of the subsystems as the normal client uses the higher level interface having no details of subsystems. This facilitates any combination of the subsystems to work well for such a client without requiring any client code changes.

How to implement this design pattern? Just think of the higher level interface and extend that interface for different
subsystems. Now you are free to have any implementation for the subsystems without being worried about the clients which used the higher level interface.

Flyweight Design Pattern

This design pattern is used in the cases where we have multiple instances with only little lighter (in terms of storage
required) differences. In such a case we can have save externally that part (which is typically heavier) of the object state which is similar across the instances and make it shared so that all the instances can point to that. This will help us having only that data in an instance which is prone to change from one instance to another. Such an approach may allow the instances to offload the heavier part of their payload to one shared place in the memory and in turn we'll end up saving the space and the time to create that heavy part redundantly for every instance. Such an approach can be really helpful in the cases where the number of instances are quite high.

How to implement this design pattern? You can create a separate singleton class for the non-changing heavier part and in the
actual class you can have a reference to that singleton instance. Not sure how to implement a singleton class? You may like to go through this article - Singleton Implementation in Java >>

Proxy Design Pattern

This design pattern is used to have a simpler object (often called the Proxy object) in place of the more complex object
which actually serves the client requests. The simpler object passes the calls to the actual object, maybe after doing some manipulation or transformation. The simpler object makes the life of the client developer easy and s/he doesn't bother about the transformation and other low-level details which the Proxy object handles for them.

The stub and skeleton in RMI do a similar stuff. The client and the server only need to send the raw details to them and
they take care of marshalling and unmarshalling.

How to implement this design pattern? Have a simpler higher level class (you can again use the Bridge pattern to design this
class) and the method definitions of this class should actually do the required transformation and once the actual parameters are ready then they should call the corresponding methods of the lower level more detailed class which actually does the required stuff.

Liked the article? You may like to Subscribe to this blog for regular updates. You may also like to follow the blog to manage the bookmark easily and to tell the world that you enjoy GeekExplains. You can find the 'Followers' widget in the rightmost sidebar.