Rambling notes

Monday, June 12, 2017

Assumptions of linear regression

There are some assumptions that a OLS (ordinary least square) or linear regression makes about data and if these assumptions are not satisfied.the results of the linear regression are not valid.In this post, we will look into the assumptions made by linear regression and how R helps us to check if these assumptions are met by our model.

We will use the same model that we created in the last post between "female literacy rate" and "age at first marriage in females".

Assumptions of linear model

The assumptions are about the data but in OLS, we typically talk about assumptions in terms of residuals produced by our model.

1.Normality of residuals: The linear regression assumes that the residuals are normally distributed.

2. Linearity of residuals: The relation between our predictor and the response variable must be linear.In other word,the points (x_i,y_i) fall in an almost straight line .If there exists a non-linear relation between a response and the predictor ,it will not captured by our LINEAR model and that leaks into residuals in the form of patterns.

3.Homoscedasticity of residuals or homogeneous variance of y over the entire range of x:

4.Independence of data: This assumption requires that all observations in data frame must be independent of each other.As an example, lets say we regress health of cows on food eaten by them and we generate a fitted regressions line.The regression line represents the relation between health of cows(response variable) and food eaten (predictor variable).
In case our target population is quite narrow, say a tiny town, the chances are high that all the cows eat the same kind of food.The observations so collected are not independent of each other and thus violate this assumption.

Lets learn about these assumptions in little more depth and see how R helps us to figure out the violation.

To recall, our model has the following specification:

p2<-lm(data=litrecy_and_ageatmarriage_grouped_by_Country,female_litrecy_rate~Age )

Lets plot residuals for the above model:

par(mfrow=c(2,2)) ## This will arrange the plots in 2*2 matrix

plot(p2)

The above gives the residual plots as shown in the figure 1,below

1.Check "normality of residuals" assumptions: This assumption is checked using top right graph.Also called probability graph or Q-Q graph or quantile-quantile graph, it compares residuals to ideal normal observation.Assuming our observation comes from normal distribution with mean μ and standard deviation δ, the standardized data y_i - μ ≈ q_i⇒ y_i=q_iδ + μ
δ
where q_i is taken from theoretical standard normal data.
This graph should be an approximate straight line.
As the Q-Q plot of our model p2 looks like an almost straight line, it conforms to the normality assumption.

2.Check "linearity of residuals" and "homoscedasticity" assumption: These assumptions are checked using top-left plot.The graph plots fitted values(predicted values) on x-axis and residuals on y-axis.
If the data points are scattered around in curve or parabola, it indicates a non-linear relation ship between the predictor or independent variable and response variable or dependent variable.In our plot, the data points don't seem to follow any specific pattern, rather the spread is random.This randomness in the spread of the data conforms the linear relationship between predictor i.e age at first marriage of females and the response variable i.e. female literacy rate.

Now, to validate the assumption of homoscedasticity, we look for almost the same deviation of each data point from the line at zero. The data points in the center of our plot seem to have more deviation than the ones towards the ends.This indicates a slight heteroscedasticity (unequal variance)in our data.

3.Check "assumption of independence of observations" assumption: This assumption is not checked using any plot, instead fulfilled during the research design (in simple terms, the way the data has been collected).

Simple regression model

A linear regression model that involves one predictor and one response variable is called simple regression model.This statistical method allows us to understand the relationship between two variables in greater depth and thereby help us make models for predictions.The predictor variable on x-axis is also call dependent variable and the response variable on y-axis is also call independent variable.

In this post, we are going to use the same data as we had used for our previous post-Bivariate analysis of female literacy rate and age at first marriage.


##Data frame litrecy_and_ageatmarriage contains our data.

p2<-lm(data=litrecy_and_ageatmarriage,female_litrecy_rate~Age )
summary(p2)

In the above code,lm is R function for linear model and we assign the result of linear model to p2. The predictor variable in our example is Age in Years (age at first marriage in females) and response variable is female literacy rate as percentage of general literate population.Lets check the details of p2 using summary(p2) which gives the following results


Call:

lm(formula = female_litrecy_rate ~ Age, data = litrecy_and_ageatmarriage)

Residuals:

    Min      1Q  Median      3Q     Max

-47.545 -20.735  -3.359  21.140  52.330

Coefficients:

             Estimate Std. Error t value Pr(>|t|)   

(Intercept)   -79.49      26.11  -3.044  0.00368 **

Age             6.23       1.18   5.280 2.69e-06 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 23.85 on 51 degrees of freedom

Multiple R-squared:  0.3534, Adjusted R-squared:  0.3407

F-statistic: 27.87 on 1 and 51 DF,  p-value: 2.687e-06

Lets understand the output.

Residuals: Residuals are the difference between the observed values of response variable and its value predicted by our model.In our example it is the difference between the value of female literacy rate in our data frame (observed value)and the value predicted by our model.Simple regression model minimizes this difference because the model which predicts values closer to our observed values will predict better for unobserved values.We should look for almost symmetric distribution of residuals around the mean. Any asymmetry indicates that some values predicted by our model are too high or too low or there might be some thing else going on that is not captured by our model.

Coefficients Estimates: The line of simple regression is the one which minimizes the residuals and is represented mathematically by y=b₁x+b₀.Here y is the response variable, female literacy rate in our case.The two coefficients of simple regression line are:
1.b₀, the intercept.
2.b₁, the slope

If we substitute the values in this equation from coefficient estimates in our output, we get
y=-79.49 + 6.23x

This line of regression has an intercept of -79.49 and slope of 6.23.
Lets first understand the intercept.

The line of regression has the intercept of point which passes through x-bar and y-bar i.e mean of x and mean of y.If the predictor (x) in simple linear regression is set to 0, the predicted value of y will be equal to the intercept.(As a side note, the model with no predictors is the mean model and is also called intercept-only model.)
y= -79.49 + 6.23(0)
y=-79.49

In other words,intercept is the estimated value of y when we consider average (mean )value of of x in our data frame.If however, its not possible for x to have a value of zero, the estimated value of y will have no meaning at all.Our example does not have data available for x=0 i.e our data frame has no observation when age at first marriage in females is zero, therefore y=-79.49 has no meaning at all.The intercept in our example just fixes the regression line on y-axis .

Lets understand the slope

Slope of a regression line gives the rate of change in the conditional mean of y if we change x by one unit.In our example,if we consider the change in "age at first marriage of females" by one unit ,the estimated change in the conditional mean of "female literacy rate" will be 6.23 .Having said that, if the "age at first marriage in female" is 20, the mean"female literacy rate" at x=20 would be 6.23% more than if the "age at first marriage in females" were 19.

Coefficient standard error:Our sample output gives the point estimates of the slope of the true regression line of the population .But its better to consider the standard error of the estimate as well in case we ran the model again and again with different samples of the same size from the same population to estimates the true population slope.Standard error measures how precisely the coefficients estimate the true population slope (or for that matter any population parameter).In other word, standard error gives the amount variation we may see in the coefficients in case we ran the sample gain.
For a large sample with normal distribution, 95% times the true population parameter lies within 1.96 times standard deviation.So ,from our sample output ,assuming that our data is normally distributed, the slope coefficient will lie within 6.23 ± (1.96)1.18.

t-value:The t-value in the output is the value of t-statistic.This value gives the number of standard deviations our coefficient lies away from 0.The farther its away from 0.the lesser the chance that it falls in the 95% confidence interval of the mean,the more the chance of rejecting the null hypothesis.In our sample output, the t-statistic value of our slope coefficient is 6.23/1.18=5.280 which is away from mean of 0 in positive direction.

p-value: t-value and p-value are used in significance testing of the coefficients of our sample.p-value gives the probability of observing a coefficient with absolute value of t.The probability of observing a coefficient with t-value of 5.280 is quite low, 2.69-e06 in our case.According to null hypotheses the coefficient is 0 or in other words, there is no relation between the predictor and response, but such a low probability overthrows that assumption and indicates coefficients are not zero and that there must be a relation between the predictor(age at first marriage in females) and response (female literacy rate) variable.

Residual standard error: Residuals of a model are unexplained and random.Its the difference between the predicted value and observed value. Residual standard error is the square root of the sum of these squared errors divided by the residual degree of freedom.Residual degree of freedom is the total degree of freedom - model degree of freedom.In our example, number of observations is 53, so total degree of freedom is 52 and since there are two coefficients including intercept, the model degree of freedom is 2-1=1.So residual degree of freedom is 52-1=51.
In other words, the residual standard error tells us the how wrong on average our model predicts.The smaller the value the better our model.Its worth to note that it has the same unit as the response variable.

Multiple R-squared: R² is also called the co-efficient of determination .In simple regression model like our example's, its is equal to the square of correlation between the predictor and response variable.In other words, R-squared gives the variation in response variable that is due to its regression on the predictor..According to ANOVA (ANalysis Of VAriance),total sum of squares(total variation in the response variable)=regression sum of squares (variation explained by the predictor)+ Error sum of squares(random variation due to error) .R²=1-Regression sum of errors/Total sum of squares.In our example, R-squares=.3534 i.e 35.34 % of the variation in our response variable is due to the predictor variable.
There is no good measure of how high R² should be .Its depends on the system we are analyzing.

Adjusted R-squared: This measure is same as the r-squared except that it adjusts for the number of predictor variables used.R-squared increases with each addition of predictor variables to the model.Adjusted R-squared on the other hand increases ONLY if the addition of the predictor variable increases the fit of the model thats is more that we would expect by chance.
When one has more than one predictor, its better to looks at this measure than R-squared.

F-statistic: F-statistic of overall significance measures fit of our model to intercept-only model.If the p-value of the F-statistic is lesser than chosen significance level, it means our model is provides better fit that mean model also called intercept-only model.

Its very importance to check if the assumptions of linear regression are met before we trust these results

Sunday, May 21, 2017

Bivariate analyses of age at the time of first marriage of females and female literacy rate

The data for these examples is taken from gapminder.We are going to analyze(learn to analyze) bivariate analysis of female literacy rate (aged 15+)and age of females at the time of their first marriage.

We have data in two separate excel file and we are going to read each file into a data frame and join them to make a single data frame fit to be analyzed.


female_litrecy<-read_excel('indicatorSE_ADT_LITR_FE_ZS.xls.xlsx')

colnames(female_litrecy)[1]<-"Country"

gatheredfemalelitrecy <- female_litrecy %>% gather(Years,female_litrecy_rate,-Country,na.rm = TRUE,convert=TRUE)

In the above code, na.rm means remove the rows with NA value.This makes sense because the rows which have no value will not contribute to the analysis.Convert=true in the above code converts the data type to correct form.The default is NOT to convert.Since we are reading from excel, some columns may be mistakenly stored as type char but reading them after setting convert=TRUE automatically converts it into correct type. e.g in the above code. column "Years"will be read as char if we don’t set convert=true

Now lets read the file of age at the time of their first marriage

ageofmarrigeinexcel<-read_excel("indicator age of marriage.xlsx")

## name the first column

colnames(ageofmarrigeinexcel)[1]<-"Country"

gatheredAgeDatafromExcel <- ageofmarrigeinexcel %>% gather(Years,Age,-Country,na.rm = TRUE,convert=TRUE)

Lets inner_join the two data frame on "Country" and "Years" fields

litrecy_and_ageatmarriage<-inner_join(gatheredfemalelitrecy,gatheredAgeDatafromExcel,by=c('Country','Years'))

Now that we have we have our data frame ready, lets plot a graph using ggplot which is in the package GGplot2.Please ensure we have that loaded first.

ggplot(data=litrecy_and_ageatmarriage, aes(x=female_litrecy_rate,y=Age)) +

geom_point(aes(color=Country,size=Years)) +

scale_y_continuous(breaks=seq(10,35,2)) +

scale_x_continuous(breaks=seq(0,100,5))+

xlab('%age of literate females aged 15 and above') +

ylab('Age at first marriage of females')   + geom_text(aes(label=Country)) +

geom_smooth(method='lm', formula=y~x)

In the last line of above code, I have added regression line to analyze the relation between the two variables.The above code generates the following graph

The above graph indicates that as the female literacy rate increases, the age at which the females marry also increases.
Lets quantify this relation between the two variables using Pearson's coefficient r.Null hypothesis for our analyses is r=0 indicating there is no relation between the two variables.Alternative hypothesis is that r is not equal to 0.We use Pearson's coefficient r of our sample to estimate the true correlation between the variables in population.We start the analysis with the assumption that Null hypothesis is true.

with(litrecy_and_ageatmarriage,cor.test(female_litrecy_rate,Age))

The above code gives the following result

Pearson's product-moment correlation

data:  female_litrecy_rate and Age
t = 5.2795, df = 51, p-value = 2.687e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3862388 0.7450490
sample estimates:
     cor 
0.594471

Lets understand the above output.

Pearson's correlation coefficient- r : The value of r is .5038696.This indicates a positive albeit not a strong relation between female literacy rate and age at first marriage of females.The correlation coefficient can take a value from -1 which indicates strong negative relationship to 1 which indicates strong positive relationship.The value of 0 indicates no relationship.

95% confidence interval: In a normal distribution, 95% of data lies within 2 standard deviations of the mean.95% confidence interval gives just that ;range within which the parameter we are analyzing should fall. Confidence interval in our case is between .3862388 and .7450490 which indicates that the parameter we are analyzing i.e r falls within this range. As we begin with that assumption that null hypothesis is true and r=0 , there is NO relation between female literacy rate and age at first marriage of females But the confidence interval we got overthrows this assumption and does NOT contain 0 which is the value of r of our null hypothesis.Therefore there is something going on and there might be relation between these two variables.

df: Degrees of freedom is number of data values in a sample that can be varied to achieve a specific result.For example: If sample size in 10 integers and we want them to add up to 100,we have the freedom to assign any value to 9 numbers but the last number has to be of specific value so that they all add up to 100.So degree of freedom is 10-1=9. df for Pearson's r is n-2 because we have a pair of variables.In this output, df =51, so number of values in our sample in 51+2=53.The bigger the sample size, the more precise our estimates are.

t (statistic) : The value of t gives the number of standard deviations our r lies away from 0.The value of 5.2795 indicates that it is this much away from 0 in positive direction.The farther its value is away from 0 in either direction, the more the chance of rejecting the null hypothesis which states there is no relation.

p-value: This gives the probability of any value equal to greater than the absolute value of t.The p-value of 2.687-e06 indicates the probability of observing value of r with t statistic of 5.2795 and this probability is quite low.

Based on the above result of our sample, we can confirm that there exists a relationship between "age at first marriage of females" and "female literacy rate" in the population as well.

Lets now make a simple regression model to help us predict the female literacy rate based on the age at first marriage of females.Simple regression also helps us understand the relations between these two in more details.

Sunday, October 12, 2014

Flood or a message from God?

A colossal loss like what every Kashmiri experienced in the first fortnight of September 2014 is unfortunate. It trembled the soul of every Kashmiri equally even the fortunate ones who remained untouched by this natural disaster.
There is an audible dolefulness which has engulfed the entire Valley in a jiffy .Though every person is empathetic with everybody else, there is a sound of relief in the heart of untouched ones that they have been spared by the nature and haven’t lost anything. Many reasons are held accountable for the flood of this magnitude such as illegal mining on river banks ,deforestation or man-made carbon emissions etc.
Keeping these reasons aside, let us comprehend at individual level if there is any other reason for such devastating flood that withered several beautiful mansions, innumerable houses, shelters, source of livelihood of many and rendered the valley somber. Have we as the people of Kashmir changed over a course of time? Is our behavior, practices not in accordance to what God likes any more? Have we turned extravagant when it comes to building our homes? Have we turned extravagant when it comes to marriages? Are we condescending our brethren who are financially less fortunate? What about the sloppy attitude towards the food wasted during marriages/functions? Can the money spent to wooden-panel our 3-storeyed houses put to better use like to feed a hungry family/helpless, unprivileged? What about heap of meat on “Trami” of “Maharaz sab”?The corrupters and liars among us have increased in number, unfortunately.
I visited my grand mother who is more than 85 years old, during these floods and she said “ye chi sein aimall,besharmi tche bademitch, poonse tche gomout toth,apuz che pheloumuth,makaan che baed magar che sunsaan”.I replied “ti tche pouz”
. Translation: The devastation is the result of our deeds, shamelessness is rampant, people love money more than anything else, lie is widespread, the houses are big but almost empty. I replied, “That’s right”.
It is our deeds, which has called for the wrath of Allah. Let not our souls be undaunted with what our eyes witnessed. The immense wealth and food provided to us by God is not being put to right use ,it is more like being wasted. So, all our prodigal efforts were devastated and rendered void leaving us to ponder. The fact that there is colossal loss of wealth and relatively less of precious lives makes me believe that THIS (the flood) is a message rather a warning from God to shun our prodigality now and that we have been given a chance.
Let us come together to rebuild and revive our Kashmir keeping in mind that we shouldn’t get another warning from God. Let us rise as a better society, people with better thoughts and in particular people who are aware that the surplus money provided by Almighty has to be put to right use and not for building mansions, prodigal weddings or practices like these.

Wednesday, November 18, 2009

Evolution of discoveries and inventions.

Note:Draft

Few days ago I was done with reading the book “Is God a mathematician” by Mario Livio.Mario Livio beautifully explains the perceptive of many philosophers and scientists towards mathematics. He views the world as dual pronged -----Plaotnists-the ones who believe mathematics as discovery and the other group which considers it as external reality-an invention.He also explains how mathematics has evolved from simple addition during the times of Pythagorus—a philosopher and a religious thinker ,to be the basis of most complex system of today’s world.

I have a different discourse here which has nothing in common to the thoughts projected by Mario Livio except work, thoughts, inventions and discoveries by scientists/philosophers he mentions. I write here how the nature of discoveries and inventions has changed over a long period of time (centuries and millennia).The discoveries/inventions/thoughts of antiquity sprouted exclusively by observing nature and its complex/simple phenomenon.Mario Livio explains “Pythagorous and Pythagoreans were so enraptured by the dependency of geometrical figures,stellar constellations and musical harmony on numbers that numbers became both the building blocks from which the universe was constructed and the principles behind its existence”. The study of Pythagoreans was oriented towards the study of universe .It could be that nature is all they had that time and brain aligns itself to what it imbibes ideating those wonderful processes of universe.Like in today’s world,computer has gotten in everybodys’ mind and all minds are aligned towards creating robots,making it more like humans, creating programs or talk about 50-60 years ago ,when minds were aligned towards creating the world of digital computers.The contemplation of Pythagoreans of the universe lead a great step for the mankind.

Another hero of antiquity Mario Livio discusses is Plato.He explains “Plato was the relentless seeker of pure knowledge, absolute conduct and eternal truths”. Look at the powerful words used by Mario Livio to describe the knowledge Plato sought—“Pure knowledge”,”absolute conduct” and “eternal truths”. Usage of any such words to describe the knowledge floating around in today’s world will sound humorous, say eternal truths of computer, absolute conduct of robots.We, living at this age are far away from being considered the seekers of pure knowledge. People like Plato have acquired considerable amount of such knowledge centuries ago and large part is yet left but there are plethora of things existing in today’s world that divert man’s mind to things which are either superficial or loose meaning in short span.

Please keep in mind that I don’t intend to compartmentalize present and past into good or bad, rather the colossal difference between the two which in turn has enormously affected the thought process of humans.The way we work,think today is indescribable to our ancestors or lets take a step back and imagine explaining ANALOG computers(the ones which used punched cards) to people who have lived before their existence.Now lets take step a further and imagine shopping on the moon and be back before dusk, it would be denounced as baloney if you try to explain this to someone today.So at every stage(say after centuries), the brain of a human adjusts (or should I say evolves) to the surrounding created by himself and it cascades to the whole generation existing at that particular time and to generations further down till there is another major adjustment needed and the process continues.

Another great man Mario Livio describes in his book is Galilio Galili.Galio, in one of the letters to his student Castelli writes –(paraphrased)—There are truths about nature which are far above man’s understanding and cannot be made credible by any learning or any other means.According to Galilio-Mathematics is the language of God.Galilio talks about the truths of nature, universe and God which by all means are related.Its incredibly amazing how the scientists and philosophers who existed centuries ago emphasized the existence of divine power The depth of their studies about nature gathered huge information for posterity.

I think rather I am forced to think that there will be some truths about universe beyond the reach of man’s imagination and with complexity orders of magnitude higher than man’s understanding. The antiquity left us a platform, a base, we have grown our world on.The deafening music, crazy dance, artificial intelligence, anti-aging, artificial growth and all modern day buzz words which are so-called the boon for humans, sometimes feel a huge regiment piling up for the disaster(may be otherwise) of humans.

Mario Livio then writes about Newton.He formulated the fundamental laws of mechanics, deciphered the laws describing the planetary motion, erected the theoretical basis for the phenomenon of light and colours and founded the study of differential calculus and did a great work on gravity.All this together would have called for a great deal of contemplation and thoughts.

As I proceed reading the book, I notice how a man started with observing/studying things which involve nature/God/universe, drifting slowly towards creating things which virtually benefit him, giving his explanation/solutions/theories for natural phenomenon while eliminating the hand of divine power from these beautiful complex systems of nature.

Newton was confronted with a question for a long time…why is solar system as stable as it is. Newton, in addition to gravitational force involved the God’s hand(any divine power) in keeping the solar system intact.But the question was answered by Laplace –a great mathematician who wrote 5 volumes of a book giving virtually complete solution to the motion in solar system.Mario Livio writes, Laplace had different views,instead of relying on God’s handiwork, he simply proved mathematically that solar system is stable over long period of time , much longer than anticipated by Newton. Now, this is the drift I talked about earlier though a negligible but a significant one considering the time span over which the evolution of discoveries/inventions/thoughts has occurred.

Astronomers learned only in 20th century, it is the gravity which keeps 100 billion stars wheeling in the arms and thronging in the centers of spiral galaxies As the harmony exhibited by stars, planets and all celestial bodies necessitates the prevalence of some force(in this case gravitational force),we can continue with the argument to penetrate to the level of prevalence of some force which produces gravitation. There might be some force responsible for the movement of protons ,neutrons, gravitons or any other particle which in turn produces gravitation(the reason for such force is yet unanswered).I am sure there is a limit to what a human mind can understand beyond which his intelligence does not suffice. It could be because of these reasons that scientists/philosophers of antiquity had pulled the God’s handiwork in the simple/complex phenomenon of nature. I believe that there are magnitude of other things existing out there yet to be explored by man.

As the time passed, man utilized his creativity to give birth to myriad things, discoveries and inventions, each step leaving the foundation for the next big invention and a slow drift away from nature, God and universe. Every invention put a stepping stone forward towards the advancement of man in every field.

If computers and internet were not into existence, I wouldn’t have been able to publish this piece of writing but wait….without even realizing man has started to dig his own grave. He has reached to a point where he is engrossed in entertaining himself and is constantly trying to change nature and universe created by the divine power to suit his needs and in the process creating an environment hostile(perhaps otherwise) to his own existence.

Wednesday, August 19, 2009

Moving past the boundaries

I am so excited, its my first trip out of the country into the boundaries of United states of America.We finally got the visa and I am writing this from right inside my hotel room in the beautiful city of Albany,NY.Its great and amazing here and am even more glad that I am not in NYC as that would have been much like my Bangalore--crowded, traffic, pollution(of all kinds)and what not.This place is full of lush green trees making all the roads look like a boulevard,its heart throbbing.
The culture here is diametrically opposite to ours and it kind of takes time for people who are introvert and talk less(like ME)to get adjusted to the people around you.People here talk too much and about anything that gets to their mind ,even the things that are impertinent to the subject under discussion(phew! that bores me).
The rules are strictly followed, unlike in our country , that makes living much easier and organized.(Live and let live is an apt phrase here).
I find the outfits of the people little funny here but it has its own charm.Its just been 4 days that I am here, got lot more things to see and explore.Will keep writing..

Friday, May 22, 2009

Auto Complete with AJAX

Its the first time that I have wet my hands with AJAX.(Reference - AJAX/JAVA)Prior to this I have been using .Net AJAX server side controls which internally wrap AJAX calls ,thus letting a user to deal only with facade.
There are numerous utilities to add auto-complete functionality to your site/application.I found them complex to use leading me to write one for myself.
It has three simple steps
1.Make asynchronous call to the server(handled by servlet) using XMLHttpRequest .
2.Get the required data in servlet and send the response as XML
3.Grab and process the data(display) in the front end when control is returned from the server.
The above three steps are common for any functionality achieved using AJAX.
The jsp page looks like this

In the above jsp if you have seen, on "onkeyupevent", function getData is called passing in the 4 parameters.The javascript file(its in text format) can be downloaded here
The servlet which handles the synchronous call made by jsp is shown below.The data to to be loaded is initialized by overriding the init method of the servlet(data retrieval will vary according to ones need)

That is pretty much.There is definitely a huge room for improvement but its gets the target rolling.

NOTE:The blogger.com parses everything ,so I had to include code in images.If anybody knows a better way to include code in blogspot, will be great.Thanks