ABSTRACT
The
continued increase in demand for objects on the Internet causes high web
traffic and consequently low user response time which is one of the major
bottleneck in the network world. Increase in bandwidth is a possible solution
to the problem but it involves increasing economic cost. An alternative
solution is web prefetching. Web prefetching is the process of predicting and
fetching web pages in advance by proxy server before a request is sent by a
user. Prefetching is performed during the server idle time. Most literature
based on the classical prefetch algorithm assumes that the server idle time is
large enough to prefetch all user’s predicted requests which is not true in a
real life situation. This research aims at improving the web prefetching
technique by developing a prefetching technique that can be effective in a high
traffic environment when the server idle time is very low. Log files were
collected and preprocessed for several client group within a domain. The
preprocessed log files were used to create web navigation graph, which shows
the transition from one web page to another web page.Support and confidence
threshold were used to remove web pages with values less than the threshold
values. Several clusters were formed in a particular client group. When the
prefetch time is predicted to be too small to prefetch, the entire clusters
formed from various domains will be used to create a prioritized cluster based
on several user request. The model was evaluated based on hit rate, byte rate,
precision, accuracy of prediction and usefulness of prediction. The result
shows that the proposed Web Clustering algorithm performs better than the
classical prefetch technique when the server idle time is small and behaves
same as the classical algorithm as the server time becomes large enough to prefetch
all users predictions.
Background of Study
The
web is a collection of text documents and other resources, linked by hyperlinks
and Uniform Resource Locator (URLs), usually accessed by web browsers, from web
servers. The web started from a simple information sharing system, and has now
grown to a rich collection of dynamic and interactive services. The tremendous
growth of web has resulted into high demand for high bandwidth and delay in
fetching user request (Neha, 2013). Users sometimes experience unpredictable
delay while retrieving web pages from the server. Increase in bandwidth is a
possible solution to the problem but it involves high economic cost. Web
caching reduces the latency perceived by the user, reduces bandwidth
utilization and reduces the loads on the origin servers (Pallis, 2007). Latency
refers to the time elapsed from the time a request is sent to the time sender
receives the requested information.
Many
latency tolerant techniques have been developed over the years to solve this
problem without necessarily increasing the bandwidth. Most notably are caching
and prefetching. Web prefetching helps to fetch and cache users request during
server idle time, which will reduce the load on the origin server. To reduce
the access delay experienced by users, it is advisable to predict and prefetch
web object based on user access patterns and cache them. Studies on web
pre-fetching are mostly based on the history of user access patterns. If the
history information shows an access pattern of URL address A followed B with a
high probability, then B will be prefetched once A is accessed (Cheng-Zhong,
2000). Web prefetching is the process of obtaining web pages in advance by
proxy server before a request is sent by a user. When a client makes a request
for web object, rather than sending request to the web server, it may be
fetched from the cache. The main factor for selecting a web pre-fetching
algorithm is its ability to predict the web object to be prefetched in order to
reduce latency. Web prefetching exploits the spatial locality of web pages,
i.e. pages that are linked with current page will be accessed with higher probability
than other pages. Web prefetching can be applied in a web environment as
between clients and web server, between proxy servers and web server and
between clients and proxy server (Greeshma, 2012).
Web
prefetching techniques are categorized into probability based and clustering
based using weight-functions. In the probability based pre-fetching,
probabilities are calculated using the history of data access. This method
assumes that the request sequence follows a pattern and calculates the probabilities
of following this pattern. Clustering based pre-fetching methods make decisions
using the information of the web pages that have been fetched previously,
assumes that pages that are close to the previously fetched pages are more
likely to be requested in the near future (Greeshma, 2012). Moreover, web
prefetching is a research topic that has gained increasing attention in recent
years. The web pre-fetching fetches some web objects before users actually
request it. Thus, the cache pre-fetching helps on reducing the user perceived
latency. Many studies have shown that the combination of caching and
pre-fetching doubles the performance compared to single caching (Waleed, 2012).
TOPIC: A CLUSTERING BASED WEB PREFETCHING IN HIGH TRAFFIC ENVIRONMENT
Format: MS Word
Chapters: 1 - 5
Delivery: Email
Delivery: Email
Number of Pages: 65
Price: 3000 NGN
In Stock
No comments:
Post a Comment
Add Comment