ABSTRACT
Grid
scheduler, schedules user jobs on the best available resource in terms of
resource characteristics by optimizing time, and resource failure in grid is no
longer an exception but a regular event. And resources are increasingly being
used by the scientific community to solve computationally intensive problems
which typically run for days or even months. It is therefore absolutely
essential that long-running applications are able to tolerate failures and
avoid computation of the task from scratch when resource failure occurred, to
satisfy the user’s QoS requirement. An Improved Job Scheduling Algorithm in
Grid Computing Environment Using Fault Tolerance Mechanisms is proposed. The
technique employed here, is the use of resource failure rate, as well as
checkpoint-based roll back recovery strategy. Check pointing aims at reducing
the amount of work that is lost upon failure of the system by intermediately
saving the state of the system. A comparison of our proposed approaches with
Moallem‟s ACO, the result shows that the proposed algorithm achieved up to 13%
reduction in make span, 12% maximization in terms of throughput and 12%
maximization in ATA when the Gridlets are varied and the resources are kept
constant. Also when the Resources are varied and Gridlets are kept constant,
the proposed algorithm achieved 18% reduction in make span, 18% maximization in
terms of throughput and up to 14% maximization in ATA.
CHAPTER
ONE
INTRODUCTION
1.1
Background of the Study
Computational
approaches to problem solving have proven their worth in almost every field of
human endeavor. Scientists in fields such as health, meteorology, astrophysics
and many more are in need of huge processing power to perform complex
calculations in a reasonable amount of time. It might take decade to run a set
of modeling experiments on a standard personal computer. Buying a supercomputer
costs millions of dollars and thousands more each year to maintain it. That's
not to mention the hefty electric bill to keep the massive system running. The
standard Personal Computers today have great processing power. The standard
tasks for an average user’s computer vary very little, usually including word
processing, Internet browsing, spreadsheets and presentations.
Owning
to the fact that high performance computing resources are expensive and hard to
access, option was to use confederated resources that could comprise
computation, storage and network resources from multiple geographically
distributed institutions (Foster et al., 2008). As most systems are idle
for significant periods of time, it should be possible to harness their
idleness or unused resources and apply them towards projects in need of such
resources. The Grid paradigm now emerged, led by Foster, Carl Kesselman, and
Steve Tuecke (Foster et al, 2008; Rhodes, 2006) called “fathers of Grid”
(Haque et al., 2012). They got together to develop a toolkit to handle
computation management, data movement, storage management and other
infrastructure that could handle large grids without restricting themselves to
specific hardware and requirement(Barboni, 2011).
Grid
emerges from solving computational problems which otherwise cannot be solved by
single personal computer. Such computational problems are financial modeling,
weather modeling, data visualization etc. This extremely high computing power
is achieved by optimal utilization of distributed heterogeneous resources which
are lying idle. This has enable scientists to broaden their simulations and
experiments to take into account more parameters (like large values) than ever before.
Imagine millions of computers owned by individuals and institutions from
various countries across the world connected to form a single, huge,
super-computer so as to utilize the resources as depicted.
(http://eu-datagrid.web.cern.ch/eu-datagrid/images/images/grid-small-prov.jpg)
The term Grid is analogous with “electrical power grids”, that provide
consistent, pervasive, reliable, transparent access to utility power
irrespective of location source(Rhodes, 2006; Pritpal and Gurinderpal, 2013;
Foster and Kesselman, 2002). Grid computing is concerned with coordinated
resource sharing and problem solving in dynamic, multi-institutional virtual
organizations. The key concept is the ability to negotiate resource-sharing
arrangements among a set of participating parties (providers and consumers) and
then to use the resulting resource pool for some purpose (Foster, 2002). Grid
enables the sharing, selection, and aggregation of a wide variety of geographically
distributed resources including supercomputers, storage systems, data sources,
and specialized devices owned by different organizations for solving
large-scale resource intensive problems in science, engineering, and
commerce(Buyya, 2002).
1.1.1
General Issues in Grid System: Principles
Four
main aspects characterize a grid(Rhodes, 2006; Buyya, 2002):
1.
Multiple Administrative Domains and Autonomy:Grid resources are geographically
distributed across multiple administrative domains, most often in different
time zones, owned by different organizations. It is obliged to honor the
autonomy of resource owners together with their local resource management and
usage policies.
2.
Heterogeneity: A Grid involves a collection of resources, heterogeneous in nature
and will comprise enormous range of technologies.
3.
Scalability. A Grid might grow from a small number of integrated resources to
millions. This led to problem of potential performance degradation as the size
of Grids increases. Subsequently, applications requiring numerous
geographically located resources must be designed to be latency and bandwidth
tolerant.
4.
Dynamicity or adaptability: With so many resources in a Grid, the probability
of some resource failing is high. Therefore resource failure should be
considered as a rule rather than the exception. Applications or Resource
managers must adapt their behavior dynamically and use the available resources
and services efficiently and effectively.
Due
to dynamic nature of grid, more failures are likely to occur in grid
environments, thereby affecting the time needed to execute job applications and
therefore degrading the performance of the system. Grid compute intensive
applications often require much longer execution time in order to solve a
single problem. The huge computing potential of grids, usually, remains
unexploited due to their susceptibility to failures like, process failures,
machine crashes, and network failures etc (Garg and Kumar, 2011). The failure
of a resource running a user job has a huge effect on the Grid performance.
Hence, in order to ensure the high system availability, the job site failure
handling is inevitable.I n grid computational system, incorporating fault
tolerant algorithms in grid scheduling is advocated. Swarm Intelligence for
Distributed Job Scheduling on the grid algorithm proposed by Moallem (2009)
will be enhanced by incorporating Fault tolerant technique.
MSC Project Topics and Complete Thesis in Computer Science
AN IMPROVED JOB SCHEDULING ALGORITHM IN GRID COMPUTING ENVIRONMENT USING FAULT TOLERANCE MECHANISMS
Department: Computer Science (M.Sc)
Format: MS Word
Chapters: 1 - 5, Preliminary Pages, Abstract, References, Appendix
No. of Pages: 103
NB: The Complete Thesis is well written and ready to use.
Price: 20,000 NGN
In Stock
Our Customers are Happy!!!
No comments:
Post a Comment
Add Comment