Well Researched and Ready to use Ph.D Thesis,
page numbers: 131, Department: Computer Science
ABSTRACT
The web is a large information hive, where information is stored and shared. Over
time, the amount of information on the web keeps increasing steadily and fast. This geometric
growth has brought about difficulties in searching for the exact needed information from the
web. Search engines have been the main tool used for searching information on the web.
They collected, stored and pre-processed information on the web as indexes. Recently, there
has been a great improvement with the development of algorithms used by the search
engines.
However, users still need to put in considerable efforts in order to access relevant
information because the supporting technology does not make it simple enough to add
ontology-based metadata to information, and the documents are only retrieved based on the
keywords.
Hence, this study presents an effective ontology-based system, which index
documents according to concepts that best describe them, and a retrieval module that
optimally utilizes the ontology tools to improve on both recall and precision. The specific
objectives are to: (i) evaluate the existing Information Retrieval (IR) algorithms in order to
discover their strengths and weaknesses; (ii) develop an enhanced ontology-based algorithm
for IR System; (iii) investigate the effectiveness of query expansion when an upper-level
ontology is combined with domain-specific ontology; and (iv) conduct performance
evaluation of the developed algorithm with the existing algorithms based on Recall and Mean
Average Precision.
The method used involved the concept of Ontology Knowledge Bases (OKB) as
document repository for fast retrieval. The indexing module implemented automatic
semantic indexing using OKBs for both upper-level and domain-specific ontology. The
Classic Vector-Space Model was adapted for retrieval module. Integration of Query Expansion as to unbound all constraints of original users’ query and a ranking algorithm were
employed.
The findings of the study revealed that:
(i) taking full advantages of ontology in the development of IR System enhanced its
performance, irrespective of the domain;
(ii) ontologies bridged the gap between query terms and documents through semantic
mechanisms as given in the equation below: where E(o,T) and P(o,T) are the number of classes of ontology o that have labels that
match any of the search-texts exactly or partially, respectively;
(iii) added Query Expansion was shown to be flexible since all the constraints are unbounded;
and
(iv) values of recall and precision were very good when compared with two other existing
IRs (Lucene and Sphider). The value of recall and precision were between 0 and 1.
In this study, it was concluded that the added value of semantic information retrieval
over traditional keyword-based retrieval has helped to achieve better precision through query
weight and better recall through semantic relations. The proposed system is robust, flexible
and efficient. The results of the study can be useful in digital libraries, information filtering,
media search and search engines.
Share!!
No comments:
Post a Comment
Add Comment