ABSTRACT
The
amount of information available today is extremely large. The increasing need
for easier and faster information discovery demands optimal information
retrieval techniques. A measure of performance of any information retrieval
system is based on the effectiveness and efficiency of retrieval. While some
techniques rely on algorithms that improve search, others aim at increasing
user’s ability to formulate search queries. Here we present a nonpredefined query
model for information retrieval system based on a relational database.
CHAPTER ONE
INTRODUCTION
Information
retrieval is obtaining information by searching a repository for items that
match user’s information need. According to Losee (1998), retrieval systems
often order documents in a manner consistent with the assumptions of Boolean
logic, by retrieving, for example, documents that have the terms dogs and cats,
and by not retrieving documents without one of these terms. Systems consistent
with the probabilistic model of retrieval locate documents based on a query
list of terms, such as {dogs, cats}, or may accept as input a natural language query,
such as I want information on dogs and cats. A probabilistic system then ranks documents
for retrieval by assigning a numeric value to each document, based on the
weights for query terms and the frequencies of term occurrences in documents. We
want to know how to “best” formulate a query, and our ultimate interest in
measures of human utility: how satisfied is each user with the results the
system gives for each information need that they pose? Manning et al (2009).
Most everyday users of IR systems expect IR systems to do ranked retrieval,
unfortunately relevance ranking is often not critical in Boolean systems. On
the other hand, most IR systems rank documents by their estimation of the usefulness
of a document for a user query, and there is little or nothing a user can do
about it. However, many power users still use Boolean systems as they feel more
in control of the retrieval process. It is correct that the set of retrieved
documents are not ranked in Boolean searches. However, the cost of a ranked set
is a set that is not fully controlled or understood by the user. In Boolean
searches, the user obtains well‐defined
search sets, which is a clear advantage if searching is considered a learning
process. The well‐defined
set provides better feedback and therefore allows modified search profiles.
Hjørland (2014). Conventional IR systems are built on the Boolean model while
most IR systems rely on sophisticated algorithms for better ranking. Most of
the materials for ranking are usually documents of an unstructured nature
(usually text). Today, research in the field IR are split among various
activities and between (a) optimizing algorithms for ranked systems and (b) extending
the Boolean model to increase the selection power of users. Researchers who are
working on the storage side of the information retrieval system are engaged in
designing sophisticated methods for identification and representation of the various
bibliographic elements essential for documents, automatic content analysis,
text processing and so on. On the other hand, researchers working on the
retrieval side are attempting to develop sophisticated searching techniques,
user interfaces, and various techniques for producing output for local as well
as remote users. Chowdhury (2004).
Department: Computer Science (M.Sc Thesis)
Format: MS Word
Chapters: 1 - 5, Preliminary Pages, Abstract, References, Appendix.
No. of Pages: 52
Price: 20,000 NGN
In Stock
Our Customers are Happy!!!
No comments:
Post a Comment
Add Comment