Drexel University Home Pagewww.drexel.edu DREXEL UNIVERSITY LIBRARIES HOMEPAGE >>

iDEA: Drexel E-repository and Archives > Drexel Academic Community > College of Information Science and Technology > Faculty Research and Publications (IST) > Relation-based document retrieval for biomedical literature databases

Please use this identifier to cite or link to this item: http://hdl.handle.net/1860/919

Title: Relation-based document retrieval for biomedical literature databases
Authors: Zhou, Xiaohua
Hu, Xiaohua
Lin, Xia
Han, Hyoil
Zhang, Xiaodan
Issue Date: Apr-2006
Publisher: Springer Verlag
Citation: Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, April 9-12, 2006, Singapore (Lecture Notes in Computer Science 3918: http://www.springerlink.com/link.asp?id=t85674vqg783). Retrieved 6/26/2006 from http://www.ischool.drexel.edu/faculty/thu/My%20Publication/Conference-papers/DASFFA06.pdf.
Abstract: In this paper, we explore the use of term relations in information retrieval for precision-focused biomedical literature search. A relation is defined as a pair of two terms which are semantically and syntactically related to each other. Unlike the traditional “bag-of-word” model for documents, our model represents a document by a set of sense-disambiguated terms and their binary relations. Since document level co-occurrence of two terms, in many cases, does not mean this document addresses their relationship s, the direct use of relation may improve the precision of very specific search, e.g. searching documents that mention genes regulated by Smad4. For this purpose, we develop a generic ontology-based approach to extract terms and their relations; a prototyped IR system supporting relation-based search is then built for Medline abstract search. We then use this novel IR system to improve the retrieval result of all official runs in TREC-2004 Genomics Track. The experiment shows promising performance of relation-based IR. The mean of P@100 (the precision of top 100 documents) for all 50 topics is raised from 26.37 %( the P@100 of the best run is 42.10%) to 53.69% while the recall is kept at an acceptable level of 44.31%. The experiment also shows the expressiveness of relations for the representation of information needs, especially in the area of biomedical literature full of various biological relations.
URI: http://hdl.handle.net/1860/919
Appears in Collections:Faculty Research and Publications (IST)

Files in This Item:

File Description SizeFormat
2006150022.pdf127.83 kBAdobe PDFView/Open
View Statistics

Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! iDEA Software Copyright © 2002-2010  Duraspace - Feedback