Drexel University Home Pagewww.drexel.edu DREXEL UNIVERSITY LIBRARIES HOMEPAGE >>

iDEA: Drexel E-repository and Archives > Drexel Theses and Dissertations > Drexel Theses and Dissertations > Language modeling approaches to question answering

Please use this identifier to cite or link to this item: http://hdl.handle.net/1860/3126

Title: Language modeling approaches to question answering
Authors: Banerjee, Protima
Keywords: Information science;Question-answering systems;Semantics--Data processing
Issue Date: 8-Oct-2009
Abstract: In today’s environment of information overload, Question Answering (QA) is a critically important research area. QA is the task of automatically extracting a precise answer from one or more data sources to a question posed in natural language. A twostage strategy is typically adopted when designing a QA system; the first stage is an Information Retrieval (IR) process which returns a set of candidate documents relevant to the question and the second stage narrows the information contained in those passages down to a single response (sentence or entity) that answers the question, typically using Information Extraction (IE) or Natural Language Processing methods. This research proposes novel techniques for QA by enhancing the user’s original query with latent semantic information from the corpus. This enhanced query is then applied to both the first and second stages of the QA architecture. To build the enhanced query, we propose the Aspect-Based Relevance Language Model as an approach that uses statistical language modeling techniques to measure the likelihood of relevance of a concept (or aspect as defined by Probabilistic Latent Semantic Analysis) to a question. We then use terms from the aspects that have the highest likelihood of relevance to design a model for a semantic Question Context, which includes sense-disambiguated terms than amplify the user’s query. Question Context is incorporated into the first state of QA as query expansion to improve recall. We then derive a novel measure called Answer Credibility from the Question Context. Answer Credibility may be thought of as a statistical measure of the reliability of a candidate answer with respect to a question and the source text from which the candidate answer was derived. We incorporate Answer Credibility in the Answer Validation process; the answer with the highest score after the application of Answer Credibility is returned to the user. Our techniques show performance improvements over state-of-the-art approaches, and have the advantage that they use statistical techniques to derive semantic information to aid the process of QA.
URI: http://hdl.handle.net/1860/3126
Appears in Collections:Drexel Theses and Dissertations

Files in This Item:

File Description SizeFormat
Banerjee_Protima.pdf1.07 MBAdobe PDFView/Open
View Statistics

Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! iDEA Software Copyright © 2002-2010  Duraspace - Feedback