TY - GEN
T1 - Can click patterns across user's query logs predict answers to definition questions?
AU - Figueroa, Alejandro
N1 - Publisher Copyright:
© 2012 Association for Computational Linguistics.
PY - 2012/1/1
Y1 - 2012/1/1
N2 - In this paper, we examined click patterns produced by users of Yahoo search engine when prompting definition questions. Regularities across these click patterns are then utilized for constructing a large and heterogeneous training corpus for answer ranking. In a nutshell, answers are extracted from clicked web-snippets originating from any class of web-site, including Knowledge Bases (KBs). On the other hand, nonanswers are acquired from redundant pieces of text across web-snippets. The effectiveness of this corpus was assessed via training two state-of-The-Art models, wherewith answers to unseen queries were distinguished. These testing queries were also submitted by search engine users, and their answer candidates were taken from their respective returned web-snippets. This corpus helped both techniques to finish with an accuracy higher than 70%, and to predict over 85% of the answers clicked by users. In particular, our results underline the importance of non-KB training data.
AB - In this paper, we examined click patterns produced by users of Yahoo search engine when prompting definition questions. Regularities across these click patterns are then utilized for constructing a large and heterogeneous training corpus for answer ranking. In a nutshell, answers are extracted from clicked web-snippets originating from any class of web-site, including Knowledge Bases (KBs). On the other hand, nonanswers are acquired from redundant pieces of text across web-snippets. The effectiveness of this corpus was assessed via training two state-of-The-Art models, wherewith answers to unseen queries were distinguished. These testing queries were also submitted by search engine users, and their answer candidates were taken from their respective returned web-snippets. This corpus helped both techniques to finish with an accuracy higher than 70%, and to predict over 85% of the answers clicked by users. In particular, our results underline the importance of non-KB training data.
UR - http://www.scopus.com/inward/record.url?scp=85035309567&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85035309567
T3 - EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
SP - 99
EP - 108
BT - EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
Y2 - 23 April 2012 through 27 April 2012
ER -