Exploring effective features for recognizing the user intent behind web queries

Resultado de la investigación: Contribución a una revistaArtículo

20 Citas (Scopus)

Resumen

Automatically identifying the user intent behind web queries has started to catch the attention of the research community, since it allows search engines to enhance user experience by adapting results to that goal. It is broadly agreed that there are three archetypal intentions behind search queries: navigational, resource/transactional and informational. Thus, as a natural consequence, this task has been interpreted as a multi-class classification problem. At large, recent works have focused on comparing several machine learning methods built with words as features. Conversely, this paper examines the influence of assorted properties on three classification approaches. In particular, it focuses its attention on the contribution of linguistic-based attributes. However, most of natural language processing tools are designed for documents, not web queries. Therefore, as a means of bridging this linguistic gap, we benefited from caseless models, which are trained with traditionally labeled data, but all terms are converted to lowercase before their generation. Overall, tested attributes proved to be effective by improving on word-based classifiers by up to 8.347% (accuracy), and outperforming a baseline by up to 6.17%. Most notably, linguistic-oriented features, from caseless models, are shown to be instrumental in narrowing the linguistic gap between queries and documents.

Idioma originalInglés
Páginas (desde-hasta)162-169
Número de páginas8
PublicaciónComputers in Industry
Volumen68
DOI
EstadoPublicada - 1 ene 2015

Áreas temáticas de ASJC Scopus

  • Informática (todo)
  • Ingeniería (todo)

Huella Profundice en los temas de investigación de 'Exploring effective features for recognizing the user intent behind web queries'. En conjunto forman una huella única.

  • Citar esto