What identifies different age cohorts in Yahoo! Answers?

Alejandro Figueroa, Mohan Timilsina

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)


For different kinds of online platforms, understanding demographics has shown to be instrumental in improving user experience, especially for personalizing and contextualizing content. Needless to say, there has been a number of studies delving into demographics in online social media platforms including Facebook and Twitter. However, only a mere handful of works have explored demographic factors behind community question-answering platforms despite their massive amount of members. For this reason, we decided to undertake a study of Yahoo! Answers members, namely as it relates to age demographics. To this end, we automatically built and annotated a large-scale corpus comprising metadata and textual inputs produced by ca. 650,000 community fellows. We profit from this collection by conducting both an exploratory/statistical analysis and predictive modelling. In the former, we explored the correlation between distinct age groups and some variables that, intuitively, can seem to be highly correlated with some cohorts. Interestingly enough, this analysis revealed that Millennials are answering questions prompted by their succeeding age group (GEN Z). In the latter, we assessed the prediction rate of various traditional statistical methods and neural networks classifiers coupled with numerous combinations of assorted textual and metadata features. Overall, best classifiers finished with an MRR of up to 0.862, and were modelled by means of FastText and Maximum Entropy (MaxEnt). In terms of informative attributes, user asking/answering activity patterns and sentimentally charged words provide telltale clues about which age group a community peer belongs to.

Original languageEnglish
Article number107278
JournalKnowledge-Based Systems
Publication statusPublished - 27 Sept 2021


  • Community question answering
  • Intelligent information retrieval
  • Natural language processing
  • User demographic analysis

ASJC Scopus subject areas

  • Management Information Systems
  • Software
  • Information Systems and Management
  • Artificial Intelligence


Dive into the research topics of 'What identifies different age cohorts in Yahoo! Answers?'. Together they form a unique fingerprint.

Cite this