Back in October, we wrote a research paper entitled “Anatomy of a Large-Scale Social Search Engine” and submitted it to WWW 2010. We found out last week that it has been accepted, so we wanted to share a preview with you today!
Our paper was inspired by the classic Google paper, “Anatomy of a Large-Scale Hypertextual Web Search Engine”, in which Sergey Brin and Larry Page originally describe the algorithms and architecture of Google. This paper was published 12 years ago in the same WWW conference.
So our goal with our paper is to follow their example by providing a thorough presentation of the approach, architecture, algorithms, interfaces, and issues involved with Aardvark’s new social search paradigm.
The paper describes the fundamental differences between the traditional “Library” paradigm of web search — in which answers are found in existing online content — and the new “Village” paradigm of social search — in which answers arise in conversation with the people in your network. We explain that in social search:
- Users can ask questions in natural language, not keywords
- Content is generated “on-demand”, tapping the huge amount of information in peoples’ heads
- The system is fueled by the goodwill of its users
We demonstrate that there is a large class of subjective questions — especially longer, contextualized requests for recommendations or advice — which are better served by social search than by web search. And our key finding is that whereas in the Library paradigm, users trust information depending upon the authority of its author, in the Village paradigm, trust comes from our sense of intimacy and connection with the person we are getting an answer from.
We also provide a detailed analysis of user behavior, and include dozens of interesting statistics. For example, of the 90,361 users we had in October 2009…
- 87.7% of questions sent to Aardvark got answered (very high answer rate!)
- 75.0% of users who asked Aardvark a question also answered a question for someone else (very high participation rate!)
- 70.4% of answer feedback had a rating of ‘good’ as opposed to ‘ok’ or ‘bad’ (high quality!)
Writing a paper like this requires being more open, and sharing more information, than most small internet startups might be comfortable with. But we recognize that we have benefitted from the open culture of the scientific community, and would like to do our part. Further, we think that the opportunity presented by social search is truly significant, and we’d like to engage with the rest of the research community on the many challenges it presents. There are very interesting problems to explore around question classification, analysis of social relationships, person-to-person matching, maintaining a question/answer economy, and many other areas.
I wrote the paper with my good friend, Sep Kamvar, who started Kaltix, a search company acquired by Google in 2003. He led personalized search at Google for several years, and is now a professor at Stanford — and an advisor for Aardvark. But this paper would not be possible without the hard work and support of the whole Aardvark team over the past few years. And, of course, Aardvark itself would not be possible without the continued enthusiastic contributions of all of you, our users!
We’re very excited about presenting this at the WWW conference, which has been providing a great forum for web research for 19 years, and we hope to see you there in April. So take a read, and let us know what you think…
(Note: the preview version we’re sharing here has some changes inspired by the great reviewer comments we received; we may make further changes for the camera-ready version that will be presented at the conference.)

