We’ve invested heavily in advanced research since we started Aardvark two years ago. This is an unusual move for a small startup company, which is expected to spend all of its time on a panicked frenzy of short-term releases. So why did we do it?
Well, we noticed that the explosion of online information about people in the Web2 era (on social networks, blogs, tweets, review sites, etc.) was parallel to the explosion of online web pages in the late 1990s. Back then, the timing was right for a major wave of research into IR-style web search engines, developing algorithms to find the right web page for a query. Similarly, the timing was right when we started Aardvark for a major wave of research into social search, developing algorithms to find the right person to answer a question. Naturally, we wanted to be a few years ahead of this curve.
The “Social Search” research problems we’re working on today are related to the “Web Search” research problems that have received so much attention over the past decade, since the components of the underlying search systems are analogous. However, while the abstract aims of each component in Aardvark are analogous to those in a corpus-based search engine, the means of achieving those aims are quite different:
Query Analysis
- With Web Search, the challenge is extrapolating a user’s intent — the question she really has in mind — based on a keyword search query.
- With Social Search, the challenge is determining the areas of expertise needed to answer a natural language question: figuring out what subject matters the question concerns.
Crawling and Indexing
- With Web Search, the challenge is crawling large numbers of web pages and indexing them based on the terms and metadata they contain.
- With Social Search the challenge is attracting socially connected users (i.e., crawling large social graphs) and indexing them based on topics of expertise: figuring out what topics people know about, as expressed in their online profiles, the content they have authored, etc.
Ranking
- With Web Search, the challenge is ranking web pages in search results based on relevance (i.e., match to the query) and quality (i.e., an authority score).
- With Social Search, the challenge is ranking candidate answerers to match with the question (i.e., with the relevant expertise) AND to match with the asker: finding answerers who want to help the asker, and who the asker will trust, based upon their affinities and sense of connection.
User Interface
- With Web Search, the challenge is displaying search results with snippets and highlights in an easily-scannable manner.
- With Social Search, the challenge is facilitating a chat-like interaction between the asker and the answerer: designing a user experience that removes the social and logistical barriers to having a quick conversation with someone you are connected to.
To address these diverse challenges, we throw a diverse set of tools at the problem: Aardvark makes heavy use of natural language processing, machine learning, and quantitative analysis more generally. We use a combination of off-the-shelf tools and custom-built frameworks and algorithms. We are strongly data-driven, and are constantly evaluating the latest tweaks (and outright wacky ideas) to see how they affect the overall performance of the system.
As a result, we’re making great progress on each of these fronts… but there are many more interesting problems ahead. Our team today includes a number of experienced researcher scientists and research engineers… and we’re still recruiting! (If you’re a talented researcher, you should consider joining us.)
If you want to hear more about how we’ve approached some of the text processing and semantic analysis research problems, come to my talk on “Text and Meaning” at the Web2.0Expo next week.
9 Comments
Interesting post.
I always like reading about behind the scenes thinking on things.
Id like to chuck one more thing onto your list though, that you no dought think about, but I think its important none the less.
Within “Crawling and Indexing” there’s also the problem of keeping people active. Making sure people don’t get bored with answering questions. Partly this is finding a good question/answerer match, but I think partly it will also become making sure people dont get asked the same thing too many times.
I dont see this as an issue at the moment, but more one to consider for the future.
On webforums on specific topics I have knowledge in (HTML,3D modeling,Java) you often see the same questions popup time and time again. And people do get fed up having to answer these queries. (often to the extent they start acting rudely to the original poster)
I think Vark.com, as it grows, might get the same problem. So Id list it as something to think about in future.
—-
Id also like to add an additional “s” to the User Interface section;
“, the challenge is facilitating a chat-like interaction between the asker and the answererS: ”
Please don’t forget that sometimes one person giving an answer isn’t enough. Sometimes we need to have help from a few different people…and even discuss it a bit…to reach a conclusion.
Congratulations for Vark.com, I love using it and have referred many friends over. I missed some academic articles references on this post.
I have been thinking one potential future of Aardvark as a platform, not just as a web service.
There are lots of websites, companies, universities, software developers, or any other type of organization which is using the internet as a communication medium. These people may be able to take advantage of an intelligent Q&A and communication platform such as Aardvark to augment their products, services, etc.
If Aardvark evolves into a platform, then these people could use the Aardvark system in their own context.
Aardvark is so cool!
Get the answer to practically anything question on planet earth, and maybe beyond that.
Aardvark now has a link on our website and we hope that people will check it out and sign up.
We did, and now we’re hooked.
Başarılı bir çalışma olmuş.
vark.com’u tebrik ediyorum.
than you, great!
so wonderful~~~~
nivce
Firstly I am not understood I think u may effected any cost are any legal matter and I realized ofter it’ to be goes to my experience goggle is fl-oughted land and cultivated land which may be we want to pics up mangoes , bananas, Custer apple pulses rice and wheat which may got and earn cooked.
3 Trackbacks
[...] can read about the research behind Aardvark and more hear at their blog! Here is a [...]
[...] and answer service. It has the potential to foster discussions between two people, while reducing, per Horowitz, “the social and logistical barriers” to conversation. As co-founder Max Ventilla puts it, [...]
[...] http://blog.vark.com/?p=267 [...]