Search Engine

Serchilo: Meta Search Engine with Wiki Commands

I followed a lightning talk of Georg Jähnig on at the 24c3 Chaos Communication Congress. The video is available now in the torrent network. I uploaded it to Google Video (please post alternatives as comments). Last year I already published an interview with Georg in German. He put quite a lot of work in improving the website and making it more international since then. Hope his ideas takes off even more in the upcoming year. Would be great to have him at the SuMa-eV congress this year as well.


Serchilo Firefox-Plugin:
Georg Jähnig:


Mahalo – New Entry in the Search Eengine Market. An Alternative?

Jason McCabe Calacanis is the founder of With Mahalo he tries to establish a new search engine based on user submitted search result pages. In order to motivate people to write search result pages Mahalo pays part time guides 10 to 15 USD. Part time guides who submit a search page result that is accepted by full time guides also get credited as the original writer of pages. How do you become a guide? At first you have to register and fill out an application form. They want to know your personal details like phone, address etc. as well as your blog, user names of sites like Wikipedia, delicious, Flickr, Youtube and so on. Then they ask about the why you want to write search results, what kind of search results and what else you have to say. Finally you have to choose about the payment of your work. Currently US citizens can chose to receive 10-15 USD per accepted search result page themselves or donate it to the Wikimedia Foundation (it is planned to add other organizations later), non-US citizens can only choose to donate it. Are they good or bad? It seems like Mahalo wants to belong to the good guys. So they have 250.000 USD in donations set aside for the Wikimedia Foundation this year. This is impressive, but it has to be seen if part time guides also choose to donate to Wikipedia.
Mahalo Greenhouse: … Oh yeah, if we accept your search result we will pay you $10 to $15 per search result (the range is based on how many search results you’ve completed: more here). Now, if you’re a disciple of Yochi and you absolutely will not work on a web-based project for money, we’ve got an amazing proposition for you: make the web better by writing spam-free search results and we’ll donate your fees to the Wikimedia Foundation. So, you can make the world better 2x: first by making clean, spam-free search results and second by helping keep the Wikipedia running (those server bills ain’t cheap!). We’ve earmarked up to $250,000 in donations to the Wikipedia this year.
Even if some will choose to donate their work to the Wikimedia Foundation it is clear that Mahalo in the first place is not about building a community. It is about making money (or possibly for some guides to earn an income?) even if they try to appeal to different users, as well those with intrinsic motivations: “…you can make the world better 2x”. Investors like Sequoia Capital's Michael Moritz, who invested in Yahoo and Google when they were still start ups, Dallas Mavericks owner Mark Cuban, who became a billionaire after selling to Yahoo, AOL Vice Chairman Ted Leonsis, who also owns the National Hockey League's Washington Capitals, Elon Musk, co-founder of online payment service PayPal, NewsCorp, CBS Corporation and Hubert Burda Media – they want to gain a profit in the end. So what is the calculation of Jason Calacanis with Mahalo?
As for funding, if the Google AdSense units currently on the site don't cover costs, Calacanis says investors … have given him enough money to run the company for at least five years. (
If Mahalo pays up to 15 Dollars per submitted search page it means an ad that costs on average 7 Cents per click should be clicked about 214 times in order to recapture the cost of a search page of a “part time guide”. However, you also have to add the server costs, cost of the full time guide checking and so on. Still though, over time it seems possible for popular search pages to recapture the cost, but what about not so popular search terms and search terms that do not exist? Jim Lanzone, CEO of said "On any given day, 60 percent of the search requests we get, we have never seen before." ( How will Calacanis solve this problem remains to be seen. What else do I have from submitting search page results apart from gaining money? In contrast to Technorati and Digg I do not get anything out of it except limited exposure – my name on a search page. With Technorati I get exposure for my blog (a link!) and receive useful data, for example who is linking to me and how many blogs link to me, what are top tags and so on. With Digg I can save my bookmarks and access them from anywhere and so on. The strategy of Mahalo to index only the best sites as well is unclear.
The FAQ says: We will link to... sites that are considered authorities in their field (i.e. Edmunds for autos, Engadget for consumer electronics, and the New York Times for news). (
Who decides who is an authority? Which are the best sites? How is the decision made? What happens in case of different opinions? Free communities like the Wikipedia community developed (and develop) ways to solve problems and create transparent decision making processes. How about transparency at Mahalo? Next: The question what is the user really looking for? ..This is a problem for all search engines: ambiguous searches. If I look for instance for the search result “Paris Hilton”. Do I look for the person or the place? Google tries to understand what users want by collecting more and more user specific information and personalizing search results according to this data. (I wrote about the privacy problem of hyper collecting user data of a commercial search engine company before. It is quasi automatically an “invitation” to collect more and more user data and utilize it commercially as much as possible.) There is no perfect solution to ambiguous searches. Mahalo also does not address the problem of ambiguous searches. So neither Mahalos results will be more relevant than those of other search engines, even if they are written by humans rather than by a computer algorithm. Is Mahalo more transparent than others? Not as far as I can see. Mahalo increases transparency by showing top searches in real time at the right sidebar. Google Zeitgeist does not do that in real time. Technorati and others though do it also in real time. Therefore I do not see more transparency as other search engines offer it. What about the search pages? I am not an expert in evaluating search engine results and it is probably still too early to do that anyways as Mahalo only started in June. Let’s see. Is Mahalo for me? It is for me if it is free! To tackle the problem of search engine monopolization, I believe we need an approach to search that is free, open source, sustainable and provides good search results. On the website there is no information about what software Mahalo is using. When I asked Jason Calacanis - suprise! Mahalo is based on free software: MediaWiki, Squid, Nutch, LAMP (Linux, Apache, MySQL, PHP). How about the search result pages itself though? They are copyright to Mahalo and therefore are not free. “we feel since we're paying for the results we should own them”. On the Wikia Search project mailing list Jason explains further to Jimmy Wales:
Now, this is not written in stone. In the future we might move to a Creative Commons model for the results--perhaps non-commercial so someone doesn't just life the entire Mahalo index and dilute our ability to pay the contributors. That's my main concern: figuring out a way to keep paying folks who want to get paid for their contributions. So, I like CC Noncommerical and I like paying people. (Jason on the search-l-wikia mailing list on 4 July 2007)
Mahalo might in future use a license that is not as free as many in the free software/content/infrastructure etc. community would like it, but Jason Calacanis is obviously trying to develop a sustainable business model based on free layers. Additionally he has expressed strong interest in helping to build open source search software together with the Wikia Search project of Jimmy Wales: hopes to a) use Wikia's open source search software and b) wants to help build it. We *share* the mission to open up search. Jason on the search-l-wikia mailing list on 3 July 2007.
Mahalo is an interesting approach to search, which revives the idea of the Yahoo Directory, the DMOZ and other directory listings. It is based on free software, but not (yet) on free knowledge. I cannot copy the database, but I can duplicate the software that is powering the site. Mahalo is set up as a commercial enterprise. Users have the choice to work for them - to submit human written search result pages and get paid or to donate what they earn to the Wikimedia Foundation. If Mahalo can become an alternative search engine with noticeable market share remains to be seen. If it is successful, I believe there is a high chance that it will be bought by Google, Yahoo or another company. It is probably what the investors are hoping for. If Mahalo would also use free licenses for its search page results, it would endanger this prospectus. Mahalo is trying to find a compromise between the use and the application of freedom in every layer (free software and free content) and its commercial interests. For anyone who wants free search it is a good start, but to create a really free search engine, result pages have to be free as well. Under current economic conditions this would not be interesting for a commercial enterprise. However, I believe only a completely free search based on completely free layers will provide a sustainable basis and motivation for people to form a free international community (like the Wikipedia community) that works continuously on a human powered search. But ... a free community cannot be bought!

Selling Advertising Space - Priority Task of Search Engine Companies?

According to Theo Röhle of the University Hamburg (Germany) the commercial exploitation of advertising space has become the main task of search engine companies. In ‘Think of it first as an advertising system’: Personalisierte Online-Suche als Datenlieferaant des Marketings” (pdf) he analyses the elicitation of user data of search engines with a focus on personalized search.

Personalized online search as offered by Google and Yahoo can indeed help to improve search results and increase their relevance for the individual user.  But what makes these services interesting to search engine companies is the protocoling and interpretation of user behavior and the profiling of its users. Thus collected user data can be used for commercial interests, possibly without time limits.

As listed companies search engine firms are depending on investors and financial markets. Therefore it is indeed their strong interest to maximize profits. Google is a (close-to) monopoly in the search engine market. Investors need companies that grow steadily. It is hard for a company like Google to grow in the search engine market, which it controls in big parts. However Google can grow "in depth", meaning it can grow by collecting more information and utilizing it commercially.

Also Theo Röhle comes to the conclusion, that user data and information that was firstly ascertained to improve search engine results will be, because of the "commercial pressure to be exploited", utilized. This information is indeed the capital of search engine companies.

The question hence: Do I, as a user, want to participate in this process of capitalization of my personal data? And: What alternatives would there be?

In regards to the findings of Theo Rhöle search engines based on Free Software and Free Algorithms become a vital interest of every Internet user, who wants to protect his/her private data and continue using modern Internet services. Only Free search engines based on Free layers can avoid monopolistic structures, where one commercial party (nearly) controls the flow of information, public as well as private.

Serchilo: Meta-Suchmaschine mit Wikiprinzip

Um die Datenbanken von Suchmaschinen und Auskunftsdiensten abzufragen, muss man auf die Websites der Anbieter. Neben Werbebannern erhält man hier oft viele für die Suche nicht relevante Informationen. Mit einer Suchanfrage bei Serchilo kann man direkt auf die Ergebnisse von Suchmaschinen zugreifen. Die Nutzer können bei Serchilo zudem eigene Kommandos schreiben und bearbeiten. Die Befehle werden hierzu in einem Wiki unter einer freien Creative-Commons-Lizenz (by+sa) gespeichert.


Subscribe to RSS - Search Engine