anny Hillis never thinks small. One of his recent projects is a clock for the Long Now Foundation that is intended to last ten thousand years. Yesterday Hillis launched freebase, a wiki-like database that he and his crew hope will become a true “data commons,” collecting and somehow making sense of vast stores of information on every topic. Seeded with large chunks of Wikipedia and other resources like musicbrainz, freebase invites the public to not only add to the knowledge base, but to port what they like to their own pages, thanks to open APIs (Application Program Interfaces) and Creative Commons Attribution licenses.
At first blush, freebase sounds a bit like Google Base, a repository for user-uploaded information that launched a little over a year ago. What distinguishes freebase, however, is its combination of community-generated information with a cunning overlay of descriptive metadata. Web 2.0 meets the Semantic Web.
If enough people upload content to freebase and tag it intelligently, freebase could signal the next step beyond Google and other search engines that return long pages of possible matches based on algorithmic computations. Theoretically, freebase could return an actual answer to your query, one constructed from hints hidden in those interlinked metadata tags. It’s a noble goal and one that Hillis might just be up to.
rguments about the advantages and disadvantages of web-based applications are raging across the net. If the topic interests you, the discussion going on over at Read/Write Web is well worth a read. On that site yesterday, Ebrahim Ezzy posted an article titled Webified Desktop Apps vs Browser-based Apps. In it Ezzy cites downsides to the new web-based apps, including being at the mercy of the network and server load, issues with authentication, security, privacy, and reliability, as well as questions about backward compatibility as these new apps evolve. In a post titled Discussion: Webified Desktop Apps, Richard MacManus highlights the main points being raised by other bloggers. Those favoring web-based solutions counter Ezzy by noting that apps and databases accessed via browsers have the advantage of being available from any connected computer, are platform agnostic, and are well suited to collaborative projects. Richard MacManus, the man behind the Read/Write blog, wisely cautions that we don’t need to think in either/or terms. Still, it pays to understand the rationale behind both sides of this important question as we negotiate increasingly complex content waters.
ver the past few months, I’ve off and on been using Writely, a web-based word processor that is elegant, fast, and free. Google acquired the app and it just relaunched. I’ve been playing with it a bit and here’s what I like best about the newest version:
* full-featured word processor (styles, colors, tables, images, comments),
* files can be accessed from any browser window,
* offsite backup every 10 seconds,
* can save and download docs in a variety of formats (html, pdf, rtf, odt or Word),
* can compare and revert to previous versions,
* collaborative editing in real-time with whomever you choose,
* folksonomic tagging support!
I’m sure there’s more and yet the app doesn’t feel like it’s succumbing to a deadening “feature creep.” The only piece missing (and perhaps I just haven’t discovered it yet) is outlining. I think and write in outline format and have had to tease some of my work on Writely into a fake outline format, but that is fairly simple. You can even post directly to your blog, which is what I’m going to do with this post. If you use Writely, let me know what you think.
abble launched a couple of days ago. Tim Perkis had alerted me to the service some months ago and I’ve been curious to see how CEO Mary Hodder and her crew would tackle the problem of tracking and organizing the vast number of videos being uploaded each day. YouTube is reporting 65,000 new videos coming online daily, with more than 100 million videos viewed a day on that service alone. Finding videos of personal interest amidst all that glut is a challenge but one made much easier now that Dabble is on the streets.
Just as delicious eases the task of keeping track of bookmarks through the use of user-generated tags, Dabble allows visitors to assign freeform descriptive tags to videos that reside anywhere online. Besides aggregating these tags, the site encourages communities of interest to grow up around comments, playlists, and a new Dabble Blog.
Dabble started out with a bang, tracking 100,000 videos from Brewster Kahle’s Moving Image Archive. To assign tags to videos you come across, you can install a handy bookmarklet tool. In addition to collating tags, Dabble allows user to contribute information about who created the video, who’s in it and the like in an editable wiki environment. This looks like a service well worth supporting. I know I will be.
hose wishing to track how well wikis perform in very public settings now have an ideal petri dish. Yesterday, eBay Wiki launched. The online auction giant is inviting community members to contribute to “fact-based articles” that relate to trading on the website. Built in conjunction with Jotspot, the wiki is quickly attracting both authors and editors. The eBay environment already provides admirable mechanisms for feedback and gauging reputations. It will be interesting to watch how well a wiki withstands inevitable attempts to game the system. Unlike many of the peer-production experiments currently underway, eBay is a testing ground on which players literally have a great deal to gain. The wiki is, of course, in beta. The first articles, seeded by eBay regulars, deal with issues as varied as restoring feedback percentages, a list of handy auction tools, and a set of tips for selling art. It was a brave move by eBay and one that will certainly have its messy moments, but it has the potential to be a true proving ground for public wikis.
Nature pits Wikipedia against Encyclopedia Britannica and the free, user-edited encyclopedia holds its own.
The good news for Wikipedia began yesterday with a special report by the venerable science magazine, Nature. The periodical oversaw the peer review of 42 entries common to the two encyclopedias and found errors in both. In fact, they discovered 162 errors in Wikipedia and 123 in the Britannica. Among the errors, four in each compedium were dubbed “serious.”
Determined to test the mettle of the online encyclopedia, two of the Wikipedia’s 45,000 registered “editors” carried the math a little farther and discovered that the Wiki articles used in the review were, on average, 2.6 times longer than the Britannica’s. The authors are “cautious about drawing conclusions, but from a purely statistical standpoint, this means that the Britannica yielded 3.6 errors for every 2KB data while the Wikipedia ended up with a mere 1.3 errors per 2KB.
The good news followed hard on the heels of bad. John Seigenthaler, founding editorial director of USA Today recently accused the encyclopedia of erroneously implicating him in the assassination of Robert Kennedy. Mr. Seigenthaler declined to edit the document.
Plus, there is the class action suit against Wikipedia brought by Baou Inc. Baou is run by Greg Lloyd Smith, who launched and defended the questionable QuakeAID project and was once sued by Amazon for engaging in fraud while using their name. Win or lose, defending cases like this one are costly and distracting.
Wikipedia will never be free of problems. There will be misinformed editors, ham-handed writers, vandals, and prolix types with axes to grind. Even with new mechanisms being put into place to screen entries, the sheer volume of data on the service prohibits any kind of full vetting. As of this writing, there are 3.7 million articles in 200 languages in the Wikipedia.
But the Nature article is thought provoking. First, it’s good to remember that even our most trusted reference tomes can make mistakes and, second, the Wikipedia, with its self-correcting nature and protean body of content, isn’t all that terrible a resource, after all.
Not long ago, corporate wisdom had it that content and customers were best kept behind tall garden walls, but the recent announcement that Microsoft and Yahoo will open up their networks and allow their respective instant messaging users to talk with one another is yet more proof that those walls are tumbling down. While altruism may have played some part in the deal, it is likely that the two partnered in hopes of overtaking AOL with its 56% of the current market share. They’ll have a rough row to hoe. AOL’s software is ubiquitous and friendlier than either Yahoo or MSN.
Plus, could the hand-shake be too little too late? Upstarts like Cerulean Studios’ Trillian and Defaultware’s Proteus X for the Mac have been providing free software allowing individuals to chat across all three systems for a while now and they come with a friendly array of customizable features, among them video and SMS support for forwarding messages to your phone.
This is all well and good for the user who just wants to chat with their friends and colleagues across systems, but the real competitive edge may well turn out to be voice over IP (VoIP). The recent $2.6 billion purchase of VoIP company Skype (whose tagline reads “the whole world can talk for free”) by eBay was considered by many to be chancy, but signs are good that individuals will choose the much cheaper (free!) VoIP services over more traditional telephone providers whenever it’s possible and easy. The Microsoft/Yahoo partnership will make it very easy for users of those two systems.
The real story may lie in rumors of talks between Microsoft and AOL’s owner Time Warner to discuss more interoperability between those two chat and voice systems. There’s been little love lost between the two in the past but market pragmatics could force them to at least kiss for the cameras. Meanwhile, newcomer to the messenger game, Google, is also said to be in talks with AOL. It will be worth watching how this all plays out.
Searching blogs is still a bit of a turkey shoot. General web search engines like Google, Yahoo, and MSN depend on spider bots that crawl websites at intervals too far apart to suit most bloggers. Search engines like Technorati, Feedster, Ice Rocket, and PubSub that are designed to work specifically with blogs and web feeds solve the time lag problem by registering pings from blogs each time one is updated or by indexing RSS and Atom web feeds, but the sheer number of blogs and blog posts make scaling a challenge for these newer search companies. Both types of search engines favor blogs with the greatest number of inbound links, a point of contention among bloggers who address smaller, more niche audiences. More robust blog search capabilities have been on most bloggers’ wish lists for some time now.
The release today of Google Blog Search (GBS) raises the hopes of many that they will be better able to find blog entries of interest and that others may discover their own blogs more easily. Initial response to the new service is mixed. The new beta service appears to gather its data by indexing RSS or Atom feeds published by many bloggers. How well this solves the latency issue with traditional Google web searches is yet to be seen, but a good number of bloggers are already testing and critiquing the service.
For a bit of context, the Wall Street Journal Online’s Vauhini Vara compiles a useful list of the major blog-specific search engines calling out their pluses and minuses.
Today, SearchEngineWatch’s Gary Price lists Google Blog Search features that he would like to see in the future, including the ability to screen out blogs that merely scrape headlines, to cluster “related blog” posts, and to search by location for entries that contain a Geotag. Price also requests a better understanding about how Google Blog search differs from Google News searches.
The Blog Herald posts initial thoughts about Google’s new search. The Herald particularly liked the ability to place any search term into a web feed and found the split results between “related blogs” and a general index to be useful, although they echo the complaint of many that the latter seems to merely mimic Google News results. The overall size of the index is the major problem (under nine million blogs searched as compared to Technorati’s more than 17 million, for instance), but this may improve over time.
Microsoft’s Robert Scoble reports that the search speed is excellent and the results contain less spam and fewer duplicates than other search results. Technorati still has more up-to-date results, but at this point, Scoble is inclined to favor Google over the other engines.
We’ll have to watch and see how Google measures up and which of the other search giants decide to enter the ring. It was heartening that the new service indexed this blog, which is still relatively small and recent. Stay tuned and we’ll report back on how well Google’s new search suits the overall needs of an ever-growing blog population.