noahgibbs: Me and my teddy bear at Karaoke after a day of RubyKaigi in HIroshima in 2017 (Default)
[personal profile] noahgibbs
I had an idea, related to (but different from) some things that Anton and I have been talking about lately.

The basic concept is: what if you could leave comments about web sites. *Any* web site. And other people could see your comments along with everybody else's. So products for sale, documentation, all the various information on the web, could be rated blog-style at their existing locations.

There are some problems, naturally, even if you can write your own browser to display the stuff automatically.

For instance: who hosts the comments? You don't want a web site to host its own comments, or unfavorable product reviews can be quietly removed by the person advertising the product! Censorship should be the right only of the original poster, and perhaps of some central rating authority, though preferably not that either.

It could be done as a web portal even now: you'd go to the portal, and browse in the normal way but through their stuff. The portal would automatically add the comments at the end and a form to post your own. However, that requires a portal and thus a central authority, which again leads to conflicts of interest.

You'd like the comments to be hosted by the commentors in some sense, both for security and to distribute the server load. But then how could you look up all the comments from all the servers for a particular web site? You could have a central server that collated all that information, but again you're back to the censorship and scalability problems, just on a smaller task.

So if you were going to store and query this stuff in a relatively peer-to-peer way (storage is easy that way, you just host your own comments), you'd need some kind of distributed query mechanism. The string you're querying on would be the URL -- any page you can bookmark, you can comment on. The returned items would be the comments, or perhaps a list of URLs/locations for them. Having the comments have URLs would make replying and threading relatively easy, though the efficiency might suck.

But how do you do the query? How do you ask the world at large, "say, what do you think of http://www.sleazy.com/cheap/filling/horsemeat.html"?

Date: 2004-03-03 05:37 pm (UTC)
From: [identity profile] psychospunk.livejournal.com
There used to be an application like this a few years back on Windows. I can't remember the name of it anymore, though. Basically, a third party host would have the comments, and when you were running this application, it would allow you to view commentary. Now mind you, this is all from memory, and I never used it myself, but it's been tried unless I dreamed it.

Date: 2004-03-03 05:50 pm (UTC)
From: [identity profile] queen-elvis.livejournal.com
I envision a very high signal-to-noise ratio for these comments. Which I guess is the problem with any internet communication device.

Date: 2004-03-03 05:54 pm (UTC)
From: [identity profile] sui66iy.livejournal.com
There was an app called "Third Voice" that did this back in the heady dot.com days. It was a browser add-on, and used a centralized storage model. ESGear had similar capabilities, I think.

As for distributed query algorithms, you might find MIT's Chord (http://www.pdos.lcs.mit.edu/chord/) work interesting.

The problems that these usually run into are more on the social end: how do you get a large enough group of people to play, so that it's interesting? And once a large enough group is interested, how do you filter out all the crap? You need some sort of community moderation system.

pah

Date: 2004-03-03 07:43 pm (UTC)
From: [identity profile] anukul.livejournal.com
NCSA was working on this in 1993: group annotations in Netscape 1.2 (http://web.archive.org/web/20010605013821/http://archive.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/group-annotations.html).

Sadly they gave up on some short term goodness because they got all architecturally wanky about scalability (http://archive.ncsa.uiuc.edu/SDG/Software/Mosaic/Notes/annotations-and-news.html).

People shouldn't throw away compelling code before it has a chance to have scalability problems. Succesful prototypes inspire scalability solutions; reflecting on architecture often results in nothing but reflection.

Re: pah

Date: 2004-03-03 07:47 pm (UTC)
From: [identity profile] angelbob.livejournal.com
Fair enough. I'm reading the Chord paper now. So far, Anton's stuff looks better.

However, there's some more interesting stuff toward the end of the paper, so maybe that'll handle the problems in question more gracefully.

I'm actually looking at it for annotations of stuff other than web pages, but since the problem is 90% the same, I figured I'd ask in terms of web pages. People have heard of them and know what they are.

Re: pah

Date: 2004-03-03 08:32 pm (UTC)
From: [identity profile] sui66iy.livejournal.com
How does Anton's stuff work?

Re: pah

Date: 2004-03-04 12:24 am (UTC)
From: [identity profile] angelbob.livejournal.com
For starters, it solves a subtly different problem -- Chord looks up values based on SHA-1 (or whatever, it's easy to change) hashes of keys. It does this with what it calls a finger list, which is basically the same idea as a node of a skip list, but applied to routing.

Anton's stuff works in several dimensions (splits the key up into an N-d entity rather than working on 1-D) and has some reliability metrics built into a cache heuristic. It's similar -- they're both multires, and both rely on probabilistic algorithms to avoid losing keys when nodes go down. Anton's version is just multidimensional, and less rigid in operation. That makes it harder to prove reliability in awful cases (which turns out not to matter in his specific problem domain), but avoids an O(log^2 N) hit every time a node connects or disconnects.

Re: pah

Date: 2004-03-04 05:47 am (UTC)
From: [identity profile] sui66iy.livejournal.com
Yes, Chord's virtues are its simplicity (which is especially nice if you hope to get many people to code interoperable versions) and its provable performance characteristics.

It's hard to know if this intuition is correct without understanding the details, but in more "natural" cases of multi-dimensional indexing (I'm thinking of spatial indexes), indexes with large dimensionalities have a disturbing tendency to degrade into linear lookups (basically because there's no way to guarantee that you can build a balanced tree of hypervolumes). But that fear may not be relevant at all since I really don't know what you mean by "splits the key up into an N-d entity".

More links...

Date: 2004-03-03 10:22 pm (UTC)
From: [identity profile] gizbot.livejournal.com
My poor brain is leaky; I remember about four companies in the boom based on this technology. There were a few interesting copyright issues and most implemented as a plug-in that allowed some sort of layer with comments. Variations were based around the level of glitz on the layers and how to manage which community's comments made it to your browser. The Gator case made some back down, though Gator obviously mucked with the commercial value of the underlying web page.

The odd spot is that all the basic search terms are too generic: commentary, websites, commuity, mark-up, layers, etc. You could see a write-up on the old back-link and annotation stuff at Kuro5hin.

It feels like a "do research before writing code" kind of topic.

Re: More links...

Date: 2004-03-15 05:04 pm (UTC)
From: [identity profile] angelbob.livejournal.com
It feels like a "do research before writing code" kind of topic.

It's certainly that. Unfortunately, it appears that pretty much everybody runs a separate annotations server. The closest to the kind of bulletproof distribution I'm looking for is NCSA's small group version, which basically boils down to "run an annotations server on your local LAN and only check that."

The closest to the right vision that I'm seeing so far is Xanadu, and their text is generally so impenetrably self-righteous that it's always hard to tell what they've thought of, and what they've actually got design notes for.

Nonetheless, I'm sifting it for nuggets of anything useful.

December 2024

S M T W T F S
1234567
891011121314
15161718192021
22232425262728
293031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 1st, 2025 12:07 am
Powered by Dreamwidth Studios