noahgibbs: Me and my teddy bear at Karaoke after a day of RubyKaigi in HIroshima in 2017 (Default)
noahgibbs ([personal profile] noahgibbs) wrote2004-03-03 05:09 pm

Architecture for Metacommentary

I had an idea, related to (but different from) some things that Anton and I have been talking about lately.

The basic concept is: what if you could leave comments about web sites. *Any* web site. And other people could see your comments along with everybody else's. So products for sale, documentation, all the various information on the web, could be rated blog-style at their existing locations.

There are some problems, naturally, even if you can write your own browser to display the stuff automatically.

For instance: who hosts the comments? You don't want a web site to host its own comments, or unfavorable product reviews can be quietly removed by the person advertising the product! Censorship should be the right only of the original poster, and perhaps of some central rating authority, though preferably not that either.

It could be done as a web portal even now: you'd go to the portal, and browse in the normal way but through their stuff. The portal would automatically add the comments at the end and a form to post your own. However, that requires a portal and thus a central authority, which again leads to conflicts of interest.

You'd like the comments to be hosted by the commentors in some sense, both for security and to distribute the server load. But then how could you look up all the comments from all the servers for a particular web site? You could have a central server that collated all that information, but again you're back to the censorship and scalability problems, just on a smaller task.

So if you were going to store and query this stuff in a relatively peer-to-peer way (storage is easy that way, you just host your own comments), you'd need some kind of distributed query mechanism. The string you're querying on would be the URL -- any page you can bookmark, you can comment on. The returned items would be the comments, or perhaps a list of URLs/locations for them. Having the comments have URLs would make replying and threading relatively easy, though the efficiency might suck.

But how do you do the query? How do you ask the world at large, "say, what do you think of http://www.sleazy.com/cheap/filling/horsemeat.html"?

Re: pah

[identity profile] sui66iy.livejournal.com 2004-03-03 08:32 pm (UTC)(link)
How does Anton's stuff work?

Re: pah

[identity profile] angelbob.livejournal.com 2004-03-04 12:24 am (UTC)(link)
For starters, it solves a subtly different problem -- Chord looks up values based on SHA-1 (or whatever, it's easy to change) hashes of keys. It does this with what it calls a finger list, which is basically the same idea as a node of a skip list, but applied to routing.

Anton's stuff works in several dimensions (splits the key up into an N-d entity rather than working on 1-D) and has some reliability metrics built into a cache heuristic. It's similar -- they're both multires, and both rely on probabilistic algorithms to avoid losing keys when nodes go down. Anton's version is just multidimensional, and less rigid in operation. That makes it harder to prove reliability in awful cases (which turns out not to matter in his specific problem domain), but avoids an O(log^2 N) hit every time a node connects or disconnects.

Re: pah

[identity profile] sui66iy.livejournal.com 2004-03-04 05:47 am (UTC)(link)
Yes, Chord's virtues are its simplicity (which is especially nice if you hope to get many people to code interoperable versions) and its provable performance characteristics.

It's hard to know if this intuition is correct without understanding the details, but in more "natural" cases of multi-dimensional indexing (I'm thinking of spatial indexes), indexes with large dimensionalities have a disturbing tendency to degrade into linear lookups (basically because there's no way to guarantee that you can build a balanced tree of hypervolumes). But that fear may not be relevant at all since I really don't know what you mean by "splits the key up into an N-d entity".