noahgibbs: Me and my teddy bear at Karaoke after a day of RubyKaigi in HIroshima in 2017 (Default)
noahgibbs ([personal profile] noahgibbs) wrote2004-02-18 12:52 am

Algorithms Question

If you're not into programming, you're probably happier if you skip this.

Say you've got a graph. It's not a tree (i.e. it has cycles), and it's fully-connected (no subsets of it that aren't connected to the main graph). You've got the list of nodes and the list of connections.

You'd like to find the longest shortest path. That sounds silly, so lemme 'splain. For any two nodes there's a shortest path between them. If you were to look at every pair of nodes and find the shortest path, you'd like the longest of those.

So if you had a graph like this:

  A--o--o--o--o--o--o--o
           |           |
           o--o--o--o--o
                 |
           B--o--o


The longest shortest path would be between A and B, so length 9. The longest shortest path would *still* go the short way 'round the cycle (because it's a shortest path), so it would be 9 and not 13.

You're not guaranteed that there *are* any leaf nodes (remember, this isn't a tree, there are cycles), so a simple Minimum Spanning Tree won't help so much...

This stuff is for a nifty network connectivity problem that a friend is working on. I've written a little statistical simulator for these things, and I'm trying to find the longest shortest path in a 30,000-node graph in less than O(n^2) time...

[identity profile] luwenth.livejournal.com 2004-02-18 01:05 am (UTC)(link)
What's the maximum number of connections that a node can have? I think this is important for some reason...

[identity profile] angelbob.livejournal.com 2004-02-18 01:11 am (UTC)(link)
Not currently limited. I expect that in practice it's rarely greater than ten, and it'll average three or four.

But in theory, 29,999 nodes could all randomly just happen to connect to a single central node...

Re: An algorithm

[identity profile] luwenth.livejournal.com 2004-02-18 02:10 am (UTC)(link)
No, it doesn't...

Border cases:
  • 1 node
  • 2 nodes
  • 3 nodes in a line.
  • 3 nodes in a triangle.
  • N nodes in a circle. (Shortest longest path should be floor of N/2)
  • 4 nodes in a P shape (connections A-B-C with B-D and C-D).


  • Start at a point on the graph, assign it a depth of 0.
  • If there are no connecting nodes, yell current depth and node. Terminate. (1 node case.)
  • Loop Start

    • Increment your depth.
    • For each connecting node:

      • If the node has no depth, assign the current depth to the node. Iterate.
      • If the node has a depth, and the depth is larger, assign the current depth to the node. Iterate. (You've just shortcut a loop.)
      • If the node has a depth, and the value is smaller, And the difference is > 1, assign the depth to the depth of the current node. Iterate. (You're not at an end, and there is a shorter way of getting here.)
      • If the node has a depth, and the value is the same as the current depth, terminate quietly. (This path has been traversed, so die quietly.)
      • If the node has a depth, and the value is smaller, And the difference is <= 1, yell the current depth and node. Terminate. (You've reached an end, and the depth is equal both ways of getting here.)


Of course, I just realized that I'm expecting that each of the explorers be able to yell a number and a node back at a central listener. As each explorer hits bottom, it will yell out a number and a node. If the central already has a depth for that node, and the new depth is smaller (which it invariably will be) then the new depth is recorded. Then once all the explorers have completed, the two nodes the listener knows about are the farthest apart, and the sum of their depths is the shortest longest path.

But short of that little issue, I think I've walked through all the potentials and it behaves appropriate, and terminates when it should (loudly or quietly).

No, I don't want to code it up :)

Re: An algorithm

[identity profile] luwenth.livejournal.com 2004-02-18 02:14 am (UTC)(link)
Errr.. "the two nodes the listener knows about" should be "the two largest nodes the listener knows about" :)

Re: An algorithm

[identity profile] luwenth.livejournal.com 2004-02-18 02:42 am (UTC)(link)
And nevermind again, my brilliance turned into bullshit *sigh*

Re:

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:13 am (UTC)(link)
it probably does matter.

Dijkstra's Algorithm can find the shortest path between all pairs of nodes in O(n3) time in a dense graph, but
O(log(E)n2) time in a sparse graph, giving you an
O(n2)
algorithm right out of the book if the number of outgoing edges per node is relatively small..

Re:

[identity profile] angelbob.livejournal.com 2004-02-18 11:28 am (UTC)(link)
Yeah. I *was* hoping for sub-N^2 (it's a very sparse graph, number of edges is approximately 3 times number of vertices), but this'll do.

Re:

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:34 am (UTC)(link)
Oh, also, I think that the "n squared times log E" (n*logE for single-source shortest path, but n^2*log(E) for your problem) "Dijkstra's with a heap" algorithm that mathworld et al keep blabbering about is really just "breadth-first search"

Why are resources on the web so damned hard to read, when the underlying algorithms they're talking about are so damned simple?

Re:

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:36 am (UTC)(link)
Also, note, i lied when i said n2

"E" is the total number of edges in the graph, not the outgoing number per node. So make that n2log(n) for your 3 outgoing edges per node problem. (the log is from the use of the heap for a priority queue in BFS, and the E is a nastily approximate upper bound of the number of things that can be in the priority queue at a given time)

[identity profile] bredmold.livejournal.com 2004-02-18 01:32 am (UTC)(link)
I should know this. I'll post a few random thoughts, but all the relevant reference volumes are at my office right now.

Your longest shortest path is also going to cross at least one cut edge. So, you should be able to get a lot of information out of calculating the minimum cut... which is doable in O(|V|+|E|) time if I recall (if I'm wrong, then it's O(|V|*|E|), which is a much bigger number).

Even if you just do the simple O(|V|^2) algorithm, I'll bet you can improve it by memoizing (bet you haven't heard that word since you were at CMU) all the intermediate results.

If all else fails, I'll try and remember to follow up on this tomorrow when I have some reference books in front of me.

[identity profile] angelbob.livejournal.com 2004-02-18 01:49 am (UTC)(link)
Your longest shortest path is also going to cross at least one cut edge.

Hm. I don't think I'm familiar with minimum cuts. Remind me what that is? I tried a lot of variations involving minimum spanning trees, if that's similar.

Even if you just do the simple O(|V|^2) algorithm, I'll bet you can improve it by memoizing (bet you haven't heard that word since you were at CMU) all the intermediate results.

You can, but I'd *really* like to avoid memoizing on an O(N^2)-in-space algorithm for 30,000 nodes if it's at all possible...

Also, it's not clear what you'd memoize. Shortest paths between various nodes, I suppose, so you could basically add virtual edges to the graph (marked with edge lengths which are the shortest path) until it's fully connected. Which is what the memoizing would wind up doing.

Hm. That'd still be a lot faster than several of the things I was contemplating, though.

Minimum cut

[identity profile] bredmold.livejournal.com 2004-02-18 02:05 am (UTC)(link)
A minimum cut is minimal set of edges (either the least cardinality or the least total weight in a weighted graph) that must be removed from a connected graph to render it disconnected (at least two separate components). A minimum cut is not necessarily unique, but neither is a longest shortest path.

Re: Minimum cut

[identity profile] angelbob.livejournal.com 2004-02-18 02:15 am (UTC)(link)
Hm. Okay, I could see that. Seems like you'd need to calculate a lot of them and try them all, though... Since, y'know, the longest shortest path could be entirely outside the bit you just disconnected from the main bulk of the graph.

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:26 am (UTC)(link)
Also, do you need the *exact* longest shortest path?

[identity profile] angelbob.livejournal.com 2004-02-18 11:29 am (UTC)(link)
There can be multiple. More to the point, I don't need the endpoints or the series of nodes, just the length.

Re:

[identity profile] angelbob.livejournal.com 2004-02-18 11:30 am (UTC)(link)
Or do you mean "would something that's probably very long, relatively, but not necessarily the longest, also work?"

Re:

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:39 am (UTC)(link)
Note that if you pick a random node, find the shortest paths to all nodes from this random node, and call the lengths of the two longest shortest paths from this node l1 and l2, that the longest shortest path in the graph must be between
l1 and l1+l2

Re:

[identity profile] r-transpose-p.livejournal.com 2004-02-18 11:40 am (UTC)(link)
by
that the longest shortest path in the graph must be between


I meant

that the length of the longest shortest path in the graph must be between

some links

[identity profile] treys.livejournal.com 2004-02-18 11:41 am (UTC)(link)
The straightforward approaches. (http://www.cs.rochester.edu/u/leblanc/csc173/graphs/apsp.html)

A fast heuristic approach. (http://www.dl.ac.uk/TCSC/Staff/Hu_Y_F/PROJECT/pdcp_siam/node8.html)

A very thorough tech report on the subject. (http://citeseer.nj.nec.com/pettie02faster.html)