-
Notifications
You must be signed in to change notification settings - Fork 91
Description
The method #<=> does not behave in the way the doc says.
The doc of #<=> says
Comparison is based on the natural ordering of the node name objects.
Returns: (Integer) — +1 if this node is a ‘successor’, 0 if equal and -1 if this node is a ‘predecessor’. Returns ‘nil’ if the other object is not a ‘Tree::TreeNode’.
So, actually, the descriptions contradict.
In the current version (2.0.0 and HEAD(60629c1)), #<=> behaves in the former way, that is, the comparison is based on #name and is valid only for the descendants of Tree::TreeNode.
Personally, I think the latter should be the way.
I have forked the repository to implement it (see below).
Here is the justification for the latter policy, that i, judging on the basis of "successor" or "predecessor".
- Arguably, the comparison based on
#nameis not intuitive, when there are definitive "successor-predecessor" or parent-child (and sibling) relationships between elements inTreeNode. Personally, when I first used this library, I didn't imagine that would be the case (and scratched my head, not understanding how it worked)... According to the doc, the meaning of#nameis no more than the ID, and the fact that it is the key parameter for Comparable is mentioned only in the above-mentioned sentence in the method#<=>, which is a bit obscured and surprising... - Currently, nodes that have different roots are comparable, which is in my opinion neither intuitive nor useful (in most cases).
- Comparison based on
#namecan be very easily implemented with users anyway, such as,sort{|a, b| a.name <=> b.name}. - How to define "successor-predecessor" is not definitive in the sense not everyone agrees completely. I can think of a few candidates as listed below.
- Perhaps (though arguable), the most convincing comparison with the spaceship operator
<=>(and hence<and>) would be to follow the order of#each. If an instance A appears before instance B in#each,A < Bor(A <=> B) == -1holds. If their root nodes are different, it returns nil, because it does not make sense to compare them. - An alternative is to follow the order of breadth-first search; that is, If an instance A appears before instance B in
#breadth_each, thenA < Bor(A <=> B) == -1holds. - Another possibility is similar but more restrictive, and returns
nilif they are not in the direct ancestor-descendant line or direct siblings, that is, uncle-nephew relationships or cousin relationships would returnnil. - Last and most strict criterion is similar to the previous one but returns
nilif they are siblings, namely only ancestor-descendant relationships are allowed.
- Perhaps (though arguable), the most convincing comparison with the spaceship operator
I am well aware this would be in practice a backward-incompatible change in the specifications, although it does follow the current doc (at least in the second part), and that is the only cons I can think of.
I have forked the repository from the current head (60629c1 on 24 Jun) and implemented the above-mentioned four algorithms plus the current <=> one in the new method #cmp. The method #cmp takes an optional parameter :policy for which one of :each (Default) (above-mentioned case 1), :breadth_each (case 2), :direct_or_sibling (case 3), :direct_only (case 4), and :name (emulation of the current <=>) is accepted. See def test_cmp in /test/test_tree.rb for demonstrations.
I note that methods < and > in the forked repo are unaffected and behave in the same way as the current <=> because #<=> is not modified.
My fork is https://github.com/masasakano/RubyTreeCmpOperator and the current head is 1f6127a. All tests (I added over 100 assertions for the new method) have passed.
A final note is, if you stick to the current "name"-based comparison, I think it is conceptually better to replace return nil in def <=>(other) with return super (which should in practice always return nil, that is, there would be no change in behaviour).
Thank you,
Masa