Python 3 __cmp__ semantic change?

Discussion:

Python 3 __cmp__ semantic change?

Johannes Bauer

2008-11-20 23:44:09 UTC

If it's not present then it would be worth reporting it as a 3.0 bug -
there's still time to get it in, as the release isn't due until early
December.

Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
earlier, I now have code like:

def __lt__(self, other):
return self.__cmp__(other) < 0

def __le__(self, other):
return self.__cmp__(other) < 0

def __gt__(self, other):
return self.__cmp__(other) > 0

def __ge__(self, other):
return self.__cmp__(other) >= 0

Does anyone know the reason why __cmp__ was discarded?

Kind regards,
Johannes

--
"Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit,
verl?sterung von Gott, Bibel und mir und bewusster Blasphemie."
-- Prophet und Vision?r Hans Joss aka HJP in de.sci.physik
<48d8bf1d$0$7510$5402220f at news.sunrise.ch>

Steven D'Aprano

2008-11-22 08:53:14 UTC

Granted it's not as efficient as a __cmp__ function.

What makes you say that? What do you mean by "efficient"? Are you
talking about memory footprint, runtime speed, disk-space, programmer
efficiency, algorithmic complexity, or something else?

What I'm talking about is very simple - and explained below, with the
help of your __cmp__ method.

return cmp(self.num*other.den, self.den*other.num)

I'm talking about runtime speed (*not* asymptotic complexity). My code
makes Fraction.__gt__ about twice as slow as Fraction.__lt__ or
Fraction.__eq__ even though with __cmp__ they would all be equally fast.

Sounds like a premature micro-optimization to me. On my machine, running
Python 2.5, the speed difference is nothing like twice as slow.
... def __init__(self, num, den=1):
... self.num = num
... self.den = den
... def __cmp__(self, other):
... return cmp(self.num*other.den, self.den*other.num)
...
... __lt__ = lambda self, other: self.__cmp__(other) < 0
...

from timeit import Timer
t1 = Timer('x < y',

... 'from __main__ import UseCmp; x=UseCmp(3, 5); y=UseCmp(1, 2)')

t2 = Timer('x < y',

... 'from __main__ import UseRichCmp;'
... 'x=UseRichCmp(3, 5); y=UseRichCmp(1, 2)')

t1.repeat()

[3.3418200016021729, 2.4046459197998047, 2.2295808792114258]

t2.repeat()

[3.8954730033874512, 3.0240590572357178, 3.5528950691223145]

There's a slight speed difference, around 35% slower. But the random
variation in speed is almost 50%, so I would conclude from this trial
that there is no *significant* speed difference between the methods.

--
Steven

Arnaud Delobelle

2008-11-23 19:04:19 UTC

So how did I get it into my head that defining __eq__ would create the
correct behaviour for __ne__ automatically? And more puzzlingly, how
come it is what actually happens? Which should I believe: the
documentation or the implementation?

http://mail.python.org/pipermail/python-ideas/2008-October/002235.html.

http://bugs.python.org/issue4395
ps to Arnaud: upgrade to rc3, which has bug fixes and many doc changes.

This occured to me after I posted the example so I did update before
building the docs. When I saw the inconsistency between the docs and
python, I also rebuilt python but it behaved the same.

I didn't try checking bugs.python.org because I can't access python.org
at the moment (thanks Aahz for pointing out it's hosted in the
Netherlands - as there are a number of US hosted sites that I can't
access I just assumed Python was one of them). I guess the bug report
is about updating the docs!

--
Arnaud

Arnaud Delobelle

2008-11-22 10:57:00 UTC

I haven't done any tests but as Fraction.__gt__ calls *both*
Fraction.__eq__ and Fraction.__lt__ it is obvious that it is going to be
roughly twice as slow.

There's a very simple way of emulating Fraction.__cmp__ in Python 3:

def totally_ordered(cls):
def __lt__(self, other): return self.cmp(other) < 0
def __eq__(self, other): return self.cmp(other) == 0
def __gt__(self, other): return self.cmp(other) > 0
cls.__lt__ = __lt__
cls.__eq__ = __eq__
cls.__gt__ = __gt__
# and same with __le__, __ge__
return cls

@totally_ordered
class Fraction:
def __init__(self, num, den=1):
assert den > 0, "denomintator must be > 0"
self.num = num
self.den = den
def cmp(self, other):
return self.num*other.den - self.den*other.num

It doesn't suffer the speed penalty incurred when defining comparison
operators from __eq__ and __lt__.

--
Arnaud

Aahz

2008-11-22 15:31:16 UTC

In article <m2ljvc6nb7.fsf at googlemail.com>,

Post by Arnaud Delobelle
def __lt__(self, other): return self.cmp(other) < 0
def __eq__(self, other): return self.cmp(other) == 0
def __gt__(self, other): return self.cmp(other) > 0
cls.__lt__ = __lt__
cls.__eq__ = __eq__
cls.__gt__ = __gt__
# and same with __le__, __ge__
return cls
@totally_ordered
assert den > 0, "denomintator must be > 0"
self.num = num
self.den = den
return self.num*other.den - self.den*other.num
It doesn't suffer the speed penalty incurred when defining comparison
operators from __eq__ and __lt__.

That's true IIF __sub__() is not substantially slower than __cmp__();
however, your basic technique is sound and one can easily rewrite your
cmp() to use some other algorithm.

--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

Steven D'Aprano

2008-11-23 01:06:20 UTC

That's not surprising. You're measuring the wrong things. If you read
what I wrote, you'll see that I'm talking about Fraction.__gt__ being
slower (as it is defined in terms of Fraction.__eq__ and
Fraction.__lt__) using when my 'totally_ordered' decorator.
I haven't done any tests but as Fraction.__gt__ calls *both*
Fraction.__eq__ and Fraction.__lt__ it is obvious that it is going to be
roughly twice as slow.

What's obvious to you and what's obvious to the Python VM are not
necessarily the same thing. I believe you are worrying about the wrong
thing. (BTW, I think your earlier decorator had a bug, in that it failed
to define __ne__ but then called "self != other".) My tests suggest that
relying on __cmp__ is nearly three times *slower* than your decorated
class, and around four times slower than defining all the rich
comparisons directly:

$ python comparisons.py
Testing FractionCmp... 37.4376080036
Testing FractionRichCmpDirect... 9.83379387856
Testing FractionRichCmpIndirect... 16.152534008
Testing FractionDecoratored... 13.2626030445

Test code follows. If I've made an error, please let me know.

[comparisons.py]
from __future__ import division

def totally_ordered(cls):
# Define total ordering in terms of __eq__ and __lt__ only.
if not hasattr(cls, '__ne__'):
def ne(self, other):
return not self == other
cls.__ne__ = ne
if not hasattr(cls, '__gt__'):
def gt(self, other):
return not (self < other or self == other)
cls.__gt__ = gt
if not hasattr(cls, '__ge__'):
def ge(self, other):
return not (self < other)
cls.__ge__ = ge
if not hasattr(cls, '__le__'):
def le(self, other):
return (self < other or self == other)
cls.__le__ = le
return cls

class AbstractFraction:
def __init__(self, num, den=1):
if self.__class__ is AbstractFraction:
raise TypeError("abstract base class, do not instantiate")
assert den > 0, "denomintator must be > 0"
self.num = num
self.den = den
def __float__(self):
return self.num/self.den

class FractionCmp(AbstractFraction):
def __cmp__(self, other):
return cmp(self.num*other.den, self.den*other.num)

class FractionRichCmpDirect(AbstractFraction):
def __eq__(self, other):
return (self.num*other.den) == (self.den*other.num)
def __ne__(self, other):
return (self.num*other.den) != (self.den*other.num)
def __lt__(self, other):
return (self.num*other.den) < (self.den*other.num)
def __le__(self, other):
return (self.num*other.den) <= (self.den*other.num)
def __gt__(self, other):
return (self.num*other.den) > (self.den*other.num)
def __ge__(self, other):
return (self.num*other.den) >= (self.den*other.num)

class FractionRichCmpIndirect(AbstractFraction):
def __cmp__(self, other):
return cmp(self.num*other.den, self.den*other.num)
def __eq__(self, other):
return self.__cmp__(other) == 0
def __ne__(self, other):
return self.__cmp__(other) != 0
def __lt__(self, other):
return self.__cmp__(other) < 0
def __le__(self, other):
return self.__cmp__(other) <= 0
def __gt__(self, other):
return self.__cmp__(other) > 0
def __ge__(self, other):
return self.__cmp__(other) >= 0

class FractionDecoratored(AbstractFraction):
def __eq__(self, other):
return self.num*other.den == self.den*other.num
def __lt__(self, other):
return self.num*other.den < self.den*other.num

FractionDecoratored = totally_ordered(FractionDecoratored)

def test_suite(small, big):
assert small < big
assert small <= big
assert not (small > big)
assert not (small >= big)
assert small != big
assert not (small == big)

from timeit import Timer

test = 'test_suite(p, q)'
setup = '''from __main__ import %s as frac
from __main__ import test_suite
p = frac(1, 2)
q = frac(4, 5)
assert float(p) < float(q)'''

for cls in [FractionCmp, FractionRichCmpDirect, FractionRichCmpIndirect,
FractionDecoratored]:
t = Timer(test, setup % cls.__name__)
print "Testing %s..." % cls.__name__,
best = min(t.repeat())
print best
[end comparisons.py]

--
Steven

Terry Reedy

2008-11-23 18:23:39 UTC

So how did I get it into my head that defining __eq__ would create the
correct behaviour for __ne__ automatically? And more puzzlingly, how
come it is what actually happens? Which should I believe: the
documentation or the implementation?

http://mail.python.org/pipermail/python-ideas/2008-October/002235.html.

http://bugs.python.org/issue4395

ps to Arnaud: upgrade to rc3, which has bug fixes and many doc changes.

Arnaud Delobelle

2008-11-23 11:14:23 UTC

(BTW, I think your earlier decorator had a bug, in that it failed to
define __ne__ but then called "self != other".)

That would be true for Python 2.x but I'm explicitly writing code for
Python 3 here, which, IIRC, infers != correctly when you define ==. I
can't refer you to the docs because my internet access to some US sites
Python 3.0rc1+ (py3k:66521, Sep 21 2008, 07:58:29)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
... self
...
... self.x = x
... return self.x == other.x
...

a, b, c = A(1), A(1), A(2)
a==b, b==c, c==a

(True, False, False)

a!=b, b!=c, c!=a

(False, True, True)

Getting increasingly frustrated by my inability to reach python.org,
I've compiled the html docs for python 3 in order to find the bit that
explains the behaviour above. But the docs say exactly the opposite:

There are no implied relationships among the comparison
operators. The truth of x==y does not imply that x!=y is
false. Accordingly, when defining __eq__(), one should also define
__ne__() so that the operators will behave as expected. See the
paragraph on __hash__() for some important notes on creating
hashable objects which support custom comparison operations and are
usable as dictionary keys.

So how did I get it into my head that defining __eq__ would create the
correct behaviour for __ne__ automatically? And more puzzlingly, how
come it is what actually happens? Which should I believe: the
documentation or the implementation?

--
Arnaud

Arnaud Delobelle

2008-11-23 16:12:11 UTC

So how did I get it into my head that defining __eq__ would create the
correct behaviour for __ne__ automatically? ?And more puzzlingly, how
come it is what actually happens? ?Which should I believe: the
documentation or the implementation?

http://mail.python.org/pipermail/python-ideas/2008-October/002235.html.
George

Ah thanks, I knew I must have had this idea from somewhere. I do
remember about this thread on python-ideas now.

--
Arnaud

George Sakkis

2008-11-23 15:16:19 UTC

So how did I get it into my head that defining __eq__ would create the
correct behaviour for __ne__ automatically? ?And more puzzlingly, how
come it is what actually happens? ?Which should I believe: the
documentation or the implementation?

According to Guido, the implementation:
http://mail.python.org/pipermail/python-ideas/2008-October/002235.html.

George

Aahz

2008-11-23 18:11:52 UTC

In article <m2fxlig3ko.fsf at googlemail.com>,

I can't refer you to the docs because my internet access to some US

BTW, python.org is hosted at XS4ALL in the Netherlands, so if you can't
reach it, it has nothing to do with your US access.

--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

Arnaud Delobelle

2008-11-23 10:05:27 UTC

Post by Steven D'Aprano

That's not surprising. You're measuring the wrong things. If you read
what I wrote, you'll see that I'm talking about Fraction.__gt__ being
slower (as it is defined in terms of Fraction.__eq__ and
Fraction.__lt__) using when my 'totally_ordered' decorator.
I haven't done any tests but as Fraction.__gt__ calls *both*
Fraction.__eq__ and Fraction.__lt__ it is obvious that it is going to be
roughly twice as slow.

What's obvious to you and what's obvious to the Python VM are not
necessarily the same thing. I believe you are worrying about the wrong
thing.

All I was asserting was that using my decorator, Fraction.__gt__ would
be roughly twice as slow as Fraction.__eq__ or Fraction.__lt__. I was
not worried about it at all! Your tests below, although very
interesting, don't shed any light on this.

Post by Steven D'Aprano
(BTW, I think your earlier decorator had a bug, in that it failed to
define __ne__ but then called "self != other".)

That would be true for Python 2.x but I'm explicitly writing code for
Python 3 here, which, IIRC, infers != correctly when you define ==. I
can't refer you to the docs because my internet access to some US sites
seems to be partly broken ATM, but here's a simple example:

Python 3.0rc1+ (py3k:66521, Sep 21 2008, 07:58:29)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
... def __init__(self):
... self
...
... def __init__(self, x):
... self.x = x
... def __eq__(self, other):
... return self.x == other.x
...

Post by Steven D'Aprano

a, b, c = A(1), A(1), A(2)
a==b, b==c, c==a

(True, False, False)

Post by Steven D'Aprano

a!=b, b!=c, c!=a

(False, True, True)

Post by Steven D'Aprano
My tests suggest that relying on __cmp__ is nearly three times
*slower* than your decorated class, and around four times slower than
$ python comparisons.py
Testing FractionCmp... 37.4376080036
Testing FractionRichCmpDirect... 9.83379387856
Testing FractionRichCmpIndirect... 16.152534008
Testing FractionDecoratored... 13.2626030445
Test code follows. If I've made an error, please let me know.

[snip test code]

If anything these tests make a retroactive case for getting rid of
__cmp__ as it seems really slow. Even FractionRichCmpIndirect (which is
the same as applying my second totally_ordered decorator to FractionCmp)
is almost twice as fast as FractionCmp. I would guess it is because
when doing e.g.

a < b

The VM will first look for

a.__lt__(b)

and fail, wasting some time in the process. Then it would look for

a.__cmp__(b)

Whereas when __lt__ is explicitely defined, the first step always
succeeds.

--
Arnaud

Arnaud Delobelle

2008-11-22 09:10:04 UTC

Post by Steven D'Aprano

Granted it's not as efficient as a __cmp__ function.

What makes you say that? What do you mean by "efficient"? Are you
talking about memory footprint, runtime speed, disk-space, programmer
efficiency, algorithmic complexity, or something else?

What I'm talking about is very simple - and explained below, with the
help of your __cmp__ method.

return cmp(self.num*other.den, self.den*other.num)

I'm talking about runtime speed (*not* asymptotic complexity). My code
makes Fraction.__gt__ about twice as slow as Fraction.__lt__ or
Fraction.__eq__ even though with __cmp__ they would all be equally fast.

Sounds like a premature micro-optimization to me. On my machine, running
Python 2.5, the speed difference is nothing like twice as slow.
... self.num = num
... self.den = den
... return cmp(self.num*other.den, self.den*other.num)
...
... __lt__ = lambda self, other: self.__cmp__(other) < 0
...

from timeit import Timer
t1 = Timer('x < y',

... 'from __main__ import UseCmp; x=UseCmp(3, 5); y=UseCmp(1, 2)')

t2 = Timer('x < y',

... 'from __main__ import UseRichCmp;'
... 'x=UseRichCmp(3, 5); y=UseRichCmp(1, 2)')

t1.repeat()

[3.3418200016021729, 2.4046459197998047, 2.2295808792114258]

t2.repeat()

[3.8954730033874512, 3.0240590572357178, 3.5528950691223145]
There's a slight speed difference, around 35% slower. But the random
variation in speed is almost 50%, so I would conclude from this trial
that there is no *significant* speed difference between the methods.

That's not surprising. You're measuring the wrong things. If you read
what I wrote, you'll see that I'm talking about Fraction.__gt__ being
slower (as it is defined in terms of Fraction.__eq__ and
Fraction.__lt__) using when my 'totally_ordered' decorator.

I haven't done any tests but as Fraction.__gt__ calls *both*
Fraction.__eq__ and Fraction.__lt__ it is obvious that it is going to be
roughly twice as slow.

--
Arnaud

skip

2008-11-20 23:58:52 UTC

Johannes> Seems it was removed on purpose - I'm sure there was a good
Johannes> reason for that, but may I ask why?

Start here:

http://www.mail-archive.com/python-3000 at python.org/msg11474.html

Also, a comment to this blog post suggests creating a CmpMixin:

http://oakwinter.com/code/porting-setuptools-to-py3k/

Skip

Johannes Bauer

2008-11-20 21:18:09 UTC

Hello group,

I'm porting some code of mine to Python 3. One class has the __cmp__
operator overloaded, but comparison doesn't seem to work anymore with that:

Traceback (most recent call last):
File "./parse", line 25, in <module>
print(x < y)
TypeError: unorderable types: IP() < IP()

Was there some kind of semantic change?

Kind regards,
Johannes

--
"Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit,
verl?sterung von Gott, Bibel und mir und bewusster Blasphemie."
-- Prophet und Vision?r Hans Joss aka HJP in de.sci.physik
<48d8bf1d$0$7510$5402220f at news.sunrise.ch>

Steve Holden

2008-11-20 22:48:03 UTC

http://docs.python.org/dev/3.0/genindex-_.html

I searched in vain for an official description of this changed
behaviour. Where can we find an official description of how
comparisons are different in Python 3.0?

If it's not present then it would be worth reporting it as a 3.0 bug -
there's still time to get it in, as the release isn't due until early
December.

regards
Steve

--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Steve Holden

2008-11-20 21:45:42 UTC

Post by Johannes Bauer
Hello group,
I'm porting some code of mine to Python 3. One class has the __cmp__
File "./parse", line 25, in <module>
print(x < y)
TypeError: unorderable types: IP() < IP()
Was there some kind of semantic change?

Overload __lt__ method.

"Called by comparison operations if rich comparison (see above) is not
defined."
http://www.python.org/doc/2.5.2/ref/customization.html
And my code works just fine with 2.5 - only on 3.0 it doesn't work
anymore. Why is that?

Well the Python 2.5 documentation can't be regarded as a reliable guide
to what to expect in 3.0 ...

You will observe that __cmp__ no longer appears in the index:

http://docs.python.org/dev/3.0/genindex-_.html

regards
Steve

--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Johannes Bauer

2008-11-20 21:38:51 UTC

Post by Johannes Bauer
Hello group,
I'm porting some code of mine to Python 3. One class has the __cmp__
File "./parse", line 25, in <module>
print(x < y)
TypeError: unorderable types: IP() < IP()
Was there some kind of semantic change?

Overload __lt__ method.

Well, of course I could do that, but the python doc says:

"Called by comparison operations if rich comparison (see above) is not
defined."
http://www.python.org/doc/2.5.2/ref/customization.html

And my code works just fine with 2.5 - only on 3.0 it doesn't work
anymore. Why is that?

Regards,
Johannes

--
"Meine Gegenklage gegen dich lautet dann auf bewusste Verlogenheit,
verl?sterung von Gott, Bibel und mir und bewusster Blasphemie."
-- Prophet und Vision?r Hans Joss aka HJP in de.sci.physik
<48d8bf1d$0$7510$5402220f at news.sunrise.ch>

Ben Finney

2008-11-20 22:20:30 UTC

http://docs.python.org/dev/3.0/genindex-_.html

I searched in vain for an official description of this changed
behaviour. Where can we find an official description of how
comparisons are different in Python 3.0?

--
\ ?[Entrenched media corporations will] maintain the status quo, |
`\ or die trying. Either is better than actually WORKING for a |
_o__) living.? ?ringsnake.livejournal.com, 2007-11-12 |
Ben Finney

Terry Reedy

2008-11-20 22:50:28 UTC

http://docs.python.org/dev/3.0/genindex-_.html

I searched in vain for an official description of this changed
behaviour. Where can we find an official description of how
comparisons are different in Python 3.0?

I was going to say "look in "What's New", but the __cmp__ removal is
missing. So I filed
http://bugs.python.org/issue4372

Terry Reedy

2008-11-20 23:53:18 UTC

Post by Terry Reedy
I was going to say "look in "What's New", but the __cmp__ removal is
missing. So I filed
http://bugs.python.org/issue4372

The whatsnew section of Python 3.0 is still empty. Guido didn't had time
to write it. http://bugs.python.org/issue2306

What's New in Python 3.0 is incomplete but definitely not empty, and it
is part of the doc set.

Christian Heimes

2008-11-20 22:58:23 UTC

Post by Terry Reedy
I was going to say "look in "What's New", but the __cmp__ removal is
missing. So I filed
http://bugs.python.org/issue4372

The whatsnew section of Python 3.0 is still empty. Guido didn't had time
to write it. http://bugs.python.org/issue2306

Christian Heimes

2008-11-20 21:43:30 UTC

Post by Johannes Bauer
Hello group,
I'm porting some code of mine to Python 3. One class has the __cmp__

__cmp__ is gone

Christian

Inyeol.Lee

2008-11-20 21:33:21 UTC

Post by Johannes Bauer
Hello group,
I'm porting some code of mine to Python 3. One class has the __cmp__
? File "./parse", line 25, in <module>
? ? print(x < y)
TypeError: unorderable types: IP() < IP()
Was there some kind of semantic change?

Overload __lt__ method.

Inyeol

Hyuga

2008-11-21 15:44:15 UTC

Post by Johannes Bauer
Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
? ? ?return self.__cmp__(other) < 0
? ? ?return self.__cmp__(other) < 0

I hope you actually have <= here.

Post by Johannes Bauer
? ? ?return self.__cmp__(other) > 0
? ? ?return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?

I think it was because __cmp__ was the backward compatible fallback for
the newer rich comparison methods and Python 3 cleans up a lot of stuff
left in just for backward compatibility. In this case it is a cleanup
too far as in most cases (i.e. those cases where you don't need the full
complexity of the rich comparisons) __cmp__ is a much simpler solution.
Seehttp://mail.python.org/pipermail/python-dev/2003-March/034073.html
for Guido's original thoughts. Also, once upon a time pep-3000 referred
to the removal of __cmp__ but I can't find it in any of the current
peps. Seehttp://mail.python.org/pipermail/python-checkins/2004-August/042959.html
andhttp://mail.python.org/pipermail/python-checkins/2004-August/042972.html
where the reference to removal of __cmp__ became "Comparisons other than
``==`` and ``!=`` between disparate types will raise an exception unless
explicitly supported by the type" and the reference to Guido's email
about removing __cmp__ was also removed.

Guido's primary argument for removing it seems to be that the code for
supporting both __cmp__ and the rich comparisons is "hairy" and that
it felt really satisfying to remove. I don't think that's a good
enough argument. It was hairy because there are a lot of cases to
check, but I wouldn't say it was crufty. It made sense, and the way
it worked seemed logical enough. I never ran into any problems with
it. And by and far the most common case is to implement some total
ordering for a class.

Now, as has been pointed out, all you really need to define total
ordering, at least for sorting, is __eq__ and __lt__, which isn't too
bad. But you still lose the ability to make any other sort of
comparison without implementing all the other comparison operators
too.

Perhaps the code could be made somewhat simpler like this: If rich
comparisons are defined, use those and *only* those operators that are
defined, and don't try to fall back on __cmp__ otherwise. If no rich
comparisons are defined, just look for __cmp__.

Steven D'Aprano

2008-11-22 01:12:48 UTC

On Fri, 21 Nov 2008 17:26:21 +0000, Arnaud Delobelle wrote:

[...]

As classes can be decorated in Python 3, you can write a decorator to
make a class totally ordered. Here is a very simplified proof of
return self != other and not self < other
cls.__gt__ = gt
# Do the same with __le__, __ge__
return cls
@totally_ordered
assert den > 0, "denomintator must be > 0" self.num = num
self.den = den
return self.num*other.den == self.den*other.num
return self.num*other.den < self.den*other.num

q12=Fraction(1, 2)
q23=Fraction(2, 3)
q12 < q23

True

q12 > q23

False
Granted it's not as efficient as a __cmp__ function.

What makes you say that? What do you mean by "efficient"? Are you talking
about memory footprint, runtime speed, disk-space, programmer efficiency,
algorithmic complexity, or something else?

As I see it, a __cmp__ method would be written something like this:

def __cmp__(self, other):
return cmp(self.num*other.den, self.den*other.num)

which presumably would save you a trivial amount of source code (and
hence memory footprint, disk-space and programmer efficiency), but the
algorithmic complexity is identical and the runtime speed might even be
trivially slower due to the extra function call.

If your major concern is to reduce the amount of repeated code in the
methods, then there's no reason why you can't write a __cmp__ method as
above and then call it from your rich comparisons:

def __eq__(self, other):
return self.__cmp__(other) == 0
def __lt__(self, other):
return self.__cmp__(other) < 0

and if you really want to be concise:

__gt__ = lambda s, o: s.__cmp__(o) > 0
__ge__ = lambda s, o: s.__cmp__(o) >= 0
__le__ = lambda s, o: s.__cmp__(o) <= 0

--
Steven

Arnaud Delobelle

2008-11-22 08:27:59 UTC

Post by Steven D'Aprano
[...]

As classes can be decorated in Python 3, you can write a decorator to
make a class totally ordered. Here is a very simplified proof of
return self != other and not self < other
cls.__gt__ = gt
# Do the same with __le__, __ge__
return cls
@totally_ordered
assert den > 0, "denomintator must be > 0" self.num = num
self.den = den
return self.num*other.den == self.den*other.num
return self.num*other.den < self.den*other.num

q12=Fraction(1, 2)
q23=Fraction(2, 3)
q12 < q23

True

q12 > q23

False
Granted it's not as efficient as a __cmp__ function.

What makes you say that? What do you mean by "efficient"? Are you talking
about memory footprint, runtime speed, disk-space, programmer efficiency,
algorithmic complexity, or something else?

What I'm talking about is very simple - and explained below, with the
help of your __cmp__ method.

Post by Steven D'Aprano
return cmp(self.num*other.den, self.den*other.num)

I'm talking about runtime speed (*not* asymptotic complexity). My code
makes Fraction.__gt__ about twice as slow as Fraction.__lt__ or
Fraction.__eq__ even though with __cmp__ they would all be equally fast.

--
Arnaud

Benjamin Kaplan

2008-11-21 16:00:11 UTC

Post by Johannes Bauer
Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
return self.__cmp__(other) < 0
return self.__cmp__(other) < 0

I hope you actually have <= here.

Post by Johannes Bauer
return self.__cmp__(other) > 0
return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?

I think it was because __cmp__ was the backward compatible fallback for
the newer rich comparison methods and Python 3 cleans up a lot of stuff
left in just for backward compatibility. In this case it is a cleanup
too far as in most cases (i.e. those cases where you don't need the full
complexity of the rich comparisons) __cmp__ is a much simpler solution.
Seehttp://mail.python.org/pipermail/python-dev/2003-March/034073.html
for Guido's original thoughts. Also, once upon a time pep-3000 referred
to the removal of __cmp__ but I can't find it in any of the current
peps. Seehttp://

mail.python.org/pipermail/python-checkins/2004-August/042959.html

andhttp://

mail.python.org/pipermail/python-checkins/2004-August/042972.html

where the reference to removal of __cmp__ became "Comparisons other than
``==`` and ``!=`` between disparate types will raise an exception unless
explicitly supported by the type" and the reference to Guido's email
about removing __cmp__ was also removed.

Guido's primary argument for removing it seems to be that the code for
supporting both __cmp__ and the rich comparisons is "hairy" and that
it felt really satisfying to remove. I don't think that's a good
enough argument. It was hairy because there are a lot of cases to
check, but I wouldn't say it was crufty. It made sense, and the way
it worked seemed logical enough. I never ran into any problems with
it. And by and far the most common case is to implement some total
ordering for a class.
Now, as has been pointed out, all you really need to define total
ordering, at least for sorting, is __eq__ and __lt__, which isn't too
bad. But you still lose the ability to make any other sort of
comparison without implementing all the other comparison operators
too.
Perhaps the code could be made somewhat simpler like this: If rich
comparisons are defined, use those and *only* those operators that are
defined, and don't try to fall back on __cmp__ otherwise. If no rich
comparisons are defined, just look for __cmp__.
--

Even easier, make object's rich comparison operators use __cmp__ by default
(and object.__cmp__ would raise exceptions since comparisons aren't defined
for arbitrary objects). That way, if the rich comparison operators are
defined, use those, otherwise use __cmp__. If you only override some of the
rich comparison operators and not __cmp__, than trying unsupported
operations will raise exceptions. Since everything extends object in 3.0,
you wouldn't run into a problem.
The only issue I can think of is that if a superclass defines the rich
comparison operators, you wouldn't be able to use __cmp__.

class object :
def __cmp__(self, other):
raise NotImplementedError
def __lt__(self, other) :
return self.__cmp__(other) < 0
def __eq__(self, other):
return self.__cmp__(other) > 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20081121/33b0ab66/attachment.html>

Arnaud Delobelle

2008-11-21 17:26:21 UTC

Post by Hyuga

Post by Johannes Bauer
Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
? ? ?return self.__cmp__(other) < 0
? ? ?return self.__cmp__(other) < 0

I hope you actually have <= here.

Post by Johannes Bauer
? ? ?return self.__cmp__(other) > 0
? ? ?return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?

I think it was because __cmp__ was the backward compatible fallback for
the newer rich comparison methods and Python 3 cleans up a lot of stuff
left in just for backward compatibility. In this case it is a cleanup
too far as in most cases (i.e. those cases where you don't need the full
complexity of the rich comparisons) __cmp__ is a much simpler solution.
Seehttp://mail.python.org/pipermail/python-dev/2003-March/034073.html
for Guido's original thoughts. Also, once upon a time pep-3000
referred to the removal of __cmp__ but I can't find it in any of the
current peps. See
http://mail.python.org/pipermail/python-checkins/2004-August/042959.html
and
http://mail.python.org/pipermail/python-checkins/2004-August/042972.html
where the reference to removal of __cmp__ became "Comparisons other
than ``==`` and ``!=`` between disparate types will raise an
exception unless explicitly supported by the type" and the reference
to Guido's email about removing __cmp__ was also removed.

Guido's primary argument for removing it seems to be that the code for
supporting both __cmp__ and the rich comparisons is "hairy" and that
it felt really satisfying to remove. I don't think that's a good
enough argument. It was hairy because there are a lot of cases to
check, but I wouldn't say it was crufty. It made sense, and the way
it worked seemed logical enough. I never ran into any problems with
it. And by and far the most common case is to implement some total
ordering for a class.
Now, as has been pointed out, all you really need to define total
ordering, at least for sorting, is __eq__ and __lt__, which isn't too
bad. But you still lose the ability to make any other sort of
comparison without implementing all the other comparison operators
too.

As classes can be decorated in Python 3, you can write a decorator to
make a class totally ordered. Here is a very simplified proof of
concept such decorator:

def totally_ordered(cls):
if not hasattr(cls, '__gt__'):
def gt(self, other):
return self != other and not self < other
cls.__gt__ = gt
# Do the same with __le__, __ge__
return cls

@totally_ordered
class Fraction:
def __init__(self, num, den=1):
assert den > 0, "denomintator must be > 0"
self.num = num
self.den = den
def __eq__(self, other):
return self.num*other.den == self.den*other.num
def __lt__(self, other):
return self.num*other.den < self.den*other.num

Post by Hyuga

Post by Johannes Bauer
q12=Fraction(1, 2)
q23=Fraction(2, 3)
q12 < q23

True

Post by Hyuga

Post by Johannes Bauer
q12 > q23

False

Granted it's not as efficient as a __cmp__ function.

--
Arnaud

Colin J. Williams

2008-11-22 16:17:59 UTC

Post by Johannes Bauer

If it's not present then it would be worth reporting it as a 3.0 bug -
there's still time to get it in, as the release isn't due until early
December.

Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
return self.__cmp__(other) < 0
return self.__cmp__(other) < 0
return self.__cmp__(other) > 0
return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?
Kind regards,
Johannes

Johannes,

Isn't the problem with your original
post that x and y are either of
different types or, is of the same type
that the values of that type are not
strictly comparable? It seems that the
old approach was to make a recursive
comparison.

Colin W.

Terry Reedy

2008-11-21 00:16:22 UTC

Post by Johannes Bauer

If it's not present then it would be worth reporting it as a 3.0 bug -
there's still time to get it in, as the release isn't due until early
December.

Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
return self.__cmp__(other) < 0
return self.__cmp__(other) < 0
return self.__cmp__(other) > 0
return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?

See previous threads, including recent one about sorting.

Duncan Booth

2008-11-21 09:09:05 UTC

Post by Johannes Bauer
Seems it was removed on purpose - I'm sure there was a good reason for
that, but may I ask why? Instead of the sleek __cmp__ function I had
return self.__cmp__(other) < 0
return self.__cmp__(other) < 0

I hope you actually have <= here.

Post by Johannes Bauer
return self.__cmp__(other) > 0
return self.__cmp__(other) >= 0
Does anyone know the reason why __cmp__ was discarded?

I think it was because __cmp__ was the backward compatible fallback for
the newer rich comparison methods and Python 3 cleans up a lot of stuff
left in just for backward compatibility. In this case it is a cleanup
too far as in most cases (i.e. those cases where you don't need the full
complexity of the rich comparisons) __cmp__ is a much simpler solution.

See http://mail.python.org/pipermail/python-dev/2003-March/034073.html
for Guido's original thoughts. Also, once upon a time pep-3000 referred
to the removal of __cmp__ but I can't find it in any of the current
peps. See
http://mail.python.org/pipermail/python-checkins/2004-August/042959.html
and
http://mail.python.org/pipermail/python-checkins/2004-August/042972.html
where the reference to removal of __cmp__ became "Comparisons other than
``==`` and ``!=`` between disparate types will raise an exception unless
explicitly supported by the type" and the reference to Guido's email
about removing __cmp__ was also removed.

--
Duncan Booth http://kupuguy.blogspot.com

George Sakkis

2008-11-21 01:01:09 UTC

? ? Johannes> Seems it was removed on purpose - I'm sure there was a good
? ? Johannes> reason for that, but may I ask why?
? ? http://www.mail-archive.com/python-3... at python.org/msg11474.html
? ?http://oakwinter.com/code/porting-setuptools-to-py3k/
Skip

Dropping __cmp__ without providing implicit or at least easy explicit
[1] total ordering is (was?) a mistake; it opens the door to subtle
bugs or redundant boilerplate code.

[1] E.g. http://code.activestate.com/recipes/576529/

32 Replies
2 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Johannes Bauer 2008-11-20 23:44:09 UTC

Steven D'Aprano 2008-11-22 08:53:14 UTC

Arnaud Delobelle 2008-11-23 19:04:19 UTC

Arnaud Delobelle 2008-11-22 10:57:00 UTC

Aahz 2008-11-22 15:31:16 UTC

Steven D'Aprano 2008-11-23 01:06:20 UTC

Terry Reedy 2008-11-23 18:23:39 UTC

Arnaud Delobelle 2008-11-23 11:14:23 UTC

Arnaud Delobelle 2008-11-23 16:12:11 UTC

George Sakkis 2008-11-23 15:16:19 UTC

Aahz 2008-11-23 18:11:52 UTC

Arnaud Delobelle 2008-11-23 10:05:27 UTC

Arnaud Delobelle 2008-11-22 09:10:04 UTC

skip 2008-11-20 23:58:52 UTC

Johannes Bauer 2008-11-20 21:18:09 UTC

Steve Holden 2008-11-20 22:48:03 UTC

Steve Holden 2008-11-20 21:45:42 UTC

Johannes Bauer 2008-11-20 21:38:51 UTC

Ben Finney 2008-11-20 22:20:30 UTC

Terry Reedy 2008-11-20 22:50:28 UTC

Terry Reedy 2008-11-20 23:53:18 UTC

Christian Heimes 2008-11-20 22:58:23 UTC

Christian Heimes 2008-11-20 21:43:30 UTC

Inyeol.Lee 2008-11-20 21:33:21 UTC

Hyuga 2008-11-21 15:44:15 UTC

Steven D'Aprano 2008-11-22 01:12:48 UTC

Arnaud Delobelle 2008-11-22 08:27:59 UTC

Benjamin Kaplan 2008-11-21 16:00:11 UTC

Arnaud Delobelle 2008-11-21 17:26:21 UTC

Colin J. Williams 2008-11-22 16:17:59 UTC

Terry Reedy 2008-11-21 00:16:22 UTC

Duncan Booth 2008-11-21 09:09:05 UTC

George Sakkis 2008-11-21 01:01:09 UTC

about - legalese

Loading...