Discussion:
Python 3
Robert Collins
2010-06-22 21:47:36 UTC
Permalink
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.

I'd like to document some changes to our coding style, to make
eventual Python 3 migration easier - but they aren't all no-brainers
so I thought I'd raise stuff here and we can discuss it.

The no brainers (simple, not terribly ugly, not used a lot in our code):
octal 0666 ->0o666
print foo, bar -> print("%s %s" % (foo, bar))
exec foo, locals() -> exec(foo, locals())

Somewhat ugly - we're going to see a lot of these:
except Foo, e:
->
except Foo:
e = sys.exc_info()[1]

bytestrings:
This has the potential to slow load times slightly: in 3 to get a
bytestring one says b'foo', but you can't do that in 2, so for
single-source compatibility it gets written as:
_b('foo')

which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.

really ugly, but very few occurences:
raise type, val, tb
->
bzrlib.util.py._builtin._reraise(type, val, tb)

Now, there are a lot of other things that we will have to solve and
talk about, this is really top level mechanical stuff and should not
be taken as the whole list or a magic bullet.

However, I think changing our coding style this and not much more will
be enough to let interested people slowly push forward and get things
like:
- the test suite
- C / pyrex modules
and so forth working on 3.

I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).

If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.

-Rob
Barry Warsaw
2010-06-22 22:20:31 UTC
Permalink
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
\o/

(quoting out of order...)
Post by Robert Collins
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
HACKING.txt doesn't describe the minimum Python version for which
compatibility must be maintained. Perhaps it should? Maybe it's described in
a different document? It does *imply* in a few places that Python 2.4 is the
minimum version. Understandable of course, but darn.
Post by Robert Collins
which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.
Or split it out for Python 2.4/2.6+ compatibility? At least you can write
idioms for the latter that 2to3 can more easily transform. E.g...
Post by Robert Collins
octal 0666 ->0o666
print foo, bar -> print("%s %s" % (foo, bar))
from __future__ import print_function
print(foo, bar)
Post by Robert Collins
exec foo, locals() -> exec(foo, locals())
->
e = sys.exc_info()[1]
This has the potential to slow load times slightly: in 3 to get a
bytestring one says b'foo', but you can't do that in 2, so for
_b('foo')
b'foo' FTW. :)
Post by Robert Collins
raise type, val, tb
->
bzrlib.util.py._builtin._reraise(type, val, tb)
raise

-Barry
Robert Collins
2010-06-22 22:40:38 UTC
Permalink
Thanks for the feedback; some data inline below.
Post by Barry Warsaw
Post by Robert Collins
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
HACKING.txt doesn't describe the minimum Python version for which
compatibility must be maintained.  Perhaps it should?  Maybe it's described in
a different document?  It does *imply* in a few places that Python 2.4 is the
minimum version.  Understandable of course, but darn.
perhaps it should, certainly we should have it written down somewhere.
overview and plugin-api reference 2.4 but there is no statement - we
should also advise our users more clearly.
https://bugs.edge.launchpad.net/bzr/+bug/597473
Post by Barry Warsaw
Post by Robert Collins
which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.
Or split it out for Python 2.4/2.6+ compatibility?  At least you can write
idioms for the latter that 2to3 can more easily transform.  E.g...
I think if we're going for separate files, we'd want a real perf win,
because adding (say) 10% more files to bzrlib would suck, and user
strings (vs protocol definitions) are fairly mutable. That is, we
tweak a lot.
Post by Barry Warsaw
Post by Robert Collins
octal 0666 ->0o666
print foo, bar -> print("%s %s" % (foo, bar))
from __future__ import print_function
print(foo, bar)
Its a little nicer, but for print statements this implies two copies
of the literal foo to update - see under maintenance burden.
Post by Barry Warsaw
Post by Robert Collins
exec foo, locals() -> exec(foo, locals())
that is nicer, but two copies of the same code isn't. If we could hook
into the python compiler and just monkey patch it on 2.4->2.5, that
would be nice.
Post by Barry Warsaw
Post by Robert Collins
->
   e = sys.exc_info()[1]
This has the potential to slow load times slightly: in 3 to get a
bytestring one says b'foo', but you can't do that in 2, so for
_b('foo')
b'foo' FTW. :)
Yes. For protocol stuff where this is relevant, we could split the
source - as above though, thats a pretty significant burden to wear.
Post by Barry Warsaw
Post by Robert Collins
   raise type, val, tb
->
   bzrlib.util.py._builtin._reraise(type, val, tb)
raise
No, doesn't do what we need in any python version :)

We use 'raise', where we need to, but we also do reraise in a couple
of places, and switching it to raise there wouldn't work. Or we'd use
'raise' already :).

-Rob
Barry Warsaw
2010-06-23 00:47:20 UTC
Permalink
Post by Robert Collins
that is nicer, but two copies of the same code isn't. If we could hook
into the python compiler and just monkey patch it on 2.4->2.5, that
would be nice.
Indeed, that would be!
-Barry
John Arbash Meinel
2010-06-22 22:55:56 UTC
Permalink
Post by Barry Warsaw
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
\o/
(quoting out of order...)
Post by Robert Collins
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
HACKING.txt doesn't describe the minimum Python version for which
compatibility must be maintained. Perhaps it should? Maybe it's described in
a different document? It does *imply* in a few places that Python 2.4 is the
minimum version. Understandable of course, but darn.
Post by Robert Collins
which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.
Or split it out for Python 2.4/2.6+ compatibility? At least you can write
idioms for the latter that 2to3 can more easily transform. E.g...
So we discussed moving to even 2.5 a while back, and basically Redhat's
latest Enterprise versus (RHEL5) is supporting python2.4 until 2014 or so...

http://www.redhat.com/docs/manuals/enterprise/
and
http://www.redhat.com/security/updates/errata/

And RHEL6 isn't even out yet (and will move to python2.6, IIRC).

Down at the bottom it shows that RHEL5 is in primary support until March
2011, and secondary/tertiary support until 2014. And if you track it
through, RHEL5 only has 2.4 available.

Being tied to 2.4 for at least 1 more year, if not 4 more years, is
really going to hurt switching to python3. As having 2.4+3 compatibility
is just hard.

John
=:->
Barry Warsaw
2010-06-23 00:42:40 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Post by John Arbash Meinel
Being tied to 2.4 for at least 1 more year, if not 4 more years, is
really going to hurt switching to python3. As having 2.4+3 compatibility
is just hard.
Definitely. It doesn't just affect Bazaar though, it hurts every Python
application that wants to run on supported versions of RHEL5.

The only remotely possible solution would be to cut off Python < 2.6 support
in Bazaar at some point but ensure that newer versions are still wire
compatible with older versions for as long as you need. Then, RHEL5 users
could still run the old clients and servers while folks that talk to them with
newer versions of Bazaar would still be able to do so.

Not fun though.
- -Barry
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJMIViBAAoJEBJutWOnSwa/BwgP/RllWmtZ4ZjGXA6R481NxWwV
ilKLJuoPlGXHKJvyUIgaCgEFsaPWttJg0NjnqBgHPVU3SfmcpGZnxUQJsKeJgAWD
0sv3ofSsdt/O4xGK1jkRW3knKtrZfCmf8mlAw7rAhMMRe9FCinpYYKSIXX0vT1gu
cGeoh5NaBf7skNMqf5wfa3TaiMA77xkswnkUGfXHfcpv7jA/OZjbJPpiewpbndD4
JIj2mD01VvgKU41blyPAHQGkkBiaRc+4N/MUZQU9YP8KWEfH2bsbHMqs5XdYhtKB
BAaZ4YHQ5yXDNcpZYzVDA5gESBR/In7ZD0pC9Vd189CD29uW/NkeT9YzXUao8Ie4
oiWEu+i7CDtjSF/OiOeVSBfJOF/gOWtaiDM5NucHVKxToxFRhgz/2zwVmNeQdegt
uYufLAww5y9fDD8Wik7EBwU9s8j0B6HpjgabsAu9KyGU3UdwatOOC/deULPQjzdM
ZHudwk5BkdkTtIdUq0tCv/zl06sEyOurWLpzgN7WejNVKC+vw+dCKeESs5DLoOCV
uq15PL3Lcc9w/5iuJHaVUL4rQ5uLOYutAOjf42zaKGdgvA+WZ6BQ99uIjK7OcJAF
qj/L28Og8eE5SyPcW/iGgHnvq9g5fZpTbdqLBQZq9wkFFRdJC323Hy04DU6u28l5
VlUlkzeqW9XKaOO+xE6+
=VdXD
-----END PGP SIGNATURE----
Robert Collins
2010-06-23 00:47:50 UTC
Permalink
Post by John Arbash Meinel
Being tied to 2.4 for at least 1 more year, if not 4 more years, is
really going to hurt switching to python3. As having 2.4+3 compatibility
is just hard.
Definitely.  It doesn't just affect Bazaar though, it hurts every Python
application that wants to run on supported versions of RHEL5.
Yup. And see recent threads on Python-dev too. Even now, years later,
the complete break in Python 3 is still a source of angst.
The only remotely possible solution would be to cut off Python < 2.6 support
in Bazaar at some point but ensure that newer versions are still wire
compatible with older versions for as long as you need.  Then, RHEL5 users
could still run the old clients and servers while folks that talk to them with
newer versions of Bazaar would still be able to do so.
Not fun though.
Very not fun - it transfers a burden (work with older python) to a
burden (under no circumstances stop supporting XYZ network call); it
also adds a burden to plugin developers - 'if you want your plugin to
be usable with a contributor on RHEL, write two versions, because you
have to use two bzrib versions'.

It would be nice to avoid that. Perhaps 2.8 or 3.3 could be more compatible?

-Rob
Barry Warsaw
2010-06-23 01:09:26 UTC
Permalink
Post by Robert Collins
It would be nice to avoid that. Perhaps 2.8 or 3.3 could be more compatible?
If only. But there almost definitely won't be a Python 2.8.

-Barry
Stephen J. Turnbull
2010-06-23 02:36:30 UTC
Permalink
Post by Robert Collins
Yup. And see recent threads on Python-dev too. Even now, years later,
the complete break in Python 3 is still a source of angst.
True, but the "even now" is inappropriate. This angst was predictable
(though nobody said "we expect much weeping and gnashing of teeth" out
loud that I recall), and *is expected to continue* for at least another
3-5 years as various libraries transition to Python 3 (thus enabling
new generations of app developers to make their own transitions and
start gnashing their teeth as they do).

I commend you personally for your forward-looking spirit. But Bazaar
as a project shouldn't hurry at this point.
Post by Robert Collins
It would be nice to avoid that. Perhaps 2.8 or 3.3 could be more compatible?
As Barry says, 2.8 is clearly a dead letter within python-dev, and as
yet I've seen no calls (not even trolling) for a separate project to
do a 2.8. Of course, 2.7 is a pretty nice language, most libraries
will maintain compatible versions at least on PyPI, so pressure for a
2.8 (from any source) is at least a year away, I think. It's not
clear what the theme of 3.3 is going to be, but with the ongoing
language freeze, I suspect compatibility efforts would not fare well.

The most common library strategy seems to be supporting two library
versions, with multiple versions of some files and a gradual
refactoring concentrating wrappers for the incompatible interfaces in
a small number of modules. As Robert proposed in his initial post.
So that's good.

But 2to3 gets better all the time, and I think 3to2 is expected for
3.2? So it's not at all a panacea, but instead of targeting "one code
base for Python 2 and Python 3", maybe the thing to do is to target
"one code base for Python 2 and 2to3", and try to autogenerate as much
of the Python 3 code base as possible. Presumably as 3to2 improves,
you should be able to transition that to target "one code base for
3to2 and Python 3".

But what worries me is the Python 2 str vs. Python 3 str set of
issues, because they're hard to flag, let alone automatically
transition. Do you have a sense for how pervasive that is in the
Bazaar codebase? Is Unicode usage mostly encapsulated in a few
functions, primarily used in the UI? Or is it more widespread? How
about functionality like diff? Is that affected?

(Again, this isn't something you need to deal with immediately, a
Python 3 transition will take years, but I think it's worth thinking a
little about.)
Martin Pool
2010-06-23 00:58:00 UTC
Permalink
Post by Barry Warsaw
Post by Robert Collins
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
HACKING.txt doesn't describe the minimum Python version for which
compatibility must be maintained.  Perhaps it should?  Maybe it's described in
a different document?  It does *imply* in a few places that Python 2.4 is the
minimum version.  Understandable of course, but darn.
It is described in
<http://doc.bazaar.canonical.com/bzr.dev/developers/code-style.html#python-versions>.
I added that a while ago in response to some merge proposals that
were not 2.4 compatible because people didn't realize we needed it.

We had a discussion a while ago that settled that we wanted to keep
2.4 compatibility at least for a while.

Based on what I read at the time it seemed to me the next step was to
get things clean with -3, then perhaps test it remains so in Babune,
then try getting the tests to run under 2to3 in at least pure python
mode. Many of these cleanups will improve our code even under 2.x.

Robert said
Post by Barry Warsaw
- 2to3 means an indeterministic program is actually run
In what sense is it indeterminisitic?
--
Martin
Martin Pool
2010-06-23 01:01:42 UTC
Permalink
 Many of these cleanups will improve our code even under 2.x.
On the other hand, Robert's patch has things like

-except locale.Error, e:
+except locale.Error:
+ e = sys.exc_info()[1]

which is clearly worse, and clearly also the kind of thing that can be
easily mechanically translated by 2to3. I wouldn't like to see that
kind of thing merged until we've decided we really want a single
codebase that works on both with no translation.
--
Martin
John Arbash Meinel
2010-06-23 01:02:57 UTC
Permalink
Post by Martin Pool
Post by Martin Pool
Many of these cleanups will improve our code even under 2.x.
On the other hand, Robert's patch has things like
+ e = sys.exc_info()[1]
which is clearly worse, and clearly also the kind of thing that can be
easily mechanically translated by 2to3. I wouldn't like to see that
kind of thing merged until we've decided we really want a single
codebase that works on both with no translation.
I thought 2to3 doesn't actually handle that. As it expects you've
written python2.6 compatible code which uses:

except locale.Error as e:

Which we *can't* do for 2.4 or 2.5

John
=:->
Barry Warsaw
2010-06-23 01:07:43 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Post by John Arbash Meinel
I thought 2to3 doesn't actually handle that. As it expects you've
Which we *can't* do for 2.4 or 2.5
Note that 2to3 is extensible, and it's not too hard to write custom fixers.
Not sure that will help here, but just FYI.

- -Barry
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQIcBAEBCAAGBQJMIV5gAAoJEBJutWOnSwa/LgcQAIsMaAk2xsNnh81PW0h+SVlU
JVdbcSHHznST+SvxrpiJP+DxQOo6ZPNSli2KOfhJSZK4jveVaYqekhjmo4G2x4e9
GQiVo3ukK6g2kKG2K1/vgKuy7TLcfSItmrF4HRGVqa7YUgPnLhXoWRFHnma6DkaC
y2GkBIOlhJiFFvng1PQrUc+QTtOTKuVhuKkU5k5vRlIW+k/YT9kfNqAYY6hdFAMd
grr68NgJqbRfO2Tn9J0ia5sd5LWrKzLc1hKyovelsG7gc7ZmYsCC2+3Ywl/DAFdE
DoGTD7DnvqiQ+6uQom3fYyUsM/itLUnVU16Du5nrkvxwWmKbEKNlXDsJNW2vznBv
qYkdEUZsiGeDlHdPq1hNiq80HowcRWUXcVHAc4mFsx0Dw0X84r+4nSbKyFqvwdlx
/d34CXKkLgNMLPjYdpoAZmvefL/MD02JMCw9ZvYwo63isV1NfpSNAXbBvSIb/kTg
Sp8Q5N+O11wat08I4k5Uuz+Q99sf9qv4wdijH2kzps56iFuu6AAo+0rrmcaAjHwX
GRXjeUIOqLvX0T7a+Fdqq+qEHyoALc9xvUOEFecewn9+eM6vYjgxq+8ZgvnXfdo1
prEN8xhqUOkmaS/GSYVLUBdEYU5X7lG/8upI/IE7vR0N8KMi+HQgjY6KzeXdfVP+
QJm3ClD56akl0lz2TDHU
=k
Martin Pool
2010-06-23 01:13:23 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Post by John Arbash Meinel
I thought 2to3 doesn't actually handle that. As it expects you've
Which we *can't* do for 2.4 or 2.5
It does actually handle this already:

--- /tmp/t.py (original)
+++ /tmp/t.py (refactored)
@@ -1,4 +1,4 @@
try:
pass
-except Exception, e:
- print e
+except Exception as e:
+ print(e)
RefactoringTool: Files that need to be modified:
RefactoringTool: /tmp/t.py

If we need to make changes that either don't work under 2.[456] or
that make the code worse, and that are simple to automate, we would be
better off putting that into 2to3 than doing it once-off through a
search and replace.
--
Martin
Robert Collins
2010-06-23 01:42:08 UTC
Permalink
2to3 is indeterministic in the sense that users will have different
versions of it, with different capabilities.

I think 2to3 is a fine tool for small projects with no serious quirks,
but we aren't one of those. I enumerated all the issues I see with
2to3 earlier in the thread, so I won't repeat that now; but I think
that they need to be addressed before we can consider using 2to3 for
anything other than mechanical translation. Fixing things that are
outside of what 2to3 can do is likely a good idea anyway.

For the bytes stuff I've mentioned - see
http://bugs.python.org/issue5425 - 2to3 converts str to 'string' not
'bytes', so we have to mark up everything, and only Python 2.6
supports b'foo'. If we change the importer to go to bytes for 'str',
then we end up with all our literals that are meant for humans as
bytes, which is also wrong. And while we support < 2.6, we can't
annotate strings properly for bytes recognition without something like
a domain specific language - e.g. a _b() function.

We're a library, with very different concerns about when we stop
supporting a version of Python to a simple end user application that
doesn't care about embedding or reuse - and typically ships an entire
Python stack anyway. AFAICT 2to3 is *entirely* driven by 'hey, you can
*switch* over easily', not by ongoing maintenance concerns; the 'add
2to3 as a compile step in your application' is deeply unsatisfying

http://www.slideshare.net/regebro/python-3-compatibility-pycon-2009
and http://wiki.python.org/moin/PortingDjangoTo3k are interesting
reads and had some hand in shaping my opinion on this. Note that the
latter is a patch - you can get an impression of how much stuff 2to3
*didn't* do there.

-Rob
Martin Pool
2010-06-23 02:11:57 UTC
Permalink
Post by Robert Collins
2to3 is indeterministic in the sense that users will have different
versions of it, with different capabilities.
That's a funny definition of "indeterministic" but anyhow, it's true
they may vary in the same kind of way that people will have different
pythons, kernels, pyrexes, libcs, etc. What the magnitude of those
differences may be, I don't know. It first came in to python2.6
afaict so we only have that and then later updates in 3.0 and 3.1.

I suppose like for Pyrex we would have the choice of either
distributing the converted code or getting people to run it. If we
distribute the converted code then we could fix it to some particular
version. If we expect the user to run it then perhaps we could say
"bzr is supported on python3.0 using python3.0-2to3".
Post by Robert Collins
I think 2to3 is a fine tool for small projects with no serious quirks,
but we aren't one of those. I enumerated all the issues I see with
2to3 earlier in the thread, so I won't repeat that now; but I think
that they need to be addressed before we can consider using 2to3 for
anything other than mechanical translation. Fixing things that are
outside of what 2to3 can do is likely a good idea anyway.
You mention that
Post by Robert Collins
- folk will want 3 before there are no platforms that have python2.x and not python3.x support.
True but irrelevant
Post by Robert Collins
- 2to3 means an indeterministic program is actually run
It's an algorithm, it's not indeterministic. There may be different
versions with different bugs but I doubt it will be a larger problem
than we've had with other dependencies, and probably smaller.
Post by Robert Collins
- I like to be able to properly support things
Me too, but this is also irrelevant. We can say we support particular
2to3 versions in the same way we support particular interpreter
versions. But more to the point we can get things passing on 3 before
we commit to supporting anything.
Post by Robert Collins
- distros would prefer to ship a single 'bzr' package rather than two separate ones - and because they run with precompiled files not writable by the user...
This is going to be complicated for distros, but it's a somewhat
similar problem to needing binary extensions built for different
interpreters, which they can do.
Post by Robert Collins
For the bytes stuff I've mentioned - see
http://bugs.python.org/issue5425 - 2to3 converts str to 'string' not
'bytes', so we have to mark up everything, and only Python 2.6
supports b'foo'. If we change the importer to go to bytes for 'str',
then we end up with all our literals that are meant for humans as
bytes, which is also wrong. And while we support < 2.6, we can't
annotate strings properly for bytes recognition without something like
a domain specific language - e.g. a _b() function.
This seems like a problem with ever wanting a single codebase to run
across 2.4..3.x, whether you support 3.x directly or by 2to3, but
probably soluble along the lines discussed in the Django page: add a
new clearer marker for "really bytes".
Post by Robert Collins
We're a library, with very different concerns about when we stop
supporting a version of Python to a simple end user application that
doesn't care about embedding or reuse - and typically ships an entire
Python stack anyway. AFAICT 2to3 is *entirely* driven by 'hey, you can
*switch* over easily', not by ongoing maintenance concerns; the 'add
2to3 as a compile step in your application' is deeply unsatisfying
The document Barry quoted earlier is not suggesting that you switch,
but rather you run it as a compile step. Additional compile steps are
not great but not awful, and better than having bad code that you have
to actually edit.
Post by Robert Collins
http://www.slideshare.net/regebro/python-3-compatibility-pycon-2009
and http://wiki.python.org/moin/PortingDjangoTo3k are interesting
reads and had some hand in shaping my opinion on this. Note that the
latter is a patch - you can get an impression of how much stuff 2to3
*didn't* do there.
Slide 11 and slide 50 suggest libraries develop on 2 and used 2to3 to
support 3.

So the course I would like to follow is the one suggested by python.org:

* get things clean with -3
* get the test suite running under 2to3
* assess at this point whether we want to support actual use under 3
and/or have a separate branch and/or cut over to requiring 2.6 or 3
--
Martin
Stephen J. Turnbull
2010-06-23 04:55:21 UTC
Permalink
Post by Martin Pool
That's a funny definition of "indeterministic" but anyhow, it's true
they may vary in the same kind of way that people will have different
pythons, kernels, pyrexes, libcs, etc. What the magnitude of those
differences may be, I don't know. It first came in to python2.6
afaict so we only have that and then later updates in 3.0 and 3.1.
The differences across versions were pretty large in my experience
(all pre-3.1, though; maybe it's more stable now). You might want to
distribute a particular version with Bazaar to get a reasonably
reliable output. I'm not sure how much the 2to3 application itself
changes, but the ruleset gets better.

You'd also probably really want to distribute specific customized
rules to handle bzr-ish idioms, anyway.
Post by Martin Pool
I suppose like for Pyrex we would have the choice of either
distributing the converted code or getting people to run it.
I don't think having people run 2to3 themselves is a reasonable idea.
It is not intended to be a Python2 to Python3 compiler the way Pyrex
is a Python to C compiler, at least not yet. At least at first (ie
during the alpha period when you're still learning about bzr idioms
that 2to3 doesn't handle OOTB and developing bzr-oriented custom
rules), there will need to be some post 2to3 hand-tweaking. Cf.
Martin van Löwis's Django port and patch (cited by Robert).
Post by Martin Pool
Post by Robert Collins
- distros would prefer to ship a single 'bzr' package rather than
two separate ones - and because they run with precompiled files
not writable by the user...
This is going to be complicated for distros, but it's a somewhat
similar problem to needing binary extensions built for different
interpreters, which they can do.
They're also going to need to do this for other libraries, such as
numpy, and probably a number of applications. The distros can handle
it.
Robert Collins
2010-06-23 05:54:08 UTC
Permalink
On Wed, Jun 23, 2010 at 2:11 PM, Martin Pool <***@canonical.com> wrote:

Meta: I'm not pushing for 3, I did a quick test to see how hard it was
likely to be; we can stop this conversation and just wait and see, at
any point.

I'm fine with the steps you're proposing we do now; waiting and seeing
is a fine strategy, we certainly won't be significantly better or
worse off if we wait (though we may end up with more time-pressure if
3.0 support were to become an important issue in the future for some
reason).

However, I'm going to tug on the threads a little more, because I
think folk are misassessing some of the risks I see.
Post by Martin Pool
Post by Robert Collins
2to3 is indeterministic in the sense that users will have different
versions of it, with different capabilities.
That's a funny definition of "indeterministic" but anyhow, it's true
they may vary in the same kind of way that people will have different
pythons, kernels, pyrexes, libcs, etc.  What the magnitude of those
differences may be, I don't know.  It first came in to python2.6
afaict so we only have that and then later updates in 3.0 and 3.1.
I suppose like for Pyrex we would have the choice of either
distributing the converted code or getting people to run it.  If we
distribute the converted code then we could fix it to some particular
version.  If we expect the user to run it then perhaps we could say
"bzr is supported on python3.0 using python3.0-2to3".
From the perspective of bug reports, I think it fits the dictionary
reasonably well; for the case where other people run 2to3 - the output
is not dependent purely on the input we supply (the bzr code) but
anyhow; the point was made and supported by Stephen - at least
historically, 2to3 has been a very variable thing.
Post by Martin Pool
Post by Robert Collins
- folk will want 3 before there are no platforms that have python2.x and not python3.x support.
True but irrelevant
I dispute the irrelevancy; I wasn't clear enough, but AIUI 2to3 only
helps with 2.6->3.x, so if we still support 2.4, we will have
significant trouble supporting 3 from a single source tree using 2to3;
other options like a ported-trunk may work better at that point (and
the port could be to 2.6 w/2to3 at that stage)
Post by Martin Pool
Post by Robert Collins
- 2to3 means an indeterministic program is actually run
It's an algorithm, it's not indeterministic.  There may be different
versions with different bugs but I doubt it will be a larger problem
than we've had with other dependencies, and probably smaller.
See above, 2to3 is a moving target, rather like rpython (what it can
handle is rpython, what it can't isn't).
Post by Martin Pool
Post by Robert Collins
- I like to be able to properly support things
Me too, but this is also irrelevant.  We can say we support particular
2to3 versions in the same way we support particular interpreter
versions.  But more to the point we can get things passing on 3 before
we commit to supporting anything.
We've had bugs with pyrex output already, and they took more time to
analyse that would have been ideal - because the users pyrex version
output different, unexpected, things. I see this as similar - we have
an open task for someone to start versioning the pyrex files we put in
our tarballs, and that issue is part of it. So again, I don't see this
as irrelevant. It would be irrelevant if the migration and [python
version] support strategy we use had no impact on our ability to
diagnose and solve bugs.
Post by Martin Pool
Post by Robert Collins
- distros would prefer to ship a single 'bzr' package rather than two separate ones - and because they run with precompiled files not writable by the user...
This is going to be complicated for distros, but it's a somewhat
similar problem to needing binary extensions built for different
interpreters, which they can do.
Stephen says numpy will force such support, but I fear that it will be
spotty, hard on users, and not very nice. It is a similar problem, but
in a different domain (package dependencies) and I can pretty much
guarantee that at least the Debian answer will be 'not pretty'. At
best. Of course this isn't a dominating factor for us, but its worth
thinking about a little.
Post by Martin Pool
Post by Robert Collins
For the bytes stuff I've mentioned - see
http://bugs.python.org/issue5425 - 2to3 converts str to 'string' not
'bytes', so we have to mark up everything, and only Python 2.6
supports b'foo'. If we change the importer to go to bytes for 'str',
then we end up with all our literals that are meant for humans as
bytes, which is also wrong. And while we support < 2.6, we can't
annotate strings properly for bytes recognition without something like
a domain specific language - e.g. a _b() function.
This seems like a problem with ever wanting a single codebase to run
across 2.4..3.x, whether you support 3.x directly or by 2to3, but
probably soluble along the lines discussed in the Django page: add a
new clearer marker for "really bytes".
Its quite solvable yes, but using explicit markers - such as the _b() function.
Post by Martin Pool
Post by Robert Collins
We're a library, with very different concerns about when we stop
supporting a version of Python to a simple end user application that
doesn't care about embedding or reuse - and typically ships an entire
Python stack anyway. AFAICT 2to3 is *entirely* driven by 'hey, you can
*switch* over easily', not by ongoing maintenance concerns; the 'add
2to3 as a compile step in your application' is deeply unsatisfying
The document Barry quoted earlier is not suggesting that you switch,
but rather you run it as a compile step.  Additional compile steps are
not great but not awful, and better than having bad code that you have
to actually edit.
I think this depends on the degree to which you trust your compile
steps :). I don't have a huge degree of trust in 2to3, based on the
anecdotal evidence of lurking in #python-dev and on the main python
developer list.
Post by Martin Pool
 * get things clean with -3
I think that this is good to do regardless.
Post by Martin Pool
 * get the test suite running under 2to3
No problem with doing that either :- but I do think that close care
should be paid to the output of 2to3
Post by Martin Pool
 * assess at this point whether we want to support actual use under 3
and/or have a separate branch and/or cut over to requiring 2.6 or 3
I'm fine with this too.

I really do believe that the dragons under the hood are all going to
be bytes vs string type related, and that we're going to find out the
hard way - e.g. by our test suite being internally consistent, but
externally faulty, due to the test suite getting cast up to string
too.

-Rob
Martin Pool
2010-06-23 06:30:32 UTC
Permalink
Post by Robert Collins
Meta: I'm not pushing for 3, I did a quick test to see how hard it was
likely to be; we can stop this conversation and just wait and see, at
any point.
I'm fine with the steps you're proposing we do now; waiting and seeing
is a fine strategy, we certainly won't be significantly better or
worse off if we wait (though we may end up with more time-pressure if
3.0 support were to become an important issue in the future for some
reason).
I think it's good to start getting some back-burner experience with
python3. If nothing else it may improve the code in the same way gz's
non-cpython patches have.

It seems we have to choose between
0 - use 2to3 as a compile step, and update the code so that it works
on 2.[456] and 2to3

You may well be right that 2to3 doesn't work well, or doesn't work
consistently across all installations, or leaves enough hard problems
unsolved that it is infeasible to have a single codebase across
2.3..3.1. In that case we would have to choose between

1- having one branch written in the common subset of 2.3..3.1
2- dropping support for 2.4 (or also 2.5) in some bzr series
3- having a separate 3.x branch

#2 and #3 are clearly possible but a bit unattractive, especially #2.
#0 and #1 may be either impossible or hard.

I'm reluctant to merge ugly or intrusive changes to trunk until we
know that we do want to do #1. I think we can get closer before
deciding.
Post by Robert Collins
Post by Martin Pool
Post by Robert Collins
- folk will want 3 before there are no platforms that have python2.x and not python3.x support.
True but irrelevant
I dispute the irrelevancy; I wasn't clear enough, but AIUI 2to3 only
helps with 2.6->3.x, so if we still support 2.4, we will have
significant trouble supporting 3 from a single source tree using 2to3;
other options like a ported-trunk may work better at that point (and
the port could be to 2.6 w/2to3 at that stage)
I guess we'll have to see whether there are any constructs where the
only way to spell it in 2.4 can't be translated by 2to3.
Post by Robert Collins
Post by Martin Pool
 * get things clean with -3
I think that this is good to do regardless.
Post by Martin Pool
 * get the test suite running under 2to3
No problem with doing that either :- but I do think that close care
should be paid to the output of 2to3
Post by Martin Pool
 * assess at this point whether we want to support actual use under 3
and/or have a separate branch and/or cut over to requiring 2.6 or 3
I'm fine with this too.
Great, let's do this then.
Post by Robert Collins
I really do believe that the dragons under the hood are all going to
be bytes vs string type related, and that we're going to find out the
hard way - e.g. by our test suite being internally consistent, but
externally faulty, due to the test suite getting cast up to string
too.
Probably. On the bright side it could make things cleaner to have a
clearer distinction.
--
Martin
Andrew Bennetts
2010-06-23 06:54:23 UTC
Permalink
Martin Pool wrote:
[...]
Post by Martin Pool
You may well be right that 2to3 doesn't work well, or doesn't work
consistently across all installations, or leaves enough hard problems
unsolved that it is infeasible to have a single codebase across
2.3..3.1. In that case we would have to choose between
1- having one branch written in the common subset of 2.3..3.1
2- dropping support for 2.4 (or also 2.5) in some bzr series
3- having a separate 3.x branch
Or, if the problem is just “not consistent across all installations” (as
opposed to 2to3 not working well for us at all):

4- run 2to3 ourselves (using whichever version and extensions we
consider best), and distribute the output.

IIRC, we already distribute tarballs with .c files that we have
generated from our .pyx source. This doesn't strike me as much
different.

-Andrew.
Gordon Tyler
2010-06-23 14:10:54 UTC
Permalink
Post by Andrew Bennetts
Or, if the problem is just “not consistent across all installations” (as
4- run 2to3 ourselves (using whichever version and extensions we
consider best), and distribute the output.
IIRC, we already distribute tarballs with .c files that we have
generated from our .pyx source. This doesn't strike me as much
different.
This sounds like the best option to me.

Ciao,
Gordon
Stephen J. Turnbull
2010-06-23 08:42:33 UTC
Permalink
Post by Robert Collins
Post by Martin Pool
Post by Robert Collins
- distros would prefer to ship a single 'bzr' package rather
than two separate ones - and because they run with precompiled
files not writable by the user...
This is going to be complicated for distros, but it's a somewhat
similar problem to needing binary extensions built for different
interpreters, which they can do.
Stephen says numpy will force such support, but I fear that it will be
spotty, hard on users, and not very nice.
But note I also think the time frame for Bazaar to really get going on
Python 3 support is 2-3 years. Yes, Debian & Ubuntu will take a long
time (~5 years?) to shake down on this, but numpy is not the only such
package. They *have* to do something. (More precisely, their Python
cabals need to do something, or the distro as a whole will move on.)
Gordon Tyler
2010-06-23 14:04:45 UTC
Permalink
Post by Robert Collins
For the bytes stuff I've mentioned - see
http://bugs.python.org/issue5425 - 2to3 converts str to 'string' not
'bytes', so we have to mark up everything, and only Python 2.6
supports b'foo'. If we change the importer to go to bytes for 'str',
then we end up with all our literals that are meant for humans as
bytes, which is also wrong. And while we support < 2.6, we can't
annotate strings properly for bytes recognition without something like
a domain specific language - e.g. a _b() function.
Why aren't all string literals meant for users marked as unicode in the
current codebase?

Ciao,
Gordon
John Arbash Meinel
2010-06-23 15:45:46 UTC
Permalink
Post by Gordon Tyler
Post by Robert Collins
For the bytes stuff I've mentioned - see
http://bugs.python.org/issue5425 - 2to3 converts str to 'string' not
'bytes', so we have to mark up everything, and only Python 2.6
supports b'foo'. If we change the importer to go to bytes for 'str',
then we end up with all our literals that are meant for humans as
bytes, which is also wrong. And while we support < 2.6, we can't
annotate strings properly for bytes recognition without something like
a domain specific language - e.g. a _b() function.
Why aren't all string literals meant for users marked as unicode in the
current codebase?
Ciao,
Gordon
I would say the #1 reason (by a large margin) is that they "didn't need
to be". As such, someone needs to put in the time to evaluate whether it
is worthwhile.

There are also things like "unicode + str == unicode" so if you have
your hard-coded strings as plain 'str', then if you ever concatenate you
don't accidentally auto-cast.

There are also things like diff headers, where it is a bit more unclear
whether it is valid to have them as Unicode. (even though users see them)

I also don't know how long __unicode__ has existed, but certainly in the
*bzrlib* codebase we haven't done unicode(exception) we've done
str(exception).

This also gets a bit messy when dealing with OS errors. Where it is very
common to have the OS return a localized error message in some 8-bit
string, which we then want to combine with say a Unicode path.

John
=:->
Gordon Tyler
2010-06-23 17:28:19 UTC
Permalink
Post by John Arbash Meinel
There are also things like diff headers, where it is a bit more unclear
whether it is valid to have them as Unicode. (even though users see them)
This is perhaps due to my being a Java programmer for 10 years but it
seems quite logical to have the internal representation of a string in
Unicode and encode that to the required encoding when it "exits" the
application, i.e. written to stdout/file/socket/etc.

Of course, it may be tricky getting an external input into Unicode if
you're not certain what the encoding is.
Post by John Arbash Meinel
I also don't know how long __unicode__ has existed, but certainly in the
*bzrlib* codebase we haven't done unicode(exception) we've done
str(exception).
Apparently unicode(exception) won't do what you want it to do:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/f3d0a1f554583eac
Post by John Arbash Meinel
This also gets a bit messy when dealing with OS errors. Where it is very
common to have the OS return a localized error message in some 8-bit
string, which we then want to combine with say a Unicode path.
The boilerplate to convert to unicode could be a somewhat tiresome, I
suppose. Although, would it not be possible to monkeypatch Exception to
add a __unicode__ method that does the right thing?

Ciao,
Gordon
Martin Pool
2010-06-23 22:59:05 UTC
Permalink
Post by Gordon Tyler
Post by John Arbash Meinel
There are also things like diff headers, where it is a bit more unclear
whether it is valid to have them as Unicode. (even though users see them)
(I think the illuminating example there is to consider a UTF-16
encoded diff - istm the headers should be UTF-16, but whether patch
can actually read such a thing is a different question.)
Post by Gordon Tyler
This is perhaps due to my being a Java programmer for 10 years but it
seems quite logical to have the internal representation of a string in
Unicode and encode that to the required encoding when it "exits" the
application, i.e. written to stdout/file/socket/etc.
Of course, it may be tricky getting an external input into Unicode if
you're not certain what the encoding is.
Java was designed from the start with a clear separation between
strings which are Unicode and byte arrays. Python supported both but
in a kind of fuzzy way, with the default for literals being a byte
string in 2.0.

http://blog.labix.org/2009/07/02/screwing-up-python-compatibility-unicode-str-bytes

Our general approach is also to normally convert to/from byte encoding
on the boundary but the 2.x environment means that either sometimes we
couldn't follow that approach consistently, or there may be latent
inadvertent cases where we don't follow it. This is one reason why
getting the tests to pass under 2to3, even if we don't want to
officially support that, may find some interesting bugs.
--
Martin
John Arbash Meinel
2010-06-23 23:51:45 UTC
Permalink
...
Post by Martin Pool
http://blog.labix.org/2009/07/02/screwing-up-python-compatibility-unicode-str-bytes
Our general approach is also to normally convert to/from byte encoding
on the boundary but the 2.x environment means that either sometimes we
couldn't follow that approach consistently, or there may be latent
inadvertent cases where we don't follow it. This is one reason why
getting the tests to pass under 2to3, even if we don't want to
officially support that, may find some interesting bugs.
I'll also note that we used to have file-ids and revision-ids as Unicode
strings in memory, but intentionally switched back to 8-bit strings.
Mostly for performance/memory consumption. (Default Unicode on most
Unixes is UCS-4, which means we bloated every revision-id and file-id by
approximately 4x. And when you have 100,000+ of them, that gets pretty big)


John
=:->
Benjamin Peterson
2010-06-23 02:08:45 UTC
Permalink
Post by John Arbash Meinel
I thought 2to3 doesn't actually handle that. As it expects you've
This is incorrect. 2to3 does translate it.
Russel Winder
2010-06-23 06:00:43 UTC
Permalink
Martin,
Post by Martin Pool
On the other hand, Robert's patch has things like
+ e = sys.exc_info()[1]
which is clearly worse, and clearly also the kind of thing that can be
easily mechanically translated by 2to3. I wouldn't like to see that
kind of thing merged until we've decided we really want a single
codebase that works on both with no translation.
I think this post and all subsequent issues regarding this code fragment
are based on the wrong assumption and so much of it is effectively
wrong. The recommended transformation is:

except locale.Error , e :

->

except locale.Error as e :

which is clearly more or less a no-op. If 2-3 generates anything else
then it is broken.

Clearly the issue of supporting pre-2.6 Python have an impact here but
only if you are trying to use a single code base to deal with Pythons
1.5 -> 2.6. The SCons folk seem to be the real masters at doing this,
but they have just given up on Pythons 1.5, 2.0->2.3 and now have 2.4 as
a floor. The issue of Python 3 is now arising but not being addressed
just yet. They do have a very interesting approach to using a single
code base for all supported Pythons, that might be useful input to the
debate -- or not. Just a thought.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
John Barstow
2010-06-22 22:32:56 UTC
Permalink
On Wed, Jun 23, 2010 at 9:47 AM, Robert Collins
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
Hmmm. These all seem to ignore the porting guidelines
[http://docs.python.org/py3k/whatsnew/3.0.html#porting-to-python-3-0]
There's actually very little we would need to do if we followed those
guidelines.
Post by Robert Collins
I'd like to document some changes to our coding style, to make
eventual Python 3 migration easier - but they aren't all no-brainers
so I thought I'd raise stuff here and we can discuss it.
octal 0666 ->0o666
print foo, bar -> print("%s %s" % (foo, bar))
exec foo, locals() -> exec(foo, locals())
These are handled by the numliterals, print, and exec fixers. If we
change our coding style we will need to disable these fixers when
using 2to3.
Post by Robert Collins
->
   e = sys.exc_info()[1]
Shouldn't that be "except Foo as e:"? Handled by the except fixer.
Post by Robert Collins
This has the potential to slow load times slightly: in 3 to get a
bytestring one says b'foo', but you can't do that in 2, so for
 _b('foo')
which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.
Well, Python 2.6 supports bytestring literals, so you really only need
to do this if you're committed to supporting 2.4/2.5 going forward.
Personally I think it would make sense to simply drop support for
anything older than 2.6 when we decide we want to start supporting
3.x, then you could just use bytestring literals.
Post by Robert Collins
   raise type, val, tb
->
   bzrlib.util.py._builtin._reraise(type, val, tb)
The raises fixer will rewrite this as "raise
type(val).with_traceback(tb)"; personally I find your proposal much
uglier.

So I guess that makes my counterproposal to adopt a minimum
requirement of Python 2.6 and use 2to3 to until such time as we drop
2.x support.
Robert Collins
2010-06-22 22:54:06 UTC
Permalink
We could - barely - consider dropping 2.4 (note that the oldest
supported RHEL, as I understand it, is still python 2.3 only).

I'm very sure we would be shooting ourselves in the foot to drop 2.5.

As for the extension guidelines, 'meh'. For reference they are:
--
For porting existing Python 2.5 or 2.6 source code to Python 3.0, the
best strategy is the following:

(Prerequisite:) Start with excellent test coverage.
Port to Python 2.6. This should be no more work than the average port
from Python 2.x to Python 2.(x+1). Make sure all your tests pass.
(Still using 2.6:) Turn on the -3 command line switch. This enables
warnings about features that will be removed (or change) in 3.0. Run
your test suite again, and fix code that you get warnings about until
there are no warnings left, and all your tests still pass.
Run the 2to3 source-to-source translator over your source code tree.
(See 2to3 - Automated Python 2 to 3 code translation for more on this
tool.) Run the result of the translation under Python 3.0. Manually
fix up any remaining issues, fixing problems until all tests pass
again.
It is not recommended to try to write source code that runs unchanged
under both Python 2.6 and 3.0; you’d have to use a very contorted
coding style, e.g. avoiding print statements, metaclasses, and much
more. If you are maintaining a library that needs to support both
Python 2.6 and Python 3.0, the best approach is to modify step 3 above
by editing the 2.6 version of the source code and running the 2to3
translator again, rather than editing the 3.0 version of the source
code.
--

We haven't really ignored them - we are ported to 2.6, we have
fantastic test coverage. Folk who are doing Python 3 work can use the
-3 switch; I'm interested here in how we enable those folk to have an
easy life, not in the mechanics of actually doing the work - that is a
separate, albeit related, problem.
2to3 is a problem for us. Here's the problem:
- folk will want 3 before there are no platforms that have python2.x
and not python3.x support.
- 2to3 means an indeterministic program is actually run
- I like to be able to properly support things
- distros would prefer to ship a single 'bzr' package rather than two
separate ones - and because they run with precompiled files not
writable by the user...

python2to3 is essentially useless to us. IMNSHO. Its a great tool for
understanding *the first pass* of what needs to be changed, but not
for actually having a supported result.

As for the 'not recommended' clause - the *Python 3* release manager
has submitted patches to another project - testtools - that does
precisely that, a single source that works. And its actually pretty
nice.

-Rob
John Arbash Meinel
2010-06-22 22:57:19 UTC
Permalink
Post by Robert Collins
We could - barely - consider dropping 2.4 (note that the oldest
supported RHEL, as I understand it, is still python 2.3 only).
The problem isn't as much that the *oldest* is so old, but that the
*newest* actual release (RHEL5) only has 2.4. RHEL6 isn't due out for a
while yet.

John
=:->
John Arbash Meinel
2010-06-22 23:02:05 UTC
Permalink
Post by Robert Collins
--
...
Post by Robert Collins
It is not recommended to try to write source code that runs unchanged
under both Python 2.6 and 3.0; you’d have to use a very contorted
coding style, e.g. avoiding print statements, metaclasses, and much
more. If you are maintaining a library that needs to support both
Python 2.6 and Python 3.0, the best approach is to modify step 3 above
by editing the 2.6 version of the source code and running the 2to3
translator again, rather than editing the 3.0 version of the source
code.
--
a) We actually try to avoid 'print' since we really want to be writing
to specific file output (and print >>foo is pretty ugly anyway)

b) we don't really use metaclasses. I know we use __new__ in one or two
places (revspec), but really only because we wanted to compatibly raise
an error if someone used __init__ directly.
Post by Robert Collins
We haven't really ignored them - we are ported to 2.6, we have
fantastic test coverage. Folk who are doing Python 3 work can use the
-3 switch; I'm interested here in how we enable those folk to have an
easy life, not in the mechanics of actually doing the work - that is a
separate, albeit related, problem.
- folk will want 3 before there are no platforms that have python2.x
and not python3.x support.
- 2to3 means an indeterministic program is actually run
- I like to be able to properly support things
- distros would prefer to ship a single 'bzr' package rather than two
separate ones - and because they run with precompiled files not
writable by the user...
python2to3 is essentially useless to us. IMNSHO. Its a great tool for
understanding *the first pass* of what needs to be changed, but not
for actually having a supported result.
As for the 'not recommended' clause - the *Python 3* release manager
has submitted patches to another project - testtools - that does
precisely that, a single source that works. And its actually pretty
nice.
-Rob
I'll just note that PyQt has some v3 support, nad it causes some real
headaches for the Windows installer. Because the code is listed in a
conditional import (something like: if sys.? > (2, 5, ...) import py3).
It means that py2exe wants to include that code, but can't even import
it without getting an exception.

I don't know what that would mean for us, but splitting the codebase
there wasn't nice for some 3rd party tools.

John
=:->
John Arbash Meinel
2010-06-22 22:50:36 UTC
Permalink
...
Post by Robert Collins
I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).
Would you really be trying to do it by making the source code dual
compatible, rather than using either something like 2to3, or by
converting and using something like 3to2?
Post by Robert Collins
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
-Rob
John
=:->
Robert Collins
2010-06-22 23:00:01 UTC
Permalink
On Wed, Jun 23, 2010 at 10:50 AM, John Arbash Meinel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
Post by Robert Collins
I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).
Would you really be trying to do it by making the source code dual
compatible, rather than using either something like 2to3, or by
converting and using something like 3to2?
There are a bunch of ways we could do it. We haven't chosen one as a
project - and I think thats fine. Python3 is over the horizon for us.

However, there is interest in it now from various places, and that has
been growing over the last couple of years.

As I see it we have 2 major issues to address, which 2to3 and a 3to2
totally bail on.

Ignore syntax, syntax is easy and largely irrelevant - its noise.

C extensions are going to need an overhaul. We're going to want to
keep fixing two versions for a while; ifdefs and macros and pyrex are
good tools here; python2to3 doesn't help *at all*. See
http://docs.python.org/py3k/howto/cporting.html#cporting-howto

Secondly, we have the *huge*, appallingly huge, mess that is 'all
strings are unicode' in Python3. I expect a majority of the work is
going to be identifying which are which, annotating them appropriately
and dealing with the fallout of things like listdir eliding damaged
filenames (for some versions of 3.x). We may need to set a minimum
version of 3 to use - though I don't claim to have any idea here :).
Adding a _u() decorator - like Benjamins patch in testtools - to make
that explicit would be a fine step, and one that slipped my mind at
the start of this thread. (Such a decorator is a no-op on 3, like _b()
is on 2).

-Rob
Ben Finney
2010-06-22 23:52:54 UTC
Permalink
Post by Robert Collins
Adding a _u() decorator - like Benjamins patch in testtools - to make
that explicit would be a fine step, and one that slipped my mind at
the start of this thread.
Why not simply use Python 2's “u'foo'” syntax?

This doesn't entail maintaining two separate code bases: rather the
Python 2 code base is what gets maintained. A run of ‘2to3’ followed by
the full test suite run under Python 3 can be an indicator of how ready
the code base is for Python 3.

(This doesn't cover C modules, but they're a separate headache either
way.)

Eventually, the decision is made to switch to Python 3. This is only
done some time after the ‘2to3’-output passes the full test suite under
Python 3.

So why would this ‘_u()’ shim be necessary or desirable?
--
\ “Software patents provide one more means of controlling access |
`\ to information. They are the tool of choice for the internet |
_o__) highwayman.” —Anthony Taylor |
Ben Finney
Robert Collins
2010-06-23 00:08:43 UTC
Permalink
Post by Ben Finney
Post by Robert Collins
Adding a _u() decorator - like Benjamins patch in testtools - to make
that explicit would be a fine step, and one that slipped my mind at
the start of this thread.
Why not simply use Python 2's “u'foo'” syntax?
It is incompatible with the Python 3 parser.

***@lifeless-64:~$ python3.1
Python 3.1.2 (r312:79147, Apr 15 2010, 15:35:48)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Post by Ben Finney
Post by Robert Collins
u'\u1234'
File "<stdin>", line 1
u'\u1234'
^
SyntaxError: invalid syntax


This is the docstring from testtools.compat for its _u function (just
tidied up a few minutes ago):
+"""A function version of the 'u' prefix.
+
+This is needed becayse the u prefix is not usable in Python 3 but is required
+in Python 2 to get a unicode object.
+
+To migrate code that was written as u'\u1234' in Python 2 to 2+3 change
+it to be _u('\u1234'). The Python 3 interpreter will decode it
+appropriately and the no-op _u for Python 3 lets it through, in Python
+2 we then call unicode-escape in the _u function.
+"""
Post by Ben Finney
This doesn't entail maintaining two separate code bases: rather the
Python 2 code base is what gets maintained. A run of ‘2to3’ followed by
the full test suite run under Python 3 can be an indicator of how ready
the code base is for Python 3.
Modulo:
- 2to3 limits and issues
- users with unknown versions of 2to3
- every distro out there
Post by Ben Finney
(This doesn't cover C modules, but they're a separate headache either
way.)
And bytestrings.
Post by Ben Finney
Eventually, the decision is made to switch to Python 3. This is only
done some time after the ‘2to3’-output passes the full test suite under
Python 3.
So why would this ‘_u()’ shim be necessary or desirable?
Because 2to3 is not complete, and structuring 'work with Python3' as a
cutover event - 'switch to 3 only when its ready' does not fit in with
the evolutionary approach we've taken to supporting Python 2.5, then
2.6, and 2.7 now.

Cutover transitions are harsh, slower than expected, dilute community
support, leave users behind unnecessarily.

-Rob
Ben Finney
2010-06-23 00:46:15 UTC
Permalink
Post by Robert Collins
Post by Ben Finney
Why not simply use Python 2's “u'foo'” syntax?
It is incompatible with the Python 3 parser.
I thought the agreement was that the code base shouldn't try to target
Python 2 and Python 3 simultaneously? That's certainly the
recommendation from the Python folks.
Post by Robert Collins
Post by Ben Finney
This doesn't entail maintaining two separate code bases: rather the
Python 2 code base is what gets maintained. A run of ‘2to3’ followed by
the full test suite run under Python 3 can be an indicator of how ready
the code base is for Python 3.
- 2to3 limits and issues
- users with unknown versions of 2to3
- every distro out there
I'm not suggesting that Bazaar users do the conversion. Rather, that
those automated tools are a way for the Bazaar developers to measure how
close the code base is to being ready for the switch to Python 3.
Post by Robert Collins
Because 2to3 is not complete, and structuring 'work with Python3' as a
cutover event - 'switch to 3 only when its ready' does not fit in with
the evolutionary approach we've taken to supporting Python 2.5, then
2.6, and 2.7 now.
Cutover transitions are harsh, slower than expected, dilute community
support, leave users behind unnecessarily.
Well, I disagree with the direction, but I respect that it's not my
workload being increased.
--
\ “Unix is an operating system, OS/2 is half an operating system, |
`\ Windows is a shell, and DOS is a boot partition virus.” —Peter |
_o__) H. Coffin |
Ben Finney
Stephen J. Turnbull
2010-06-23 02:48:37 UTC
Permalink
Post by Robert Collins
Cutover transitions are harsh, slower than expected, dilute community
support, leave users behind unnecessarily.
Maybe the net work increase would be less if you cut over but provide
a bundled installation with Python 3 of appropriate version (optional,
of course)? At least for early versions of bzr/py3. I understand why
this hasn't been done for 2.x, but Python 3 is a horse of a different
color.

Of course that might not be feasible due to IT policy of RHEL users
etc.
Vincent Ladeuil
2010-06-23 07:47:20 UTC
Permalink
<snip/>
Post by Robert Collins
This is the docstring from testtools.compat for its _u function (just
+"""A function version of the 'u' prefix.
+
+This is needed becayse the u prefix is not usable in Python 3 but is required
+in Python 2 to get a unicode object.
+
+To migrate code that was written as u'\u1234' in Python 2 to 2+3 change
+it to be _u('\u1234'). The Python 3 interpreter will decode it
s/decode it/decode '\u1234'/ ?

I had to re-read to come to this conclusion ("Python3 will decode" and
"Python 3 lets it through" are contradictory otherwise).
Post by Robert Collins
+appropriately and the no-op _u for Python 3 lets it through, in Python
+2 we then call unicode-escape in the _u function.
+"""
Robert Collins
2010-06-23 07:48:54 UTC
Permalink
Post by Vincent Ladeuil
<snip/>
   > This is the docstring from testtools.compat for its _u function (just
   > +"""A function version of the 'u' prefix.
   > +
   > +This is needed becayse the u prefix is not usable in Python 3 but is required
   > +in Python 2 to get a unicode object.
   > +
   > +To migrate code that was written as u'\u1234' in Python 2 to 2+3 change
   > +it to be _u('\u1234'). The Python 3 interpreter will decode it
s/decode it/decode '\u1234'/ ?
I had to re-read to come to this conclusion ("Python3 will decode" and
"Python 3 lets it through" are contradictory otherwise).
Please put a patch forward to testtools - it can probably be improved better.

There are two function definitions, one for 2.x and one for 3, and the
explanation needs to explain how you end up with the right stuff on
both versions, and why its needed.

-Rob
Jelmer Vernooij
2010-06-23 00:58:09 UTC
Permalink
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
I'd like to document some changes to our coding style, to make
eventual Python 3 migration easier - but they aren't all no-brainers
so I thought I'd raise stuff here and we can discuss it.
octal 0666 ->0o666
print foo, bar -> print("%s %s" % (foo, bar))
exec foo, locals() -> exec(foo, locals())
->
e = sys.exc_info()[1]
This has the potential to slow load times slightly: in 3 to get a
bytestring one says b'foo', but you can't do that in 2, so for
_b('foo')
which on 2 is a no-op, and on 3 reencodes using latin-1 (so everything
works). Or we could split out separate files to import on 2 and 3, but
I think the extra seek would make it a wash perf wise.
raise type, val, tb
->
bzrlib.util.py._builtin._reraise(type, val, tb)
Now, there are a lot of other things that we will have to solve and
talk about, this is really top level mechanical stuff and should not
be taken as the whole list or a magic bullet.
However, I think changing our coding style this and not much more will
be enough to let interested people slowly push forward and get things
- the test suite
- C / pyrex modules
and so forth working on 3.
I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).
If the consensus from this thread is that this is ok, I'll update
HACKING docs appropriately.
Is supporting the two versions (that are quite different, at least where
Bazaar is concerned) from the same code base really a good idea ?

I'm wondering if having a separate "python 3" branch of Bazaar to which
newer revisions are explicitly ported is perhaps less of a strain on
development than having to fix up portability issues in the main branch
all the time and by introducing various wrappers around standard
functions.

Cheers,

Jelmer
Andrew Bennetts
2010-06-23 02:15:18 UTC
Permalink
Robert Collins wrote:
[...]
Post by Robert Collins
I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).
To be honest, I'd be inclined to wait. I don't see much benefit to us
pushing towards Python 3 support at the moment, and conversely I don't
think it'll be any harder to start doing that later — in fact it may get
easier as tools like 2to3 (or 3to2) improve, and it gets easier for us
to consider dropping Python 2.4, etc.

My suspicion is that we simply don't have the resources to make 3 a
first class citizen without a negative impact on our 2.x support, if you
include “no significant performance hit” in the meaning of “first class
citizen”. The change to bytes and unicode handling seems like something
that will be difficult to support in 2.5 and 3.x from the same code base
without a performance hit, because b"" literals were only added in 2.6.

What benefits do you see? As far as I can find with Google we've seen a
single inquiry about bzr on Python 3, so there doesn't seem to be much
demand for a port so far.

-Andrew.
Martin Pool
2010-06-23 02:43:54 UTC
Permalink
[...]
Post by Robert Collins
I think that ideally, in a year or so we'd be in a position to make a
concerted push to make 3 a first class citizen (because 3 is getting
considerable upstream and in-distribution attention).
To be honest, I'd be inclined to wait.  I don't see much benefit to us
pushing towards Python 3 support at the moment, and conversely I don't
think it'll be any harder to start doing that later — in fact it may get
easier as tools like 2to3 (or 3to2) improve, and it gets easier for us
to consider dropping Python 2.4, etc.
I don't see any rush to make 3 the default platform. I see some
benefit in testing on 3 so that we get earlier warning of any issues
and just because good testing across more environments can make some
hidden bugs easier to fix. I like the idea of gradually getting the
test suite to pass on 2to3 by making only clean changes to the main
trunk because it will not hurt people not using py3.
--
Martin
Stephen J. Turnbull
2010-06-23 05:11:14 UTC
Permalink
The change to bytes and unicode handling seems like something that
will be difficult to support in 2.5 and 3.x from the same code base
without a performance hit, because b"" literals were only added in 2.6.
I think the _b() device should handle literals. If the 2.x compiler
doesn't already optimize _b('Andrew') to 'Andrew', some sort of
preprocessing step would be ugly but not impossible, or a monkey patch
to the compiler.

The issue that worries me (as is being discussed in a different
context on python-dev) is that some modules in the stdlib now expect
to be passed str = unicode instead of str = bytes. To the extent that
such a module is on the critical path, you could take a big
performance hit there, or face the need to rewrite some of the stdlib
to provide a parallel bytes API.

OTOH, this would only affect Python 3, and you have plenty of time to
wait for or design performance improvements there. N.B. Guido has
expressed support for "polymorphism", which he defines to mean that
functions which process strings (bytes or unicode) return the type fed
to them (and error on inconsistency). I don't know if that applies to
Python 3.2, though (FLUFL, WDYT?)
Barry Warsaw
2010-06-23 12:55:50 UTC
Permalink
Post by Stephen J. Turnbull
OTOH, this would only affect Python 3, and you have plenty of time to
wait for or design performance improvements there. N.B. Guido has
expressed support for "polymorphism", which he defines to mean that
functions which process strings (bytes or unicode) return the type fed
to them (and error on inconsistency). I don't know if that applies to
Python 3.2, though (FLUFL, WDYT?)
I'm not sure what there would be to do for Python 3.2, i.e. on the
enforcement/code change side. It sounds like Guido's polymorphism
recommendation is little more than just that, but TBH, lack of time is forcing
me to tune out on that particular thread for now. :/

-Barry
Russel Winder
2010-06-23 06:16:11 UTC
Permalink
Andrew,

On Wed, 2010-06-23 at 12:15 +1000, Andrew Bennetts wrote:
[ . . . ]
Post by Andrew Bennetts
What benefits do you see? As far as I can find with Google we've seen a
single inquiry about bzr on Python 3, so there doesn't seem to be much
demand for a port so far.
Not entirely true, some people haven't asked because they know the
answer already and they just install 2.6 in order to use Bazaar, SCons,
etc. whilst using 3.1 for their own code.

There is a "chicken and egg" situation here, no-one is moving to 3.1
because no-one is moving to 3.1. The fact that major distros still
haven't caught up to 2.6 is probably the major factor.

If pre-2.6 is an important factor for Bazaar then thinking about 3.1 is
something to work on in the background. The question is whether pre-2.6
Pythons are used by users of Bazaar. Are there statistics or is this
just an assumption? I have no data, I am just asking the question.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
John Szakmeister
2010-06-23 07:13:56 UTC
Permalink
On Wed, Jun 23, 2010 at 2:16 AM, Russel Winder <***@russel.org.uk> wrote:
[snip]
Post by Russel Winder
If pre-2.6 is an important factor for Bazaar then thinking about 3.1 is
something to work on in the background.  The question is whether pre-2.6
Pythons are used by users of Bazaar.  Are there statistics or is this
just an assumption?  I have no data, I am just asking the question.
I'm not sure I can offer much here, but I a number of dev systems I
see are running with Python < 2.6. Python 2.5 is really pervasive,
and we still see 2.4 on the server side quite a bit. That's just one
example, but hopefully you see that the concerns are real.

-John
Russel Winder
2010-06-23 05:52:52 UTC
Permalink
Robert,

On Wed, 2010-06-23 at 09:47 +1200, Robert Collins wrote:
[ . . . ]
Post by Robert Collins
print foo, bar -> print("%s %s" % (foo, bar))
[ . . . ]

Just to protect against creation of misinformation:


print foo , bar -> print ( foo , bar )

Also the "%s" and % operator are deprecated in Python 3, so:

print ( "%s %s" % ( foo , bar ) ) -> print ( "{} {}".format ( foo ,
bar ) )
Post by Robert Collins
->
e = sys.exc_info()[1]
Why?

except Foo , e : -> except Foo as e :


I find that the following document is very helpful:

http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python/python2python3.pdf
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Robert Collins
2010-06-23 05:55:52 UTC
Permalink
Post by Russel Winder
Robert,
[ . . . ]
Post by Robert Collins
print foo, bar -> print("%s %s" % (foo, bar))
[ . . . ]
print foo , bar  ->  print ( foo , bar )
print ( "%s %s" % ( foo , bar ) ) -> print ( "{} {}".format ( foo ,
bar ) )
Oh yes; still, in 3.1 they still work, which is good enough for now :).
working$ python2.5
Python 2.5.5 (r255:77872, Jan 31 2010, 21:34:29)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
... 1/0
... except Exception as e:
<stdin>:3: Warning: 'as' will become a reserved keyword in Python 2.6
File "<stdin>", line 3
except Exception as e:
^
SyntaxError: invalid syntax
Post by Russel Winder
http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python/python2python3.pdf
Thanks.

-Rob
Russel Winder
2010-06-23 06:09:32 UTC
Permalink
On Wed, 2010-06-23 at 17:55 +1200, Robert Collins wrote:
[ . . . ]
Post by Robert Collins
working$ python2.5
Python 2.5.5 (r255:77872, Jan 31 2010, 21:34:29)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
... 1/0
<stdin>:3: Warning: 'as' will become a reserved keyword in Python 2.6
File "<stdin>", line 3
^
SyntaxError: invalid syntax
Indeed, and true for all pre2.6 Pythons. All the 2.x -> 3.1 guides
effectively assume 2.6 as a floor. I have the luxury of treating 2.6 as
a floor and tend to forget sometimes the hassles of using anything that
doesn't have the multiprocessing package as standard (*).

Sorry for slipping in a googly (**).

(*) Insert long spiel about multicore, processor bound computing, use of
Python, the GIL, process-orientation, and CSP.

(**) Cricket metaphor nothing to do with anything else that might have
come to mind.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Martin Pool
2010-06-23 06:11:14 UTC
Permalink
Post by Russel Winder
Robert,
The message Russel is responding to here is Robert's proposal that we
make certain changes (shown by the arrows) to our coding style so that
we can have a single codebase that works on 2.4 .. 3.1.
Post by Russel Winder
[ . . . ]
Post by Robert Collins
print foo, bar -> print("%s %s" % (foo, bar))
[ . . . ]
print foo , bar  ->  print ( foo , bar )
Those two statements do different things in 2.x so no, the second is
not ok as a coding guideline for a single codebase.
Post by Russel Winder
print ( "%s %s" % ( foo , bar ) ) -> print ( "{} {}".format ( foo ,
bar ) )
There's no .format in python2.4.
Post by Russel Winder
Post by Robert Collins
->
    e = sys.exc_info()[1]
Why?
What Robert proposes will work for a single codebase but there is no
"as e" on 2.4.

One of the major points of this thread and previous ones is that we
are not prepared to give up 2.4 support yet. I do realize that it
would be easier to support 3.x if we dropped 2.4 but we're not going
there yet.
--
Martin
Russel Winder
2010-06-23 06:24:55 UTC
Permalink
Martin,
[ . . . ]
Post by Martin Pool
Post by Russel Winder
print foo , bar -> print ( foo , bar )
Those two statements do different things in 2.x so no, the second is
not ok as a coding guideline for a single codebase.
Indeed. I was thinking of transformation rules rather than common code.
For common code, you might have to do something like:

print ( str ( foo ) + ' ' + str ( bar ) )


[ . . . ]
Post by Martin Pool
One of the major points of this thread and previous ones is that we
are not prepared to give up 2.4 support yet. I do realize that it
would be easier to support 3.x if we dropped 2.4 but we're not going
there yet.
As far as I am aware you have to go to 2.6 and drop all pre-2.6 idioms
to have any chance of 2to3 and 3to2 being useful. With 2.4 and 2.5
factors in the equation, then looking at 3 is for the future --
certainly this is the view in the SCons team.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Ben Finney
2010-06-23 07:16:10 UTC
Permalink
Post by Russel Winder
Post by Martin Pool
One of the major points of this thread and previous ones is that we
are not prepared to give up 2.4 support yet. I do realize that it
would be easier to support 3.x if we dropped 2.4 but we're not going
there yet.
As far as I am aware you have to go to 2.6 and drop all pre-2.6 idioms
to have any chance of 2to3 and 3to2 being useful. With 2.4 and 2.5
factors in the equation, then looking at 3 is for the future --
certainly this is the view in the SCons team.
+1.

I think Python 2.4 ↔ 3.1 is too broad a range to try supporting from a
single code base. This is exacerbated by the fact that Python 2.4 and
2.5 are not receiving upstream support, and Python 2.6 will soon be
receiving very little attention upstream (once Python 2.7 is released).
--
\ “If we don't believe in freedom of expression for people we |
`\ despise, we don't believe in it at all.” —Noam Chomsky, |
_o__) 1992-11-25 |
Ben Finney
Danny van Heumen
2010-06-24 18:19:30 UTC
Permalink
(Small disclaimer: I'm not that familiar with python, but am interested
to learn and I think this thread is particularly interesting. The
following question came to my mind. Okay, so it might be rubbish - if so
just say it and I'll shut up :) ...)

I'm wondering. If 2.4 <--> 3.1 is too broad to support, and 2.6 is
better suited to convert to 3.x, since it supports some of the 3.x
constructs.

Would it be feasible to write code in a 2.6 compatible style and
automagically convert it to 2.4?

I mean, I think the most issues I've seen raised are issues like:
'except Foo as e' is not recognized in python 2.4, and 'with x as f' is
not supported.
But, if I understood it correctly, all of these constructs can be
expanded to a form that is understable by 2.4.

So, suppose that you should avoid (complex) modules that were introduced
in python 2.6, would it be feasible to just convert the basic constructs
to 2.4? (That way you can build to a more 3.x compatible code base.)
Post by Russel Winder
Post by Martin Pool
One of the major points of this thread and previous ones is that we
are not prepared to give up 2.4 support yet. I do realize that it
would be easier to support 3.x if we dropped 2.4 but we're not going
there yet.
As far as I am aware you have to go to 2.6 and drop all pre-2.6 idioms
to have any chance of 2to3 and 3to2 being useful. With 2.4 and 2.5
factors in the equation, then looking at 3 is for the future --
certainly this is the view in the SCons team.
+1.
I think Python 2.4 ↔ 3.1 is too broad a range to try supporting from a
single code base. This is exacerbated by the fact that Python 2.4 and
2.5 are not receiving upstream support, and Python 2.6 will soon be
receiving very little attention upstream (once Python 2.7 is released).
John Barstow
2010-06-24 22:11:23 UTC
Permalink
On Fri, Jun 25, 2010 at 6:19 AM, Danny van Heumen
Post by Danny van Heumen
(Small disclaimer: I'm not that familiar with python, but am interested
to learn and I think this thread is particularly interesting. The
following question came to my mind. Okay, so it might be rubbish - if so
just say it and I'll shut up :) ...)
I'm wondering. If 2.4 <--> 3.1 is too broad to support, and 2.6 is
better suited to convert to 3.x, since it supports some of the 3.x
constructs.
Really what this tends to point up is the need to branch the code when
we decide we want to start supporting 3.x

If I were a theoretical release manager doing hypothetical releases, I
would imagine something like this (using inflated release numbers to
drive home the point this is just an example!):

bzr 4.4 - last release supporting python 2.4/2.5 - you won't be able
to upgrade anymore until you upgrade your python version.
bzr 4.6 - first release to support python 3.x using 2to3. Requires
minimum version of python 2.6
bzr 4.4.1 - point release of legacy branch where someone was paid to
backport a feature from 4.6
bzr 5.10 - last release with python 2.x support - you won't be able to
upgrade anymore until you upgrade your python version.
bzr 5.12 - first release without python 2.x support *at all*...

Honestly, there's no downside to saying that release X is going to
require Python 2.6; people who depend on an older version of python
just don't upgrade their installation of bazaar. Anyone sufficiently
motivated can backport code to the legacy branch (say, to accommodate
a new wire protocol), upgrade python without official distribution
support (you *can* compile python on a RHEL box when you need a newer
version!), or petition for a compatibility mode (please provide a bzr
2.2 server on legacy.launchpad.net, here's a bundle of money...)

Frankly, things like native NTLM support can't be integrated cleanly
while you support Python 2.4 (you need the MD4 support in the newer
hashlib module for the pure python implementation). As it is, last
time I checked Bazaar 1.6 was the only version available in any RHEL5
repository, so I'm not sure waiting for RHEL6 is really going to make
a difference.

So are we reluctant to drop Python 2.4/2.5 support because we have no
guarantee of backwards compatibility in the wire protocol and/or
branch format? I don't really see us dropping 2a support just because
we make a move to Python 2.6. Or would we lose major contributors who
are unable to upgrade their version of Python?
John Arbash Meinel
2010-06-24 22:23:00 UTC
Permalink
Post by John Barstow
Honestly, there's no downside to saying that release X is going to
require Python 2.6; people who depend on an older version of python
just don't upgrade their installation of bazaar. Anyone sufficiently
motivated can backport code to the legacy branch (say, to accommodate
a new wire protocol), upgrade python without official distribution
support (you *can* compile python on a RHEL box when you need a newer
version!), or petition for a compatibility mode (please provide a bzr
2.2 server on legacy.launchpad.net, here's a bundle of money...)
Frankly, things like native NTLM support can't be integrated cleanly
while you support Python 2.4 (you need the MD4 support in the newer
hashlib module for the pure python implementation). As it is, last
time I checked Bazaar 1.6 was the only version available in any RHEL5
repository, so I'm not sure waiting for RHEL6 is really going to make
a difference.
Toshio maintains an rpm for RHEL of bzr. So while it isn't in the
official repo, it is still trivially available.

I've often felt that "if you can upgrade bzr, you can upgrade python",
but apparently that feeling is not held. (One is considered a user-space
command, the other a system-wide infrastructure.)

My personal feeling is that we can do something like:

1) 2.2 will be python2.4 compatible
2) 2.3 will drop 2.4 (maybe 2.5) support.
3) We continue with our current methodology of supporting old releases
(2.0/2.1/soon to be 2.2) for some time. I don't know whether we'll drop
2.0 as a supported stable series now that 2.2 will be released.
Regardless, though, we'll certainly still support 2.1, and *that* will
stay 2.4 compatible.

The big concern with having a break at 2.3 is how difficult it will be
to do the fixes in the 2.1/2.2 branch, and have them still apply to the
2.3 branch.

At the moment, most code is not very diverged, so it isn't very hard.
I'm guessing that will mostly stay true. Martin has mentioned that he
likes "except Exception as e" syntax, so there will be some effort
there. If we start using a lot of context objects, that would also make
the code a fair bit different.
Post by John Barstow
So are we reluctant to drop Python 2.4/2.5 support because we have no
guarantee of backwards compatibility in the wire protocol and/or
branch format? I don't really see us dropping 2a support just because
we make a move to Python 2.6. Or would we lose major contributors who
are unable to upgrade their version of Python?
I think it is mostly that we are likely to introduce new RPCs where we
find edge cases that can perform better. Those won't be backported to
older releases (because it is not a bugfix/security fix).

That said, we are unlikely (though it has happened) to remove RPCs. In
which case older clients talking to newer servers will work like they
always have. And newer clients talking to older servers will know to
fall back to the older code path.

We've had some stress points there, and haven't always gotten it 100%
correct, but generally I think we've done ok.

John
=:->
David Cournapeau
2010-06-23 14:18:14 UTC
Permalink
Post by Russel Winder
Post by Martin Pool
One of the major points of this thread and previous ones is that we
are not prepared to give up 2.4 support yet. I do realize that it
would be easier to support 3.x if we dropped 2.4 but we're not going
there yet.
As far as I am aware you have to go to 2.6 and drop all pre-2.6 idioms
to have any chance of 2to3 and 3to2 being useful.  With 2.4 and 2.5
factors in the equation, then looking at 3 is for the future --
certainly this is the view in the SCons team.
+1.
I think Python 2.4 ↔ 3.1 is too broad a range to try supporting from a
single code base. This is exacerbated by the fact that Python 2.4 and
2.5 are not receiving upstream support, and Python 2.6 will soon be
receiving very little attention upstream (once Python 2.7 is released).
I am not claiming any strong relevance to bzr, but in numpy, we intend
to support python 2.4 -> python 3.1 (and we have a lot of C code :) ),
and it seems we will manage to do it.

Yes, it is quite painful - and I would guess not very useful for an
application like bzr. To be honest, I don't see the point in python 3.
the benefits seem ridiculous compared to the backward incompatibility
pain and the only reason why I am contributing to the py3 port of
Numpy/Scipy is to be good "python citizen".

David
--
 \         “If we don't believe in freedom of expression for people we |
 `\           despise, we don't believe in it at all.” —Noam Chomsky, |
_o__)                                                       1992-11-25 |
Ben Finney
Alexander Belchenko
2010-06-23 08:18:21 UTC
Permalink
What is the real benefit of Python 3 over Python 2.5 for Bazaar?
Faster execution speed? Faster import speed?

Does it will give bzr more bonus point to increase its adoption?

2.6 is already headache to create windows standalone installer. I
suppose moving to 3 only increase the pain.
--
All the dude wanted was his rug back
Robert Collins
2010-06-23 08:25:38 UTC
Permalink
Post by Alexander Belchenko
What is the real benefit of Python 3 over Python 2.5 for Bazaar?
Faster execution speed? Faster import speed?
Does it will give bzr more bonus point to increase its adoption?
2.6 is already headache to create windows standalone installer. I suppose
moving to 3 only increase the pain.
Moving to 3 would be a long way off - years+.

3 has a few interesting things for bzr;
Firstly, things like Pynie only run three - and it specifically claims
to be *much* faster at running code than CPython.
Secondly, If (I'm being generous :)) there are significant issues in
Python 3 that affect us, we stand a much better chance of having a
smooth transition if we file bugs - which requires us to actually
compile on the platform.
Lastly, eventually, Python 2 is going to fade away, and we should be
ready for the early stages of that, which is platforms such as Ubuntu
or RHEL that want to ship only Python 3, or Python 3 as default.

I'm not, nor do I think anyone was, talking about making installers in
Python 3 until it really is better for us. But right now its a big
unknown, and it would be nice to reduce the risk involved in getting
across it.

-Rob
Toshio Kuratomi
2010-06-23 17:50:54 UTC
Permalink
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
I'd like to document some changes to our coding style, to make
eventual Python 3 migration easier - but they aren't all no-brainers
so I thought I'd raise stuff here and we can discuss it.
octal 0666 ->0o666
This isn't python-2.4 compatible

-Toshio
John Arbash Meinel
2010-06-23 18:04:50 UTC
Permalink
Post by Toshio Kuratomi
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
I'd like to document some changes to our coding style, to make
eventual Python 3 migration easier - but they aren't all no-brainers
so I thought I'd raise stuff here and we can discuss it.
octal 0666 ->0o666
This isn't python-2.4 compatible
-Toshio
Nor is it python2.5 compatible. So even when we decide we can drop 2.4,
I don't think that means we can drop 2.5 until even later.

John
=:->
John Arbash Meinel
2010-06-30 20:13:14 UTC
Permalink
Post by Robert Collins
I've done a couple of trivial patches moving in the direction of being
able to play with Python 3 and bzr.
In another thread Barry made reference to the 'six' package from
Benjamin Peterson, available at lp:python-six.

Something you might want to look at. It is specifically focused on
tools/constants/constructs that are meant to help write source code that
is 2 and 3 compatible.

Documentation here:
http://packages.python.org/six/

Also, I don't really know if he is going for all-of-2 (2.1,2.2,2.3 or
just 2.6+3.x compatibility.)

John
=:->
John Arbash Meinel
2010-06-30 21:28:39 UTC
Permalink
...
Post by John Arbash Meinel
http://packages.python.org/six/
Also, I don't really know if he is going for all-of-2 (2.1,2.2,2.3 or
just 2.6+3.x compatibility.)
I just looked at the code, and it is pretty clear it isn't meant for all
versions of 2.x. It does stuff like:

from . import _util

Which seems to work under 2.5, but doesn't work on python2.4.

Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.

John
=:->
Max Kanat-Alexander
2010-06-30 21:49:25 UTC
Permalink
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
Dropping 2.4 support makes the baby sysadmin messiah cry.

(Or, more clearly--there are a lot of long-term support distros out
there that shipped with 2.4, and on which it is really complex to
install a separate or newer python.)

-Max
--
http://www.everythingsolved.com/
Competent, Friendly Bugzilla and Perl Services. Everything Else, too.
Barry Warsaw
2010-06-30 22:24:42 UTC
Permalink
Post by Max Kanat-Alexander
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
Dropping 2.4 support makes the baby sysadmin messiah cry.
(Or, more clearly--there are a lot of long-term support
distros out
there that shipped with 2.4, and on which it is really complex to
install a separate or newer python.)
cx_freeze ftw?

-Barry
Martin Pool
2010-06-30 22:55:38 UTC
Permalink
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
       Dropping 2.4 support makes the baby sysadmin messiah cry.
This is really not on the cards for the foreseeable future. There are
clear benefits to supporting 2.4, and only minor benefits at present
to supporting 3 now. I think supporting 3 would be good but I
certainly wouldn't give up 2.4 to get there.

Perhaps we can send patches to six to make it 2.4 compatible.
--
Martin
Scott Aubrey
2010-07-01 09:02:49 UTC
Permalink
Post by Martin Pool
Post by Max Kanat-Alexander
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
Dropping 2.4 support makes the baby sysadmin messiah cry.
This is really not on the cards for the foreseeable future. There are
clear benefits to supporting 2.4, and only minor benefits at present
to supporting 3 now. I think supporting 3 would be good but I
certainly wouldn't give up 2.4 to get there.
[snip]
--
Martin
Why is that?

Not being a python developer myself, nor highly involved in the bazaar community, I feel I may be over stepping my boundary here, but here goes.

from a redhat 5 person perspective: I've got my bug fix, no new features version of an OS that's stable, and bazaar has a bug fix only version that works with this python version. The newer version of bazaar doesn't work, but that's OK.

Why is that idea of dropping Python 2.4 (and Redhat 5 etc) such a bad thing? Even putting aside python 3 support, supporting newer features and versions of python must have some benefit -- at least it's less platforms to make sure new code works on? combined with the fully, bug fix supported versions of bazaar available for those platforms, that will be supported for at least [insert timeframe]? Isn't that what these people expect from Redhat, why expect more from bazaar?

- Scott
Stephen J. Turnbull
2010-07-01 09:36:06 UTC
Permalink
Post by Scott Aubrey
Why is that idea of dropping Python 2.4 (and Redhat 5 etc) such a
bad thing?
The short answer is that some customers want that support. And
(surprise) these are Customers, excuse me, *Mr.* Customers *Sirs* with
a capital C (C for "cash in hand"). (That's a joke; I don't know what
the real reason is, that's a plausible guess, though.)
Post by Scott Aubrey
Isn't that what these people expect from Redhat, why expect more
from bazaar?
Because they are different cases. Redhat is expected to leave you
with a stable, bugfix only version because their job is to provide a
platform on which *everything* works. This allows you (FSVO "you")
to choose (a) to upgrade nothing and have a system that becomes
monotonically more stable over time, with exactly the same features,
or (b) to upgrade only locally-developed software, which presumably
you understand pretty well, and you can restrict bug hunts to that
well-understood software, or (c) to upgrade a select few mission-
critical applications, and restrict bug hunts to that relatively well-
understood mission-critical set. It's your choice, Red Hat enables it
by providing that rock-solid platform.

Bazaar is an application, and while many people are perfectly happy
with Bazaar 0.9 still, others consider Bazaar a mission-critical
support tool, they greatly appreciate the new features and performance
improvements, and they want to upgrade to more modern version once it
becomes stable. However, they may also demand a rock-solid platform
in other respects. It's perfectly reasonable for them to *want* to
run Bazaar 2.2 on Red Hat 5. The question for Bazaar devs is "should
they satisfy that desire?"

Martin's answer is clearly "yes", and you know what, I know he knows a
lot more about it than me, so there you go.
John Arbash Meinel
2010-07-01 13:59:13 UTC
Permalink
Post by Martin Pool
Post by Max Kanat-Alexander
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
Dropping 2.4 support makes the baby sysadmin messiah cry.
This is really not on the cards for the foreseeable future. There are
clear benefits to supporting 2.4, and only minor benefits at present
to supporting 3 now. I think supporting 3 would be good but I
certainly wouldn't give up 2.4 to get there.
Perhaps we can send patches to six to make it 2.4 compatible.
Certainly I don't see us doing it for any of bzr 2.0, 2.1 or 2.2. I was
thinking more along the lines of 2.3 (or 3.0 :). As you say, supporting
3.x is lower on the list right now than supporting python 2.4. However,
I think this will change at some point in the future, and we should be
aware of that, and of tech that might help us transition.

John
=:->
Martin Pool
2010-07-02 00:50:19 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by John Arbash Meinel
Though if we decided we could drop 2.4 support (and not 2.5) it might
work for us.
       Dropping 2.4 support makes the baby sysadmin messiah cry.
This is really not on the cards for the foreseeable future.  There are
clear benefits to supporting 2.4, and only minor benefits at present
to supporting 3 now.  I think supporting 3 would be good but I
certainly wouldn't give up 2.4 to get there.
Perhaps we can send patches to six to make it 2.4 compatible.
Certainly I don't see us doing it for any of bzr 2.0, 2.1 or 2.2. I was
thinking more along the lines of 2.3 (or 3.0 :). As you say, supporting
3.x is lower on the list right now than supporting python 2.4. However,
I think this will change at some point in the future, and we should be
aware of that, and of tech that might help us transition.
+1

I'm delighted to take patches that make things work more cleanly with
-3 warnings or through 2to3 or directly on 3, as long as they don't
break 2.4 or make the code excessively ugly. Conversely if someone
feels they want to develop a parallel branch that does work on 3 that
might be an interesting experiment.

Every time the topic of dropping 2.4 support comes up several people
post that they still want it supported. I run into machines running
multi-year-old OS releases fairly often so I sympathize.

The byte vs unicode transition may be hard enough that we want to
start chewing on it.

We're going to branch off 2.2 and bump the trunk to 2.3 in about a
month so it's possible that we could say the 2.2 series will be the
last one supported on 2.4. I feel like the right time to do that is
when somebody says "I really want to do X and it's just not feasible
in 2.4" - whether that is using 2to3 or six or something else. We've
had some "it would be nice" cases like context managers but nothing
really compelling yet.
--
Martin
John Arbash Meinel
2010-07-02 00:59:17 UTC
Permalink
...
Post by Martin Pool
We're going to branch off 2.2 and bump the trunk to 2.3 in about a
month so it's possible that we could say the 2.2 series will be the
last one supported on 2.4. I feel like the right time to do that is
when somebody says "I really want to do X and it's just not feasible
in 2.4" - whether that is using 2to3 or six or something else. We've
had some "it would be nice" cases like context managers but nothing
really compelling yet.
I think most of the 2.4 vs newer comes down to code clarity sorts of
things. from . import foo, context managers, except X as e, defaultdict,
etc. I haven't seen anything that is 'must have', but quite a few things
that 'would be nice'.

As I understand it, that is the good/bad of python 3 as well. Apparently
it is much cleaner as a language 'from the beginning', but there really
isn't anything you can do that you couldn't before. So everyone that has
worked out python-2 quirks doesn't benefit very much.

Anyway, I see your point. From my pov, there is always the
chicken-and-the-egg stuff. I would probably look more closely at making
sure we work with python-2.7 today, and once that is out, we'd probably
be want to be close to ready for python3 around 1-year after python-2.7
is released (it will be supported for a long time, but after about a
year I would guess missing features would start to be noticeable.)

John
=:->
Stephen J. Turnbull
2010-07-02 03:14:41 UTC
Permalink
Post by John Arbash Meinel
Anyway, I see your point. From my pov, there is always the
chicken-and-the-egg stuff. I would probably look more closely at making
sure we work with python-2.7 today, and once that is out, we'd probably
be want to be close to ready for python3 around 1-year after python-2.7
is released (it will be supported for a long time, but after about a
year I would guess missing features would start to be noticeable.)
I would not bet on that. Currently Python 3 (the language) is in a
long-term feature freeze (for 3.2 at least; possible 3.3 as well), as
it is in catch-up mode for the stdlib even today (email needs a
complete rewrite because of the peculiar nature of email as a
bytes-oriented protocol which is nominally ASCII text but in fact is
only slightly less random than an MS-Word core dump, and there are
several other modules which need a fair amount of work). And large
3rd party frameworks like numpy and Twisted are not there yet (real
soon now, really, for numpy; Real Soon Now (uh, right, yup, not!) for
Twisted.

Python 3 is very attractive for new development because it's cleaner
and because it has a usable approach to Unicode. (All my own stuff
got converted a long time ago, and I've never missed Python 2 at all.)
Once people get their minds wrapped around it, I think the mail and
web-oriented stdlib modules are going to see quantum improvements. If
you have other interests in Python 3, sure, you can get started on
porting Bazaar to it now, I think, which wasn't true at all for Python
3.0, and was only marginally true for 3.1. But for general
programming and ongoing development of applications with a lot of
history (in which I would include Bazaar), there really is no hurry
AFAICS.
Russel Winder
2010-07-02 07:15:17 UTC
Permalink
Martin,
Post by Martin Pool
I'm delighted to take patches that make things work more cleanly with
-3 warnings or through 2to3 or directly on 3, as long as they don't
break 2.4 or make the code excessively ugly. Conversely if someone
feels they want to develop a parallel branch that does work on 3 that
might be an interesting experiment.
From the unmetricated stories from the field section, i.e. anecdotal
evidence to be treated as such.

SCons, like Bazaar, is sticking with 2.4 as a floor for the short- to
medium- term. So there is a consistency of approach amongst Python-based
frameworks that worry about how far back to go.

Waf 1.5.x requires Python 2.4 -> 2.6 to create the project specific
build code, but that generated code runs with any of Pythons 2.3 -> 3.1.
Thomas Nagy has written a document about plans for Waf 1.6.x and from
that document:

Python 3 is now the default syntax. Waf now runs unmodified for
2.6, 2.7, 3.0 and 3.1. Upon execution, a detection is performed
and the code is then converted for python 2.3, 2.4 and 2.5 if
necessary. It is far easier to modify Python3 code to run on
Python2 than porting Python 2 code to Python 3.

Which seems to show there is a consistency of approach tot he Python 2 /
Python 3 issue across the board.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:***@ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: ***@russel.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Loading...