Discussion:
[RFC] Adding Python as a possible language and it's usage
Martin Liška
2018-07-17 12:49:03 UTC
Permalink
Hi.

I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.

There are some bulletins why I would like to replace current AWK scripts:

1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)

2) similar happens in gcc/opth-gen.awk

3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.

4) we can come up with new sanity checks of options:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397

5) there are various targets that generate *.opt files, one example is ARM:
gcc/config/arm/parsecpu.awk

where transforms:
./gcc/config/arm/arm-cpus.in

I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?

I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.

I'm looking forward to a feedback.
Martin
Basile Starynkevitch
2018-07-17 17:13:00 UTC
Permalink
Hello All,
Post by Martin Liška
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
I would suggest also (and perhaps instead) considering using GNU Guile
https://www.gnu.org/software/guile/

(personally, I prefer Guile to Python, but that is just my preference)

Since Guile is the preferred GNU scripting language (for example Guile
is a GNU project, but AFAIK Python is not).

BTW, I dislike Python syntax (my personal taste is an allergy to
significant spaces, but I admit it is just a matter of taste and I could
contribute some Python code in the future if it becomes needed). Also, I
am noticing that these days the Python project might have some
governance issues (see e.g. https://lwn.net/Articles/759654/ in case you
did not heard about it).


However, the idea of depending more deeply on a good scripting language
in GCC is very pleasant.


Regards
--
Basile STARYNKEVITCH == http://starynkevitch.net/Basile
opinions are mine only - les opinions sont seulement miennes
Bourg La Reine, France
David Malcolm
2018-07-17 23:52:08 UTC
Permalink
Post by Basile Starynkevitch
Hello All,
In https://gcc.gnu.org/ml/gcc/2018-07/msg00233.html Martin Liška
Post by Martin Liška
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
I would suggest also (and perhaps instead) considering using GNU Guile
https://www.gnu.org/software/guile/
(personally, I prefer Guile to Python, but that is just my
preference)
Since Guile is the preferred GNU scripting language (for example Guile
is a GNU project, but AFAIK Python is not).
BTW, I dislike Python syntax (my personal taste is an allergy to
significant spaces, but I admit it is just a matter of taste and I could
contribute some Python code in the future if it becomes needed). Also, I
am noticing that these days the Python project might have some
governance issues (see e.g. https://lwn.net/Articles/759654/ in case you
did not heard about it).
[disclosure: I'm a CPython core developer, albeit a rather dormant one]

"Governance issues" seems a little strong to me: yes, Guido is stepping
down as BDFL, but will still participate, and CPython is one of the
best-run FLOSS projects I've had the pleasure of participating in. I'm
sure that the project will continue to be well-run.
Post by Basile Starynkevitch
However, the idea of depending more deeply on a good scripting
language
in GCC is very pleasant.
Indeed. I'm a fan of Python in this regard, as you might have guessed
:)

Dave
David Niklas
2018-07-17 20:37:08 UTC
Permalink
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite
unpleasant to make any adjustments. My question is simple: can we
starting using a scripting language like Python and replace usage of
the AWK scripts? It's probably question for Steering committee, but I
would like to see feedback from community.
There are some bulletins why I would like to replace current AWK
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
flags type classes multiple global variables are created (var_opt_char,
var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
believe we should come up with a structured option format that will
make parsing and processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
5) there are various targets that generate *.opt files, one example is
ARM: gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will
make it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into
options-save.c that can be compiled and works.
I'm looking forward to a feedback.
Martin
<snip>

I was reading phoronix and came upon an article about this email.

As a FLOSS dev and someone who is familiar with both languages in
question, I'd like to point out that python is an unstable language. It
has matured and changed a lot over the years. The tools like python's
2to3 tool have gained an infamous reputation.
OTOH, awk is very stable. I have been on the GNU variant's ML for some
time and I have noticed that when a question over implementation arises
they go looking at and, when necessary, consulting what the other awks are
doing. For Python there is only one implementation, thus only one way of
thinking about how it works unless you want to change something in the
core language.
Gentoo's portage is an excellent example of a good language gone bad
through less than ideal programming in python and it seems to me that,
based on the description above, the awk code in gcc needs a code base
cleanup and decrustification, not rewritten in the latest and greatest
language simply because it is *the fad* of the day. And yes, by spelling
python out as *the* language of choice without any other options Mr.
Martin is recommending to us what to choose without any reason whatsoever
given.
Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust, (it's
also all the rage)? Or tex? Or SQL (that would at least be interesting to
read :) ?
A fast development cycle is the typical cry of python enthusiasts (and my
foolish self at one point in time), but there are plenty of other fast
development languages out there.
In my not so humble opinion, this aught to be approached with some degree
of wisdom and intelligence as opposed to a zest for something new for
newnesses sake.

Sincerely,
David

PS: No, I am not volunteering myself.
David Malcolm
2018-07-18 00:23:36 UTC
Permalink
Post by David Niklas
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite
unpleasant to make any adjustments. My question is simple: can we
starting using a scripting language like Python and replace usage of
the AWK scripts? It's probably question for Steering committee, but I
would like to see feedback from community.
There are some bulletins why I would like to replace current AWK
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
flags type classes multiple global variables are created
(var_opt_char,
var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I
believe we should come up with a structured option format that will
make parsing and processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
5) there are various targets that generate *.opt files, one example is
ARM: gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will
make it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into
options-save.c that can be compiled and works.
I'm looking forward to a feedback.
Martin
<snip>
I was reading phoronix and came upon an article about this email.
[disclosure: I'm a CPython core developer, albeit a rather dormant one,
and have made contributions to PyPy]
Post by David Niklas
As a FLOSS dev and someone who is familiar with both languages in
question, I'd like to point out that python is an unstable language.
It
Post by David Niklas
has matured and changed a lot over the years.
Depends on your meaning of "unstable". The changes are, IMHO,
extremely well-documented e.g.:

https://docs.python.org/3/whatsnew/3.7.html

and the documentation tells you precisely in which version each feature
became available; see e.g.:
https://docs.python.org/3/library/re.html#re.subn
for examples of this.
Post by David Niklas
The tools like python's
2to3 tool have gained an infamous reputation.
OTOH, awk is very stable. I have been on the GNU variant's ML for some
time and I have noticed that when a question over implementation arises
they go looking at and, when necessary, consulting what the other awks are
doing. For Python there is only one implementation, thus only one way of
thinking about how it works unless you want to change something in the
core language.
There are multiple implementations of Python.

CPython is the original one, but of the actively-developed
implementations there's also PyPy and IronPython, along with Jython,
and others. And yes, people talk to each other.
Post by David Niklas
Gentoo's portage is an excellent example of a good language gone bad
through less than ideal programming in python and it seems to me that,
based on the description above, the awk code in gcc needs a code base
cleanup and decrustification, not rewritten in the latest and
greatest
language simply because it is *the fad* of the day.
I get the impression you've had a bad experience with Python in the
past, and that this is why you sent this email.

If it's "the fad of the day", then according to:
https://www.tiobe.com/tiobe-index/python/
it's been the fad of the year in 2007 and 2010, and is current the #4
programming language. Maybe there's some inherent quality underlying
that long-term popularity that makes it more than, say just a "fad".

Using a popular programming language will make it easier for GCC to get
new contributors.

And yes, by spelling
Post by David Niklas
python out as *the* language of choice without any other options Mr.
Martin is recommending to us what to choose without any reason
whatsoever
given.
Martin is offering to do the work (and, in fact, already has prototyped
it), and that counts for a lot in my book.
Post by David Niklas
Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust, (it's
also all the rage)? Or tex? Or SQL (that would at least be
interesting to
read :) ?
Because I never want to maintain another non-trivial awk script if I
can help it, and the thought of being able to do more stuff in Python
makes me happy.

Oh, and Python is more likely to be available on the developer's
machine or build box than at least half of the languages you mention.

Admittedly there's the Python 2 vs Python 3 issue, but Python 2.6
onwards is broadly compatible with Python 3.*, and there's a well-known
common subset that works in both languages. Python 2.6 is almost 10
years old at this point.
Post by David Niklas
A fast development cycle is the typical cry of python enthusiasts (and my
foolish self at one point in time), but there are plenty of other fast
development languages out there.
And Python is superior to them all, in my opinion. For example, Python
makes it easy to embed unit tests in the support scripts. Also, the
Python standard library is "batteries included".
Post by David Niklas
In my not so humble opinion, this aught to be approached with some degree
of wisdom and intelligence as opposed to a zest for something new for
newnesses sake.
Python is older than Java, and is almost as old as GCC itself.
Post by David Niklas
Sincerely,
David
PS: No, I am not volunteering myself.
Quite.

Dave
Paul Koning
2018-07-18 00:37:55 UTC
Permalink
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite
unpleasant to make any adjustments. My question is simple: can we
starting using a scripting language like Python and replace usage of
the AWK scripts? It's probably question for Steering committee, but I
would like to see feedback from community....
I'm looking forward to a feedback.
Martin
David gave a number of good arguments. I support Martin's proposal, both as to replacing AWK and specifically the choice of Python for that purpose.

Python fits the bill very well in my experience. I've used it to write several large programs, including such non-obvious ones as two network protocol stack implementations.

In roughly 40 years, and roughly 40 programming languages, I've only twice encountered a language where I could go from knowing nothing at all to writing a substantial real world program in just one week: Pascal (in college) and Python (about 15 years ago). This is why Python became my language of choice whenever I don't need the speed or small memory footprint of C/C++.

paul
d***@mail.com
2018-07-18 16:41:26 UTC
Permalink
On Tue, 17 Jul 2018 20:23:36 -0400
Post by David Malcolm
Post by David Niklas
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite
unpleasant to make any adjustments. My question is simple: can we
starting using a scripting language like Python and replace usage of
the AWK scripts? It's probably question for Steering committee, but I
would like to see feedback from community.
<snip>
Post by David Malcolm
[disclosure: I'm a CPython core developer, albeit a rather dormant one,
and have made contributions to PyPy]
Very good.
Post by David Malcolm
Post by David Niklas
As a FLOSS dev and someone who is familiar with both languages in
question, I'd like to point out that python is an unstable
language.
It
has matured and changed a lot over the years.
Depends on your meaning of "unstable". The changes are, IMHO,
https://docs.python.org/3/whatsnew/3.7.html
and the documentation tells you precisely in which version each feature
https://docs.python.org/3/library/re.html#re.subn
for examples of this.
And that is what I mean. I changes. I have compiled C code from 20 years
ago and it works as expected. Many Python packages are still awaiting
migration from 2 to 3 and 3.x series does change things.
My argument is based on the fact that maintaining python code requires
much more work than some other langs.
Post by David Malcolm
Post by David Niklas
The tools like python's
2to3 tool have gained an infamous reputation.
OTOH, awk is very stable. I have been on the GNU variant's ML for some
time and I have noticed that when a question over implementation arises
they go looking at and, when necessary, consulting what the other awks are
doing. For Python there is only one implementation, thus only one way of
thinking about how it works unless you want to change something in the
core language.
There are multiple implementations of Python.
CPython is the original one, but of the actively-developed
implementations there's also PyPy and IronPython, along with Jython,
and others. And yes, people talk to each other.
If memory serves, ~1 year ago PyPy was not recommended by the Gentoo devs
for a python implementation because it was considered unstable. Jython
is integrating python with java so I did not consider it a "pure" python
implimentation. I did not know that CPython was the original. I seem to
remember that it was intended to convert python to C and was not yet
complete. I can't comment about the IronPython, but it is good to know
that crosstalk does occur.
I use python3 when I need python.
Post by David Malcolm
Post by David Niklas
Gentoo's portage is an excellent example of a good language gone bad
through less than ideal programming in python and it seems to me that,
based on the description above, the awk code in gcc needs a code base
cleanup and decrustification, not rewritten in the latest and greatest
language simply because it is *the fad* of the day.
I get the impression you've had a bad experience with Python in the
past, and that this is why you sent this email.
Not really... For the curios my story is this: I wanted to learn to
program and C was the dreaded language of the day. Ruby and Python3 were
recommended. I tried to learn first ruby and then python with little
success. I decided to try the hardest language I could find, since
2 years in, the "easy" ones were not working out. I leaned C in no time,
even a perfect understanding of pointers came to me in 6 months time and I
realized that OO and my brain did not like each other. I can program in
python and other OO langs, but I am always running into 2 vs. 3 problems
and each version seems to add something that I know other users might not
have the correct version of python to support or breaks something that
may or may not require changing ones program. Awk (my 4th lang), is a
scripting language that I am also quite good at. I learned it because I
needed to develop simple things faster.
Post by David Malcolm
https://www.tiobe.com/tiobe-index/python/
it's been the fad of the year in 2007 and 2010, and is current the #4
programming language. Maybe there's some inherent quality underlying
that long-term popularity that makes it more than, say just a "fad".
Not to argue your point, but I have sadly witnessed as language after
language is promoted by employers and educators such that I fear that the
numbers of devs interested in a particular language is often times
skewed instead of developers developing their interests organically.
Post by David Malcolm
Using a popular programming language will make it easier for GCC to get
new contributors.
Until it becomes less popular...
And gcc is for compiling C code, so we need more C devs than any other
language :)
Post by David Malcolm
Post by David Niklas
And yes, by spelling
python out as *the* language of choice without any other options Mr.
Martin is recommending to us what to choose without any reason whatsoever
given.
Martin is offering to do the work (and, in fact, already has prototyped
it), and that counts for a lot in my book.
Forgive the misinterpretation of your email on my part, it looked like
Martin was trying to prototype and then ask everyone else to do most of
the work for him.
Post by David Malcolm
Post by David Niklas
Why not ruby? Or Crystal? Or Mozart? Or *gasp* Fortran? Or Rust, (it's
also all the rage)? Or tex? Or SQL (that would at least be
interesting to
read :) ?
Because I never want to maintain another non-trivial awk script if I
can help it, and the thought of being able to do more stuff in Python
makes me happy.
Good enough.
Post by David Malcolm
Oh, and Python is more likely to be available on the developer's
machine or build box than at least half of the languages you mention.
Probably.
Post by David Malcolm
Admittedly there's the Python 2 vs Python 3 issue, but Python 2.6
onwards is broadly compatible with Python 3.*, and there's a well-known
common subset that works in both languages. Python 2.6 is almost 10
years old at this point.
Well known? Wish I knew. And I did read all the standard library and
included docs cover to cover, plus a bunch of internet tutorials too...
Post by David Malcolm
Post by David Niklas
A fast development cycle is the typical cry of python enthusiasts (and my
foolish self at one point in time), but there are plenty of other fast
development languages out there.
And Python is superior to them all, in my opinion. For example, Python
makes it easy to embed unit tests in the support scripts.
Yes, including unit tests is a big advantage to any program, if they ever
get written :)
Post by David Malcolm
Also, the
Python standard library is "batteries included".
...
Post by David Malcolm
Post by David Niklas
In my not so humble opinion, this aught to be approached with some degree
of wisdom and intelligence as opposed to a zest for something new for
newnesses sake.
Python is older than Java, and is almost as old as GCC itself.
<snip>

I make no objection. Your arguments are sound enough. Just bear in mind
that I worry that you will end up envying Sisyphus, or breaking things on
older platforms.

See for example, this bug I submitted:
https://bugs.gentoo.org/show_bug.cgi?id=634712
It remains unsolved. Furthermore, it was introduced in a recent and
continues to the latest version of bind. Bind is not a trivial piece of
SW. Nor is it a small or infrequently used one. More than 50% of bugs I
find are in python packages. Yes, I did count at one point in time.
Yes, the fixes for these packages normally evade me.

Sincerely,
David
David Malcolm
2018-07-18 01:01:21 UTC
Permalink
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from
community.
As you know, I'm a fan of Python. As I noted elsewhere in this thread,
one issue is Python 2 vs Python 3 (and minimum versions). Within
Python 2.*, Python 2.6 onwards is broadly compatible with Python 3.*,
and there's a well-known common subset that works in both languages.

To what extent would this complicate bootstrap? (I don't think so, in
that it would appear to be just an external build-time dependency on
the build machine).

Would this make it harder for people to build GCC? It's one more
dependency, but CPython is widely available and relatively easy to
build. (I don't have experience of doing bring-up of a new
architecture, though).
Post by Martin Liška
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of
flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
Having some kind of .opt linting sounds useful.
Post by Martin Liška
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into options-
save.c
that can be compiled and works.
I'm looking forward to a feedback.
Martin
You named it "gcc-options.py", but I think we'll want something that
can be imported from other scripts, and this isn't valid to "import" as
a module, due to the "-". It should have a filename that either uses
an underscore, or no separator.
Post by Martin Liška
# parse content of optionlist
It's probably worth moving this into a class. Maybe:

class OptionList:
def __init__ (self, lines):
# etc

or similar.

"optimization_flags" could be a member of that class.
Post by Martin Liška
# start printing
This ought to be in a function, rather than having this at the top-
level.

Moving it into a function would allow for some unittest tests:
(a) tests of parsing some lines provided as string literals, to unit-
test the parser.

(b) integration tests of parsing the actual optionlist, maybe.

perhaps via a --unit-test command-line option to trigger
unittest.main().


Maybe a way to ensure no semantic changes during the transition would
be to diff the generated .c/.h files compared to the awk files, and
verifying that there are no significant whitespace changes, for all
supported configs?

Hope this is constructive.
Dave
Richard Biener
2018-07-18 09:51:36 UTC
Permalink
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.
I'm looking forward to a feedback.
I guess we either need to document python as build requirement in
install.texi then,
it currently has

@item A POSIX or SVR4 awk

Necessary for creating some of the generated source files for ***@.
If in doubt, use a recent GNU awk version, as some of the older ones
are broken. GNU awk version 3.1.5 is known to work.

alternatively we could handle the generated files like those we still
need flex for:

@item --enable-generated-files-in-srcdir
Neither the .c and .h files that are generated from Bison and flex nor the
info manuals and man pages that are built from the .texi files are present
in the SVN development tree. When building GCC from that development tree,
or from one of our snapshots, those generated files are placed in your
build directory, which allows for the source to be in a readonly
directory.

If you configure with @option{--enable-generated-files-in-srcdir} then those
generated files will go into the source directory. This is mainly intended
for generating release or prerelease tarballs of the GCC sources, since it
is not a requirement that the users of source releases to have flex, Bison,
or makeinfo.

We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...

Do we get rid of the AWK build requirement with your changes?

Richard.
Post by Martin Liška
Martin
Richard Earnshaw (lists)
2018-07-18 10:03:50 UTC
Permalink
Post by Richard Biener
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.
I'm looking forward to a feedback.
I guess we either need to document python as build requirement in
install.texi then,
it currently has
@item A POSIX or SVR4 awk
If in doubt, use a recent GNU awk version, as some of the older ones
are broken. GNU awk version 3.1.5 is known to work.
alternatively we could handle the generated files like those we still
@item --enable-generated-files-in-srcdir
Neither the .c and .h files that are generated from Bison and flex nor the
info manuals and man pages that are built from the .texi files are present
in the SVN development tree. When building GCC from that development tree,
or from one of our snapshots, those generated files are placed in your
build directory, which allows for the source to be in a readonly
directory.
generated files will go into the source directory. This is mainly intended
for generating release or prerelease tarballs of the GCC sources, since it
is not a requirement that the users of source releases to have flex, Bison,
or makeinfo.
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
Do we get rid of the AWK build requirement with your changes?
Nope, the Arm port uses AWK for handling the CPU description tables. I
chose to use that specifically because it was already relied on for
other parts of the build system.

Please don't go down the Perl line, though...

R.
Post by Richard Biener
Richard.
Post by Martin Liška
Martin
David Malcolm
2018-07-18 10:56:31 UTC
Permalink
Post by Richard Biener
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack
of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into
options-save.c
that can be compiled and works.
I'm looking forward to a feedback.
I guess we either need to document python as build requirement in
install.texi then,
it currently has
@item A POSIX or SVR4 awk
If in doubt, use a recent GNU awk version, as some of the older ones
are broken. GNU awk version 3.1.5 is known to work.
alternatively we could handle the generated files like those we still
If we go down the "document Python as a build requirement" route, we
would need to decide on a minimum version, and what to do about Python
2 vs Python 3. We could restrict ourselves to the common subset of the
two languages, or to require Python 3 (or Python 2, I suppose).

If we want somewhat conservative minimum versions, one strategy might
be to require (Python 2.* (2.6 or later) OR Python 3 (3.3 or later)),
and code to the common subset of 2.6 and 3.3. Implicitly, this would
mean no 3rd-party modules; we'd be sticking to the Python standard
library.

Rationale:

Python 2.6 onwards is broadly compatible with Python 3.*. and is about
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6). I'm guessing that many older systems have Python 2 installed,
but not Python 3, and anything we write is likely to be compatible with
even older Python 2.* implementations.

Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
414), which makes it much easier to write scripts that work with both
2.* and 3.*. Python 3.3 is almost 6 years old.

(this is just a suggestion)

Dave
Post by Richard Biener
@item --enable-generated-files-in-srcdir
Neither the .c and .h files that are generated from Bison and flex nor the
info manuals and man pages that are built from the .texi files are present
in the SVN development tree. When building GCC from that development tree,
or from one of our snapshots, those generated files are placed in your
build directory, which allows for the source to be in a readonly
directory.
generated files will go into the source directory. This is mainly intended
for generating release or prerelease tarballs of the GCC sources, since it
is not a requirement that the users of source releases to have flex, Bison,
or makeinfo.
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
Do we get rid of the AWK build requirement with your changes?
Richard.
Post by Martin Liška
Martin
Jakub Jelinek
2018-07-18 11:08:17 UTC
Permalink
Post by David Malcolm
Post by Richard Biener
alternatively we could handle the generated files like those we still
We can't, because unlike the flex output, the option handling is heavily
target specific and storing in the tarball a collection of per-target
specially generated results would be a nightmare.
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is about
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6). I'm guessing that many older systems have Python 2 installed,
but not Python 3, and anything we write is likely to be compatible with
even older Python 2.* implementations.
Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
414), which makes it much easier to write scripts that work with both
2.* and 3.*. Python 3.3 is almost 6 years old.
(this is just a suggestion)
Then the question is also whether to use python2, python3 or python
binaries. E.g. on some distros python without suffix generates ugly
warnings and that already affects dg-extract-results.sh which just runs
python -c ... rather than first looking for python2 or python3 and only
falling back to python if those don't exist. Some other contrib/ scripts
look only for python3.

Jakub
Eric S. Raymond
2018-07-18 12:06:25 UTC
Permalink
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is about
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6).
It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.
This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).
It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under. I had to solve this problem for
reposurgeon; techniques documented here...

http://www.catb.org/esr/faqs/practical-python-porting/
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.
Joel Sherrill
2018-07-18 12:49:58 UTC
Permalink
Post by Eric S. Raymond
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is
about
Post by Eric S. Raymond
Post by David Malcolm
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6).
It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.
This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).
It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under. I had to solve this problem for
reposurgeon; techniques documented here...
I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).
I think the RTEMS Community may be a good precedence here. RTEMS is always
cross compiled and we are as host agnostic as possible. We use as close to
the latest release of GCC, binutils, gdb, and newlib as possible. Our host
side tools are in a combination of Python and C++. We use Sphinx for
documentation.

We are careful to use the Python on RHEL 6 as a baseline. You can build an
RTEMS environment there. But at least one of the Sphinx pieces requires a
Python of at least RHEL 7 vintage.

We have a lot of what I will politely call institutional and large
organization users who have to adhere to strict IT policies. I think RHEL 7
is common but can't swear there is no RHEL 6 out there and because of that,
we set the Python 2.x as a minimum.

Yes these are old. And for native new distribution use, it doesn't matter.
But for cross and local upgrades, old distributions matter. Particularly
those targeting enterprise users. And those are glacially slow.

As an aside, it was not being able to build the RTEMS documentation that
pushed me off RHEL 6 as my primary personal environment last year. I wanted
to be using the oldest distribution I thought was in use in our community.

--joel
RTEMS
Matthias Klose
2018-07-18 14:29:42 UTC
Permalink
Post by Joel Sherrill
Post by Eric S. Raymond
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is
about
Post by Eric S. Raymond
Post by David Malcolm
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6).
It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.
This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).
It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under. I had to solve this problem for
reposurgeon; techniques documented here...
I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).
I think the RTEMS Community may be a good precedence here. RTEMS is always
cross compiled and we are as host agnostic as possible. We use as close to
the latest release of GCC, binutils, gdb, and newlib as possible. Our host
side tools are in a combination of Python and C++. We use Sphinx for
documentation.
We are careful to use the Python on RHEL 6 as a baseline. You can build an
RTEMS environment there. But at least one of the Sphinx pieces requires a
Python of at least RHEL 7 vintage.
We have a lot of what I will politely call institutional and large
organization users who have to adhere to strict IT policies. I think RHEL 7
is common but can't swear there is no RHEL 6 out there and because of that,
we set the Python 2.x as a minimum.
Yes these are old. And for native new distribution use, it doesn't matter.
But for cross and local upgrades, old distributions matter. Particularly
those targeting enterprise users. And those are glacially slow.
As an aside, it was not being able to build the RTEMS documentation that
pushed me off RHEL 6 as my primary personal environment last year. I wanted
to be using the oldest distribution I thought was in use in our community.
doesn't RHEL 6 has overlays for that very reason to install a newer Python3?

Please don't start with Python2 anymore. It's discontinued in less than two
years and then you'll have distributions not having Python2 anymore. If you
don't have a recent Python3, then you probably can build it for your platform
itself.

Python3 is also cross-buildable, and much easier to cross-build than guile or perl.

Matthias
Janne Blomqvist
2018-07-18 14:46:06 UTC
Permalink
Post by Joel Sherrill
Post by Eric S. Raymond
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is
about
Post by Eric S. Raymond
Post by David Malcolm
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6).
It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.
This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).
It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under. I had to solve this problem for
reposurgeon; techniques documented here...
I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).
I think the RTEMS Community may be a good precedence here. RTEMS is
always
Post by Joel Sherrill
cross compiled and we are as host agnostic as possible. We use as close
to
Post by Joel Sherrill
the latest release of GCC, binutils, gdb, and newlib as possible. Our
host
Post by Joel Sherrill
side tools are in a combination of Python and C++. We use Sphinx for
documentation.
We are careful to use the Python on RHEL 6 as a baseline. You can build
an
Post by Joel Sherrill
RTEMS environment there. But at least one of the Sphinx pieces requires a
Python of at least RHEL 7 vintage.
We have a lot of what I will politely call institutional and large
organization users who have to adhere to strict IT policies. I think
RHEL 7
Post by Joel Sherrill
is common but can't swear there is no RHEL 6 out there and because of
that,
Post by Joel Sherrill
we set the Python 2.x as a minimum.
Yes these are old. And for native new distribution use, it doesn't
matter.
Post by Joel Sherrill
But for cross and local upgrades, old distributions matter. Particularly
those targeting enterprise users. And those are glacially slow.
As an aside, it was not being able to build the RTEMS documentation that
pushed me off RHEL 6 as my primary personal environment last year. I
wanted
Post by Joel Sherrill
to be using the oldest distribution I thought was in use in our
community.
doesn't RHEL 6 has overlays for that very reason to install a newer Python3?
EPEL provides python 3.4 for RHEL6.

(EPEL is a non-official add-on repository, but I suspect the vast majority
who aren't running some single-task server have it enabled)

Don't know if there's something equivalent for SLES.
Please don't start with Python2 anymore. It's discontinued in less than two
years and then you'll have distributions not having Python2 anymore.
+1
--
Janne Blomqvist
Martin Liška
2018-07-20 09:49:05 UTC
Permalink
Post by Matthias Klose
Post by Joel Sherrill
Post by Eric S. Raymond
Post by David Malcolm
Python 2.6 onwards is broadly compatible with Python 3.*. and is
about
Post by Eric S. Raymond
Post by David Malcolm
to be 10 years old. (IIRC it was the system python implementation in
RHEL 6).
It is indeed. Without some regular testing with Python 2.6 it could be
easy to introduce code that doesn't actually work on that old version.
I did that recently, see PR 86112.
This isn't an objection to using Python (I like it, and anyway I don't
touch the parts of GCC that you're talking about using it for). Just a
caution that trying to restrict yourself to a portable subset isn't
always easy for casual users of a language (also a problem with C++98
vs C++11 vs C++14 as I'm sure many GCC devs are aware).
It's not very difficult to write "polyglot" Python that is indifferent
to which version it runs under. I had to solve this problem for
reposurgeon; techniques documented here...
I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).
I think the RTEMS Community may be a good precedence here. RTEMS is always
cross compiled and we are as host agnostic as possible. We use as close to
the latest release of GCC, binutils, gdb, and newlib as possible. Our host
side tools are in a combination of Python and C++. We use Sphinx for
documentation.
We are careful to use the Python on RHEL 6 as a baseline. You can build an
RTEMS environment there. But at least one of the Sphinx pieces requires a
Python of at least RHEL 7 vintage.
We have a lot of what I will politely call institutional and large
organization users who have to adhere to strict IT policies. I think RHEL 7
is common but can't swear there is no RHEL 6 out there and because of that,
we set the Python 2.x as a minimum.
Yes these are old. And for native new distribution use, it doesn't matter.
But for cross and local upgrades, old distributions matter. Particularly
those targeting enterprise users. And those are glacially slow.
As an aside, it was not being able to build the RTEMS documentation that
pushed me off RHEL 6 as my primary personal environment last year. I wanted
to be using the oldest distribution I thought was in use in our community.
doesn't RHEL 6 has overlays for that very reason to install a newer Python3?
Please don't start with Python2 anymore. It's discontinued in less than two
years and then you'll have distributions not having Python2 anymore. If you
don't have a recent Python3, then you probably can build it for your platform
itself.
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense. Even though we agree on transition of option scripts to Python,
I'm planning to that in time frame of GCC 10 release.

Martin
Post by Matthias Klose
Python3 is also cross-buildable, and much easier to cross-build than guile or perl.
Matthias
Segher Boessenkool
2018-07-20 16:37:17 UTC
Permalink
Post by Martin Liška
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense.
Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.


Segher
Paul Koning
2018-07-20 16:54:36 UTC
Permalink
Post by Segher Boessenkool
Post by Martin Liška
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense.
Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.
Is it required that GCC must build with only the stock support elements on the primary target platforms? Or is it allowed to require installing prerequisites? Yes, some platforms are so far behind they still don't ship Python 3, but installing it is straightforward.

paul
Segher Boessenkool
2018-07-20 17:11:08 UTC
Permalink
Post by Paul Koning
Post by Segher Boessenkool
Post by Martin Liška
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense.
Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.
Is it required that GCC must build with only the stock support elements on the primary target platforms?
Not that I know. But why should we make it hugely harder for essentially
no benefit?

All the arguments against awk were arguments against *the current scripts*.

And yes, we can (and perhaps should) rewrite those build scripts as C code,
just like all the other gen* we have.
Post by Paul Koning
Or is it allowed to require installing prerequisites? Yes, some platforms are so far behind they still don't ship Python 3, but installing it is straightforward.
Installing it is not straightforward at all.


Segher
Konovalov, Vadim
2018-07-20 18:53:53 UTC
Permalink
From: Segher Boessenkool
Post by Paul Koning
Post by Segher Boessenkool
Post by Martin Liška
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense.
Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.
Is it required that GCC must build with only the stock
support elements on the primary target platforms?
Not that I know. But why
should we make it hugely harder for essentially
no benefit?
All the arguments
against awk were arguments against *the current scripts*.
And yes, we can (and
perhaps should) rewrite those build scripts as C code,
just like all the other
gen* we have.
+1
Post by Paul Koning
Or is it allowed to require installing prerequisites? Yes,
some platforms are so far behind they still don't ship Python 3, but installing
Sometimes those are not behind, those could have no python for other reasons -
maybe those are too forward? They just don't have python yet?
Post by Paul Koning
it is straightforward.
Installing it is not straightforward at all.
I also agree with this;

Please consider that both Python - 2 and 3 - they both do not
support build chain on Windows with GCC

for me, it is a
Matthias Klose
2018-07-20 19:06:37 UTC
Permalink
Post by Konovalov, Vadim
From: Segher Boessenkool
Post by Paul Koning
Post by Segher Boessenkool
Post by Martin Liška
Fully agree with that. Coming up with a new scripts written in python2 really
makes no sense.
Then python cannot be a build requirement for GCC, since some of our
primary targets do not ship python3.
Is it required that GCC must build with only the stock
support elements on the primary target platforms?
Not that I know. But why
should we make it hugely harder for essentially
no benefit?
All the arguments
against awk were arguments against *the current scripts*.
And yes, we can (and
perhaps should) rewrite those build scripts as C code,
just like all the other
gen* we have.
+1
Post by Paul Koning
Or is it allowed to require installing prerequisites? Yes,
some platforms are so far behind they still don't ship Python 3, but installing
Sometimes those are not behind, those could have no python for other reasons -
maybe those are too forward? They just don't have python yet?
Post by Paul Koning
it is straightforward.
Installing it is not straightforward at all.
I also agree with this;
all == "Installing it is not straightforward" ?

I do question this. I mentioned elsewhere what is needed.
Post by Konovalov, Vadim
Please consider that both Python - 2 and 3 - they both do not
support build chain on Windows with GCC
for me, it is a showstopper
This seems to be a different issue. However I have to say that I'm not booting
Windows on a regular basis. Does build chain on Windows means Cygwin? If yes,
there surely is Python available prebuilt.

Matthias
Konovalov, Vadim
2018-07-20 20:09:30 UTC
Permalink
From: Matthias Klose
To: Konovalov, Vadim; Segher Boessenkool;
Post by Konovalov, Vadim
Sometimes those are not behind, those could have no python for other reasons -
maybe those are too forward? They just don't have python yet?
Post by Segher Boessenkool
Post by Paul Koning
it is straightforward.
Installing it is not straightforward at all.
I also agree with this;
all == "Installing it is not straightforward" ?
I do question this. I mentioned elsewhere what is needed.
What is needed - not always presented.
Post by Konovalov, Vadim
Please consider that both Python - 2 and 3 - they both do not
support build chain on Windows with GCC
for me, it is a showstopper
This seems to be a different issue. However I have to say
that I'm not booting
Windows on a regular basis. Does build chain on Windows
means Cygwin? If yes,
there surely is Python available prebuilt.
Cygwin is very different platform,
python rebuild on Cygwin is supported here, yes, but this is very
different matter.

But I was talking about Windows, not Cygwin,

Rebuild of Python on windows (without Cygwin) not supported,
I was surprised to discover that and I will be gladly accept and use it
When it eventually will support GCC+Windows rebuild.

There are some blogs on Internet about someone who eventually
did a build on windows with GCC, but -
why this effort wasn't propagated into python mainstream?

Most of those mentioned blogs are from 2006 or 2008; rather obsolete and could
not be easily reused

https://wiki.python.org/moin/WindowsCompilers

mentions
GCC - MinGW (x86)
MinGW is an alternative C/C++ compiler that works with all Python versions up to 3.4.

BUT this is just fake - no, the instruction is unfinished and does not work even supposed to work
M
Eric S. Raymond
2018-07-18 18:11:35 UTC
Permalink
I don't see any mention of avoiding dict comprehensions (not supported
until 2.7, so unusable on RHEL6/CentOS6 and SLES 11).
That is correct. The HOWTO introduction does say that its techniques
won't guarantee 2.6 compatibility. That would have been a great deal more
difficult - some 3.x syntax backported into 2.7.2 makes a large difference
here.

In practice, no deployment of reposurgeon or src or doclifter or any
of the other polyglot Python code I maintain has tripped over this, or
at least I'm not seeing issue reports about it.

Python devteam support for Python 2.6 terminated in 2013.
I maintain it's easy to unwittingly use a feature (such as dict
comprehensions) which works fine on your machine, but aren't supported
by all versions you intend to support. Regular testing with the oldest
version is needed to prevent that (which was the point I was making).
Yes. This is why reposurgeon, doclifter, and cvs-fast-export both have
regression-test suites that exercise all Python code under both 2 and
3, a practice I strongly recommend.

Python 2.7 is scheduled for EOL in 2020. My plan is to retain 2.7 support
in my code until 2022.

I report that my practices are keeping the frequency of Python port
defects I hear about to zero. I understand that GCC may have different
constraints.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.
Joseph Myers
2018-07-23 14:30:49 UTC
Permalink
Post by David Malcolm
Python 3.3 reintroduced the 'u' prefix for unicode string literals (PEP
414), which makes it much easier to write scripts that work with both
2.* and 3.*. Python 3.3 is almost 6 years old.
I can't see u'' as of any relevance to .opt parsing. Both the .opt files,
and the generated output from them, should be pure ASCII, and using native
str throughout (never using Python 2 unicode) should work fine.

(I don't see much value in declaring support for EOL versions of Python,
i.e. anything before 2.7 and 3.4, but if we do, I don't think u'' will be
a feature that controls which versions are supported.)
--
Joseph S. Myers
***@codesourcery.com
Segher Boessenkool
2018-07-18 21:28:39 UTC
Permalink
Post by Richard Biener
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
At least perl is GPL (Python is not).


What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.


Segher
Florian Weimer
2018-07-19 11:30:19 UTC
Permalink
Post by Segher Boessenkool
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I'm not an experienced awk programmer, but I don't think plain awk
supports arrays of arrays, so there's really no good way to emulate
user-defined data types and structure the code.
Richard Earnshaw (lists)
2018-07-19 16:12:01 UTC
Permalink
Post by Florian Weimer
Post by Segher Boessenkool
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I'm not an experienced awk programmer, but I don't think plain awk
supports arrays of arrays, so there's really no good way to emulate
user-defined data types and structure the code.
You can do multi-dimentional arrays in awk. They're flattened a bit
like tcl ones are, but there are ways of iterating over a dimention.
See, for example, config/arm/parsecpu.awk which gets up to tricks like that.

R.
Michael Clark
2018-07-20 04:42:13 UTC
Permalink
Post by Richard Earnshaw (lists)
Post by Florian Weimer
Post by Segher Boessenkool
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I'm not an experienced awk programmer, but I don't think plain awk
supports arrays of arrays, so there's really no good way to emulate
user-defined data types and structure the code.
You can do multi-dimentional arrays in awk. They're flattened a bit
like tcl ones are, but there are ways of iterating over a dimention.
See, for example, config/arm/parsecpu.awk which gets up to tricks like that.
Has it occurred to anyone to write a lean and fast tool in the host C++ subset that is allowable for use in the host tools. I guess this is currently C++98/GNU++98. This adds no additional dependencies. Sure it is a slight level of effort higher than writing an awk script, but with a modern C++ this is less of a case as it has ever been in the past. I personally use C++11/14 as a substitute for python type programs that would normally be considered script language material, mostly due to fluency and the fact that modern C++ has grown more tractable as a replacement for “fast to code in” languages given it is much faster to code in than C.

LLVM discussed changing the host compiler language feature dependencies to C++11/C++14. There are obvious but not insurmountable bootstrap requirements. i.e. for very old systems it will require an intermediate C++11/C++14 compiler to bootstrap LLVM 6.0. Here is LLVM's new compiler baseline and it seems to require CentOS 7.

- Clang 3.1
- GCC 4.8
- Visual Studio 2015 (Update 3)

[1] https://llvm.org/docs/GettingStarted.html#getting-a-modern-host-c-toolchain

I find I can be very productive and nearly as concise in C++11 as I can in many script languages due to modern versions of <vector>, <map>, <set>, <memory>, <string>, <regex>, auto, lambdas, etc. It’s relatively easy to write memory clean code from the get go using std::unique_ptr<> and sparing amounts of std::shared_ptr<>) and the new C++11 for comprehensions, initializer lists and various other enhancements can make coding in “modern C++” relatively friendly and productive. In the words of Bjarne Stroustrup: It feels like a new language. I can point to examples of small text processing utilities that i’ve written that could be written in python with a relatively similar amount of effort. Fluency with STL and the use of lean idioms. STL and structs (data hiding is only a header tweak away from being broken in a language like C++, and the use of struct and is similar to language like python which resorts to using underscores or “idiomatic enforcement”). i.e. there are lightweight, fast and productive modern C++ idioms that work well with vectors, sets, maps and unique_ptr or shared_ptr automatic memory management. I find with modern idioms these programs valgrind clean almost always.

Would modern-C++ (for generated files) be considered for GCC 9? The new idioms may make parts of the code considerable more concise and could allow use of some of the new automatic memory management features. The reason I’m suggesting this, is that for anything other than a trivial command line invocation of sed or awk, I would tend to write a modern C++ program to do text processing versus a script langauge like python. Firstly it is faster, Secondly I am proficient enough and the set and map functionality combined with the new automatic memory management is sufficient enough that complex nested data structures and text processing can handled with relative ease. Note: I do tend to avoid iostream and instead use stdc FILE * and fopen/fread/frwite/fclose or POSIX open/read/write/close if I want to avoid buffering. I find iostream performance is not that great.

How conservative are we? Is C++11 going go be available for use in GCC before C++2x in 202x. Indeed <filesystem> would improve some of the Windows/UNIX interoperability. I’ve found that writing C++11/14 allows me to write in an idiomatic C/C++ subset that is quite stable across platforms. We now even have <cstdint> on Windows. There has been quite a bit of convergence.

Having the constraint that modern C++11/14 can only be used for generated files lessens the burden as the distribution build can maintain the same base compiler dependencies.

Michael.
Jeff Law
2018-07-19 14:47:03 UTC
Permalink
Post by Segher Boessenkool
Post by Richard Biener
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
At least perl is GPL (Python is not).
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I've found python *far* easier to read than awk. And you can actually
run a debugger on your python code to see what it's doing.
Jeff
Eric Gallager
2018-07-19 15:59:20 UTC
Permalink
Post by Jeff Law
Post by Segher Boessenkool
Post by Richard Biener
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
At least perl is GPL (Python is not).
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I've found python *far* easier to read than awk. And you can actually
run a debugger on your python code to see what it's doing.
Jeff
gawk comes with a debugger, too:
https://www.gnu.org/software/gawk/manual/html_node/Debugger.html
Martin Liška
2018-07-20 10:02:17 UTC
Permalink
Post by Jeff Law
Post by Segher Boessenkool
Post by Richard Biener
We already conditionally require Perl for building for some targets so I wonder
if using perl would be better ...
At least perl is GPL (Python is not).
What would the advantage of using Python be? I haven't heard any yet.
Awk may be a bit clunky but at least it is easily readable for anyone.
I've found python *far* easier to read than awk. And you can actually
run a debugger on your python code to see what it's doing.
Jeff
Yes, using Python is mainly because of object-oriented programming paradigm.
It's handy to have encapsulation of functionality in methods, one can do
unit-testing of parts of the script. Currently AWK scripts are mix of input/output
transformation and various emission of printf('#error..') sanity checks.
In general the script is not easily readable and contains multiple global arrays
that simulate encapsulation in classes.

Martin
Boris Kolpackov
2018-07-18 15:13:47 UTC
Permalink
My question is simple: can we starting using a scripting language like
Python and replace usage of the AWK scripts?
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
Paul Koning
2018-07-18 16:56:02 UTC
Permalink
Post by Boris Kolpackov
My question is simple: can we starting using a scripting language like
Python and replace usage of the AWK scripts?
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
It's the same sort of thing: untar the sources, configure, make, make install. The code is larger than awk but the process is no more difficult.

For Windows there are pre-built kits. Ditto for a number of other popular operating systems.

paul
Boris Kolpackov
2018-07-18 17:22:12 UTC
Permalink
Post by Paul Koning
Post by Boris Kolpackov
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
It's the same sort of thing: untar the sources, configure, make, make
install.
Will this also install all the Python packages one might plausible want
to use in GCC?
Paul Koning
2018-07-18 17:29:54 UTC
Permalink
Post by Boris Kolpackov
Post by Paul Koning
Post by Boris Kolpackov
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
It's the same sort of thing: untar the sources, configure, make, make
install.
Will this also install all the Python packages one might plausible want
to use in GCC?
It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual). I expect that will easily cover anything GCC might want to do.

paul
Matthias Klose
2018-07-18 18:03:59 UTC
Permalink
Post by Paul Koning
Post by Boris Kolpackov
Post by Paul Koning
Post by Boris Kolpackov
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
It's the same sort of thing: untar the sources, configure, make, make
install.
Windows binaries and MacOSX binaries are available from upstream. The build
process on *ix targets is autoconf based and easy as for awk/gawk.
Post by Paul Koning
Post by Boris Kolpackov
Will this also install all the Python packages one might plausible want
to use in GCC?
some extension modules depend on external libraries, but even if those don't
exist, the build succeeds without building these extension modules. The sources
come with embedded libs for zlib, libmpdec, libexpat. They don't include
libffi (only in 3.7), libsqlite, libgdbm, libbluetooth, libdb. I suppose the
usage of such modules should be banned by policy. The only needed thing is any
of libdb (Berkley/SleepyCat) or gdbm to build the anydbm module which might be
necessary.
Post by Paul Koning
It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual). I expect that will easily cover anything GCC might want to do.
The current usage of awk and perl doesn't include any third party libraries.
That's where the usage of Python should start with.

Matthias
Martin Liška
2018-07-20 10:07:02 UTC
Permalink
Post by Matthias Klose
Post by Paul Koning
Post by Boris Kolpackov
Post by Paul Koning
Post by Boris Kolpackov
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process (somehow
I doubt it)? What about Windows?
It's the same sort of thing: untar the sources, configure, make, make
install.
Windows binaries and MacOSX binaries are available from upstream. The build
process on *ix targets is autoconf based and easy as for awk/gawk.
Post by Paul Koning
Post by Boris Kolpackov
Will this also install all the Python packages one might plausible want
to use in GCC?
some extension modules depend on external libraries, but even if those don't
exist, the build succeeds without building these extension modules. The sources
come with embedded libs for zlib, libmpdec, libexpat. They don't include
libffi (only in 3.7), libsqlite, libgdbm, libbluetooth, libdb. I suppose the
usage of such modules should be banned by policy. The only needed thing is any
of libdb (Berkley/SleepyCat) or gdbm to build the anydbm module which might be
necessary.
Post by Paul Koning
It installs the entire standard Python library (corresponding to the 1800+ pages of the library manual). I expect that will easily cover anything GCC might want to do.
The current usage of awk and perl doesn't include any third party libraries.
That's where the usage of Python should start with.
Thank you Matthias for explanation of dependencies problematics. I can confirm
that option handling scripts can easily work without any fancy modules.

Martin
Post by Matthias Klose
Matthias
Konovalov, Vadim
2018-07-19 13:48:19 UTC
Permalink
From: Paul Koning
Post by Boris Kolpackov
I wonder what will be the expected way to obtain a suitable version of
Python if one is not available on the build machine? With awk I can
build it from source pretty much anywhere. Is building newer versions
of Python on older targets a similarly straightforward process
(somehow I doubt it)? What about Windows?
It's the same sort of thing: untar
the sources, configure, make, make install. The code is larger than awk but
the process is no more difficult.
Python build chain on windows does not support building with gcc

It was surprise for me to discover that, but this is how it is.

Very inconvenient.
For Windows there are pre-built kits. Ditto
for a number of other popular operating systems.
This suits for simple cases or for "popular" ones, but greatly complicate things if it isn'
Karsten Merker
2018-07-19 20:20:31 UTC
Permalink
Post by David Malcolm
Post by Martin Liška
I've recently touched AWK option generate machinery and it's
quite unpleasant to make any adjustments. My question is
simple: can we starting using a scripting language like Python
and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from
community.
As you know, I'm a fan of Python. As I noted elsewhere in this
thread, one issue is Python 2 vs Python 3 (and minimum
versions). Within Python 2.*, Python 2.6 onwards is broadly
compatible with Python 3.*, and there's a well-known common
subset that works in both languages.
To what extent would this complicate bootstrap? (I don't think
so, in that it would appear to be just an external build-time
dependency on the build machine).
Would this make it harder for people to build GCC? It's one
more dependency, but CPython is widely available and relatively
easy to build. (I don't have experience of doing bring-up of a
new architecture, though).
Hello,

I have recently been working on bringing up a new Debian port for
the riscv64 architecture from scratch, so I would like to add
some of my personal experiences here.

Adding a dependency on python for building gcc would make life
for distribution porters quite a bit harder. There are a bunch
of packages that are more or less essential for a modern Linux
distribution but at the same time extremely difficult to properly
cross-build. For a distribution porter trying to bootstrap a new
architecture, this means that one has to resort to native
building sooner or later, i.e. one has to build native toolchain
packages and then work forward from there. During the bootstrap
process it is often necessary to break dependency cycles and
natively rebuild toolchain packages with different build-profiles
enabled, or to build newer versions of the same toolchain packages
with bugfixes for the new architecture.

A dependency on python would mean that to be able to do a native
rebuild of the toolchain one would need a native python. The
problem here is that python has an enormous number of transitive
build-dependencies and not all of them are easily cross-buildable,
i.e. one needs a native compiler to build some of them in a
bootstrap scenario. This can lead to a catch-22-style situation
where one would need a native python package and its dependencies
for natively building the gcc package and a native gcc package
for building (some of) the dependencies of the python package.

With awk we don't have this problem as in contrast to python awk
doesn't pull in any dependencies that aren't required by gcc
anyway. From a distro porter's point of view I would therefore
appreciate very much if it would be possible to avoid adding a
python dependency to the gcc build process.

Regards,
Karsten

P.S.: I am not subscribed to the list, so it would be nice
if you could CC me on replies.
--
Gem. Par. 28 Abs. 4 Bundesdatenschutzgesetz widerspreche ich der Nutzung
sowie der Weitergabe meiner personenbezogenen Daten für Zwecke der
Werbung sowie der Markt- oder Meinungsforschung.
Matthias Klose
2018-07-20 10:01:03 UTC
Permalink
Post by Karsten Merker
Post by David Malcolm
Post by Martin Liška
I've recently touched AWK option generate machinery and it's
quite unpleasant to make any adjustments. My question is
simple: can we starting using a scripting language like Python
and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from
community.
As you know, I'm a fan of Python. As I noted elsewhere in this
thread, one issue is Python 2 vs Python 3 (and minimum
versions). Within Python 2.*, Python 2.6 onwards is broadly
compatible with Python 3.*, and there's a well-known common
subset that works in both languages.
To what extent would this complicate bootstrap? (I don't think
so, in that it would appear to be just an external build-time
dependency on the build machine).
Would this make it harder for people to build GCC? It's one
more dependency, but CPython is widely available and relatively
easy to build. (I don't have experience of doing bring-up of a
new architecture, though).
Hello,
I have recently been working on bringing up a new Debian port for
the riscv64 architecture from scratch, so I would like to add
some of my personal experiences here.
Adding a dependency on python for building gcc would make life
for distribution porters quite a bit harder. There are a bunch
of packages that are more or less essential for a modern Linux
distribution but at the same time extremely difficult to properly
cross-build. For a distribution porter trying to bootstrap a new
architecture, this means that one has to resort to native
building sooner or later, i.e. one has to build native toolchain
packages and then work forward from there. During the bootstrap
process it is often necessary to break dependency cycles and
natively rebuild toolchain packages with different build-profiles
enabled, or to build newer versions of the same toolchain packages
with bugfixes for the new architecture.
A dependency on python would mean that to be able to do a native
rebuild of the toolchain one would need a native python. The
problem here is that python has an enormous number of transitive
build-dependencies and not all of them are easily cross-buildable,
i.e. one needs a native compiler to build some of them in a
bootstrap scenario. This can lead to a catch-22-style situation
where one would need a native python package and its dependencies
for natively building the gcc package and a native gcc package
for building (some of) the dependencies of the python package.
With awk we don't have this problem as in contrast to python awk
doesn't pull in any dependencies that aren't required by gcc
anyway. From a distro porter's point of view I would therefore
appreciate very much if it would be possible to avoid adding a
python dependency to the gcc build process.
I don't see that as an issue. As said in another reply in this thread, you can
do a staged python build, which has the same build dependencies as awk (maybe
except the db/gdvm module). And if you need to, you can cross build python as
well more easily than for example perl or guile.

Matthias
Martin Liška
2018-07-20 10:05:49 UTC
Permalink
Post by Karsten Merker
Post by David Malcolm
Post by Martin Liška
I've recently touched AWK option generate machinery and it's
quite unpleasant to make any adjustments. My question is
simple: can we starting using a scripting language like Python
and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from
community.
As you know, I'm a fan of Python. As I noted elsewhere in this
thread, one issue is Python 2 vs Python 3 (and minimum
versions). Within Python 2.*, Python 2.6 onwards is broadly
compatible with Python 3.*, and there's a well-known common
subset that works in both languages.
To what extent would this complicate bootstrap? (I don't think
so, in that it would appear to be just an external build-time
dependency on the build machine).
Would this make it harder for people to build GCC? It's one
more dependency, but CPython is widely available and relatively
easy to build. (I don't have experience of doing bring-up of a
new architecture, though).
Hello,
I have recently been working on bringing up a new Debian port for
the riscv64 architecture from scratch, so I would like to add
some of my personal experiences here.
Adding a dependency on python for building gcc would make life
for distribution porters quite a bit harder. There are a bunch
of packages that are more or less essential for a modern Linux
distribution but at the same time extremely difficult to properly
cross-build. For a distribution porter trying to bootstrap a new
architecture, this means that one has to resort to native
building sooner or later, i.e. one has to build native toolchain
packages and then work forward from there. During the bootstrap
process it is often necessary to break dependency cycles and
natively rebuild toolchain packages with different build-profiles
enabled, or to build newer versions of the same toolchain packages
with bugfixes for the new architecture.
A dependency on python would mean that to be able to do a native
rebuild of the toolchain one would need a native python. The
problem here is that python has an enormous number of transitive
build-dependencies and not all of them are easily cross-buildable,
i.e. one needs a native compiler to build some of them in a
bootstrap scenario. This can lead to a catch-22-style situation
where one would need a native python package and its dependencies
for natively building the gcc package and a native gcc package
for building (some of) the dependencies of the python package.
Hi.

The problematic is quite covered in this thread. You're not CC, so
please take a look:

https://gcc.gnu.org/ml/gcc/2018-07/msg00233.html

So for your use case, cross compilation of python (without fancy
modules that have dependencies) should work for you to make a transition
into native distribution.

Martin
Post by Karsten Merker
With awk we don't have this problem as in contrast to python awk
doesn't pull in any dependencies that aren't required by gcc
anyway. From a distro porter's point of view I would therefore
appreciate very much if it would be possible to avoid adding a
python dependency to the gcc build process.
Regards,
Karsten
P.S.: I am not subscribed to the list, so it would be nice
if you could CC me on replies.
Joseph Myers
2018-07-23 14:16:47 UTC
Permalink
Post by Martin Liška
I've recently touched AWK option generate machinery and it's quite
unpleasant to make any adjustments. My question is simple: can we
starting using a scripting language like Python and replace usage of the
AWK scripts? It's probably question for Steering committee, but I would
like to see feedback from community.
I'd prefer Python to Awk for this code.
Post by Martin Liška
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
More generally, I don't think there are any checks that flags specified
for options are known flags at all; I expect a typo in a flag to result in
it being silently ignored.

Common code that reads .opt files into some logical datastructure,
complete with validation including that all flags specified are in the
list of valid flags, followed by converting those structures to whatever
output is required, seems appropriate to me.
--
Joseph S. Myers
***@codesourcery.com
Michael Matz
2018-07-27 14:19:45 UTC
Permalink
Hi,
Post by Martin Liška
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.
I'm looking forward to a feedback.
Using any python scripts as part of generally building GCC (i.e. where the
generated files aren't prepackaged) will introduce a python dependency for
distro packages. And for those distros that bootstrap a core cycle of
packages (e.g. *SUSE) this will include python (and all its dependencies)
into that bootstrap cycle.

That will be terrible.


Ciao,
Michael.
Michael Matz
2018-07-27 14:31:54 UTC
Permalink
Hi,
Post by Michael Matz
Using any python scripts as part of generally building GCC (i.e. where
the generated files aren't prepackaged) will introduce a python
dependency for distro packages. And for those distros that bootstrap a
core cycle of packages (e.g. *SUSE) this will include python (and all
its dependencies) into that bootstrap cycle.
That will be terrible.
Oh, and of course, I haven't read any really convincing arguments for
why python would be so much better than awk to counter the disadvantages.

Building a compiler (especially one that regards itself as a
multi-target/host one) should have extremely few prerequisites (ideally
only a compiler and runtime for the language its written in), and I
wouldn't call a full python distro that (no matter how much people claim
that getting the necessary subset of python is mostly trivial. compiling
any random awk is trivial, especially given a compiler you already need
anyway; python is not).

Hell, if anything I'd say we should rewrite the awk scripts into POSIX sh
(!). I'll concede that for text processing AWK is nicer ;-)

So, if it's only for a minor convenience of writing some text
processing scripts, no, that's not a good reason to complicate our
prerequisites. (The helper scripts in contrib/ as long as they aren't
used during GCC build can use any fancy language they want)


Ciao,
Michael.
Matthias Klose
2018-07-28 02:29:14 UTC
Permalink
Post by Michael Matz
Hi,
Post by Michael Matz
Using any python scripts as part of generally building GCC (i.e. where
the generated files aren't prepackaged) will introduce a python
dependency for distro packages. And for those distros that bootstrap a
core cycle of packages (e.g. *SUSE) this will include python (and all
its dependencies) into that bootstrap cycle.
That will be terrible.
Oh, and of course, I haven't read any really convincing arguments for
why python would be so much better than awk to counter the disadvantages.
Building a compiler (especially one that regards itself as a
multi-target/host one) should have extremely few prerequisites (ideally
only a compiler and runtime for the language its written in), and I
wouldn't call a full python distro that (no matter how much people claim
that getting the necessary subset of python is mostly trivial. compiling
any random awk is trivial, especially given a compiler you already need
anyway; python is not).
that very much depends on your bootstrap system supporting staged builds. You
already have to do that for glibc/gcc anyway. But yes, if you think that adding
a staged python build is more complicated ...
Post by Michael Matz
Hell, if anything I'd say we should rewrite the awk scripts into POSIX sh
(!). I'll concede that for text processing AWK is nicer ;-)
So, if it's only for a minor convenience of writing some text
processing scripts, no, that's not a good reason to complicate our
prerequisites. (The helper scripts in contrib/ as long as they aren't
used during GCC build can use any fancy language they want)
Joseph Myers
2018-07-27 14:38:21 UTC
Permalink
Post by Michael Matz
Using any python scripts as part of generally building GCC (i.e. where the
generated files aren't prepackaged) will introduce a python dependency for
distro packages. And for those distros that bootstrap a core cycle of
packages (e.g. *SUSE) this will include python (and all its dependencies)
into that bootstrap cycle.
I would have expected most concerns to be about builds on non-GNU hosts -
not about builds on GNU/Linux where Python is generally already available
(and differences in Python versions should definitely *not* affect the
generated output, so there should be no increases in the number of
iterations required for any bootstrap cycle to converge).

We've been having a similar discussion for glibc, both about replacing
uses of perl (optional, but required to build the manual and to run
various tests - python is also already required to run various tests) with
python and about replacing uses of awk (required) with python as well, in
the interests of easier maintainability - and I didn't see any concerns
raised about such a change at all. Of course in the glibc case pretty
much all building is done on GNU hosts (although theoretically you can
cross-compile from non-GNU systems, in practice that's liable to be broken
with e.g. cross-rpcgen not building with random systems' headers, and
probable dependencies on GNU versions of various host tools).

Obviously if you're bootstrapping core packages and their build
dependencies, use in glibc is more or less equivalent to use in GCC. (But
if build dependencies include those involved in testing, you already have
python as one for glibc, and Tcl for GCC, for example.)
--
Joseph S. Myers
***@codesourcery.com
Michael Matz
2018-07-27 14:53:59 UTC
Permalink
Hi,
Post by Joseph Myers
I would have expected most concerns to be about builds on non-GNU hosts -
not about builds on GNU/Linux where Python is generally already available
(and differences in Python versions should definitely *not* affect the
generated output, so there should be no increases in the number of
iterations required for any bootstrap cycle to converge).
We've been having a similar discussion for glibc, both about replacing
uses of perl (optional, but required to build the manual and to run
various tests - python is also already required to run various tests) with
python and about replacing uses of awk (required) with python as well, in
the interests of easier maintainability - and I didn't see any concerns
raised about such a change at all.
perl is currently included in the bootstrap set. There's no reason why
python couldn't be included as well, but we'd have to make it a limited
python (so that the additional builddeps become at least minimal), and
that leads to further work (decisions and implementation around the
existence of minimal-python and full-python).

And of course the build time of the bootstrap cycle lengthens
non-trivially. Maybe not by much, but still.

I don't know why you didn't get concerns raised during those discussions,
it can't mean an indication that everything is fine with going from
perl to python when part of non-optional build dependencies. (Optional
deps are always fine; we're breaking out those parts, like testsuite, into
different packages that aren't then part of the bootstrap cycle).
Post by Joseph Myers
Obviously if you're bootstrapping core packages and their build
dependencies, use in glibc is more or less equivalent to use in GCC. (But
if build dependencies include those involved in testing, you already have
python as one for glibc, and Tcl for GCC, for example.)
Testsuites aren't part of the bootstrap cycle if they would have to
enlarge it unduly. Tcl and expect is, though (hmm, I wonder why), as is
perl; they all have trivial buildrequires.


Ciao,
Michael.
Paul Smith
2018-07-27 23:21:48 UTC
Permalink
Post by Michael Matz
perl is currently included in the bootstrap set. There's no reason
why python couldn't be included as well,
If Perl is already in the bootstrap set and the awk scripts are hard to
maintain then why can't the awk scripts be rewritten in Perl instead of
Python? That would avoid adding more prerequisites and surely Perl is
sufficiently expressive that it can perform these translations just as
well as Python.

I understand some people have an issue with Perl's maintainability but
just because you CAN write difficult to maintain code in Perl doesn't
mean you HAVE to.

I've seen plenty of difficult to understand and maintain Python
scripting... just saying "use Python" is not a panacea for
supportability problems.
Joseph Myers
2018-07-30 14:28:07 UTC
Permalink
Post by Paul Smith
If Perl is already in the bootstrap set and the awk scripts are hard to
maintain then why can't the awk scripts be rewritten in Perl instead of
Python? That would avoid adding more prerequisites and surely Perl is
sufficiently expressive that it can perform these translations just as
well as Python.
At least in the glibc community we find the current developers generally
prefer Python for such code, so using it in place of Perl (or Awk) works
better for maintainability now.
--
Joseph S. Myers
***@codesourcery.com
David Malcolm
2018-07-28 16:53:55 UTC
Permalink
Post by Joseph Myers
Post by Michael Matz
Using any python scripts as part of generally building GCC (i.e. where the
generated files aren't prepackaged) will introduce a python dependency for
distro packages. And for those distros that bootstrap a core cycle of
packages (e.g. *SUSE) this will include python (and all its dependencies)
into that bootstrap cycle.
I would have expected most concerns to be about builds on non-GNU hosts -
not about builds on GNU/Linux where Python is generally already available
(and differences in Python versions should definitely *not* affect the
generated output, so there should be no increases in the number of
iterations required for any bootstrap cycle to converge).
We've been having a similar discussion for glibc, both about
replacing
uses of perl (optional, but required to build the manual and to run
various tests - python is also already required to run various tests) with
python and about replacing uses of awk (required) with python as well, in
the interests of easier maintainability - and I didn't see any concerns
raised about such a change at all. Of course in the glibc case pretty
much all building is done on GNU hosts (although theoretically you can
cross-compile from non-GNU systems, in practice that's liable to be broken
with e.g. cross-rpcgen not building with random systems' headers, and
probable dependencies on GNU versions of various host tools).
I can certainly remember quite a number of painful issues getting
shaken out by python during the AArch64 bootstrap much before we
published the port upstream that not much other testing was able to
find. It was a good test of the toolchain but if it is required that
you need to have working python on the target *before* you get a
bootstrapped GCC on the system, I'm not sure how helpful /
frustrating
that is really to folks trying to bring up a GNU / Linux system
natively. I am concerned that we are increasing the barrier on entry
for such developers.
It is not the majority of developers (but put another way) we do need
to answer the question whether the dependency on python makes it
harder for folks to bring up a new GNU/Linux system on a new
architecture even though it may make life easier in other areas for
working on the compiler.
What are the other areas where we envisage using python in the longer
term for GCC ? option processing is one area, where else ?
FWIW I have a Python module for working with the output of -fsave-
optimization-record (a JSON-based format). It's not clear to me if
that should live in the gcc source tree (and thus tarball) or as a part
of a 3rd-party repository.

Related to that, I'd like to use Python in the testsuite, for verifying
the output of -fsave-optimization-record.

See
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01546.html
for more info on both of these.

Dave
Post by Joseph Myers
Obviously if you're bootstrapping core packages and their build
dependencies, use in glibc is more or less equivalent to use in GCC. (But
if build dependencies include those involved in testing, you
already have
python as one for glibc, and Tcl for GCC, for example.)
This implies that the decision for glibc has been made. while you
imply above that the discussion is still on going ?
regards
Ramana
Post by Joseph Myers
Joseph S. Myers
Joseph Myers
2018-07-30 14:34:34 UTC
Permalink
Post by Joseph Myers
Obviously if you're bootstrapping core packages and their build
dependencies, use in glibc is more or less equivalent to use in GCC. (But
if build dependencies include those involved in testing, you already have
python as one for glibc, and Tcl for GCC, for example.)
This implies that the decision for glibc has been made. while you
imply above that the discussion is still on going ?
Python has been used for some glibc tests for some time. It's usage to
replace other Perl and Awk scripts (and especially those required for the
build) for which there is discussion - though as no-one has objected to
such a change we may effectively have consensus.

https://sourceware.org/ml/libc-alpha/2018-07/msg00559.html
--
Joseph S. Myers
***@codesourcery.com
Andreas Schwab
2018-07-30 15:13:12 UTC
Permalink
Post by Joseph Myers
Python has been used for some glibc tests for some time.
Using it for tests is ok, since they are not part of the bootstrap
cycle.

Andreas.
--
Andreas Schwab, SUSE Labs, ***@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
konsolebox
2018-07-28 00:26:20 UTC
Permalink
Just another user here.

I'm not a fan of Python and I don't want it added as a dependency to my
favorite compiler. If I would build a minimal system with a toolchain, I
wouldn't want Python to be a mandatory component, so please don't. Thanks.

P.S. I don't mind Perl. It's a legacy tool next to Awk.
Post by Martin Liška
Hi.
I've recently touched AWK option generate machinery and it's quite unpleasant
to make any adjustments. My question is simple: can we starting using a scripting
language like Python and replace usage of the AWK scripts? It's probably question
for Steering committee, but I would like to see feedback from community.
1) gcc/optc-save-gen.awk is full of copy&pasted code, due to lack of flags
type classes multiple
global variables are created (var_opt_char, var_opt_string, ...)
2) similar happens in gcc/opth-gen.awk
3) we do very many regex matches (mainly in gcc/opt-functions.awk), I believe
we should come up with a structured option format that will make parsing and
processing much simpler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81397
gcc/config/arm/parsecpu.awk
./gcc/config/arm/arm-cpus.in
I guess having a well-defined structured format for *.opt files will make
it easier to write generated opt files?
I'm attaching a prototype that can transform optionlist into options-save.c
that can be compiled and works.
I'm looking forward to a feedback.
Martin
Loading...