Discussion:
[PATCH] Fix DCE REG_LIBCALL note moving from noop move insns (PR rtl-optimization/33644)
Jakub Jelinek
2007-10-11 18:49:06 UTC
Permalink
Hi!

The attached testcase fails on ppc with -m32 -O2 -ftrapv.
In *.unshare we have:
(insn 44 43 45 3 pr33644.c:15 (set (reg:SI 3 3)
(reg/f:SI 155 [ D.1244 ])) -1 (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_LIBCALL 47 (nil))))

(insn 45 44 46 3 pr33644.c:15 (set (reg:SI 4 4)
(reg/f:SI 115 virtual-stack-vars)) -1 (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))
...
(insn 47 46 48 3 pr33644.c:15 (set (reg:SI 156)
(reg:SI 3 3)) -1 (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_RETVAL 44 (expr_list:REG_EQUAL (minus:SI (reg/f:SI 155 [ D.1244 ])
(reg/f:SI 115 virtual-stack-vars))
(nil)))))

First vregs inserts a new insn between 44 and 45 to instantiate
virtual-stack-vars:
(insn 44 43 84 3 pr33644.c:15 (set (reg:SI 3 3)
(reg/f:SI 155 [ D.1244 ])) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_LIBCALL 47 (nil))))

(insn 84 44 45 3 pr33644.c:15 (set (reg:SI 169)
(plus:SI (reg/f:SI 113 sfp)
(const_int 8 [0x8]))) -1 (nil))

(insn 45 84 46 3 pr33644.c:15 (set (reg:SI 4 4)
(reg:SI 169)) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))
and later on CSE finds sfp + 8 is already in pseudo 168:
(insn 44 43 84 2 pr33644.c:15 (set (reg:SI 3 3 [ D.1244 ])
(reg:SI 3 3 [ D.1244 ])) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_LIBCALL 47 (nil))))

(insn 84 44 45 2 pr33644.c:15 (set (reg/f:SI 169)
(reg/f:SI 168)) 312 {*movsi_internal1} (expr_list:REG_EQUAL (plus:SI (reg/f:SI 113 sfp)
(const_int 8 [0x8]))
(nil)))

(insn 45 84 46 2 pr33644.c:15 (set (reg:SI 4 4)
(reg/f:SI 169)) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))

and then forwprop replaces pseudo 169 with pseudo 168 in insn 45:
(insn 44 42 84 2 pr33644.c:15 (set (reg:SI 3 3 [ D.1244 ])
(reg:SI 3 3 [ D.1244 ])) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_LIBCALL 47 (nil))))

(insn 84 44 45 2 pr33644.c:15 (set (reg/f:SI 169)
(reg/f:SI 168)) 312 {*movsi_internal1} (expr_list:REG_EQUAL (plus:SI (reg/f:SI 113 sfp)
(const_int 8 [0x8]))
(nil)))

(insn 45 84 46 2 pr33644.c:15 (set (reg:SI 4 4)
(reg/f:SI 168)) 312 {*movsi_internal1} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))

When DCE is run during GCSE, insn 44 is being deleted as it is a noop_move_p
and moves REG_LIBCALL note to the following insn (84). But that one doesn't have
REG_LIBCALL_ID note and in this case wasn't even marked, as it is unneeded
and so the next delete_unmarked_insns loop iteration kills even that insn.
This means we end up with a REG_RETVAL note without corresponding RET_LIBCALL
and ICE later on.

I believe it is not a bug to have non-REG_LIBCALL_ID insns within
a libcall sequence as long as they are either useless or something needed
by the libcall sequence insns (i.e. they can be deleted whenever the whole
libcall sequence is killed, but can be also deleted if they are not needed
individually). If so, attached are two possible patches which fix this,
one will move the REG_LIBCALL note to the first insn that is actually not
going to be deleted, the other to the first REG_LIBCALL_ID insn (which is supposed
to be marked by mark_libcall). I prefer the latter patch.

Ok for trunk?

Jakub
Eric Botcazou
2007-10-13 14:58:47 UTC
Permalink
Post by Jakub Jelinek
I believe it is not a bug to have non-REG_LIBCALL_ID insns within
a libcall sequence as long as they are either useless or something needed
by the libcall sequence insns (i.e. they can be deleted whenever the whole
libcall sequence is killed, but can be also deleted if they are not needed
individually).
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have insns
between REG_LIBCALL and REG_RETVAL notes that don't seem to be part of the
libcall sequence. AFAICS it is used by a unique pass (dce.c) so I think the
pass should be rewritten to deal with libcalls like the other passes.
Post by Jakub Jelinek
If so, attached are two possible patches which fix this, one will move the
REG_LIBCALL note to the first insn that is actually not going to be deleted,
the other to the first REG_LIBCALL_ID insn (which is supposed to be marked
by mark_libcall). I prefer the latter patch.
Given the above remark, I'd say that the former patch is more correct though.
--
Eric Botcazou
Daniel Berlin
2007-10-13 18:42:20 UTC
Permalink
Post by Eric Botcazou
Post by Jakub Jelinek
I believe it is not a bug to have non-REG_LIBCALL_ID insns within
a libcall sequence as long as they are either useless or something needed
by the libcall sequence insns (i.e. they can be deleted whenever the whole
libcall sequence is killed, but can be also deleted if they are not needed
individually).
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have insns
between REG_LIBCALL and REG_RETVAL notes that don't seem to be part of the
libcall sequence.
Which is simply a bug, not indicitative bad idea.
Post by Eric Botcazou
AFAICS it is used by a unique pass (dce.c) so I think the
pass should be rewritten to deal with libcalls like the other passes.
It's only a bad idea if you think doing linear walks all the time to
figure out whether an INSN is part of a libcall is a good idea.
IMHO, REG_LIBCALL and REG_RETVAL is the real bad idea.

Rewriting DCE to deal with libcalls like "every other pass" will add
several O(n) walks in bad places.
Eric Botcazou
2007-10-14 12:19:10 UTC
Permalink
Post by Daniel Berlin
Post by Eric Botcazou
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have
insns between REG_LIBCALL and REG_RETVAL notes that don't seem to be part
of the libcall sequence.
Which is simply a bug, not indicitative bad idea.
Properly fixing the bug would mean adding code all over the RTL middle-end to
propagate REG_LIBCALL_ID . The semantics of libcalls is clear: all insns
between REG_LIBCALL and REG_RETVAL are part of the libcall sequence.
Post by Daniel Berlin
It's only a bad idea if you think doing linear walks all the time to
figure out whether an INSN is part of a libcall is a good idea.
Can't we just do it only one time? I'll give it a try.
Post by Daniel Berlin
IMHO, REG_LIBCALL and REG_RETVAL is the real bad idea.
Possibly, but I think that trying to undo it with REG_LIBCALL_ID doesn't help.
--
Eric Botcazou
Daniel Berlin
2007-10-14 15:19:53 UTC
Permalink
Post by Eric Botcazou
Post by Daniel Berlin
Post by Eric Botcazou
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have
insns between REG_LIBCALL and REG_RETVAL notes that don't seem to be part
of the libcall sequence.
Which is simply a bug, not indicitative bad idea.
Properly fixing the bug would mean adding code all over the RTL middle-end to
propagate REG_LIBCALL_ID . The semantics of libcalls is clear: all insns
between REG_LIBCALL and REG_RETVAL are part of the libcall sequence.
Clear != useful :)

For the record, REG_LIBCALL_ID also has very clear semantics.
REG_LIBCALL_ID signifies the current instruction is part of a libcall
sequence. All insns that are part of the same libcall share the same
ID number.
Post by Eric Botcazou
Post by Daniel Berlin
It's only a bad idea if you think doing linear walks all the time to
figure out whether an INSN is part of a libcall is a good idea.
Can't we just do it only one time? I'll give it a try.
Why make DCE slower instead of fixing the bug?
Post by Eric Botcazou
Post by Daniel Berlin
IMHO, REG_LIBCALL and REG_RETVAL is the real bad idea.
Possibly, but I think that trying to undo it with REG_LIBCALL_ID doesn't help.
Why?
It is an incremental step towards fixing the underlying problem.
Kenneth Zadeck
2007-10-14 16:29:44 UTC
Permalink
Post by Daniel Berlin
Post by Eric Botcazou
Post by Daniel Berlin
Post by Eric Botcazou
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have
insns between REG_LIBCALL and REG_RETVAL notes that don't seem to be part
of the libcall sequence.
Which is simply a bug, not indicitative bad idea.
Properly fixing the bug would mean adding code all over the RTL middle-end to
propagate REG_LIBCALL_ID . The semantics of libcalls is clear: all insns
between REG_LIBCALL and REG_RETVAL are part of the libcall sequence.
Clear != useful :)
For the record, REG_LIBCALL_ID also has very clear semantics.
REG_LIBCALL_ID signifies the current instruction is part of a libcall
sequence. All insns that are part of the same libcall share the same
ID number.
Post by Eric Botcazou
Post by Daniel Berlin
It's only a bad idea if you think doing linear walks all the time to
figure out whether an INSN is part of a libcall is a good idea.
Can't we just do it only one time? I'll give it a try.
Why make DCE slower instead of fixing the bug?
Post by Eric Botcazou
Post by Daniel Berlin
IMHO, REG_LIBCALL and REG_RETVAL is the real bad idea.
Possibly, but I think that trying to undo it with REG_LIBCALL_ID doesn't help.
Why?
It is an incremental step towards fixing the underlying problem.
Eric,

While, I belive that it will be possible for you to make dce not need
the LIB_CALL_ID, I think that it misses the points that danny was
trying to make.

We added the LIB_CALL_ID because we foresaw that as we modernize the
rtl passes, that more and more of these passes will be using either
use-def or def-use chains. For these optimizations, you rarely make
linear passes over the instructions, you simply follow the chains from
one insn of interest to another. The LIB_CALL_ID provides a simple
constant time test to find out if you have landed inside of a lib
call. Removing this structure will only make it harder to add passes
that only deal with the instructions of interest and essentially
forces us into the endless inefficient rescanning of blocks paradyme
that the rtl passes are famous for.

I have not looked at this bug. I am happy to if you want. I am sure
that it will be trivial to modify the pass that moved/created the insn
in the middle of the libcall to inherit the LIB_CALL_ID from the
previous insn.

Kenny
Jakub Jelinek
2007-10-14 21:45:21 UTC
Permalink
Post by Kenneth Zadeck
I have not looked at this bug. I am happy to if you want. I am sure
that it will be trivial to modify the pass that moved/created the insn
in the middle of the libcall to inherit the LIB_CALL_ID from the
previous insn.
That is not desirable, if anything in this case the insn should be
added before the whole libcall sequence rather than before the insn
that actually needs it. Otherwise, useless insns added to the libcall
sequences wouldn't be ever DCEd.

While it might be easy to modify the instantiate_virtual_regs, there
are dozens of other passes that do similar things, so at least for 4.3 it is
highly unlikely they will be all modified.

Jakub
Eric Botcazou
2007-10-17 21:01:23 UTC
Permalink
Post by Jakub Jelinek
That is not desirable, if anything in this case the insn should be
added before the whole libcall sequence rather than before the insn
that actually needs it. Otherwise, useless insns added to the libcall
sequences wouldn't be ever DCEd.
Right, libcalls are handled backwards, i.e. as optimization barriers.

Here's the patch I'm toying with.


* cfgcleanup.c: Do not include dce.h.
* cfgrtl.c (delete_insn_chain_and_edges): Resurrect.
* combine.c (distribute_notes): Delete REG_LIBCALL_ID case.
* dce.c (something_changed): Delete.
(libcall_dead_p): New predicate.
(delete_unmarked_insns): Use it to delete dead libcalls.
Deal with REG_LIBCALL and REG_RETVAL notes.
(prescan_insns_for_dce): Do not mark libcalls as needed.
(mark_reg_dependencies): Likewise.
(dce_process_block): Likewise.
(fast_dce): Delete unused local variable.
(run_fast_dce): Do not return a value.
* dce.h (run_fast_dce): Adjust prototype.
* optabs.c (libcall_id): Delete.
(maybe_encapsulate_block): Do not emit REG_LIBCALL_ID notes.
(emit_no_conflict_block): Do not look for REG_LIBCALL_ID notes.
* reload1.c (reload): Delete REG_LIBCALL_ID case.
* rtl.h (delete_insn_chain_and_edges): Resurrect prototype.
* see.c (see_update_relevancy): Look for REG_LIBCALL and REG_RETVAL
notes instead of REG_LIBCALL_ID notes.
* reg-notes.def (LIBCALL_ID): Delete.
* Makefile.in (see.o): Add dce.h dependency.
(cfgcleanup.o): Remove dce.h dependency.
--
Eric Botcazou
Eric Botcazou
2007-10-20 14:10:38 UTC
Permalink
Post by Eric Botcazou
Here's the patch I'm toying with.
Revised version attached, tested on i586-suse-linux, x86_64-suse-linux,
ia64-suse-linux and sparc-sun-solaris2.8; as previously stated, I won't
install it, so it's up to you.


2007-10-20 Eric Botcazou <***@libertysurf.fr>

PR rtl-optimization/33644
* cfgcleanup.c: Do not include dce.h.
* cfgrtl.c (delete_insn_chain_and_edges): Resurrect.
* combine.c (distribute_notes): Delete REG_LIBCALL_ID case.
* dce.c (something_changed): Delete.
(libcall_dead_p): New predicate.
(delete_unmarked_insns): Use it to delete dead libcalls.
Deal with REG_LIBCALL and REG_RETVAL notes.
(prescan_libcall_for_dce): New function.
(prescan_insns_for_dce): Use it to deal with libcalls.
(mark_reg_dependencies): Do nothing special for libcalls.
(dce_process_block): Likewise.
(fast_dce): Delete unused local variable.
(run_fast_dce): Do not return a value.
* dce.h (struct df): Delete.
(run_fast_dce): Adjust prototype.
* optabs.c (libcall_id): Delete.
(maybe_encapsulate_block): Do not emit REG_LIBCALL_ID notes.
(emit_no_conflict_block): Do not look for REG_LIBCALL_ID notes.
* reload1.c (reload): Delete REG_LIBCALL_ID case.
* rtl.h (delete_insn_chain_and_edges): Resurrect prototype.
* see.c (see_update_relevancy): Look for REG_LIBCALL and REG_RETVAL
notes instead of REG_LIBCALL_ID notes.
* reg-notes.def (LIBCALL_ID): Delete.
* Makefile.in (see.o): Add dce.h dependency.
(cfgcleanup.o): Remove dce.h dependency.
--
Eric Botcazou
Kenneth Zadeck
2007-10-22 16:49:15 UTC
Permalink
Post by Eric Botcazou
Post by Eric Botcazou
Here's the patch I'm toying with.
Revised version attached, tested on i586-suse-linux, x86_64-suse-linux,
ia64-suse-linux and sparc-sun-solaris2.8; as previously stated, I won't
install it, so it's up to you.
PR rtl-optimization/33644
* cfgcleanup.c: Do not include dce.h.
* cfgrtl.c (delete_insn_chain_and_edges): Resurrect.
* combine.c (distribute_notes): Delete REG_LIBCALL_ID case.
* dce.c (something_changed): Delete.
(libcall_dead_p): New predicate.
(delete_unmarked_insns): Use it to delete dead libcalls.
Deal with REG_LIBCALL and REG_RETVAL notes.
(prescan_libcall_for_dce): New function.
(prescan_insns_for_dce): Use it to deal with libcalls.
(mark_reg_dependencies): Do nothing special for libcalls.
(dce_process_block): Likewise.
(fast_dce): Delete unused local variable.
(run_fast_dce): Do not return a value.
* dce.h (struct df): Delete.
(run_fast_dce): Adjust prototype.
* optabs.c (libcall_id): Delete.
(maybe_encapsulate_block): Do not emit REG_LIBCALL_ID notes.
(emit_no_conflict_block): Do not look for REG_LIBCALL_ID notes.
* reload1.c (reload): Delete REG_LIBCALL_ID case.
* rtl.h (delete_insn_chain_and_edges): Resurrect prototype.
* see.c (see_update_relevancy): Look for REG_LIBCALL and REG_RETVAL
notes instead of REG_LIBCALL_ID notes.
* reg-notes.def (LIBCALL_ID): Delete.
* Makefile.in (see.o): Add dce.h dependency.
(cfgcleanup.o): Remove dce.h dependency.
Eric,

This patch is fine, as far as it goes. But the problem is that i do not
actually think that it fixes PR33644.

According to Jakub, the problem is that the bad insn should have been
deleted before the libcall rather than inside of it.

The problem as you have highlighted it, is that when dse sees a libcall
where the return value is dead, it is supposed to delete all insns
between the one with the libcall regnote and the retval regnote. If an
insn is improperly added to the middle of the libcall, then if that
libcall retval goes dead, that insn will also be deleted and we will
have a much harder to track down bug.

Now, of course, i could be wrong, and the proper place for this insn to
have been inserted was between the insn with the libcall note and the
retval note, and in that case you are free to commit this patch, but if
the net effect of this patch is simply to mask the problem, the patch is
not acceptable. All of this hinges on whether it really is should be
inserted into the middle of the libcall. I thought that the last word
on this was Jakub telling me to fix the bug by changing the insertion
point. But I could have easily missed something.

I was quite excited by the conversation on this thread last week. I
think that it sounds like we are very close to being able to just get
rid of lib calls altogether and I will be happy to contribute to this
effort as soon as stage 1 opens. But in the meantime, this bug does
need to be fixed.
Eric Botcazou
2007-10-22 18:37:20 UTC
Permalink
Post by Kenneth Zadeck
This patch is fine, as far as it goes. But the problem is that i do not
actually think that it fixes PR33644.
Of course it does, by deleting the dead insns and adjusting the notes.
Post by Kenneth Zadeck
I was quite excited by the conversation on this thread last week. I
think that it sounds like we are very close to being able to just get
rid of lib calls altogether and I will be happy to contribute to this
effort as soon as stage 1 opens.
Yes, that's also my impression.


So, Jakub, OK for the patch?
--
Eric Botcazou
Jakub Jelinek
2007-10-22 18:44:28 UTC
Permalink
Post by Eric Botcazou
Post by Kenneth Zadeck
This patch is fine, as far as it goes. But the problem is that i do not
actually think that it fixes PR33644.
Of course it does, by deleting the dead insns and adjusting the notes.
Post by Kenneth Zadeck
I was quite excited by the conversation on this thread last week. I
think that it sounds like we are very close to being able to just get
rid of lib calls altogether and I will be happy to contribute to this
effort as soon as stage 1 opens.
Yes, that's also my impression.
So, Jakub, OK for the patch?
Fine with me, not that I have much say in it though.

Jakub
Eric Botcazou
2007-10-22 19:30:34 UTC
Permalink
Post by Jakub Jelinek
Fine with me, not that I have much say in it though.
Thanks. I ended up committing a patch that I had said I wouldn't commit. :-)
--
Eric Botcazou
Eric Botcazou
2007-10-15 13:30:53 UTC
Permalink
Post by Kenneth Zadeck
We added the LIB_CALL_ID because we foresaw that as we modernize the
rtl passes, that more and more of these passes will be using either
use-def or def-use chains. For these optimizations, you rarely make
linear passes over the instructions, you simply follow the chains from
one insn of interest to another.
Define "rarely". :-) The only pass that uses REG_LIBCALL_ID is dce.c and the
first thing it does is a linear pass over the instructions...
Post by Kenneth Zadeck
The LIB_CALL_ID provides a simple constant time test to find out if you have
landed inside of a lib call. Removing this structure will only make it
harder to add passes that only deal with the instructions of interest and
essentially forces us into the endless inefficient rescanning of blocks
paradyme that the rtl passes are famous for.
OK, but it's half-backed work since only the new dce.c pass among the RTL
passes knows of it, and GCC is also famous for its half-back transitions.
Post by Kenneth Zadeck
I have not looked at this bug. I am happy to if you want. I am sure
that it will be trivial to modify the pass that moved/created the insn
in the middle of the libcall to inherit the LIB_CALL_ID from the
previous insn.
As I said to Daniel, it's probably only the tip of the iceberg.
--
Eric Botcazou
Kenneth Zadeck
2007-10-15 16:11:42 UTC
Permalink
Post by Eric Botcazou
Post by Kenneth Zadeck
We added the LIB_CALL_ID because we foresaw that as we modernize the
rtl passes, that more and more of these passes will be using either
use-def or def-use chains. For these optimizations, you rarely make
linear passes over the instructions, you simply follow the chains from
one insn of interest to another.
Define "rarely". :-) The only pass that uses REG_LIBCALL_ID is dce.c and the
first thing it does is a linear pass over the instructions...
Post by Kenneth Zadeck
The LIB_CALL_ID provides a simple constant time test to find out if you have
landed inside of a lib call. Removing this structure will only make it
harder to add passes that only deal with the instructions of interest and
essentially forces us into the endless inefficient rescanning of blocks
paradyme that the rtl passes are famous for.
OK, but it's half-backed work since only the new dce.c pass among the RTL
passes knows of it, and GCC is also famous for its half-back transitions.
There is nothing half baked at all about our implementation.
All of the insns between the one with the REG_LIBCALL and REG_RETVAL are
supposed to be part of the lib call. According to Jakub, this insn that
has been inserted into the middle of the libcall has nothing to do with
the lib call and so it is a bug.

I have to say that i am completely unrepentant about enforcing the
semantics of rtl.
Post by Eric Botcazou
Post by Kenneth Zadeck
I have not looked at this bug. I am happy to if you want. I am sure
that it will be trivial to modify the pass that moved/created the insn
in the middle of the libcall to inherit the LIB_CALL_ID from the
previous insn.
As I said to Daniel, it's probably only the tip of the iceberg.
This is not true, the libcall_id addition was done very early in the
dataflow branch development cycle. We fixed a lot of these bugs before
the dataflow merge was committed. I think that we can honestly say that
we are close to the end of this process rather than just beginning.

This code in dce is going to fail whenever someone inserts code into the
middle of libcall and after the branch being committed several months
ago, and basically working on every single platform, this is the first
bug that has surfaced since the commit.

Kenny
Eric Botcazou
2007-10-14 16:31:57 UTC
Permalink
Post by Daniel Berlin
For the record, REG_LIBCALL_ID also has very clear semantics.
REG_LIBCALL_ID signifies the current instruction is part of a libcall
sequence. All insns that are part of the same libcall share the same
ID number.
I understand, but we now have 2 concurrent semantics for libcall sequences.
Post by Daniel Berlin
Why make DCE slower instead of fixing the bug?
Because I think it's only the tip of the iceberg. All RTL passes would need
to be fixed to make them preserve both the old and the new semantics.
--
Eric Botcazou
Daniel Berlin
2007-10-14 18:04:04 UTC
Permalink
Post by Eric Botcazou
Post by Daniel Berlin
For the record, REG_LIBCALL_ID also has very clear semantics.
REG_LIBCALL_ID signifies the current instruction is part of a libcall
sequence. All insns that are part of the same libcall share the same
ID number.
I understand, but we now have 2 concurrent semantics for libcall sequences.
Post by Daniel Berlin
Why make DCE slower instead of fixing the bug?
Because I think it's only the tip of the iceberg. All RTL passes would need
to be fixed to make them preserve both the old and the new semantics.
As more passes use libcall_id, you simply fix those that are messing
it up behind it.

This is exactly how we've made every other far reaching change in the
backend in the past year.
Start somewhere, and slowly expand the reach of it until it is done
throughout the entire backend.
Eric Botcazou
2007-10-15 13:57:00 UTC
Permalink
Post by Daniel Berlin
As more passes use libcall_id, you simply fix those that are messing
it up behind it.
This is exactly how we've made every other far reaching change in the
backend in the past year.
Start somewhere, and slowly expand the reach of it until it is done
throughout the entire backend.
The only thing I can say is that your conception of the maintenance of a
production compiler sounds a little worrying to me. This work should
have been conducted on the branch before the merge, or at least a clear
plan should have been discussed with the relevant maintainers before it.
--
Eric Botcazou
Daniel Berlin
2007-10-15 14:27:49 UTC
Permalink
Post by Eric Botcazou
Post by Daniel Berlin
As more passes use libcall_id, you simply fix those that are messing
it up behind it.
This is exactly how we've made every other far reaching change in the
backend in the past year.
Start somewhere, and slowly expand the reach of it until it is done
throughout the entire backend.
The only thing I can say is that your conception of the maintenance of a
production compiler sounds a little worrying to me.
"Maintenance of a production compiler"? GCC releases are production
releases. GCC itself has 3 stages of development, not all of which
result in a production quality compiler. This has been true for many
years now. Nobody would claim that the result of stage1 is "a
production quality compiler", and it certainly hasn't been since at
least 4.0. You know this of course, i'm not sure why you are ignoring
it. If you really think the goal of every stage should be to produce
a production compiler, or see your job only as "maintenance of a
production compiler", i would respectfully suggest you need to take a
long look at the history of GCC and how that viewpoint turned out to
affect GCC.
Post by Eric Botcazou
This work should
have been conducted on the branch before the merge, or at least a clear
plan should have been discussed with the relevant maintainers before it.
You weren't even an RTL maintainer when this work was discussed.
It *was* in fact, discussed with the relevant maintainers.
The work *was* done on a branch.
Dataflow was a big enough piece of work without adding 73 more
dependencies. Sometimes you have to say "enough is enough" and merge
what you have. If we had kept going, we might as well have just
forked the compiler.

--Dan
Eric Botcazou
2007-10-15 14:15:44 UTC
Permalink
Post by Daniel Berlin
Rewriting DCE to deal with libcalls like "every other pass" will add
several O(n) walks in bad places.
Well, DCE already does O(n) walks everytime it deals with an instruction that
belongs to a libcall:


/* Mark all insns using DELETE_PARM in the libcall that contains
START_INSN. */
static void
mark_libcall (rtx start_insn, bool delete_parm)
{
rtx note = find_reg_note (start_insn, REG_LIBCALL_ID, NULL_RTX);
int id = INTVAL (XEXP (note, 0));
rtx insn;

mark_insn (start_insn, delete_parm);
insn = NEXT_INSN (start_insn);

/* There are tales, long ago and far away, of the mystical nested
libcall. No one alive has actually seen one, but other parts of
the compiler support them so we will here. */
for (insn = NEXT_INSN (start_insn); insn; insn = NEXT_INSN (insn))
{
if (INSN_P (insn))
{
/* Stay in the loop as long as we are in any libcall. */
if ((note = find_reg_note (insn, REG_LIBCALL_ID, NULL_RTX)))
{
if (id == INTVAL (XEXP (note, 0)))
{
mark_insn (insn, delete_parm);
if (dump_file)
fprintf (dump_file, "matching forward libcall %d[%d]\n",
INSN_UID (insn), id);
}
}
else
break;
}
}

for (insn = PREV_INSN (start_insn); insn; insn = PREV_INSN (insn))
{
if (INSN_P (insn))
{
/* Stay in the loop as long as we are in any libcall. */
if ((note = find_reg_note (insn, REG_LIBCALL_ID, NULL_RTX)))
{
if (id == INTVAL (XEXP (note, 0)))
{
mark_insn (insn, delete_parm);
if (dump_file)
fprintf (dump_file, "matching backward libcall %d[%d]\n",
INSN_UID (insn), id);
}
}
else
break;
}
}
}


So AFAICS the only purpose of REG_LIBCALL_ID is to answer the question "is
this insn part of a libcall?", what a mere bitmap populated during the
initial pass over the instructions would do.
--
Eric Botcazou
Kenneth Zadeck
2007-10-14 16:27:38 UTC
Permalink
Post by Daniel Berlin
Post by Eric Botcazou
Post by Daniel Berlin
Post by Eric Botcazou
I think this demonstrates that REG_LIBCALL_ID is a bad idea: we now have
insns between REG_LIBCALL and REG_RETVAL notes that don't seem to be part
of the libcall sequence.
Which is simply a bug, not indicitative bad idea.
Properly fixing the bug would mean adding code all over the RTL middle-end to
propagate REG_LIBCALL_ID . The semantics of libcalls is clear: all insns
between REG_LIBCALL and REG_RETVAL are part of the libcall sequence.
Clear != useful :)
For the record, REG_LIBCALL_ID also has very clear semantics.
REG_LIBCALL_ID signifies the current instruction is part of a libcall
sequence. All insns that are part of the same libcall share the same
ID number.
Post by Eric Botcazou
Post by Daniel Berlin
It's only a bad idea if you think doing linear walks all the time to
figure out whether an INSN is part of a libcall is a good idea.
Can't we just do it only one time? I'll give it a try.
Why make DCE slower instead of fixing the bug?
Post by Eric Botcazou
Post by Daniel Berlin
IMHO, REG_LIBCALL and REG_RETVAL is the real bad idea.
Possibly, but I think that trying to undo it with REG_LIBCALL_ID doesn't help.
Why?
It is an incremental step towards fixing the underlying problem.
Eric,

While, I belive that it will be possible for you to make dce not need
the LIB_CALL_ID, I think that it misses the points that danny was
trying to make.

We added the LIB_CALL_ID because we foresaw that as we modernize the
rtl passes, that more and more of these passes will be using either
use-def or def-use chains. For these optimizations, you rarely make
linear passes over the instructions, you simply follow the chains from
one insn of interest to another. The LIB_CALL_ID provides a simple
constant time test to find out if you have landed inside of a lib
call. Removing this structure will only make it harder to add passes
that only deal with the instructions of interest and essentially
forces us into the endless inefficient rescanning of blocks paradyme
that the rtl passes are famous for.

I have not looked at this bug. I am happy to if you want. I am sure
that it will be trivial to modify the pass that moved/created the insn
in the middle of the libcall to inherit the LIB_CALL_ID from the
previous insn.

Kenny
Steven Bosscher
2007-10-15 18:06:26 UTC
Permalink
xf. http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00886.html
(sorry for breaking the thread)
Post by Eric Botcazou
Post by Kenneth Zadeck
The LIB_CALL_ID provides a simple constant time test to find out if you have
landed inside of a lib call. Removing this structure will only make it
harder to add passes that only deal with the instructions of interest and
essentially forces us into the endless inefficient rescanning of blocks
paradyme that the rtl passes are famous for.
OK, but it's half-backed work since only the new dce.c pass among the RTL
passes knows of it, and GCC is also famous for its half-back transitions.
I can't resist reacting to this.

First, the only reason why there are so many "half-back transitions"
is because no maintainer can do a full transition by him/herself. The
main reasons for this, as I see it, are that there is no coherent plan
for the development of GCC, and there are too many would-be
maintainers who only feel responsible for their own little niche
interest and not for the overall cleanliness of GCC. I don't see you,
Eric, help with e.g. the mapped-locations support, the transition to
define_peephole2, or this ugly libcall notes mess that's given me too
much gray hair already over the past five years. Only the bravest
developer with very thick skins still dare to make a significant
change to the GCC code base (I particularly admire Diego for this ;-).

FWIW an old and by now incomplete list of partial transitions is in
the wiki: http://gcc.gnu.org/wiki/Partial_Transitions.


But what I'm missing here (perhaps because I only half care about GCC
anymore myself, so I don't follow what is going on, but hey! ;-) is
why there is any reason to still have REG_LIBCALL and REG_RETVAL notes
at all.

The only reasons left to have these libcall notes, as I recall, were:

1) the old RTL loop optimizer could use them to move loop invariant libcalls.
Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.

2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.

3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
I don't know if this is still the case, and I also don't really see
why this was ever necessary to begin with. But if this is still
necessary for some targets, then you should be able to schedule all
insns with the same REG_LIBCALL_ID notes as a block.


So, if anyone is considering to actually complete this transition, I'd
suggest removing REG_LIBCALL and REG_RETVAL and teaching the scheduler
to move insns with the same REG_LIBCALL_ID as a block. This should
even be quite safe for GCC 4.3.

My $0.02.

Gr.
Steven
Kenneth Zadeck
2007-10-15 18:37:05 UTC
Permalink
Post by Steven Bosscher
xf. http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00886.html
(sorry for breaking the thread)
Post by Eric Botcazou
Post by Kenneth Zadeck
The LIB_CALL_ID provides a simple constant time test to find out if you have
landed inside of a lib call. Removing this structure will only make it
harder to add passes that only deal with the instructions of interest and
essentially forces us into the endless inefficient rescanning of blocks
paradyme that the rtl passes are famous for.
OK, but it's half-backed work since only the new dce.c pass among the RTL
passes knows of it, and GCC is also famous for its half-back transitions.
I can't resist reacting to this.
First, the only reason why there are so many "half-back transitions"
is because no maintainer can do a full transition by him/herself. The
main reasons for this, as I see it, are that there is no coherent plan
for the development of GCC, and there are too many would-be
maintainers who only feel responsible for their own little niche
interest and not for the overall cleanliness of GCC. I don't see you,
Eric, help with e.g. the mapped-locations support, the transition to
define_peephole2, or this ugly libcall notes mess that's given me too
much gray hair already over the past five years. Only the bravest
developer with very thick skins still dare to make a significant
change to the GCC code base (I particularly admire Diego for this ;-).
FWIW an old and by now incomplete list of partial transitions is in
the wiki: http://gcc.gnu.org/wiki/Partial_Transitions.
But what I'm missing here (perhaps because I only half care about GCC
anymore myself, so I don't follow what is going on, but hey! ;-) is
why there is any reason to still have REG_LIBCALL and REG_RETVAL notes
at all.
1) the old RTL loop optimizer could use them to move loop invariant libcalls.
Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
I don't know if this is still the case, and I also don't really see
why this was ever necessary to begin with. But if this is still
necessary for some targets, then you should be able to schedule all
insns with the same REG_LIBCALL_ID notes as a block.
So, if anyone is considering to actually complete this transition, I'd
suggest removing REG_LIBCALL and REG_RETVAL and teaching the scheduler
to move insns with the same REG_LIBCALL_ID as a block. This should
even be quite safe for GCC 4.3.
My $0.02.
Gr.
Steven
I think that when danny and i started this, it was not considered an
option to remove libcall notes. I never understood the reason for them,
but in retrospect it most likely would have been easier to get rid of
them rather than take the route we did an actually make them work the
way they were supposed to.
Eric Botcazou
2007-10-17 19:17:35 UTC
Permalink
Post by Kenneth Zadeck
I think that when danny and i started this, it was not considered an
option to remove libcall notes. I never understood the reason for them,
Libcalls notes make it possible for the optimizers to know that the libcall
sequence has no side-effects, i.e. that is only used for its return value.
See for example delete_trivially_dead_insns.
--
Eric Botcazou
Daniel Berlin
2007-10-18 01:36:00 UTC
Permalink
Post by Eric Botcazou
Post by Kenneth Zadeck
I think that when danny and i started this, it was not considered an
option to remove libcall notes. I never understood the reason for them,
Libcalls notes make it possible for the optimizers to know that the libcall
sequence has no side-effects, i.e. that is only used for its return value.
See for example delete_trivially_dead_insns.
Which of course is information any sane backend would transmit without
adding special requirements all over the place for where you can place
insns, which insns you can move, and requiring walking *all the
instructions in the entire function to see if this is part of one of
the "special ones"*
Post by Eric Botcazou
--
Eric Botcazou
Eric Botcazou
2007-10-17 18:26:40 UTC
Permalink
Post by Steven Bosscher
First, the only reason why there are so many "half-back transitions"
Sorry for the typo, I meant "half-backed transitions".
Post by Steven Bosscher
is because no maintainer can do a full transition by him/herself. The
main reasons for this, as I see it, are that there is no coherent plan
for the development of GCC, and there are too many would-be
maintainers who only feel responsible for their own little niche
interest and not for the overall cleanliness of GCC.
I fully agree with you.
Post by Steven Bosscher
I don't see you, Eric, help with e.g. the mapped-locations support,
Look more carefully: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01976.html
Post by Steven Bosscher
But what I'm missing here (perhaps because I only half care about GCC
anymore myself, so I don't follow what is going on, but hey! ;-) is
why there is any reason to still have REG_LIBCALL and REG_RETVAL notes
at all.
1) the old RTL loop optimizer could use them to move loop invariant
libcalls. Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.
Right.
Post by Steven Bosscher
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
Wrong. The new DCE handles libcalls backwards, i.e. as optimization barriers
(see Jakub's messages) instead of as optimization helpers.
Post by Steven Bosscher
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
I don't know if this is still the case, and I also don't really see
why this was ever necessary to begin with. But if this is still
necessary for some targets, then you should be able to schedule all
insns with the same REG_LIBCALL_ID notes as a block.
These REG_LIBCALL_ID notes would need to be actively maintained throughout the
entire RTL middle-end and I don't see any practical advantages over what we
currently have. And of course someone needs to do the implementation.
Post by Steven Bosscher
So, if anyone is considering to actually complete this transition, I'd
suggest removing REG_LIBCALL and REG_RETVAL and teaching the scheduler
to move insns with the same REG_LIBCALL_ID as a block.
I'd suggest to take a look at the handling of libcalls in dce.c...
--
Eric Botcazou
Steven Bosscher
2007-10-17 21:55:01 UTC
Permalink
Post by Eric Botcazou
Post by Steven Bosscher
I don't see you, Eric, help with e.g. the mapped-locations support,
Look more carefully: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01976.html
Sorry I missed that. I carefully avoid looking over gcc-patches. I
just have some filters to keep messages with certain buzz-word to sort
out the messages I may find interesting. mapped-locations and Eric are
not buzz-word ;-)

In any case, thanks for having taken care of that..
Post by Eric Botcazou
Post by Steven Bosscher
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
Wrong. The new DCE handles libcalls backwards, i.e. as optimization barriers
(see Jakub's messages) instead of as optimization helpers.
Hm, DCE doesn't do what I had expected it to do with REG_LIBCALL_ID
notes. More on that below...
Post by Eric Botcazou
Post by Steven Bosscher
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
I don't know if this is still the case, and I also don't really see
why this was ever necessary to begin with. But if this is still
necessary for some targets, then you should be able to schedule all
insns with the same REG_LIBCALL_ID notes as a block.
These REG_LIBCALL_ID notes would need to be actively maintained throughout the
entire RTL middle-end
Yes, but that is equally true for REG_LIBCALL and REG_RETVAL, which
currently give GCC a lot of trouble, too.

Since May when I effectively stopped working on gcc, I already
proposed fixes for two libcall notes related bugs (both not in code
coming from the df-branch). There are two additional problems with
REG_LIBCALL and REG_RETVAL notes. The first is that they must appear
as a pair, which is not intuitive and complicates the notes
maintenance work (one of those two bugs was a case in lower-subreg
where a REG_RETVAL note was removed but not the the corresponding
REG_LIBCALL note. Or the other way around, whatever... :-). The second
is that REG_LIBCALL and REG_RETVAL only implicitly indicate which
insns belong to a libcall sequence, namely all insns appearing between
the libcall and the retval note, inclusive. ISTR a lot of RTL passes
just punt on libcall sequences to avoid having to figure out whether
an insn is part of a libcall or not.

I thought REG_LIBCALL_ID would have solved these problems, but that
doesn't appear to be the case right now.
Post by Eric Botcazou
and I don't see any practical advantages over what we
currently have.
After looking at what dce.c does, I have to agree with you. I still
see REG_LIBCALL_ID as a _potentially_ better solution, but what dce.c
does now is just as horrible as what other passes have to do with the
"old" libcall notes.

Thoughts dump follows...

Assume REG_LIBCALL and REG_RETVAL wouldn't exist anymore, and GCC
would only have REG_LIBCALL_ID notes. How would/should things work
then? (Honest open question, so please say what you think about this,
too!)
Post by Eric Botcazou
From my POV, one advantage of REG_LIBCALL_ID is that all insns that
are part of a libcall sequence are explicitly marked as such. This
should in theory make it easier for the optimizers to work with the
insns stream, because you can immediately tell whether an insn belongs
to a libcall sequence. This ought to make it possible to insert
non-libcall insns into the libcall sequence without making it part of
the libcall. That, in turn, would allow the compiler to move around
insns that are part of a libcall. Think e.g. code hoisting or PRE,
when an expression can be hoisted, but one insn is in a libcall and
the other is not -- gcc currently can't hoist in a case like this.
I'm sure similar restrictions due to libcall notes apply most other
RTL passes. With REG_LIBCALL_ID, most of these restrictions would be
lifted. Another common problem with the old libcall notes is that you
frequently have to move the notes to another insn in the libcall
sequence. With REG_LIBCALL_ID you wouldn't have this problem because
all libcall insns carry a libcall note already.


But one thing I don't understand about REG_LIBCALL_ID is how you can
find the result of the libcall sequence. With the old libcall notes,
when the set of the insn with the REG_RETVAL note is dead, you can
remove the entire libcall sequence (ok, cutting corners a bit here,
but that's more or less what gcc does). With REG_LIBCALL_ID it is not
quite as obvious when the whole libcall sequence is dead.

So problem #1 is: There is no way AFAICT to tell whether a libcall
sequence is dead if you only have REG_LIBCALL_ID to work with. One
way around this would be to only remove REG_LIBCALL and to change the
semantics of REG_RETVAL such that it is associated with a libcall ID,
i.e. all insn in a libcall sequence get a REG_LIBCALL_ID note, except
the insn that sets the return value, which gets a REG_RETVAL note
(also holding the libcall ID) instead.


Another thing I don't understand, is how you can find other insns in
the libcall sequence, assuming you would allow inserting non-libcall
insns into the original libcall sequence (and not give them a libcall
note, i.e. as in the case of the patch at the start of the thread I
broke ;-). In dce.c, as I understand it, when an insn with a
REG_LIBCALL_ID is marked as necessary, a linear walk over the insn
chain is done in forward and backward direction over all insn with the
same REG_LIBCALL_ID note value. This would fail (and apparently does
fail) when a non-libcall insn is inserted into the sequence. In other
words, passes that use REG_LIBCALL_ID now _already_ assume the libcall
sequence remains contiguous. This, IMHO, negates the most important
theoretical advantage of REG_LIBCALL_ID over the old libcall notes.

Which is them problem #2: Given a libcall ID, how can you find all
insns belonging to this libcall if the libcall sequence is _not_
contiguous anymore. One way would be to make the REG_LIBCALL_ID notes
form a double linked list to all other libcall insns. I think the
overhead of this would be acceptable. Obviously the trouble would be
to teach the compiler to update the chain. This is easy when everyone
uses the proper API (i.e. delete_insn and friends) but I'm not sure
this is done consistently everywhere. Other solutions are possible
(tables, VECs, etc.) but they all have to be maintained.

Of course another solution is to keep the libcall insns sequence
contiguous, but in that case REG_LIBCALL_ID is just overhead to keep
one pass happy. It would not solve any of the issues with the old
libcall notes.


Yet another solution is to implement TLS_ADDR_EXPR and remove libcall
notes altogether, but that is (and has been for years now)
pie-in-the-sky, it seems :-(

None of this helps solving the current problem for GCC 4.3, sadly...

Gr.
Steven
Eric Botcazou
2007-10-18 12:45:57 UTC
Permalink
Post by Steven Bosscher
Post by Eric Botcazou
These REG_LIBCALL_ID notes would need to be actively maintained
throughout the entire RTL middle-end
Yes, but that is equally true for REG_LIBCALL and REG_RETVAL, which
currently give GCC a lot of trouble, too.
No, not equally true, REG_LIBCALL and REG_RETVAL are already actively
maintained throughout the entire RTL middle-end. Granted, with more
or less trouble, but they are.
Post by Steven Bosscher
From my POV, one advantage of REG_LIBCALL_ID is that all insns that
are part of a libcall sequence are explicitly marked as such. This
should in theory make it easier for the optimizers to work with the
insns stream, because you can immediately tell whether an insn belongs
to a libcall sequence. This ought to make it possible to insert
non-libcall insns into the libcall sequence without making it part of
the libcall. That, in turn, would allow the compiler to move around
insns that are part of a libcall. Think e.g. code hoisting or PRE,
when an expression can be hoisted, but one insn is in a libcall and
the other is not -- gcc currently can't hoist in a case like this.
I'm sure similar restrictions due to libcall notes apply most other
RTL passes. With REG_LIBCALL_ID, most of these restrictions would be
lifted. Another common problem with the old libcall notes is that you
frequently have to move the notes to another insn in the libcall
sequence. With REG_LIBCALL_ID you wouldn't have this problem because
all libcall insns carry a libcall note already.
I agree that it could be convenient to allow REG_LIBCALL_IDed insns
interspersed with non-REG_LIBCALL_IDed insns. But I think that this
would need some serious thinking because you need to precisely define
when the REG_LIBCALL_ID flag is inherited or not (see Jakub's example).
Post by Steven Bosscher
Of course another solution is to keep the libcall insns sequence
contiguous, but in that case REG_LIBCALL_ID is just overhead to keep
one pass happy. It would not solve any of the issues with the old
libcall notes.
Right, I also think that requiring REG_LIBCALL_IDed insns to be contiguous
voids any advantages over REG_LIBCALL and REG_RETVAL in practice.
Post by Steven Bosscher
Yet another solution is to implement TLS_ADDR_EXPR and remove libcall
notes altogether, but that is (and has been for years now)
pie-in-the-sky, it seems :-(
Definitely, I'm personally not too thrilled about inventing yet another brand
new mechanism to deal with libcalls. If you have some time to divert to GCC
in the near future again, I'd be happy to explore some definitive solutions
on a branch.
--
Eric Botcazou
Daniel Berlin
2007-10-18 01:40:59 UTC
Permalink
Post by Eric Botcazou
These REG_LIBCALL_ID notes would need to be actively maintained throughout the
entire RTL middle-end and I don't see any practical advantages over what we
currently have. And of course someone needs to do the implementation.
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
God forbid we ever start somewhere and work towards a goal.
Really.
I'm completely done working on our backend or trying to improve it in any way.
I'm removing my name from the dataflow maintainer list.

--Dan
Mark Mitchell
2007-10-18 06:23:41 UTC
Permalink
Post by Daniel Berlin
Post by Eric Botcazou
These REG_LIBCALL_ID notes would need to be actively maintained throughout the
entire RTL middle-end and I don't see any practical advantages over what we
currently have. And of course someone needs to do the implementation.
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
God forbid we ever start somewhere and work towards a goal.
Really.
I'm completely done working on our backend or trying to improve it in any way.
I'm removing my name from the dataflow maintainer list.
Let's all take a deep breath here.

In.

Out.

In.

Out.

Wow, I feel so centered...

Losing our tempers won't help. There's an honest disagreement here
between two long-standing GCC contributors. No need for dramatics.

(I'd be willing to host/moderate a regular conference call for any/all
interested GCC developers to use to discuss Issues of the Day. I know
it's old-tech, but if people are interested, let me know. Phone still
seems to work better to resolve some issues than IRC/email.)

Now, to the design issue at hand. REG_LIBCALL_ID seems a better way to
represent libcalls than the longstanding begin/end structure for the
reason Danny stated: you can tell what instruction is in what libcall
without walking around in the instruction stream. So, we don't want to
be removing it, we want to be moving towards it.

But, because we're presently dependent on the old structures, it sounds
to me like we need to make sure that no non-libcall instructions are
inserted between REG_LIBCALL and REG_RETVAL. That's the bug. We should
find the code doing that, and make it stop -- until we complete this
transition. That's what Jakub is trying to do.

As for half-baked transitions (or, to use Zack's original phraseology,
"incomplete transitions"), yup, we have them. My experience is that
most other folks do to. I've certainly seen a fair number of
proprietary compilers with weird, half-used legacy bits in them. I'm
not sure there's a way around them; if you demand there be no incomplete
transitions at all, you raise the bar for changing things pretty far.
The dataflow bar was already pretty high, and Kenny, Danny, Seongbae,
et. al. all worked pretty hard for a long time to get that stuff in good
shape. As Danny says, at some point, you have to declare victory.

(We are making progress, though, by organizing things into smaller, more
self-contained units. For example, the various optimization passes are
much more replaceable than ever before. We're doing a better job
separating the front ends from the middle end. This is good.)

Danny, do you think Jakub's patch is correct?

Thanks,
--
Mark Mitchell
CodeSourcery
***@codesourcery.com
(650) 331-3385 x713
Richard Kenner
2007-10-18 11:20:39 UTC
Permalink
Post by Mark Mitchell
Now, to the design issue at hand. REG_LIBCALL_ID seems a better way to
represent libcalls than the longstanding begin/end structure for the
reason Danny stated: you can tell what instruction is in what libcall
without walking around in the instruction stream. So, we don't want to
be removing it, we want to be moving towards it.
For what it's worth, I agree. It has always been a nuisance that we
had such a poor structure to represent this and has always been the source
of confusion in the past.
Post by Mark Mitchell
But, because we're presently dependent on the old structures, it sounds
to me like we need to make sure that no non-libcall instructions are
inserted between REG_LIBCALL and REG_RETVAL. That's the bug.
I agree with this as well.
Kenneth Zadeck
2007-10-18 11:34:48 UTC
Permalink
Post by Richard Kenner
Post by Mark Mitchell
Now, to the design issue at hand. REG_LIBCALL_ID seems a better way to
represent libcalls than the longstanding begin/end structure for the
reason Danny stated: you can tell what instruction is in what libcall
without walking around in the instruction stream. So, we don't want to
be removing it, we want to be moving towards it.
For what it's worth, I agree. It has always been a nuisance that we
had such a poor structure to represent this and has always been the source
of confusion in the past.
Post by Mark Mitchell
But, because we're presently dependent on the old structures, it sounds
to me like we need to make sure that no non-libcall instructions are
inserted between REG_LIBCALL and REG_RETVAL. That's the bug.
I agree with this as well.
I think that this is the fundamental point. Danny and I have taken the
view that it is not correct to insert instructions into a libcall. And
during the dataflow branch developement, we corrected as many of these
as we encountered.

For a long time, the gcc community has been happy "knowing" that it has
been wrong to insert code into a libcall and has been willing to simply
fix wrong code problems as they have been reported. What we did was
make many of the cases where an insn is inserted into a libcall into a
compiler icing situation. What has been extremely frustrating has been
Eric's personal attacks, by first calling this a "bug" and then calling
it "half backed."

I will post a patch today that, rather than just waiting for dce to try
to delete a libcall with a problem, aggressively checks each pass to
make sure that no one has inserted such an insn. The community can then
test this to see how much of a problem this really is. This is less of
"half baked" solution but, fixing libcalls was not actually the focus of
the dataflow branch. I happen to be of the mind that libcalls are just
bad and that our time would have been better spent removing them rather
than trying to make something this busted work. But vision of such
things are a lot better in hindsight.

Kenny
Richard Kenner
2007-10-18 11:46:21 UTC
Permalink
Post by Kenneth Zadeck
I think that this is the fundamental point. Danny and I have taken the
view that it is not correct to insert instructions into a libcall. And
during the dataflow branch developement, we corrected as many of these
as we encountered.
I want to avoid confusion here and be clear that we're talking about
UNRELATED insns inside a libcall. Clearly if, for example, you have to
generate a reload for an insn inside a libcall, the appropriate place
for it is inside the libcall.

I'd phrase it this way: a libcall is a sequence of insns that implement
a particular "thing". It's perfectly valid to rearrange that sequence or
to expand or contract the number of insns in that sequence as long as they
are all part of the implementation of the "thing". What is NOT valid is
to move some of that sequence outside of the libcall range or to move
insns that are not part of that implementation inside the libcall range.

Unfortunately, this DOES produce the problem that Eric was talking about,
which is the need for any optimization that changes insns inside a libcall
to track the fact that they are part of the libcall. But I think that's
unavoidable.
Kenneth Zadeck
2007-10-18 12:03:26 UTC
Permalink
Post by Richard Kenner
Post by Kenneth Zadeck
I think that this is the fundamental point. Danny and I have taken the
view that it is not correct to insert instructions into a libcall. And
during the dataflow branch developement, we corrected as many of these
as we encountered.
I want to avoid confusion here and be clear that we're talking about
UNRELATED insns inside a libcall. Clearly if, for example, you have to
generate a reload for an insn inside a libcall, the appropriate place
for it is inside the libcall.
I assume that generally the optimizations that do this know that they
are doing.
The common code moving optimizations, i assume, want to treat libcalls
as black boxes.
Post by Richard Kenner
I'd phrase it this way: a libcall is a sequence of insns that implement
a particular "thing". It's perfectly valid to rearrange that sequence or
to expand or contract the number of insns in that sequence as long as they
are all part of the implementation of the "thing". What is NOT valid is
to move some of that sequence outside of the libcall range or to move
insns that are not part of that implementation inside the libcall range.
Unfortunately, this DOES produce the problem that Eric was talking about,
which is the need for any optimization that changes insns inside a libcall
to track the fact that they are part of the libcall. But I think that's
unavoidable.
I actually did not know that there were such optimizations. I thought
that most of the back end treated libcalls as black boxes.
Jakub Jelinek
2007-10-18 12:17:34 UTC
Permalink
Post by Kenneth Zadeck
I actually did not know that there were such optimizations. I thought
that most of the back end treated libcalls as black boxes.
AFAIK that's definitely not the case, most passes treat insns in libcall
sequences as any other insns. The libcall notes I believe are meant as
optimization rather than optimization inhibitor, so I'd say what dce.c
does with it (treats them as optimization inhibitor) is wrong.
As optimization these notes say - if you delete the REG_RETVAL marked
insn, you can delete the whole libcall sequence, as it is now useless,
no matter what other side effects (e.g. setting hard registers) it has.
dce.c treats them as if an insn is part of libcall sequence, it must not be
deleted.

Jakub
Steven Bosscher
2007-10-18 12:39:44 UTC
Permalink
Post by Jakub Jelinek
Post by Kenneth Zadeck
I actually did not know that there were such optimizations. I thought
that most of the back end treated libcalls as black boxes.
AFAIK that's definitely not the case, most passes treat insns in libcall
sequences as any other insns.
That's not true for at least all passes in gcse.c and for the RTL loop
optimizers. They all have to do special things for libcalls. In the
case of PRE and hoisting, the "special thing" they do is to punt. I
don't know about other passes.
Post by Jakub Jelinek
dce.c treats them as if an insn is part of libcall sequence, it must not be
deleted.
If an insn is part of a libcall *and* the insn is necessary, I
suppose? Blindly marking all libcall insns as necessary would be
quite dumb.

(But then again, how often can GCC really delete a libcall? My bet:
almost never.)

Gr.
Steven
Jakub Jelinek
2007-10-18 13:00:57 UTC
Permalink
Post by Steven Bosscher
Post by Jakub Jelinek
dce.c treats them as if an insn is part of libcall sequence, it must not be
deleted.
If an insn is part of a libcall *and* the insn is necessary, I
suppose? Blindly marking all libcall insns as necessary would be
quite dumb.
See prescan_insns_for_dce - whenever it sees an insn with REG_LIBCALL_ID
note, it marks all insns with REG_LIBCALL_ID notes before and after it
(why it does so when prescan_insns_for_dce walks all insns anyway
is beyond my understanding - simple mark_insn (insn, fast) would do
exactly the same and in mark_reg_dependencies there is no point in marking
what has been already marked), so from what I can see dce.c never deletes
any libcall insns except noop moves (those are deleted even when they are
marked as needed, see delete_unmarked_insns), no matter whether they are
or are not needed.

Jakub
Richard Kenner
2007-10-18 20:21:10 UTC
Permalink
Post by Jakub Jelinek
AFAIK that's definitely not the case, most passes treat insns in libcall
sequences as any other insns. The libcall notes I believe are meant as
optimization rather than optimization inhibitor, so I'd say what dce.c
does with it (treats them as optimization inhibitor) is wrong.
I agree.
Jakub Jelinek
2007-10-18 12:05:33 UTC
Permalink
Post by Richard Kenner
Post by Kenneth Zadeck
I think that this is the fundamental point. Danny and I have taken the
view that it is not correct to insert instructions into a libcall. And
during the dataflow branch developement, we corrected as many of these
as we encountered.
I want to avoid confusion here and be clear that we're talking about
UNRELATED insns inside a libcall. Clearly if, for example, you have to
generate a reload for an insn inside a libcall, the appropriate place
for it is inside the libcall.
That's even more confusing.
PR33644 is about virtual reg instantation pass adding RELATED insn inside
the libcall sequence (though without LIBCALL_ID note, from what I see
no pass after expand adds them, just combine redistributes them). That
RELATED insn is later on changed into UNRELATED by fwprop1.
So, is in this case acceptable to add the UNRELATED insn into the sequence
(and should each pass in that case add LIBCALL_ID notes)? The problem
with adding LIBCALL_ID note is that it won't then be deleted as useless
if it becomes useless.

If we decide that e.g. all insns in between REG_LIBCALL/REG_RETVAL noted
insns must have REG_LIBCALL_ID notes, then IMNSHO we should enforce it
with --enable-checking, that's the only reasonable way to make sure problems
are caught up quickly.

Jakub
Richard Kenner
2007-10-18 20:20:31 UTC
Permalink
Post by Jakub Jelinek
That's even more confusing.
PR33644 is about virtual reg instantation pass adding RELATED insn inside
the libcall sequence (though without LIBCALL_ID note, from what I see
no pass after expand adds them, just combine redistributes them). That
RELATED insn is later on changed into UNRELATED by fwprop1.
I don't see anything in that PR saying exactly how an optimizer is changing
a related insn to being unrelated (that seems wierd to me), so I can't
comment about the details here. However, I saw this email before I got on
a plane and thought about it during the flight.

A "related" insns really doesn't need to be in a libcall since, by
definition, it feeds into one of the insns there. So if the libcall is
moved or deleted, that insns will be processes similarly by whatever code
is moving or deleting the libcall.

In fact, I think we can view libcalls as anachronisms which we can replace
by a much simpler mechanism. In the original compiler, we didn't have any
sort of good flow infrastructure, so they were the only way to indicate
that if we delete or move some insn, we have to do the same to others.
But now there are much better ways: all we need to do is to have a REG_NOTE
on the last insn in a libcall (the one with the REG_EQUAL) that lists (like
LOG_LINKS) the insns that need to be deleted or moved with it and *perhaps*
those insns need to point to that ending insn. The rest can use existing
dataflow information.
Eric Botcazou
2007-10-18 13:39:43 UTC
Permalink
Post by Kenneth Zadeck
For a long time, the gcc community has been happy "knowing" that it has
been wrong to insert code into a libcall and has been willing to simply
fix wrong code problems as they have been reported. What we did was
make many of the cases where an insn is inserted into a libcall into a
compiler icing situation. What has been extremely frustrating has been
Eric's personal attacks, by first calling this a "bug" and then calling
it "half backed."
Of course, double typo, I meant "half-baked".

I'm sorry that you took my messages as personal attacks, they aren't. What
I do criticize is the handling of libcalls in dce.c and the introduction of
the REG_LIBCALL_ID note.
--
Eric Botcazou
Mark Mitchell
2007-10-18 16:18:33 UTC
Permalink
Post by Kenneth Zadeck
Post by Mark Mitchell
But, because we're presently dependent on the old structures, it sounds
to me like we need to make sure that no non-libcall instructions are
inserted between REG_LIBCALL and REG_RETVAL. That's the bug.
I think that this is the fundamental point. Danny and I have taken the
view that it is not correct to insert instructions into a libcall.
I will post a patch today that, rather than just waiting for dce to try
to delete a libcall with a problem, aggressively checks each pass to
make sure that no one has inserted such an insn.
That seems like a good thing; verification of invariants is nice.
Post by Kenneth Zadeck
I happen to be of the mind that libcalls are just
bad and that our time would have been better spent removing them rather
than trying to make something this busted work.
Let's be careful about terminology here. Libcalls aren't "busted".
Certainly, one can build a working compiler with the old libcall
representation, the new one, or no such representation at all. I think
one of the reasons we tend to get each other worked up is that we
declare things "hopelessly broken", "totally busted", etc. too often.
Some of us have worked hard to build these things now being declared
"busted". :-)

Anyhow, it would certainly be a priori better if libcall sequences in
RTL could just go away and the "magic" properties of libcall sequences
be otherwise expressed in some generic form. As Jakub says, part of the
idea is to be able to delete libcalls wholesale when you realize you
don't need their values; we might now be able to set things up so that
our ordinary code-removal optimizations just automatically do that.
We'd certainly need something to say that a function call instruction is
deletable if its value is unused; in general, of course, we assume
function calls have important side-effects.

Anyhow, I'm not seeing any reason for drastic action here. It seems to
me like REG_LIBCALL_ID gives us a path forward, so we want to keep it.
Meanwhile, we need to fix this bug. I haven't seen anything so far that
suggests that fixing the bug requires a major change to the
representation; it just sounds like a bug.

Thanks,
--
Mark Mitchell
CodeSourcery
***@codesourcery.com
(650) 331-3385 x713
Jakub Jelinek
2007-10-18 19:26:38 UTC
Permalink
Post by Mark Mitchell
Anyhow, I'm not seeing any reason for drastic action here. It seems to
me like REG_LIBCALL_ID gives us a path forward, so we want to keep it.
Meanwhile, we need to fix this bug. I haven't seen anything so far that
suggests that fixing the bug requires a major change to the
representation; it just sounds like a bug.
I think for 4.3 the best thing would be apply Eric's patch, as so far really
nothing uses REG_LIBCALL_ID notes (dce.c really doesn't need them ATM and
does a wrong thing with libcalls, where they inhibit optimization rather
than allowing eliminating whole unneeded libcall sequences) and almost no
pass maintains these notes, then once branched REG_LIBCALL_ID can be
reapplied together with addition of --enable-checking verification code
to make sure it is maintained through all the RTL passes and start using it
in various passes.

Jakub
Richard Kenner
2007-10-18 20:33:39 UTC
Permalink
Post by Mark Mitchell
We'd certainly need something to say that a function call instruction is
deletable if its value is unused; in general, of course, we assume
function calls have important side-effects.
I thought we have a flag for that now, but I'm not sure I remember which one.

I agree that it may well be the case that libcalls are ALREADY unnecessary,
but I'm not quite ready to assert that myself.
Steven Bosscher
2007-10-18 22:27:39 UTC
Permalink
Post by Richard Kenner
Post by Mark Mitchell
We'd certainly need something to say that a function call instruction is
deletable if its value is unused; in general, of course, we assume
function calls have important side-effects.
I thought we have a flag for that now, but I'm not sure I remember which one.
I agree that it may well be the case that libcalls are ALREADY unnecessary,
but I'm not quite ready to assert that myself.
The last time this came up, it was pretty much agreed that the main
reason to still have libcalls was to allow dynamic TLS addresses
(which involve calls) to be CSE-ed. I don't know if this actually
works as expected right now, I've never checked that. I don't think
anyone knows if CSE of other libcalls is useful/necessary/happening
for some target.

Otherwise, the tree optimizers should remove almost all dead code, so
the chances of a dead libcall are virtually 0.

Gr.
Steven
Richard Kenner
2007-10-19 01:51:21 UTC
Permalink
Post by Steven Bosscher
The last time this came up, it was pretty much agreed that the main
reason to still have libcalls was to allow dynamic TLS addresses
(which involve calls) to be CSE-ed. I don't know if this actually
works as expected right now, I've never checked that.
I'd expect this to be very similar, if not identical, to a call to
something like _muldi3. Why isn't it?
Steven Bosscher
2007-10-19 05:29:06 UTC
Permalink
Post by Richard Kenner
Post by Steven Bosscher
The last time this came up, it was pretty much agreed that the main
reason to still have libcalls was to allow dynamic TLS addresses
(which involve calls) to be CSE-ed. I don't know if this actually
works as expected right now, I've never checked that.
I'd expect this to be very similar, if not identical, to a call to
something like _muldi3. Why isn't it?
Things like a multiplication can be CSE-ed in tree-ssa, so there
should not be any libcalls of this kind to CSE in the generated RTL.

But TLS addressing is not exposed in GIMPLE so you can't CSE the TLS
addresses either.

static __thread int a, b;
int foo (void)
{
return a + b;
}

Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.

Search for TLS_ADDR_EXPR with Google ;-)

Gr.
Steven
Richard Kenner
2007-10-19 12:41:48 UTC
Permalink
Post by Steven Bosscher
Things like a multiplication can be CSE-ed in tree-ssa, so there
should not be any libcalls of this kind to CSE in the generated RTL.
But TLS addressing is not exposed in GIMPLE so you can't CSE the TLS
addresses either.
That's true, but we aren't depending *just* on cse'ing stuff at the tree
level, are we? We do cse multiplication libcalls at the RTL level and that
same approach should work for TLS libcalls.
Post by Steven Bosscher
Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.
But isn't the function marked const? (Is that the right flag?)
Steven Bosscher
2007-10-19 13:04:49 UTC
Permalink
Post by Richard Kenner
Post by Steven Bosscher
Things like a multiplication can be CSE-ed in tree-ssa, so there
should not be any libcalls of this kind to CSE in the generated RTL.
But TLS addressing is not exposed in GIMPLE so you can't CSE the TLS
addresses either.
That's true, but we aren't depending *just* on cse'ing stuff at the tree
level, are we? We do cse multiplication libcalls at the RTL level and that
same approach should work for TLS libcalls.
Yes.

I thought you were referring to my remark that we supposedly only need
libcalls for TLS address calls, but IIUC you in fact reacted to my
remark whether CSE of TLS libcalls actually works, right?

Anyway, yes, if you have libcall notes, then CSE of TLS libcalls should work.
Post by Richard Kenner
Post by Steven Bosscher
Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.
But isn't the function marked const? (Is that the right flag?)
__tls_get_addr is not const, and not pure either.

Gr.
Steven
Daniel Jacobowitz
2007-10-19 13:20:13 UTC
Permalink
Post by Steven Bosscher
Post by Richard Kenner
Post by Steven Bosscher
Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.
But isn't the function marked const? (Is that the right flag?)
__tls_get_addr is not const, and not pure either.
To expand on this:

__tls_get_addr can side effects. The only one I'm sure of offhand
is that it can allocate previously unallocated thread-local storage
for the current thread / module. I think in Alex's clever descriptor
stuff it was called indirectly through the GOT entry and could also
modify the GOT entry to call a simpler function.

But omitting those side effects is, from the POV of a conforming
program, fine. Everywhere except in the dynamic linker during TLS
setup, at least. Can't we get away with marking it const or pure?
--
Daniel Jacobowitz
CodeSourcery
Richard Kenner
2007-10-19 13:35:27 UTC
Permalink
Post by Daniel Jacobowitz
But omitting those side effects is, from the POV of a conforming
program, fine. Everywhere except in the dynamic linker during TLS
setup, at least. Can't we get away with marking it const or pure?
Usually you do in that sort of situation. An analogy is a memoized
function: technically it's not const or pure, but in practice you can
treat it as so.
Richard Kenner
2007-10-19 13:32:42 UTC
Permalink
Post by Steven Bosscher
__tls_get_addr is not const, and not pure either.
Why not? Doesn't it return the same value every time? If it DOESN'T,
then how can we CSE it? And if it DOES, why don't we mark it that way?
Richard Kenner
2007-10-19 13:33:23 UTC
Permalink
Post by Steven Bosscher
I thought you were referring to my remark that we supposedly only need
libcalls for TLS address calls, but IIUC you in fact reacted to my
remark whether CSE of TLS libcalls actually works, right?
Well, both since they are related.
Jakub Jelinek
2007-10-19 13:35:43 UTC
Permalink
Post by Steven Bosscher
Post by Richard Kenner
Post by Steven Bosscher
Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.
But isn't the function marked const? (Is that the right flag?)
__tls_get_addr is not const, and not pure either.
gcc actually marks them as CONST_OR_PURE_CALL_P (which is a little bit of
lie, as glibc needs to write some memory as part of resolving __tls_get_addr
for the first time, but a) nothing actually cares if it is not called at all
b) nothing cares if it is called just once or several times).

Just tried
static __thread int a, b;

int foo (void)
{
return a + b;
}

void bar (int x, int y)
{
a = x;
b = y;
}

with -O2 -m64 -fpic on x86_64 and -O2 -m{32,64} -fpic on ppc64 and while
4.1/4.2 have just one __tls_get_addr call in each fn, the trunk has 2 in each.
CSE does its job, replaces the pseudo set by second REG_RETVAL insn
by the pseudo set by first REG_RETVAL, but as dce.c doesn't handle libcalls
properly, nothing ever purges the dead libcall.

Jakub
Richard Kenner
2007-10-19 13:47:16 UTC
Permalink
Post by Jakub Jelinek
gcc actually marks them as CONST_OR_PURE_CALL_P (which is a little bit of
lie, as glibc needs to write some memory as part of resolving __tls_get_addr
for the first time, but a) nothing actually cares if it is not called at all
b) nothing cares if it is called just once or several times).
...
Post by Jakub Jelinek
CSE does its job, replaces the pseudo set by second REG_RETVAL insn
by the pseudo set by first REG_RETVAL, but as dce.c doesn't handle libcalls
properly, nothing ever purges the dead libcall.
I'm travelling now, so can't easily test this myself, but what happens if
it's NOT a libcall and if there's still two calls, why?
Steven Bosscher
2007-10-19 14:01:22 UTC
Permalink
Post by Jakub Jelinek
Post by Steven Bosscher
Post by Richard Kenner
Post by Steven Bosscher
Compile with -fpic. There should only be one call to __tls_get_addr
to get a base address. Without libcalls there will be two calls. The
base address is not exposed in tree-ssa.
But isn't the function marked const? (Is that the right flag?)
__tls_get_addr is not const, and not pure either.
gcc actually marks them as CONST_OR_PURE_CALL_P (which is a little bit of
lie, as glibc needs to write some memory as part of resolving __tls_get_addr
for the first time, but a) nothing actually cares if it is not called at all
b) nothing cares if it is called just once or several times).
I didn't know that.

So that would mean that you don't even need libcall notes to CSE
_tls_get_addr calls?
Post by Jakub Jelinek
Just tried
static __thread int a, b;
int foo (void)
{
return a + b;
}
void bar (int x, int y)
{
a = x;
b = y;
}
with -O2 -m64 -fpic on x86_64 and -O2 -m{32,64} -fpic on ppc64 and while
4.1/4.2 have just one __tls_get_addr call in each fn, the trunk has 2 in each.
CSE does its job, replaces the pseudo set by second REG_RETVAL insn
by the pseudo set by first REG_RETVAL, but as dce.c doesn't handle libcalls
properly, nothing ever purges the dead libcall.
What happens if you try this with Eric's patch?

And (more interesting) what happens if you stop gcc from wrapping the
__tls_get_addr sequence in libcall notes?

Gr.
Steven
Paolo Bonzini
2007-10-19 14:17:04 UTC
Permalink
Post by Steven Bosscher
So that would mean that you don't even need libcall notes to CSE
_tls_get_addr calls?
CALL_INSNs are never DCE'd:

static bool
deletable_insn_p (rtx insn, bool fast)
{
rtx body, x;
int i;

if (!NONJUMP_INSN_P (insn))
return false;


Paolo
Kenneth Zadeck
2007-10-19 15:54:26 UTC
Permalink
Post by Paolo Bonzini
Post by Steven Bosscher
So that would mean that you don't even need libcall notes to CSE
_tls_get_addr calls?
static bool
deletable_insn_p (rtx insn, bool fast)
{
rtx body, x;
int i;
if (!NONJUMP_INSN_P (insn))
return false;
Paolo
this could be too conservative, there is no reason not dce pure or const
calls.

kenny
Richard Kenner
2007-10-19 21:44:48 UTC
Permalink
Post by Kenneth Zadeck
this could be too conservative, there is no reason not dce pure or const
calls.
Agreed.
Jakub Jelinek
2007-10-19 15:54:52 UTC
Permalink
Post by Paolo Bonzini
Post by Steven Bosscher
So that would mean that you don't even need libcall notes to CSE
_tls_get_addr calls?
static bool
deletable_insn_p (rtx insn, bool fast)
{
rtx body, x;
int i;
if (!NONJUMP_INSN_P (insn))
return false;
Why is that? CONST_OR_PURE_CALL_P CALL_INSNs should be deletable...

Anyway, just tried (sorry for the delay) removing the libcall stuff, just
generate:
(call_insn/u 5 4 6 mm2.c:4 (parallel [
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0 S1 A8])
(const_int 0 [0x0])))
(unspec:DI [
(const_int 0 [0x0])
] 20)
]) -1 (nil)
(nil))

(insn 6 5 7 mm2.c:4 (set (reg:DI 60)
(reg:DI 0 ax)) -1 (nil))

but CSE isn't able to merge these. Will still try to add REG_EQUAL note to
the pseudo = %rax insn if that helps.

Jakub
Jakub Jelinek
2007-10-19 16:14:42 UTC
Permalink
Post by Jakub Jelinek
Anyway, just tried (sorry for the delay) removing the libcall stuff, just
(call_insn/u 5 4 6 mm2.c:4 (parallel [
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0 S1 A8])
(const_int 0 [0x0])))
(unspec:DI [
(const_int 0 [0x0])
] 20)
]) -1 (nil)
(nil))
(insn 6 5 7 mm2.c:4 (set (reg:DI 60)
(reg:DI 0 ax)) -1 (nil))
but CSE isn't able to merge these. Will still try to add REG_EQUAL note to
the pseudo = %rax insn if that helps.
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
The comment says:
/* The UNSPEC case was added here because the ia-64 claims that
USEs do not work after reload and generates UNSPECS rather
than USEs. Since dce is run after reload we need to avoid
deleting these even if they are dead. If it turns out that
USEs really do work after reload, the ia-64 should be
changed, and the UNSPEC case can be removed. */
I wonder what kind of UNSPECs and at what places were the reasons to add it,
I very much doubt not dceing any UNSPECs is desirable, even on ia64.
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.

--- gcc/config/i386/i386.c.jj 2007-10-19 14:39:55.000000000 +0200
+++ gcc/config/i386/i386.c 2007-10-19 18:05:57.000000000 +0200
@@ -7644,7 +7644,7 @@ legitimize_tls_address (rtx x, enum tls_

if (TARGET_64BIT && ! TARGET_GNU2_TLS)
{
- rtx rax = gen_rtx_REG (Pmode, AX_REG), insns;
+ rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, last;

start_sequence ();
emit_call_insn (gen_tls_global_dynamic_64 (rax, x));
@@ -7652,7 +7652,10 @@ legitimize_tls_address (rtx x, enum tls_
end_sequence ();

CONST_OR_PURE_CALL_P (insns) = 1;
- emit_libcall_block (insns, dest, rax, x);
+/* emit_libcall_block (insns, dest, rax, x); */
+ emit_insn (insns);
+ emit_move_insn (dest, rax);
+ last = set_unique_reg_note (last, REG_EQUAL, x);
}
else if (TARGET_64BIT && TARGET_GNU2_TLS)
emit_insn (gen_tls_global_dynamic_64 (dest, x));
@@ -7673,7 +7676,7 @@ legitimize_tls_address (rtx x, enum tls_

if (TARGET_64BIT && ! TARGET_GNU2_TLS)
{
- rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, note;
+ rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, note, last;

start_sequence ();
emit_call_insn (gen_tls_local_dynamic_base_64 (rax));
@@ -7683,7 +7686,10 @@ legitimize_tls_address (rtx x, enum tls_
note = gen_rtx_EXPR_LIST (VOIDmode, const0_rtx, NULL);
note = gen_rtx_EXPR_LIST (VOIDmode, ix86_tls_get_addr (), note);
CONST_OR_PURE_CALL_P (insns) = 1;
- emit_libcall_block (insns, base, rax, note);
+/* emit_libcall_block (insns, base, rax, note); */
+ emit_insn (insns);
+ last = emit_move_insn (base, rax);
+ set_unique_reg_note (last, REG_EQUAL, note);
}
else if (TARGET_64BIT && TARGET_GNU2_TLS)
emit_insn (gen_tls_local_dynamic_base_64 (base));
--- gcc/dce.c.jj 2007-10-19 14:39:55.000000000 +0200
+++ gcc/dce.c 2007-10-19 18:01:08.000000000 +0200
@@ -102,7 +102,8 @@ deletable_insn_p (rtx insn, bool fast)
rtx body, x;
int i;

- if (!NONJUMP_INSN_P (insn))
+ if (!NONJUMP_INSN_P (insn)
+ && (!CALL_P (insn) || !CONST_OR_PURE_CALL_P (insn)))
return false;

body = PATTERN (insn);


Jakub
Kenneth Zadeck
2007-10-19 16:39:58 UTC
Permalink
Post by Jakub Jelinek
Post by Jakub Jelinek
Anyway, just tried (sorry for the delay) removing the libcall stuff, just
(call_insn/u 5 4 6 mm2.c:4 (parallel [
(set (reg:DI 0 ax)
(call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0 S1 A8])
(const_int 0 [0x0])))
(unspec:DI [
(const_int 0 [0x0])
] 20)
]) -1 (nil)
(nil))
(insn 6 5 7 mm2.c:4 (set (reg:DI 60)
(reg:DI 0 ax)) -1 (nil))
but CSE isn't able to merge these. Will still try to add REG_EQUAL note to
the pseudo = %rax insn if that helps.
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
/* The UNSPEC case was added here because the ia-64 claims that
USEs do not work after reload and generates UNSPECS rather
than USEs. Since dce is run after reload we need to avoid
deleting these even if they are dead. If it turns out that
USEs really do work after reload, the ia-64 should be
changed, and the UNSPEC case can be removed. */
I wonder what kind of UNSPECs and at what places were the reasons to add it,
I very much doubt not dceing any UNSPECs is desirable, even on ia64.
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.
i think that it may be worthwhile to look into this case.
in ia64.md around line 5983.

;; As USE insns aren't meaningful after reload, this is used instead
;; to prevent deleting instructions setting registers for EH handling
(define_insn "prologue_use"
Post by Jakub Jelinek
--- gcc/config/i386/i386.c.jj 2007-10-19 14:39:55.000000000 +0200
+++ gcc/config/i386/i386.c 2007-10-19 18:05:57.000000000 +0200
@@ -7644,7 +7644,7 @@ legitimize_tls_address (rtx x, enum tls_
if (TARGET_64BIT && ! TARGET_GNU2_TLS)
{
- rtx rax = gen_rtx_REG (Pmode, AX_REG), insns;
+ rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, last;
start_sequence ();
emit_call_insn (gen_tls_global_dynamic_64 (rax, x));
@@ -7652,7 +7652,10 @@ legitimize_tls_address (rtx x, enum tls_
end_sequence ();
CONST_OR_PURE_CALL_P (insns) = 1;
- emit_libcall_block (insns, dest, rax, x);
+/* emit_libcall_block (insns, dest, rax, x); */
+ emit_insn (insns);
+ emit_move_insn (dest, rax);
+ last = set_unique_reg_note (last, REG_EQUAL, x);
}
else if (TARGET_64BIT && TARGET_GNU2_TLS)
emit_insn (gen_tls_global_dynamic_64 (dest, x));
@@ -7673,7 +7676,7 @@ legitimize_tls_address (rtx x, enum tls_
if (TARGET_64BIT && ! TARGET_GNU2_TLS)
{
- rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, note;
+ rtx rax = gen_rtx_REG (Pmode, AX_REG), insns, note, last;
start_sequence ();
emit_call_insn (gen_tls_local_dynamic_base_64 (rax));
@@ -7683,7 +7686,10 @@ legitimize_tls_address (rtx x, enum tls_
note = gen_rtx_EXPR_LIST (VOIDmode, const0_rtx, NULL);
note = gen_rtx_EXPR_LIST (VOIDmode, ix86_tls_get_addr (), note);
CONST_OR_PURE_CALL_P (insns) = 1;
- emit_libcall_block (insns, base, rax, note);
+/* emit_libcall_block (insns, base, rax, note); */
+ emit_insn (insns);
+ last = emit_move_insn (base, rax);
+ set_unique_reg_note (last, REG_EQUAL, note);
}
else if (TARGET_64BIT && TARGET_GNU2_TLS)
emit_insn (gen_tls_local_dynamic_base_64 (base));
--- gcc/dce.c.jj 2007-10-19 14:39:55.000000000 +0200
+++ gcc/dce.c 2007-10-19 18:01:08.000000000 +0200
@@ -102,7 +102,8 @@ deletable_insn_p (rtx insn, bool fast)
rtx body, x;
int i;
- if (!NONJUMP_INSN_P (insn))
+ if (!NONJUMP_INSN_P (insn)
+ && (!CALL_P (insn) || !CONST_OR_PURE_CALL_P (insn)))
return false;
body = PATTERN (insn);
Jakub
Paolo Bonzini
2007-10-19 16:49:35 UTC
Permalink
Post by Jakub Jelinek
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.
I would say that all non-trapping UNSPECs (not UNSPEC_VOLATILEs) should
be DCEd. Maxim Kuvyrkov recently added a target hook to mark some
unspecs as non-trapping, and speculative moves are exactly the once he
needed the hook for.

Paolo
Seongbae Park (박성배, 朴成培)
2007-10-19 16:55:42 UTC
Permalink
Post by Paolo Bonzini
Post by Jakub Jelinek
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.
I would say that all non-trapping UNSPECs (not UNSPEC_VOLATILEs) should
be DCEd. Maxim Kuvyrkov recently added a target hook to mark some
unspecs as non-trapping, and speculative moves are exactly the once he
needed the hook for.
Paolo
This was discussed before, and after some discussion,
we've agreed to the following semantic as described in rtl.def:

215 UNSPEC can occur all by itself in a PATTERN, as a component
of a PARALLEL,
216 or inside an expression.
217 UNSPEC by itself or as a component of a PARALLEL
218 is currently considered not deletable.
219
220 FIXME: Replace all uses of UNSPEC that appears by itself or
as a component
221 of a PARALLEL with USE.
222 */

284 USE can not appear as an operand of other rtx except for PARALLEL.
285 USE is not deletable, as it indicates that the operand
286 is used in some unknown way. */
287 DEF_RTL_EXPR(USE, "use", "e", RTX_EXTRA)

293 CLOBBER can not appear as an operand of other rtx except
for PARALLEL.
294 CLOBBER of a hard register appearing by itself (not within PARALLEL)
295 is considered undeletable before reload. */
296 DEF_RTL_EXPR(CLOBBER, "clobber", "e", RTX_EXTRA)


If we want to change the meaning or the interpretation to clean this up,
I'm all for it, but I think it's too late to do that for 4.3.

Seongbae
Paolo Bonzini
2007-10-19 17:23:05 UTC
Permalink
Post by Seongbae Park (박성배, 朴成培)
If we want to change the meaning or the interpretation to clean this up,
I'm all for it, but I think it's too late to do that for 4.3.
I don't think that this is 4.3 material anyway, except for Eric's DCE patch.

Paolo
Jakub Jelinek
2007-10-19 17:41:08 UTC
Permalink
Post by Paolo Bonzini
Post by Seongbae Park (박성배, 朴成培)
If we want to change the meaning or the interpretation to clean this up,
I'm all for it, but I think it's too late to do that for 4.3.
I don't think that this is 4.3 material anyway, except for Eric's DCE patch.
The removal of libcall notes from TLS sequences surely is not 4.3 material.
But not DCEing CONST_OR_PURE_CALL_P calls or insns with UNSPECs inside of
PARALLELs is IMHO a regression from 4.2 and as such should be fixed.

Jakub
Seongbae Park (박성배, 朴成培)
2007-10-19 18:03:00 UTC
Permalink
Post by Jakub Jelinek
Post by Paolo Bonzini
Post by Seongbae Park (박성배, 朴成培)
If we want to change the meaning or the interpretation to clean this up,
I'm all for it, but I think it's too late to do that for 4.3.
I don't think that this is 4.3 material anyway, except for Eric's DCE patch.
The removal of libcall notes from TLS sequences surely is not 4.3 material.
Agreed.
Post by Jakub Jelinek
But not DCEing CONST_OR_PURE_CALL_P calls or
Agreed.
Post by Jakub Jelinek
insns with UNSPECs inside of
PARALLELs is IMHO a regression from 4.2 and as such should be fixed.
I'm not sure on this. During the current DCE development,
we've asked for what UNSPEC means,
and the collective answer we got was what's described in rtl.def.

As the FIXME in rtl.def suggests, the longer term direction
we agreed to is to allow deleting UNSPEC regardless of where it appears
(as you suggests)
but last I heard, there are backends that are using UNSPEC
for places that it should have used USE
and for what it's worth, this was considered the existing practice.

If we think we should fix this in 4.3, that's fine by me
but certainly it would require some cleanup efforts from various backends.

Seongbae
Jakub Jelinek
2007-10-19 18:40:50 UTC
Permalink
Post by Seongbae Park (박성배, 朴成培)
Post by Jakub Jelinek
insns with UNSPECs inside of
PARALLELs is IMHO a regression from 4.2 and as such should be fixed.
I'm not sure on this. During the current DCE development,
we've asked for what UNSPEC means,
and the collective answer we got was what's described in rtl.def.
That asking was done on IRC, not on the ml.
Can you show which backend uses UNSPEC inside of PARALLEL that is not
deletable? Or was that part added just for symmetricity?
4.2 insn_dead_p certainly didn't do anything special about plain UNSPECs,
though it wasn't run late when ia64 adds those 3 UNSPECs into the RTL chain.

I have quickly skimmed i386, rs6000, sparc, alpha, mips, s390 md files.

UNSPEC alone
ia64/ia64.md
prologue_use
flushrs
bundle_selector

I agree about all these 3 that they shouldn't be deletable.

But the uses of UNSPECs inside of PARALLEL I'm pretty sure all are meant to
mean this insn does what is in the first PARALLEL entry, but does it
slightly differently (except maybe for s390 UNSPEC_EXECUTE, I'm really not
sure about that, but if it wouldn't be deletable, guess 4.2 would have
problems; then UNSPEC_EXECUTE is the only (unspec ) that appears on the
first position inside of the parallel). All the rest is IMNSHO deletable.

UNSPEC inside of PARALLEL
i386/i386.md
return_internal_long
tls_global_dynamic_64
tls_local_dynamic_base_64
alpha/alpha.md
sibcall_osf_1_er
sibcall_osf_1
rs6000/altivec.md
altivec_lve*
altivec_lvx*
altivec_stvx*
altivec_stve*
rs6000/spe.md
spe_evlhhesplat
spe_evlhhesplatx
spe_evlhhossplat
spe_evlhhossplatx
spe_evlhhousplat
spe_evlhhousplatx
spe_evlwhsplat
spe_evlwhsplatx
spe_evlwwsplat
spe_evlwwsplatx
spe_evldd
spe_evlddx
spe_evldh
spe_evldhx
spe_evldw
spe_evldwx
spe_evlwhe
spe_evlwhex
spe_evlwhos
spe_evlwhosx
spe_evlwhou
spe_evlwhoux
spe_evstdd
spe_evstddx
spe_evstdh
spe_evstdhx
spe_evstdw
spe_evstdwx
spe_evstwhe
spe_evstwhex
spe_evstwho
spe_evstwhox
spe_evstwwe
spe_evstwwex
spe_evstwwo
spe_evstwwox
s390/s390.md
UNSPEC_EXECUTE
UNSPEC_ROUND

Jakub
Seongbae Park (박성배, 朴成培)
2007-10-19 19:24:44 UTC
Permalink
Post by Jakub Jelinek
Post by Seongbae Park (박성배, 朴成培)
Post by Jakub Jelinek
insns with UNSPECs inside of
PARALLELs is IMHO a regression from 4.2 and as such should be fixed.
I'm not sure on this. During the current DCE development,
we've asked for what UNSPEC means,
and the collective answer we got was what's described in rtl.def.
That asking was done on IRC, not on the ml.
I'm afraid you might have missed the thread:

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00386.html

I guess the subject wasn't very informative.
Post by Jakub Jelinek
Can you show which backend uses UNSPEC inside of PARALLEL that is not
deletable?
I can't find the pattern in ia64.md,
but we used to have a regression caused by deleting UNSPEC inside PARALLEL.
Post by Jakub Jelinek
Or was that part added just for symmetricity?
Nope. It was explicitly to fix a regression,
though it's easier if we don't have to distinguish between top level
and inside parallel.
Post by Jakub Jelinek
4.2 insn_dead_p certainly didn't do anything special about plain UNSPECs,
though it wasn't run late when ia64 adds those 3 UNSPECs into the RTL chain.
I have quickly skimmed i386, rs6000, sparc, alpha, mips, s390 md files.
UNSPEC alone
ia64/ia64.md
prologue_use
flushrs
bundle_selector
I agree about all these 3 that they shouldn't be deletable.
But the uses of UNSPECs inside of PARALLEL I'm pretty sure all are meant to
mean this insn does what is in the first PARALLEL entry, but does it
slightly differently (except maybe for s390 UNSPEC_EXECUTE, I'm really not
sure about that, but if it wouldn't be deletable, guess 4.2 would have
problems; then UNSPEC_EXECUTE is the only (unspec ) that appears on the
first position inside of the parallel). All the rest is IMNSHO deletable.
UNSPEC inside of PARALLEL
i386/i386.md
return_internal_long
tls_global_dynamic_64
tls_local_dynamic_base_64
alpha/alpha.md
sibcall_osf_1_er
sibcall_osf_1
rs6000/altivec.md
altivec_lve*
altivec_lvx*
altivec_stvx*
altivec_stve*
rs6000/spe.md
spe_evlhhesplat
spe_evlhhesplatx
spe_evlhhossplat
spe_evlhhossplatx
spe_evlhhousplat
spe_evlhhousplatx
spe_evlwhsplat
spe_evlwhsplatx
spe_evlwwsplat
spe_evlwwsplatx
spe_evldd
spe_evlddx
spe_evldh
spe_evldhx
spe_evldw
spe_evldwx
spe_evlwhe
spe_evlwhex
spe_evlwhos
spe_evlwhosx
spe_evlwhou
spe_evlwhoux
spe_evstdd
spe_evstddx
spe_evstdh
spe_evstdhx
spe_evstdw
spe_evstdwx
spe_evstwhe
spe_evstwhex
spe_evstwho
spe_evstwhox
spe_evstwwe
spe_evstwwex
spe_evstwwo
spe_evstwwox
s390/s390.md
UNSPEC_EXECUTE
UNSPEC_ROUND
Jakub
I looked at those and I agree.
I just don't know this will work in ia64 - if ia64 maintainers can agree
to allow deleting insns with UNSPEC at the top of PARALLEL
(of course, provided the insn is dead otherwise), I'm fine with doing
it that way.
--
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot
Jakub Jelinek
2007-10-19 17:36:02 UTC
Permalink
Post by Paolo Bonzini
Post by Jakub Jelinek
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.
I would say that all non-trapping UNSPECs (not UNSPEC_VOLATILEs) should
be DCEd. Maxim Kuvyrkov recently added a target hook to mark some
unspecs as non-trapping, and speculative moves are exactly the once he
needed the hook for.
I doubt the new target hook would fit into what DCE needs.

But if the only problem is with ia64's prologue_use, or that and
bundle_selector, then perhaps that
if (GET_CODE (body) == UNSPEC) return false;
could move from deletable_insn_p_1 to deletable_insn_p and
thus avoid DCE only of UNSPECs in body of the insns, while allow
it in PARALLELs.

Jakub
Kenneth Zadeck
2007-10-19 19:19:39 UTC
Permalink
Post by Jakub Jelinek
Post by Paolo Bonzini
Post by Jakub Jelinek
Adding REG_EQUAL note was all that was needed for CSEing it, but dce.c still
won't delete it, because it punts on all UNSPECs. The UNSPECs are really
necessary for the __tls_get_addr call, as the call_insn has mandatory insn
sequence that needs to be output there (to make TLS transitions possible).
Say speculative moves use UNSPEC_LDS{,A}, but they certainly should be dced.
I would say that all non-trapping UNSPECs (not UNSPEC_VOLATILEs) should
be DCEd. Maxim Kuvyrkov recently added a target hook to mark some
unspecs as non-trapping, and speculative moves are exactly the once he
needed the hook for.
I doubt the new target hook would fit into what DCE needs.
But if the only problem is with ia64's prologue_use, or that and
bundle_selector, then perhaps that
if (GET_CODE (body) == UNSPEC) return false;
could move from deletable_insn_p_1 to deletable_insn_p and
thus avoid DCE only of UNSPECs in body of the insns, while allow
it in PARALLELs.
Jakub
There are a lot of issues with the usage (and abuse) of unspecs on the
ia64 that most likely need to be considered as a whole and done before
you go head with this.

I personally think that the unspec rather than use issue is something
that just needs to be fixed outright.

one of the big problems here is that it is fairly difficult to get ia64
changes reviewed in any reasonable manner.

Kenny
Jakub Jelinek
2007-10-19 19:25:26 UTC
Permalink
Post by Kenneth Zadeck
There are a lot of issues with the usage (and abuse) of unspecs on the
ia64 that most likely need to be considered as a whole and done before
you go head with this.
See my other mail, there are just 3 UNSPECs in insn body on ia64, and
0 UNSPECs directly inside of PARALLELs.
Post by Kenneth Zadeck
one of the big problems here is that it is fairly difficult to get ia64
changes reviewed in any reasonable manner.
Disagree here, Jim's reviews have been quite timely lately.

Jakub
Richard Kenner
2007-10-19 22:04:54 UTC
Permalink
Post by Paolo Bonzini
I would say that all non-trapping UNSPECs (not UNSPEC_VOLATILEs) should
be DCEd.
Can't even all trapping ones be DCEd? Is there any language where it's
required to include an operation if the only purpose of the operation would
be to trap in the case of an error?
Richard Kenner
2007-10-19 14:22:03 UTC
Permalink
Post by Steven Bosscher
So that would mean that you don't even need libcall notes to CSE
_tls_get_addr calls?
You shouldn't if everything else is working right. But, from what's been
reported here, you do, so it seems that something is broken.

As I said, it may well be that we already don't need libcall notes anymore
if some small bugs are fixed and it may well be better to fix those bugs
and kill the concept than to do further work on it.
Eric Botcazou
2007-10-19 14:04:13 UTC
Permalink
Post by Jakub Jelinek
Just tried
static __thread int a, b;
int foo (void)
{
return a + b;
}
void bar (int x, int y)
{
a = x;
b = y;
}
with -O2 -m64 -fpic on x86_64 and -O2 -m{32,64} -fpic on ppc64 and while
4.1/4.2 have just one __tls_get_addr call in each fn, the trunk has 2 in
each. CSE does its job, replaces the pseudo set by second REG_RETVAL insn
by the pseudo set by first REG_RETVAL, but as dce.c doesn't handle libcalls
properly, nothing ever purges the dead libcall.
Just to confirm that my patch fixes this pessimization. However the version I
posted is a little too aggressive (clean on x86 and x86-64 but optimizes away
some _Q_ functions on SPARC), I've a corrected version currently in testing.
--
Eric Botcazou
Eric Botcazou
2007-10-18 13:15:58 UTC
Permalink
Post by Daniel Berlin
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
I've posted a patch to that effect. It restores the correct interpretation of
libcall notes as optimization helpers rather than optimization barriers as
far as DCE is concerned and, thus, eliminates the need for being able to
determine whether an insn is part of a libcall in dce.c.

But I won't install it myself, given how controversial this issue has become.
I therefore propose that a 3rd party decides here, and I'd suggest Jakub.

Yes, I see theorical advantages over being able to determine whether an insn
is part of a libcall without having to walk the entire function.

No, I don't see any practical advantages given how the RTL passes are written,
including the new DCE pass which starts with a linear walk over the insns.
Post by Daniel Berlin
God forbid we ever start somewhere and work towards a goal.
Really.
I'm completely done working on our backend or trying to improve it in any
way. I'm removing my name from the dataflow maintainer list.
Everything I've written above doesn't change in any way my judgement on the
great work you have done with Kenneth and others on the dataflow branch:
http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00798.html
--
Eric Botcazou
Paolo Bonzini
2007-10-18 14:01:37 UTC
Permalink
Post by Eric Botcazou
Post by Daniel Berlin
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
I've posted a patch to that effect. It restores the correct interpretation of
libcall notes as optimization helpers rather than optimization barriers as
far as DCE is concerned and, thus, eliminates the need for being able to
determine whether an insn is part of a libcall in dce.c.
But I won't install it myself, given how controversial this issue has become.
I therefore propose that a 3rd party decides here, and I'd suggest Jakub.
I agree that Jakub has a good hold of DCE but, if a DF reviewer is
needed, I'm happy to help.

I'm not extremely tied to REG_LIBCALL_ID, though in the long run I think
it is a better solution than REG_LIBCALL/REG_RETVAL. However, it looks
like there is no clean design for a new representation (for example,
might the best thing be REG_LIBCALL_ID/REG_RETVAL?) that is also usable
*for the entire transition* (covering the whole RTL pipeline). Also,
we're in stage3. For all these reasons, I'm happy to ditch it at least
momentarily.

Paolo
Kenneth Zadeck
2007-10-18 14:07:31 UTC
Permalink
Post by Paolo Bonzini
Post by Eric Botcazou
Post by Daniel Berlin
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
I've posted a patch to that effect. It restores the correct
interpretation of libcall notes as optimization helpers rather than
optimization barriers as far as DCE is concerned and, thus,
eliminates the need for being able to determine whether an insn is
part of a libcall in dce.c.
But I won't install it myself, given how controversial this issue has become.
I therefore propose that a 3rd party decides here, and I'd suggest Jakub.
I agree that Jakub has a good hold of DCE but, if a DF reviewer is
needed, I'm happy to help.
I'm not extremely tied to REG_LIBCALL_ID, though in the long run I
think it is a better solution than REG_LIBCALL/REG_RETVAL. However,
it looks like there is no clean design for a new representation (for
example, might the best thing be REG_LIBCALL_ID/REG_RETVAL?) that is
also usable *for the entire transition* (covering the whole RTL
pipeline). Also, we're in stage3. For all these reasons, I'm happy
to ditch it at least momentarily.
Paolo
thanks

kenny
Steven Bosscher
2007-10-18 14:06:53 UTC
Permalink
Post by Eric Botcazou
Post by Daniel Berlin
If you don't see the practical advantage of being able to determine
whether an insn is part of a libcall without having to walk the entire
function to figure out whether you are allowed to touch an insn,
please just remove it.
I've posted a patch to that effect. It restores the correct interpretation of
libcall notes as optimization helpers rather than optimization barriers as
far as DCE is concerned and, thus, eliminates the need for being able to
determine whether an insn is part of a libcall in dce.c.
But I won't install it myself, given how controversial this issue has become.
I therefore propose that a 3rd party decides here, and I'd suggest Jakub.
FWIW, I think your patch is the typical "less than ideal but the best
we can do now" kind of solution.

Gr.
Steven
Paolo Bonzini
2007-10-18 07:28:22 UTC
Permalink
Post by Steven Bosscher
1) the old RTL loop optimizer could use them to move loop invariant libcalls.
Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
4) When -ftrapv is in effect, libcalls are "necessary" so that the
results of an operation can be propagated without making the call to the
libgcc functions dead. The attached patch causes a failure in pr30286.c.

Paolo
Richard Guenther
2007-10-18 07:57:24 UTC
Permalink
Post by Paolo Bonzini
Post by Steven Bosscher
1) the old RTL loop optimizer could use them to move loop invariant libcalls.
Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
4) When -ftrapv is in effect, libcalls are "necessary" so that the
results of an operation can be propagated without making the call to the
libgcc functions dead. The attached patch causes a failure in pr30286.c.
One more reason to remove -ftrapv?

/me runs...
Steven Bosscher
2007-10-18 09:01:42 UTC
Permalink
Post by Paolo Bonzini
Post by Steven Bosscher
1) the old RTL loop optimizer could use them to move loop invariant libcalls.
Since the old RTL loop optimizer is no more, this is no longer a
reason to keep the old-style libcall notes.
2) the old DCE in flow.c could remove entire libcall sequences if the
libcall result was dead.
The new DCE uses REG_LIBCALL_ID notes, so again the reason to have
REG_LIBCALL and REG_RETVAL is gone here.
3) The pre-regalloc scheduling pass used to move libcall blocks as a whole.
4) When -ftrapv is in effect, libcalls are "necessary" so that the
results of an operation can be propagated without making the call to the
libgcc functions dead. The attached patch causes a failure in pr30286.c.
If the libcall is not pure/const, it should not be removed. If the
libcall is pure/const with -ftrapv, that's your bug. It seems
backwards that libcall notes now suddenly also exist to _avoid_
removing library calls...

Gr.
Steven
Eric Botcazou
2007-10-18 12:50:59 UTC
Permalink
Post by Steven Bosscher
If the libcall is not pure/const, it should not be removed. If the
libcall is pure/const with -ftrapv, that's your bug.
Yes, -ftrapv is known to be broken because of this problem.
Post by Steven Bosscher
It seems backwards that libcall notes now suddenly also exist to _avoid_
removing library calls...
Yes, that's also the misinterpretation of dce.c.
--
Eric Botcazou
Loading...