Discussion:
[m68k 03/13] boehm-gc fixes
z***@linux-m68k.org
2007-01-30 11:26:18 UTC
Permalink
Hi,

A few fixes to get gc working on m68k:
- the thread suspend handler has to save all registers
- reenable MPROTECT_VDB, it should work, otherwise it probably was a
kernel bug (and I couldn't find a reference to the original problem,
neither did I notice any problem)
- change STACKBOTTOM to LINUX_STACKBOTTOM so it works with 2.6 kernel


2007-01-30 Roman Zippel <***@linux-m68k.org>

* boehm-gc/include/private/gcconfig.h: use LINUX_STACKBOTTOM so
it works with Linux 2.6, reactivate MPROTECT_VDB
* boehm-gc/pthread_stop_world.c: save all register
on signal entry

---
boehm-gc/include/private/gcconfig.h | 4 ++--
boehm-gc/pthread_stop_world.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

Index: gcc/boehm-gc/include/private/gcconfig.h
===================================================================
--- gcc.orig/boehm-gc/include/private/gcconfig.h 2006-12-21 16:18:52.000000000 +0100
+++ gcc/boehm-gc/include/private/gcconfig.h 2006-12-21 16:22:22.000000000 +0100
@@ -665,10 +665,10 @@
# endif
# ifdef LINUX
# define OS_TYPE "LINUX"
-# define STACKBOTTOM ((ptr_t)0xf0000000)
+# define LINUX_STACKBOTTOM
# define USE_GENERIC_PUSH_REGS
/* We never got around to the assembly version. */
-/* # define MPROTECT_VDB - Reported to not work 9/17/01 */
+# define MPROTECT_VDB
# ifdef __ELF__
# define DYNAMIC_LOADING
# include <features.h>
Index: gcc/boehm-gc/pthread_stop_world.c
===================================================================
--- gcc.orig/boehm-gc/pthread_stop_world.c 2006-12-21 16:22:50.000000000 +0100
+++ gcc/boehm-gc/pthread_stop_world.c 2006-12-21 16:23:15.000000000 +0100
@@ -124,7 +124,7 @@ sem_t GC_suspend_ack_sem;

void GC_suspend_handler_inner(ptr_t sig_arg);

-#if defined(IA64) || defined(HP_PA)
+#if defined(IA64) || defined(HP_PA) || defined(M68K)
extern void GC_with_callee_saves_pushed();

void GC_suspend_handler(int sig)

--
z***@linux-m68k.org
2007-01-30 11:26:26 UTC
Permalink
Hi,

This patch fixes another reload problem, symbolic_operand() recognizes
constant addresses, but aren't recognized by GO_IF_LEGITIMATE_ADDRESS,
so that reload fails, so simply use symbolic_operand() there as well.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.h (INDIRECTABLE_1_ADDRESS_P): Use
symbolic_operand() to recognize addresses.

---

gcc/config/m68k/m68k.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

Index: gcc-4.1/gcc/config/m68k/m68k.h
===================================================================
--- gcc-4.1.orig/gcc/config/m68k/m68k.h
+++ gcc-4.1/gcc/config/m68k/m68k.h
@@ -718,9 +718,7 @@ __transfer_from_trampoline () \
&& (TARGET_68020 \
|| ((unsigned) INTVAL (XEXP (X, 1)) + 0x8000) < 0x10000)) \
|| (GET_CODE (X) == PLUS && XEXP (X, 0) == pic_offset_table_rtx \
- && flag_pic && GET_CODE (XEXP (X, 1)) == SYMBOL_REF) \
- || (GET_CODE (X) == PLUS && XEXP (X, 0) == pic_offset_table_rtx \
- && flag_pic && GET_CODE (XEXP (X, 1)) == LABEL_REF))
+ && flag_pic && symbolic_operand (XEXP (X, 1), VOIDmode))) \

#define GO_IF_NONINDEXED_ADDRESS(X, ADDR) \
{ if (INDIRECTABLE_1_ADDRESS_P (X)) goto ADDR; }

--
Richard Sandiford
2007-01-30 13:06:22 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
This patch fixes another reload problem, symbolic_operand() recognizes
constant addresses, but aren't recognized by GO_IF_LEGITIMATE_ADDRESS,
so that reload fails, so simply use symbolic_operand() there as well.
* config/m68k/m68k.h (INDIRECTABLE_1_ADDRESS_P): Use
symbolic_operand() to recognize addresses.
---
gcc/config/m68k/m68k.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
Index: gcc-4.1/gcc/config/m68k/m68k.h
===================================================================
--- gcc-4.1.orig/gcc/config/m68k/m68k.h
+++ gcc-4.1/gcc/config/m68k/m68k.h
@@ -718,9 +718,7 @@ __transfer_from_trampoline () \
&& (TARGET_68020 \
|| ((unsigned) INTVAL (XEXP (X, 1)) + 0x8000) < 0x10000)) \
|| (GET_CODE (X) == PLUS && XEXP (X, 0) == pic_offset_table_rtx \
- && flag_pic && GET_CODE (XEXP (X, 1)) == SYMBOL_REF) \
- || (GET_CODE (X) == PLUS && XEXP (X, 0) == pic_offset_table_rtx \
- && flag_pic && GET_CODE (XEXP (X, 1)) == LABEL_REF))
+ && flag_pic && symbolic_operand (XEXP (X, 1), VOIDmode))) \
#define GO_IF_NONINDEXED_ADDRESS(X, ADDR) \
{ if (INDIRECTABLE_1_ADDRESS_P (X)) goto ADDR; }
I'm not sure about this. We want to load the symbol or label
and add the offset separately, don't we? That's certainly
what the pre-reload code does.

Can you give a few more details about how this fails? How do
we end up with (plus pic_offset_table_rtx (const (plus ...)))?

Richard
Roman Zippel
2007-01-30 13:56:49 UTC
Permalink
Hi,
Post by Richard Sandiford
I'm not sure about this. We want to load the symbol or label
and add the offset separately, don't we? That's certainly
what the pre-reload code does.
Can you give a few more details about how this fails? How do
we end up with (plus pic_offset_table_rtx (const (plus ...)))?
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=385327

The additional offset is generated later by converting the SI argument to
a QI argument, but I have to check the exact details again myself.
Note that all mentioned bugzilla references are different bugs.

bye, Roman
Jeffrey Law
2007-02-05 17:44:39 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Richard Sandiford
I'm not sure about this. We want to load the symbol or label
and add the offset separately, don't we? That's certainly
what the pre-reload code does.
Can you give a few more details about how this fails? How do
we end up with (plus pic_offset_table_rtx (const (plus ...)))?
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=385327
The additional offset is generated later by converting the SI argument to
a QI argument, but I have to check the exact details again myself.
Note that all mentioned bugzilla references are different bugs.
I don't see anything in that bug report which includes any analysis of
the problem. I'd really like to see some analysis instead of just a
patch.


Jeff
Jeffrey Law
2007-02-05 17:47:23 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Richard Sandiford
I'm not sure about this. We want to load the symbol or label
and add the offset separately, don't we? That's certainly
what the pre-reload code does.
Can you give a few more details about how this fails? How do
we end up with (plus pic_offset_table_rtx (const (plus ...)))?
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=385327
The additional offset is generated later by converting the SI argument to
a QI argument, but I have to check the exact details again myself.
Note that all mentioned bugzilla references are different bugs.
I also wonder if we should be trying to prevent reload from changing
the mode of the argument from SI to QI mode -- it seems like we are
having numerous problems when arguments get squished down to QImode.

I thought we had a macro to prevent this kind of behavior.

Jeff
Roman Zippel
2007-02-05 22:56:21 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by Roman Zippel
The additional offset is generated later by converting the SI argument to
a QI argument, but I have to check the exact details again myself.
Note that all mentioned bugzilla references are different bugs.
I also wonder if we should be trying to prevent reload from changing
the mode of the argument from SI to QI mode -- it seems like we are
having numerous problems when arguments get squished down to QImode.
Actually the argument has already QI mode. This is the reload in question:

Reloads for insn # 27
[..]
Reload 1: reload_in (QI) = (mem/u:QI (plus:SI (reg:SI 13 %a5)
(const:SI (plus:SI (symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)
(const_int 3 [0x3])))) [0 S1 A8])
DATA_REGS, RELOAD_FOR_INPUT (opnum = 1), optional, can't combine
reload_in_reg: (subreg:QI (reg:SI 131) 3)

[..]

(insn 7 5 9 2 (set (reg:SI 131)
(mem/u:SI (plus:SI (reg:SI 13 %a5)
(symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)) [0 S4 A8])) 31 {*m68k.md:659} (nil)
(expr_list:REG_EQUIV (mem/u:SI (plus:SI (reg:SI 13 %a5)
(symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)) [0 S4 A8])
(nil)))

[..]

(insn 27 23 28 2 (set (mem/s:QI (plus:SI (reg:SI 134)
(const_int 26 [0x1a])) [0 Code_PositiveIntPower+26 S1 A8])
(subreg:QI (reg:SI 131) 3)) 38 {*m68k.md:731} (nil)
(nil))


I guess the alternative is to push it into a register, but this actually
works fine, the linker simply creates another GOT entry for it, so I don't
see why we should disallow it.

bye, Roman
Jeffrey Law
2007-02-06 17:31:51 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by Roman Zippel
The additional offset is generated later by converting the SI argument to
a QI argument, but I have to check the exact details again myself.
Note that all mentioned bugzilla references are different bugs.
I also wonder if we should be trying to prevent reload from changing
the mode of the argument from SI to QI mode -- it seems like we are
having numerous problems when arguments get squished down to QImode.
Reloads for insn # 27
[..]
Reload 1: reload_in (QI) = (mem/u:QI (plus:SI (reg:SI 13 %a5)
(const:SI (plus:SI (symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)
(const_int 3 [0x3])))) [0 S1 A8])
DATA_REGS, RELOAD_FOR_INPUT (opnum = 1), optional, can't combine
reload_in_reg: (subreg:QI (reg:SI 131) 3)
[..]
(insn 7 5 9 2 (set (reg:SI 131)
(mem/u:SI (plus:SI (reg:SI 13 %a5)
(symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)) [0 S4 A8])) 31 {*m68k.md:659} (nil)
(expr_list:REG_EQUIV (mem/u:SI (plus:SI (reg:SI 13 %a5)
(symbol_ref:SI ("_Z12LispLessThanR15LispEnvironmenti") [flags 0x41] <function_decl 0xb7906620 LispLessThan>)) [0 S4 A8])
(nil)))
[..]
(insn 27 23 28 2 (set (mem/s:QI (plus:SI (reg:SI 134)
(const_int 26 [0x1a])) [0 Code_PositiveIntPower+26 S1 A8])
(subreg:QI (reg:SI 131) 3)) 38 {*m68k.md:731} (nil)
(nil))
I guess the alternative is to push it into a register, but this actually
works fine, the linker simply creates another GOT entry for it, so I don't
see why we should disallow it.
OK. It's not changing the mode of anything, it's simply reloading for
a subreg. So concerns about mode changes are really a red herring.

ISTM that you don't generally want gunk like <symbol>+offset in the GOT,
instead you'd prefer to just have the raw symbol in the got, then add in
the offset.

That opinion also matches the comments in G_I_L_A which states "When
generating PIC, an address involving a SYMBOL_REF is legitimate if and
only if it is the sum of pic_offset_table_rtx and the SYMBOL_REF" (or
LABEL_REF).

What happens if you twiddle LEGITIMATE_PIC_OPERAND_P instead? Instead
of having it check for symbolic_operand, instead explicitly check for
just SYMBOL_REF or LABEL_REF which would make it match the comments in
G_I_L_A. My concern with this approach is how does it affect code
generation earlier in the compiler.

If twiddling LEGITIMATE_PIC_OPERAND_P turns out to be a bad idea, then
we can probably go forward with your patch to INDIRECTABLE_1_ADDRESS_P.
We can then ponder defining LEGITIMIZE_RELOAD_ADDRESS if we want to
twiddle the reloads generated to get better performance or try and
minimize the number of GOT entries if we're inclined to do so.

Jeff
Roman Zippel
2007-02-07 21:47:51 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by Roman Zippel
I guess the alternative is to push it into a register, but this actually
works fine, the linker simply creates another GOT entry for it, so I don't
see why we should disallow it.
OK. It's not changing the mode of anything, it's simply reloading for
a subreg. So concerns about mode changes are really a red herring.
ISTM that you don't generally want gunk like <symbol>+offset in the GOT,
instead you'd prefer to just have the raw symbol in the got, then add in
the offset.
Why is it so important to avoid this? I can understand it in the
-fpic/ColdFire case to minimize GOT entries, but for -fPIC it's not that
important and I'd rather take a few more GOT entries, if it allows better
code.
I wouldn't mind cleaning up the code to conditionally allow for both
cases, but then I'd rather wait for Richard's patch which uninlines
G_I_L_A.

bye, Roman
Jeffrey Law
2007-02-12 17:22:05 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by Roman Zippel
I guess the alternative is to push it into a register, but this actually
works fine, the linker simply creates another GOT entry for it, so I don't
see why we should disallow it.
OK. It's not changing the mode of anything, it's simply reloading for
a subreg. So concerns about mode changes are really a red herring.
ISTM that you don't generally want gunk like <symbol>+offset in the GOT,
instead you'd prefer to just have the raw symbol in the got, then add in
the offset.
Why is it so important to avoid this? I can understand it in the
-fpic/ColdFire case to minimize GOT entries, but for -fPIC it's not that
important and I'd rather take a few more GOT entries, if it allows better
code.
1. Even with -fPIC you want a compact GOT to keep your displacements
as small as possible. Just because you can reference a large GOT
doesn't make it a good thing.

2. If the GOT only contains the symbolic part of the address, then
references to the GOT can be CSE'd, which is very good as every
reference to the GOT is a memory operation.

Jeff
z***@linux-m68k.org
2007-01-30 11:26:22 UTC
Permalink
Hi,

This removes the workaround for some assemblers to address the jump
table with "Lnnn-LInnn-2.b(%pc)" and simply uses "Lnnn(%pc)" now, this
means the code to generate the extra label can go now.
As a side effect it allows gas to generate the best addressing based on
the offset.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.c (output_move_himode)
config/m68k/m68k.md: remove jump table recognition
* config/m68k/m68k.c (ASM_OUTPUT_CASE_FETCH): use simple pc
relative addressing

---

gcc/config/m68k/m68k.c | 33 ++-------------------------------
gcc/config/m68k/m68k.md | 21 +--------------------
2 files changed, 3 insertions(+), 51 deletions(-)

Index: gcc/gcc/config/m68k/m68k.c
===================================================================
--- gcc.orig/gcc/config/m68k/m68k.c 2006-12-26 01:57:15.000000000 +0100
+++ gcc/gcc/config/m68k/m68k.c 2006-12-26 18:30:51.000000000 +0100
@@ -1819,25 +1819,6 @@ output_move_himode (rtx *operands)
}
else if (CONSTANT_P (operands[1]))
return "move%.l %1,%0";
- /* Recognize the insn before a tablejump, one that refers
- to a table of offsets. Such an insn will need to refer
- to a label on the insn. So output one. Use the label-number
- of the table of offsets to generate this label. This code,
- and similar code below, assumes that there will be at most one
- reference to each table. */
- if (GET_CODE (operands[1]) == MEM
- && GET_CODE (XEXP (operands[1], 0)) == PLUS
- && GET_CODE (XEXP (XEXP (operands[1], 0), 1)) == LABEL_REF
- && GET_CODE (XEXP (XEXP (operands[1], 0), 0)) != PLUS)
- {
- rtx labelref = XEXP (XEXP (operands[1], 0), 1);
- if (MOTOROLA)
- asm_fprintf (asm_out_file, "\t.set %LLI%d,.+2\n",
- CODE_LABEL_NUMBER (XEXP (labelref, 0)));
- else
- (*targetm.asm_out.internal_label) (asm_out_file, "LI",
- CODE_LABEL_NUMBER (XEXP (labelref, 0)));
- }
return "move%.w %1,%0";
}

@@ -2957,16 +2938,6 @@ print_operand (FILE *file, rtx op, int l
It is possible for PIC to generate a (plus (label_ref...) (reg...))
and we handle that just like we would a (plus (symbol_ref...) (reg...)).

- Some SGS assemblers have a bug such that "Lnnn-LInnn-2.b(pc,d0.l*2)"
- fails to assemble. Luckily "Lnnn(pc,d0.l*2)" produces the results
- we want. This difference can be accommodated by using an assembler
- define such "LDnnn" to be either "Lnnn-LInnn-2.b", "Lnnn", or any other
- string, as necessary. This is accomplished via the ASM_OUTPUT_CASE_END
- macro. See m68k/sgs.h for an example; for versions without the bug.
- Some assemblers refuse all the above solutions. The workaround is to
- emit "K(pc,d0.l*2)" with K being a small constant known to give the
- right behavior.
-
They also do not like things like "pea 1.w", so we simple leave off
the .w on small constants.

@@ -2977,10 +2948,10 @@ print_operand (FILE *file, rtx op, int l

#if MOTOROLA
# define ASM_OUTPUT_CASE_FETCH(file, labelno, regname) \
- asm_fprintf (file, "%LL%d-%LLI%d.b(%Rpc,%s.", labelno, labelno, regname)
+ asm_fprintf (file, "%LL%d(%Rpc,%s.", labelno, regname)
#else /* !MOTOROLA */
# define ASM_OUTPUT_CASE_FETCH(file, labelno, regname) \
- asm_fprintf (file, "%Rpc@(%LL%d-%LLI%d-2:b,%s:", labelno, labelno, regname)
+ asm_fprintf (file, "%Rpc@(%LL%d,%s:", labelno, regname)
#endif /* !MOTOROLA */

void
Index: gcc/gcc/config/m68k/m68k.md
===================================================================
--- gcc.orig/gcc/config/m68k/m68k.md 2006-12-26 13:10:54.000000000 +0100
+++ gcc/gcc/config/m68k/m68k.md 2006-12-26 18:30:51.000000000 +0100
@@ -6696,26 +6696,7 @@
[(set (match_operand:SI 0 "nonimmediate_operand" "=a")
(match_operand:QI 1 "address_operand" "p"))]
""
-{
- /* Recognize an insn that refers to a table of offsets. Such an insn will
- need to refer to a label on the insn. So output one. Use the
- label-number of the table of offsets to generate this label. This code,
- and similar code above, assumes that there will be at most one reference
- to each table. */
- if (GET_CODE (operands[1]) == PLUS
- && GET_CODE (XEXP (operands[1], 1)) == LABEL_REF
- && GET_CODE (XEXP (operands[1], 0)) != PLUS)
- {
- rtx labelref = XEXP (operands[1], 1);
- if (MOTOROLA)
- asm_fprintf (asm_out_file, "\\t.set %LLI%d,.+2\\n",
- CODE_LABEL_NUMBER (XEXP (labelref, 0)));
- else
- (*targetm.asm_out.internal_label) (asm_out_file, "LI",
- CODE_LABEL_NUMBER (XEXP (labelref, 0)));
- }
- return "lea %a1,%0";
-})
+ "lea %a1,%0")

;; This is the first machine-dependent peephole optimization.
;; It is useful when a floating value is returned from a function call

--
Jeffrey Law
2007-02-05 17:20:14 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
This removes the workaround for some assemblers to address the jump
table with "Lnnn-LInnn-2.b(%pc)" and simply uses "Lnnn(%pc)" now, this
means the code to generate the extra label can go now.
As a side effect it allows gas to generate the best addressing based on
the offset.
* config/m68k/m68k.c (output_move_himode)
config/m68k/m68k.md: remove jump table recognition
* config/m68k/m68k.c (ASM_OUTPUT_CASE_FETCH): use simple pc
relative addressing
OK.
Jeff
z***@linux-m68k.org
2007-01-30 11:26:25 UTC
Permalink
Hi,

This is another patch which originally addressed a reload problem by
avoiding to generate a load from the constant pool in the first place.
It cleans up the predicate handling and rejects certain constants (that
we want in registers) early, but allows them again during reload for the
case that reload has no register left and it tries the immediate
argument first, before generating a memory reference.
The fp compare pattern are now much closer to the real instructions, so
that the main work is done by combine and avoids unneccessary reloads.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.c (notice_update_cc): recognize fp compare
(moved from fp compare patterns).
* config/m68k/m68k.md (cmp?f): cleanup predicates to relieve
reload.
* gcc/config/m68k/predicates.md (fp_src_operand): New, reject
certain constants early.

---
gcc/config/m68k/m68k.c | 3 ++
gcc/config/m68k/m68k.md | 52 ++++++++++++------------------------------
gcc/config/m68k/predicates.md | 10 ++++++++
3 files changed, 28 insertions(+), 37 deletions(-)

Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -2872,6 +2872,9 @@ notice_update_cc (rtx exp, rtx insn)
if (((cc_status.value1 && FP_REG_P (cc_status.value1))
|| (cc_status.value2 && FP_REG_P (cc_status.value2))))
cc_status.flags = CC_IN_68881;
+ if (cc_status.value2 && GET_CODE (cc_status.value2) == COMPARE
+ && GET_MODE_CLASS (GET_MODE (XEXP (cc_status.value2, 0))) == MODE_FLOAT)
+ cc_status.flags = CC_IN_68881;
}

const char *
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -406,50 +406,28 @@

(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")

-(define_insn "cmp<mode>_68881"
+(define_insn "*cmp<mode>_68881"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,m<FP:const>")
- (match_operand:FP 1 "general_operand" "f<FP:dreg>m<FP:const>,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,<FP:dreg>mF")))]
"TARGET_68881"
-{
- cc_status.flags = CC_IN_68881;
- if (FP_REG_P (operands[0]))
- {
- if (FP_REG_P (operands[1]))
- return "fcmp%.x %1,%0";
- else
- return "fcmp%.<FP:prec> %f1,%0";
- }
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ "@
+ fcmp%.x %1,%0
+ fcmp%.<FP:prec> %f1,%0")

-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,f<FP:dreg><Q>U")))]
"TARGET_COLDFIRE_FPU"
-{
- cc_status.flags = CC_IN_68881;
- if (FP_REG_P (operands[0]))
- {
- if (FP_REG_P (operands[1]))
- return "fcmp%.d %1,%0";
- else
- return "fcmp%.<FP:prec> %f1,%0";
- }
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ "@
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")

;; Recognizers for btst instructions.

Index: egcs/gcc/config/m68k/predicates.md
===================================================================
--- egcs.orig/gcc/config/m68k/predicates.md
+++ egcs/gcc/config/m68k/predicates.md
@@ -181,3 +181,13 @@
(define_predicate "pre_dec_operand"
(and (match_code "mem")
(match_test "GET_CODE (XEXP (op, 0)) == PRE_DEC")))
+
+;; Special case of general_src_operand, which rejects a few fp
+;; constants (which we prefer in registers) before reload.
+
+(define_predicate "fp_src_operand"
+ (match_operand 0 "general_src_operand")
+{
+ return !CONSTANT_P (op) || !standard_68881_constant_p (op)
+ || reload_in_progress || reload_completed;
+})

--
Jeffrey Law
2007-02-05 17:27:28 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
This is another patch which originally addressed a reload problem by
avoiding to generate a load from the constant pool in the first place.
It cleans up the predicate handling and rejects certain constants (that
we want in registers) early, but allows them again during reload for the
case that reload has no register left and it tries the immediate
argument first, before generating a memory reference.
The fp compare pattern are now much closer to the real instructions, so
that the main work is done by combine and avoids unneccessary reloads.
* config/m68k/m68k.c (notice_update_cc): recognize fp compare
(moved from fp compare patterns).
* config/m68k/m68k.md (cmp?f): cleanup predicates to relieve
reload.
* gcc/config/m68k/predicates.md (fp_src_operand): New, reject
certain constants early.
OK.
Jeff
Richard Sandiford
2007-02-05 20:08:05 UTC
Permalink
I like the idea here, but I'm a little worried about some details.
Post by z***@linux-m68k.org
(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")
It looks to me like you deleted the force_reg here without replacing
it with something else. This ColdFire code seems to have been a
half-measure that goes some way to doing what your patch does,
but it only did it for ColdFire, and (wrongly) didn't enforce the
same condition in the insn itself. (And I've no idea why it
checked reload_completed -- the compare optabs shouldn't be called
after reload.)

But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.

We should either keep the force_reg, or restrict fp_src_operand to
TARGET_COLDFIRE_FPU. I think the latter is easier because...
Post by z***@linux-m68k.org
-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,f<FP:dreg><Q>U")))]
...it would then work as-is here too.

Nitpick: the second "f" in operand 1's constraints is redundant.
Post by z***@linux-m68k.org
-{
- cc_status.flags = CC_IN_68881;
- if (FP_REG_P (operands[0]))
- {
- if (FP_REG_P (operands[1]))
- return "fcmp%.d %1,%0";
- else
- return "fcmp%.<FP:prec> %f1,%0";
- }
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")
You didn't say why you were getting rid of the CC_REVERSED stuff.
Are you sure it isn't useful?

Richard
Roman Zippel
2007-02-05 20:56:44 UTC
Permalink
Hi,
Post by Richard Sandiford
I like the idea here, but I'm a little worried about some details.
Post by z***@linux-m68k.org
(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")
It looks to me like you deleted the force_reg here without replacing
it with something else. This ColdFire code seems to have been a
half-measure that goes some way to doing what your patch does,
but it only did it for ColdFire, and (wrongly) didn't enforce the
same condition in the insn itself. (And I've no idea why it
checked reload_completed -- the compare optabs shouldn't be called
after reload.)
But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.
One of the operands is still forced into a register.
I don't think this will cause a perfomance regression, as combine later
had all the freedom to put the value back.
Post by Richard Sandiford
We should either keep the force_reg, or restrict fp_src_operand to
TARGET_COLDFIRE_FPU. I think the latter is easier because...
Post by z***@linux-m68k.org
-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,f<FP:dreg><Q>U")))]
...it would then work as-is here too.
Why do you want to force _both_ values into a register, the old code
didn't do this and I don't see a reason why it would be any advantage.
Post by Richard Sandiford
Post by z***@linux-m68k.org
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")
You didn't say why you were getting rid of the CC_REVERSED stuff.
Are you sure it isn't useful?
Because it's done in combine now. With the old pattern the register
operand could be reloaded into any of the two operands, which isn't a
problem with the new pattern anymore.

bye, Roman
Richard Sandiford
2007-02-05 21:11:18 UTC
Permalink
Post by Roman Zippel
Post by Richard Sandiford
I like the idea here, but I'm a little worried about some details.
Post by z***@linux-m68k.org
(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")
It looks to me like you deleted the force_reg here without replacing
it with something else. This ColdFire code seems to have been a
half-measure that goes some way to doing what your patch does,
but it only did it for ColdFire, and (wrongly) didn't enforce the
same condition in the insn itself. (And I've no idea why it
checked reload_completed -- the compare optabs shouldn't be called
after reload.)
But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.
One of the operands is still forced into a register.
Yeah, but the other operand. ;) The operands aren't commutative after all.
Post by Roman Zippel
I don't think this will cause a perfomance regression, as combine later
had all the freedom to put the value back.
But it would only do that if the compare instruction was the only user.
combine wouldn't have got rid of the copy if there were _two_ users.

Do you disagree that fp_src_operand should reject constants for
TARGET_COLDFIRE_FPU?
Post by Roman Zippel
Post by Richard Sandiford
We should either keep the force_reg, or restrict fp_src_operand to
TARGET_COLDFIRE_FPU. I think the latter is easier because...
Post by z***@linux-m68k.org
-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,f<FP:dreg><Q>U")))]
...it would then work as-is here too.
Why do you want to force _both_ values into a register, the old code
didn't do this and I don't see a reason why it would be any advantage.
Well, like I say, the old code was a half-measure. (It was presumably
better than nothing though, otherwise it would never have been added.)
What we really want is to force one operand to be a nonimmediate_operand
and one to be a register_operand. As I say, the best way to do that
is in fp_src_operand. All I'm really saying is that fp_src_operand
should reject constants if TARGET_COLDFIRE_FPU.
Post by Roman Zippel
Post by Richard Sandiford
Post by z***@linux-m68k.org
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")
You didn't say why you were getting rid of the CC_REVERSED stuff.
Are you sure it isn't useful?
Because it's done in combine now. With the old pattern the register
operand could be reloaded into any of the two operands, which isn't a
problem with the new pattern anymore.
But why was it a problem? Consider the case where we have:

(set (cc0)
(compare (reg:SF PSEUDO1)
(reg:SF PSEUDO2)))

before reload. PSEUDO2 gets allocated a hard register H and PSEUDO1
gets spilled to the stack at address A. The old pattern allowed:

(set (cc0)
(compare (mem:SF A)
(reg:SF H)))

keeping the comparison at one instruction. Yours forces reload to
move (mem:SF A) into a temporary register T first, which is both an
extra instruction, and has the possiblity of kicking out whatever
the register allocators put into T.

Richard
Roman Zippel
2007-02-05 22:14:43 UTC
Permalink
Hi,
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
It looks to me like you deleted the force_reg here without replacing
it with something else. This ColdFire code seems to have been a
half-measure that goes some way to doing what your patch does,
but it only did it for ColdFire, and (wrongly) didn't enforce the
same condition in the insn itself. (And I've no idea why it
checked reload_completed -- the compare optabs shouldn't be called
after reload.)
But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.
One of the operands is still forced into a register.
Yeah, but the other operand. ;) The operands aren't commutative after all.
What difference does it make at this point?
Post by Richard Sandiford
Post by Roman Zippel
I don't think this will cause a perfomance regression, as combine later
had all the freedom to put the value back.
But it would only do that if the compare instruction was the only user.
combine wouldn't have got rid of the copy if there were _two_ users.
AFAIK [g]cse should have put the value into a register in this case.
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
We should either keep the force_reg, or restrict fp_src_operand to
TARGET_COLDFIRE_FPU. I think the latter is easier because...
Post by z***@linux-m68k.org
-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
+ (compare (match_operand:FP 0 "register_operand" "f,f")
+ (match_operand:FP 1 "fp_src_operand" "f,f<FP:dreg><Q>U")))]
...it would then work as-is here too.
Why do you want to force _both_ values into a register, the old code
didn't do this and I don't see a reason why it would be any advantage.
Well, like I say, the old code was a half-measure. (It was presumably
better than nothing though, otherwise it would never have been added.)
What we really want is to force one operand to be a nonimmediate_operand
and one to be a register_operand. As I say, the best way to do that
is in fp_src_operand. All I'm really saying is that fp_src_operand
should reject constants if TARGET_COLDFIRE_FPU.
Any constants? Why?
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
Post by z***@linux-m68k.org
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")
You didn't say why you were getting rid of the CC_REVERSED stuff.
Are you sure it isn't useful?
Because it's done in combine now. With the old pattern the register
operand could be reloaded into any of the two operands, which isn't a
problem with the new pattern anymore.
(set (cc0)
(compare (reg:SF PSEUDO1)
(reg:SF PSEUDO2)))
before reload. PSEUDO2 gets allocated a hard register H and PSEUDO1
(set (cc0)
(compare (mem:SF A)
(reg:SF H)))
keeping the comparison at one instruction. Yours forces reload to
move (mem:SF A) into a temporary register T first, which is both an
extra instruction, and has the possiblity of kicking out whatever
the register allocators put into T.
I'm trying to avoid unnecessary reloads by putting one of the values
already into a register before reload. In your case reload already has too
few registers and has to do reloads anyway and can still make another
try at reducing reloads some other way. This makes it a relative special
case, where I'm trying to do something for the general case.
I wouldn't mind to help reload for this case, but I'd prefer if it were
possible without losing the more general improvements.

bye, Roman
Richard Sandiford
2007-02-05 22:47:34 UTC
Permalink
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
It looks to me like you deleted the force_reg here without replacing
it with something else. This ColdFire code seems to have been a
half-measure that goes some way to doing what your patch does,
but it only did it for ColdFire, and (wrongly) didn't enforce the
same condition in the insn itself. (And I've no idea why it
checked reload_completed -- the compare optabs shouldn't be called
after reload.)
But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.
One of the operands is still forced into a register.
Yeah, but the other operand. ;) The operands aren't commutative after all.
What difference does it make at this point?
The difference is that conditions are AIUI canonicalised at the tree level
in such a way that the second operand is likely to the non-constant once.
Consider:

int foo (float f) { return 1.0 > f; }

which becomes:

return f < 1.0e+0

So forcing constant first operands into a register is not as effective
as forcing constant second operands in a register.

The basic rule is: if the hardware insn needs a separate register load,
expose it as early as possible. I thought that was the point of your
patch (and I agree it's a good idea). I'm just trying to point out
a problem with the ColdFire side of things.
Post by Roman Zippel
Post by Richard Sandiford
Well, like I say, the old code was a half-measure. (It was presumably
better than nothing though, otherwise it would never have been added.)
What we really want is to force one operand to be a nonimmediate_operand
and one to be a register_operand. As I say, the best way to do that
is in fp_src_operand. All I'm really saying is that fp_src_operand
should reject constants if TARGET_COLDFIRE_FPU.
Any constants? Why?
Yes, and because the ColdFire FPU's comparison insns don't allow
constant operands. They're not as rich as the 68881 insns.
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
Post by z***@linux-m68k.org
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0")
You didn't say why you were getting rid of the CC_REVERSED stuff.
Are you sure it isn't useful?
Because it's done in combine now. With the old pattern the register
operand could be reloaded into any of the two operands, which isn't a
problem with the new pattern anymore.
(set (cc0)
(compare (reg:SF PSEUDO1)
(reg:SF PSEUDO2)))
before reload. PSEUDO2 gets allocated a hard register H and PSEUDO1
(set (cc0)
(compare (mem:SF A)
(reg:SF H)))
keeping the comparison at one instruction. Yours forces reload to
move (mem:SF A) into a temporary register T first, which is both an
extra instruction, and has the possiblity of kicking out whatever
the register allocators put into T.
I'm trying to avoid unnecessary reloads by putting one of the values
already into a register before reload. In your case reload already has too
few registers and has to do reloads anyway and can still make another
try at reducing reloads some other way.
Well, AIUI, reload isn't smart enough to reverse comparisons.
Nor should it be. It's already too much of a kitchen-sink pass.
I don't think we can really expect reload to avoid a temporary
register in this case _unless_ we allow operand 0 to be a MEM.
Post by Roman Zippel
This makes it a relative special case, where I'm trying to do
something for the general case. I wouldn't mind to help reload for
this case, but I'd prefer if it were possible without losing the more
general improvements.
I don't think it has to be "one or the other". The standard idiom,
as used in the move patterns on most RISC targets, is to use the most
permissive applicable predicate for both operands, then use both the
constraints _and_ the insn condition to force one of the operands to be
a register. That way you get the desired restrictions before, during,
and after reload. E.g.:

---------------------------------------------------------------------------
(define_insn "*cmp<mode>_cf"
[(set (cc0)
(compare (match_operand:FP 0 "fp_src_operand" "f,<FP:dreg><Q>U")
(match_operand:FP 1 "fp_src_operand" "f<FP:dreg><Q>U,f")))]
"TARGET_COLDFIRE_FPU
&& (register_operand (operands[0], VOIDmode)
|| register_operand (operands[1], VOIDmode))"
{
cc_status.flags = CC_IN_68881;
if (FP_REG_P (operands[0]))
{
if (FP_REG_P (operands[1]))
return "fcmp%.d %1,%0";
else
return "fcmp%.<FP:prec> %f1,%0";
}
cc_status.flags |= CC_REVERSED;
return "fcmp%.<FP:prec> %f0,%1";
})
---------------------------------------------------------------------------

and similar for the m68k. (Untested.)

Richard
Roman Zippel
2007-02-06 03:26:33 UTC
Permalink
Hi,
Post by Richard Sandiford
Post by Roman Zippel
What difference does it make at this point?
The difference is that conditions are AIUI canonicalised at the tree level
in such a way that the second operand is likely to the non-constant once.
Ah, I forgot about this.
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
is in fp_src_operand. All I'm really saying is that fp_src_operand
should reject constants if TARGET_COLDFIRE_FPU.
Any constants? Why?
Yes, and because the ColdFire FPU's comparison insns don't allow
constant operands. They're not as rich as the 68881 insns.
Then it makes sense of course.
Post by Richard Sandiford
I don't think it has to be "one or the other". The standard idiom,
as used in the move patterns on most RISC targets, is to use the most
permissive applicable predicate for both operands, then use both the
constraints _and_ the insn condition to force one of the operands to be
a register. That way you get the desired restrictions before, during,
---------------------------------------------------------------------------
(define_insn "*cmp<mode>_cf"
[(set (cc0)
(compare (match_operand:FP 0 "fp_src_operand" "f,<FP:dreg><Q>U")
(match_operand:FP 1 "fp_src_operand" "f<FP:dreg><Q>U,f")))]
"TARGET_COLDFIRE_FPU
&& (register_operand (operands[0], VOIDmode)
|| register_operand (operands[1], VOIDmode))"
{
cc_status.flags = CC_IN_68881;
if (FP_REG_P (operands[0]))
{
if (FP_REG_P (operands[1]))
return "fcmp%.d %1,%0";
else
return "fcmp%.<FP:prec> %f1,%0";
}
cc_status.flags |= CC_REVERSED;
return "fcmp%.<FP:prec> %f0,%1";
})
---------------------------------------------------------------------------
and similar for the m68k. (Untested.)
Ok, below is an updated patch, which also rejects all constants for
ColdFire. (Ligthly tested.)

bye, Roman

---
gcc/config/m68k/m68k.c | 7 ++++
gcc/config/m68k/m68k.md | 62 +++++++++++++++---------------------------
gcc/config/m68k/predicates.md | 11 +++++++
3 files changed, 41 insertions(+), 39 deletions(-)

Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -2875,6 +2875,13 @@ notice_update_cc (rtx exp, rtx insn)
if (((cc_status.value1 && FP_REG_P (cc_status.value1))
|| (cc_status.value2 && FP_REG_P (cc_status.value2))))
cc_status.flags = CC_IN_68881;
+ if (cc_status.value2 && GET_CODE (cc_status.value2) == COMPARE
+ && GET_MODE_CLASS (GET_MODE (XEXP (cc_status.value2, 0))) == MODE_FLOAT)
+ {
+ cc_status.flags = CC_IN_68881;
+ if (!FP_REG_P (XEXP (cc_status.value2, 0)))
+ cc_status.flags |= CC_REVERSED;
+ }
}

const char *
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -406,50 +406,34 @@

(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")

-(define_insn "cmp<mode>_68881"
+(define_insn "*cmp<mode>_68881"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,m<FP:const>")
- (match_operand:FP 1 "general_operand" "f<FP:dreg>m<FP:const>,f")))]
- "TARGET_68881"
-{
- cc_status.flags = CC_IN_68881;
- if (FP_REG_P (operands[0]))
- {
- if (FP_REG_P (operands[1]))
- return "fcmp%.x %1,%0";
- else
- return "fcmp%.<FP:prec> %f1,%0";
- }
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ (compare (match_operand:FP 0 "fp_src_operand" "f,f,<FP:dreg>mF")
+ (match_operand:FP 1 "fp_src_operand" "f,<FP:dreg>mF,f")))]
+ "TARGET_68881
+ && (register_operand (operands[0], <MODE>mode)
+ || register_operand (operands[1], <MODE>mode))"
+ "@
+ fcmp%.x %1,%0
+ fcmp%.<FP:prec> %f1,%0
+ fcmp%.<FP:prec> %0,%f1")

-(define_insn "cmp<mode>_cf"
+(define_insn "*cmp<mode>_cf"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "f,<FP:dreg><Q>U")
- (match_operand:FP 1 "general_operand" "f<FP:dreg><Q>U,f")))]
- "TARGET_COLDFIRE_FPU"
-{
- cc_status.flags = CC_IN_68881;
- if (FP_REG_P (operands[0]))
- {
- if (FP_REG_P (operands[1]))
- return "fcmp%.d %1,%0";
- else
- return "fcmp%.<FP:prec> %f1,%0";
- }
- cc_status.flags |= CC_REVERSED;
- return "fcmp%.<FP:prec> %f0,%1";
-})
+ (compare (match_operand:FP 0 "fp_src_operand" "f,f,<FP:dreg><Q>U")
+ (match_operand:FP 1 "fp_src_operand" "f,<FP:dreg><Q>U,f")))]
+ "TARGET_COLDFIRE_FPU
+ && (register_operand (operands[0], <MODE>mode)
+ || register_operand (operands[1], <MODE>mode))"
+ "@
+ fcmp%.d %1,%0
+ fcmp%.<FP:prec> %f1,%0
+ fcmp%.<FP:prec> %0,%f1")

;; Recognizers for btst instructions.

Index: egcs/gcc/config/m68k/predicates.md
===================================================================
--- egcs.orig/gcc/config/m68k/predicates.md
+++ egcs/gcc/config/m68k/predicates.md
@@ -181,3 +181,14 @@
(define_predicate "pre_dec_operand"
(and (match_code "mem")
(match_test "GET_CODE (XEXP (op, 0)) == PRE_DEC")))
+
+;; Special case of general_src_operand, which rejects a few fp
+;; constants (which we prefer in registers) before reload.
+
+(define_predicate "fp_src_operand"
+ (match_operand 0 "general_src_operand")
+{
+ return !CONSTANT_P (op) || (!TARGET_COLDFIRE_FPU
+ && !standard_68881_constant_p (op))
+ || reload_in_progress || reload_completed;
+})
Richard Sandiford
2007-02-06 07:24:42 UTC
Permalink
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
What difference does it make at this point?
The difference is that conditions are AIUI canonicalised at the tree level
in such a way that the second operand is likely to the non-constant once.
Ah, I forgot about this.
Oops, sorry, I typoed. I meant: ...in such a way that the second operand
is likely to the _constant one_. (I think you knew that, but just for
the record.)
Post by Roman Zippel
Ok, below is an updated patch, which also rejects all constants for
ColdFire. (Ligthly tested.)
+;; Special case of general_src_operand, which rejects a few fp
+;; constants (which we prefer in registers) before reload.
+
+(define_predicate "fp_src_operand"
+ (match_operand 0 "general_src_operand")
+{
+ return !CONSTANT_P (op) || (!TARGET_COLDFIRE_FPU
+ && !standard_68881_constant_p (op))
+ || reload_in_progress || reload_completed;
+})
...the ColdFire patterns shouldn't really accept constants at any
stage. The different pre-reload and mid/post-reload handling of
special constants doesn't apply to ColdFire either.

Also -- very minor -- TARGET_68881 would be better than
!TARGET_COLDFIRE_FPU, especially as you're calling a function
with 68881 in its name anyway.

So I think the predicate should look something like this:

-------------------------------------------------------------------------
(define_predicate "fp_src_operand"
(match_operand 0 "general_src_operand")
{
return (!CONSTANT_P (op)
|| (TARGET_68881
&& (!standard_68881_constant_p (op)
|| reload_in_progress
|| reload_completed)));
})
-------------------------------------------------------------------------

The patch looks good to me otherwise FWIW.

Richard
Roman Zippel
2007-02-07 21:25:44 UTC
Permalink
Hi,
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
What difference does it make at this point?
The difference is that conditions are AIUI canonicalised at the tree level
in such a way that the second operand is likely to the non-constant once.
Ah, I forgot about this.
Oops, sorry, I typoed. I meant: ...in such a way that the second operand
is likely to the _constant one_. (I think you knew that, but just for
the record.)
Yes. :)
BTW I looked closer at the code, constants are currently always pushed
into register as prepare_cmp_insn() thinks it's too expensive.
Also it's a little annoying that long double constant are completely
rejected by LEGITIMATE_CONSTANT_P(), but maybe it's useful for ColdFire?
Post by Richard Sandiford
Post by Roman Zippel
Ok, below is an updated patch, which also rejects all constants for
ColdFire. (Ligthly tested.)
+;; Special case of general_src_operand, which rejects a few fp
+;; constants (which we prefer in registers) before reload.
+
+(define_predicate "fp_src_operand"
+ (match_operand 0 "general_src_operand")
+{
+ return !CONSTANT_P (op) || (!TARGET_COLDFIRE_FPU
+ && !standard_68881_constant_p (op))
+ || reload_in_progress || reload_completed;
+})
...the ColdFire patterns shouldn't really accept constants at any
stage. The different pre-reload and mid/post-reload handling of
special constants doesn't apply to ColdFire either.
Indeed, I'll change it.

bye, Roman
Jeffrey Law
2007-02-05 21:02:04 UTC
Permalink
Post by Richard Sandiford
I like the idea here, but I'm a little worried about some details.
Post by z***@linux-m68k.org
(define_expand "cmp<mode>"
[(set (cc0)
- (compare (match_operand:FP 0 "general_operand" "")
- (match_operand:FP 1 "general_operand" "")))]
+ (compare (match_operand:FP 0 "register_operand" "")
+ (match_operand:FP 1 "fp_src_operand" "")))]
"TARGET_HARD_FLOAT"
-{
- m68k_last_compare_had_fp_operands = 1;
- if (TARGET_COLDFIRE && !reload_completed)
- operands[1] = force_reg (<MODE>mode, operands[1]);
-})
+ "m68k_last_compare_had_fp_operands = 1;")
It looks to me like you deleted the force_reg here without replacing
it with something else.
Good catch. Sorry I missed it. I guess the question now becomes
whether or not the coldfire required op1 to be a register or does it
just require one operand to be a register. The new code will handle
the latter, but not the former.
Post by Richard Sandiford
But the idea of forcing the operand into a register for ColdFire was a
good one, and we shouldn't lose it. The patch could be a performance
regression on ColdFire as-is.
Agreed.
Post by Richard Sandiford
We should either keep the force_reg, or restrict fp_src_operand to
TARGET_COLDFIRE_FPU. I think the latter is easier because...
I think you mean have fp_src_operand reject non-register operands for
TARGET_COLDFIRE, right? That was my first thought when I saw your
objection.

The !reload_completed condition certainly looks strange. You might be
able to ping Paul and/or Peter since I think they added that clause.

Jeff
z***@linux-m68k.org
2007-01-30 11:26:16 UTC
Permalink
Hi,

gcc has problems to generate conditional fp jumps in very large
functions, as it always uses the 16bit variant. The easy solution is to
use the pseudo opcodes gas provides as we do with normal jump opcodes
and leave the work to gas.
This removes also many MOTOROLA tests, as the used jump opcodes weren't
really Motorola ops, but all of them are gas pseudo opcodes and this
uses now the officially documented variant.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.md: replace all fbcc with fjcc

---
gcc/config/m68k/m68k.md | 92 ++++++++++++++----------------------------------
1 file changed, 28 insertions(+), 64 deletions(-)

Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -5777,10 +5777,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbeq %l0", "fbeq %l0", "jbeq %l0");
- else
- OUTPUT_JUMP ("jeq %l0", "fjeq %l0", "jeq %l0");
+ OUTPUT_JUMP ("jeq %l0", "fjeq %l0", "jeq %l0");
})

(define_insn "bne"
@@ -5791,10 +5788,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbne %l0", "fbne %l0", "jbne %l0");
- else
- OUTPUT_JUMP ("jne %l0", "fjne %l0", "jne %l0");
+ OUTPUT_JUMP ("jne %l0", "fjne %l0", "jne %l0");
})

(define_insn "bgt"
@@ -5805,10 +5799,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbgt %l0", "fbgt %l0", 0);
- else
- OUTPUT_JUMP ("jgt %l0", "fjgt %l0", 0);
+ OUTPUT_JUMP ("jgt %l0", "fjgt %l0", 0);
})

(define_insn "bgtu"
@@ -5830,10 +5821,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jblt %l0", "fblt %l0", "jbmi %l0");
- else
- OUTPUT_JUMP ("jlt %l0", "fjlt %l0", "jmi %l0");
+ OUTPUT_JUMP ("jlt %l0", "fjlt %l0", "jmi %l0");
})

(define_insn "bltu"
@@ -5855,10 +5843,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbge %l0", "fbge %l0", "jbpl %l0");
- else
- OUTPUT_JUMP ("jge %l0", "fjge %l0", "jpl %l0");
+ OUTPUT_JUMP ("jge %l0", "fjge %l0", "jpl %l0");
})

(define_insn "bgeu"
@@ -5880,10 +5865,7 @@
(pc)))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jble %l0", "fble %l0", 0);
- else
- OUTPUT_JUMP ("jle %l0", "fjle %l0", 0);
+ OUTPUT_JUMP ("jle %l0", "fjle %l0", 0);
})

(define_insn "bleu"
@@ -5905,7 +5887,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbor %l0" : "fjor %l0";
+ return "fjor %l0";
})

(define_insn "bunordered"
@@ -5916,7 +5898,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbun %l0" : "fjun %l0";
+ return "fjun %l0";
})

(define_insn "buneq"
@@ -5927,7 +5909,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbueq %l0" : "fjueq %l0";
+ return "fjueq %l0";
})

(define_insn "bunge"
@@ -5938,7 +5920,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbuge %l0" : "fjuge %l0";
+ return "fjuge %l0";
})

(define_insn "bungt"
@@ -5949,7 +5931,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbugt %l0" : "fjugt %l0";
+ return "fjugt %l0";
})

(define_insn "bunle"
@@ -5960,7 +5942,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbule %l0" : "fjule %l0";
+ return "fjule %l0";
})

(define_insn "bunlt"
@@ -5971,7 +5953,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbult %l0" : "fjult %l0";
+ return "fjult %l0";
})

(define_insn "bltgt"
@@ -5982,7 +5964,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbogl %l0" : "fjogl %l0";
+ return "fjogl %l0";
})

;; Negated conditional jump instructions.
@@ -5995,10 +5977,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbne %l0", "fbne %l0", "jbne %l0");
- else
- OUTPUT_JUMP ("jne %l0", "fjne %l0", "jne %l0");
+ OUTPUT_JUMP ("jne %l0", "fjne %l0", "jne %l0");
})

(define_insn ""
@@ -6009,10 +5988,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbeq %l0", "fbeq %l0", "jbeq %l0");
- else
- OUTPUT_JUMP ("jeq %l0", "fjeq %l0", "jeq %l0");
+ OUTPUT_JUMP ("jeq %l0", "fjeq %l0", "jeq %l0");
})

(define_insn ""
@@ -6023,10 +5999,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jble %l0", "fbngt %l0", 0);
- else
- OUTPUT_JUMP ("jle %l0", "fjngt %l0", 0);
+ OUTPUT_JUMP ("jle %l0", "fjngt %l0", 0);
})

(define_insn ""
@@ -6048,10 +6021,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbge %l0", "fbnlt %l0", "jbpl %l0");
- else
- OUTPUT_JUMP ("jge %l0", "fjnlt %l0", "jpl %l0");
+ OUTPUT_JUMP ("jge %l0", "fjnlt %l0", "jpl %l0");
})

(define_insn ""
@@ -6073,10 +6043,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jblt %l0", "fbnge %l0", "jbmi %l0");
- else
- OUTPUT_JUMP ("jlt %l0", "fjnge %l0", "jmi %l0");
+ OUTPUT_JUMP ("jlt %l0", "fjnge %l0", "jmi %l0");
})

(define_insn ""
@@ -6098,10 +6065,7 @@
(label_ref (match_operand 0 "" ""))))]
""
{
- if (MOTOROLA)
- OUTPUT_JUMP ("jbgt %l0", "fbnle %l0", 0);
- else
- OUTPUT_JUMP ("jgt %l0", "fjnle %l0", 0);
+ OUTPUT_JUMP ("jgt %l0", "fjnle %l0", 0);
})

(define_insn ""
@@ -6123,7 +6087,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbun %l0" : "fjun %l0";
+ return "fjun %l0";
})

(define_insn "*bunordered_rev"
@@ -6134,7 +6098,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbor %l0" : "fjor %l0";
+ return "fjor %l0";
})

(define_insn "*buneq_rev"
@@ -6145,7 +6109,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbogl %l0" : "fjogl %l0";
+ return "fjogl %l0";
})

(define_insn "*bunge_rev"
@@ -6156,7 +6120,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbolt %l0" : "fjolt %l0";
+ return "fjolt %l0";
})

(define_insn "*bungt_rev"
@@ -6167,7 +6131,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbole %l0" : "fjole %l0";
+ return "fjole %l0";
})

(define_insn "*bunle_rev"
@@ -6178,7 +6142,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbogt %l0" : "fjogt %l0";
+ return "fjogt %l0";
})

(define_insn "*bunlt_rev"
@@ -6189,7 +6153,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fboge %l0" : "fjoge %l0";
+ return "fjoge %l0";
})

(define_insn "*bltgt_rev"
@@ -6200,7 +6164,7 @@
"TARGET_HARD_FLOAT"
{
gcc_assert (cc_prev_status.flags & CC_IN_68881);
- return MOTOROLA ? "fbueq %l0" : "fjueq %l0";
+ return "fjueq %l0";
})

;; Unconditional and other jump instructions

--
Jeffrey Law
2007-02-05 17:06:42 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
gcc has problems to generate conditional fp jumps in very large
functions, as it always uses the 16bit variant. The easy solution is to
use the pseudo opcodes gas provides as we do with normal jump opcodes
and leave the work to gas.
This removes also many MOTOROLA tests, as the used jump opcodes weren't
really Motorola ops, but all of them are gas pseudo opcodes and this
uses now the officially documented variant.
* config/m68k/m68k.md: replace all fbcc with fjcc
The obvious concern here is whether or not all the other m68k
assemblers accept fjcc. Can you comment on this?

Jeff
z***@linux-m68k.org
2007-01-30 11:26:23 UTC
Permalink
Hi,

This adds support for unwinding through linux signal trampolines.

2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.

---

gcc/config/m68k/linux-unwind.h | 153 +++++++++++++++++++++++++++++++++++++++++
gcc/config/m68k/linux.h | 2
2 files changed, 155 insertions(+)

Index: egcs/gcc/config/m68k/linux-unwind.h
===================================================================
--- /dev/null
+++ egcs/gcc/config/m68k/linux-unwind.h
@@ -0,0 +1,153 @@
+/* DWARF2 EH unwinding support for Linux/m68k.
+ Copyright (C) 2006 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2, or (at your option)
+any later version.
+
+In addition to the permissions in the GNU General Public License, the
+Free Software Foundation gives you unlimited permission to link the
+compiled version of this file with other programs, and to distribute
+those programs without any restriction coming from the use of this
+file. (The General Public License restrictions do apply in other
+respects; for example, they cover modification of the file, and
+distribution when not linked into another program.)
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING. If not, write to
+the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+Boston, MA 02110-1301, USA. */
+
+/* Do code reading to identify a signal frame, and set the frame
+ state data appropriately. See unwind-dw2.c for the structs.
+ Don't use this at all if inhibit_libc is used. */
+
+#ifndef inhibit_libc
+
+#include <signal.h>
+
+/* <sys/ucontext.h> is unfortunaly broken right now */
+struct uw_ucontext {
+ unsigned long uc_flags;
+ struct ucontext *uc_link;
+ stack_t uc_stack;
+ mcontext_t uc_mcontext;
+ unsigned long uc_filler[80];
+ __sigset_t uc_sigmask;
+};
+
+#define MD_FALLBACK_FRAME_STATE_FOR m68k_fallback_frame_state
+
+#ifdef __mcoldfire__
+#define M68K_FP_SIZE 8
+#else
+#define M68K_FP_SIZE 12
+#endif
+
+static _Unwind_Reason_Code
+m68k_fallback_frame_state (struct _Unwind_Context *context,
+ _Unwind_FrameState *fs)
+{
+ unsigned short *pc = context->ra;
+ long cfa;
+
+ /* moveq #__NR_sigreturn,%d0; trap #0 */
+ if (pc[0] == 0x7077 && pc[1] == 0x4e40)
+ {
+ struct sigcontext *sc;
+
+ /* Context is passed as the 3rd argument. */
+ sc = *(struct sigcontext **) (context->cfa + 8);
+
+ cfa = sc->sc_usp;
+ fs->regs.cfa_how = CFA_REG_OFFSET;
+ fs->regs.cfa_reg = 15;
+ fs->regs.cfa_offset = cfa - (long) context->cfa;
+
+ fs->regs.reg[0].how = REG_SAVED_OFFSET;
+ fs->regs.reg[0].loc.offset = (long) &sc->sc_d0 - cfa;
+ fs->regs.reg[1].how = REG_SAVED_OFFSET;
+ fs->regs.reg[1].loc.offset = (long) &sc->sc_d1 - cfa;
+ fs->regs.reg[8].how = REG_SAVED_OFFSET;
+ fs->regs.reg[8].loc.offset = (long) &sc->sc_a0 - cfa;
+ fs->regs.reg[9].how = REG_SAVED_OFFSET;
+ fs->regs.reg[9].loc.offset = (long) &sc->sc_a1 - cfa;
+
+ fs->regs.reg[24].how = REG_SAVED_OFFSET;
+ fs->regs.reg[24].loc.offset = (long) &sc->sc_pc - cfa;
+
+ if (*(int *) sc->sc_fpstate)
+ {
+ int *fpregs = (int *) sc->sc_fpregs;
+
+ fs->regs.reg[16].how = REG_SAVED_OFFSET;
+ fs->regs.reg[16].loc.offset = (long) &fpregs[0] - cfa;
+ fs->regs.reg[17].how = REG_SAVED_OFFSET;
+ fs->regs.reg[17].loc.offset = (long) &fpregs[M68K_FP_SIZE/4] - cfa;
+ }
+ }
+#ifdef __mcoldfire__
+ /* move.l #__NR_rt_sigreturn,%d0; trap #0 */
+ else if (pc[0] == 0x203c && pc[1] == 0x0000 &&
+ pc[2] == 0x00ad && pc[3] == 0x4e40)
+#else
+ /* moveq #~__NR_rt_sigreturn,%d0; not.b %d0; trap #0 */
+ else if (pc[0] == 0x7052 && pc[1] == 0x4600 && pc[2] == 0x4e40)
+#endif
+ {
+ struct uw_ucontext *uc;
+ greg_t *gregs;
+ int i;
+
+ /* Context is passed as the 3rd argument. */
+ uc = *(struct uw_ucontext **) (context->cfa + 8);
+
+ gregs = uc->uc_mcontext.gregs;
+ cfa = gregs[15];
+ fs->regs.cfa_how = CFA_REG_OFFSET;
+ fs->regs.cfa_reg = 15;
+ fs->regs.cfa_offset = cfa - (long) context->cfa;
+
+ /* register %d0-%d7/%a0-%a6 */
+ for (i = 0; i <= 14; i++)
+ {
+ fs->regs.reg[i].how = REG_SAVED_OFFSET;
+ fs->regs.reg[i].loc.offset = (long) &gregs[i] - cfa;
+ }
+
+ /* return address */
+ fs->regs.reg[24].how = REG_SAVED_OFFSET;
+ fs->regs.reg[24].loc.offset = (long) &gregs[16] - cfa;
+
+#define uc_fpstate uc_filler[0]
+
+ if (uc->uc_fpstate)
+ {
+ long fpregs = (long) uc->uc_mcontext.fpregs.f_fpregs;
+
+ /* register %fp0-%fp7 */
+ for (i = 16; i <= 23; i++)
+ {
+ fs->regs.reg[i].how = REG_SAVED_OFFSET;
+ fs->regs.reg[i].loc.offset = fpregs - cfa;
+ fpregs += M68K_FP_SIZE;
+ }
+ }
+ }
+ else
+ return _URC_END_OF_STACK;
+
+ fs->retaddr_column = 24;
+ fs->signal_frame = 1;
+
+ return _URC_NO_REASON;
+}
+#endif /* ifdef inhibit_libc */
Index: egcs/gcc/config/m68k/linux.h
===================================================================
--- egcs.orig/gcc/config/m68k/linux.h
+++ egcs/gcc/config/m68k/linux.h
@@ -229,3 +229,5 @@ Boston, MA 02110-1301, USA. */
}

#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
+
+#define MD_UNWIND_SUPPORT "config/m68k/linux-unwind.h"

--
Jeffrey Law
2007-02-05 17:24:04 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
This adds support for unwinding through linux signal trampolines.
* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.
Didn't Richard S. post a patch for this already, or am I
mis-remembering?

jeff
Roman Zippel
2007-02-07 21:12:13 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
This adds support for unwinding through linux signal trampolines.
* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.
Didn't Richard S. post a patch for this already, or am I
mis-remembering?
He did, but it had a few problems, this is my version of the patch with
the ColdFire bits added.

bye, Roman
Jeffrey Law
2007-02-12 17:23:51 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
This adds support for unwinding through linux signal trampolines.
* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.
Didn't Richard S. post a patch for this already, or am I
mis-remembering?
He did, but it had a few problems, this is my version of the patch with
the ColdFire bits added.
Richard -- any problems with going with Roman's unwind bits? I'm really
not that qualified to judge the implementation as I'm not a dwarf-unwind
or coldfire expert.

jeff
Richard Sandiford
2007-02-12 17:45:04 UTC
Permalink
Post by Jeffrey Law
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
This adds support for unwinding through linux signal trampolines.
* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.
Didn't Richard S. post a patch for this already, or am I
mis-remembering?
He did, but it had a few problems, this is my version of the patch with
the ColdFire bits added.
Richard -- any problems with going with Roman's unwind bits? I'm really
not that qualified to judge the implementation as I'm not a dwarf-unwind
or coldfire expert.
I ran a GNU/Linux regression test with Roman's patch and it seemed
to work fine. I'd be happy to see it applied FWIW.

Richard
Jeffrey Law
2007-02-12 17:54:36 UTC
Permalink
Post by Richard Sandiford
Post by Jeffrey Law
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
This adds support for unwinding through linux signal trampolines.
* config/m68k/linux.h (MD_UNWIND_SUPPORT): Define.
* config/m68k/linux-unwind.h: New file.
Didn't Richard S. post a patch for this already, or am I
mis-remembering?
He did, but it had a few problems, this is my version of the patch with
the ColdFire bits added.
Richard -- any problems with going with Roman's unwind bits? I'm really
not that qualified to judge the implementation as I'm not a dwarf-unwind
or coldfire expert.
I ran a GNU/Linux regression test with Roman's patch and it seemed
to work fine. I'd be happy to see it applied FWIW.
OK. Then let's go with Roman's patch.

Jeff
z***@linux-m68k.org
2007-01-30 11:26:28 UTC
Permalink
Hi,

When combine creates a bitfield instruction it assumes simple shift
operations, but it doesn't work quite that way on m68k. If the bitfield
crosses the register boundary it wraps around, thus it actually is a
rotate operation.

This example demonstrates where it goes wrong (reduced from emacs):

#define XINT(v) (((int)(v) << 4) >> 4)
#define XUINT(v) (((unsigned)(v) << 4) >> 4)

int f(int v, int s)
{
return (XUINT(v) >> -XINT(s)) & 0x0fffffff;
}

gcc takes the shift and the mask and generates a bitfield op, but the
real bitfield instruction wraps around at the register end and doesn't
produce the expected result.

I guess it would be quite some work to teach combine about this and I
don't know how these behave on other ports, so the simple solution
below just disables dynamic offsets and widths for register arguments
and while I'm at it I also cleaned up the predicates here as well.



2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.md (extv,extzv,insv): disable dynamic
parameter for register bitfield operations, general predicates
cleanup

---

gcc/config/m68k/m68k.md | 62 ++++++++++++++++++++++++------------------------
1 file changed, 31 insertions(+), 31 deletions(-)

Index: gcc-4.1/gcc/config/m68k/m68k.md
===================================================================
--- gcc-4.1.orig/gcc/config/m68k/m68k.md
+++ gcc-4.1/gcc/config/m68k/m68k.md
@@ -4991,34 +4991,34 @@
;; so that its address is reloaded.

(define_expand "extv"
- [(set (match_operand:SI 0 "nonimmediate_operand" "")
+ [(set (match_operand:SI 0 "register_operand" "")
(sign_extract:SI (match_operand:SI 1 "general_operand" "")
- (match_operand:SI 2 "general_operand" "")
- (match_operand:SI 3 "general_operand" "")))]
+ (match_operand:SI 2 "const_int_operand" "")
+ (match_operand:SI 3 "const_int_operand" "")))]
"TARGET_68020 && TARGET_BITFIELD"
"")

(define_insn ""
- [(set (match_operand:SI 0 "nonimmediate_operand" "=d")
+ [(set (match_operand:SI 0 "register_operand" "=d")
(sign_extract:SI (match_operand:QI 1 "memory_operand" "o")
- (match_operand:SI 2 "general_operand" "dn")
- (match_operand:SI 3 "general_operand" "dn")))]
+ (match_operand:SI 2 "nonmemory_operand" "dn")
+ (match_operand:SI 3 "nonmemory_operand" "dn")))]
"TARGET_68020 && TARGET_BITFIELD"
"bfexts %1{%b3:%b2},%0")

(define_expand "extzv"
- [(set (match_operand:SI 0 "nonimmediate_operand" "")
+ [(set (match_operand:SI 0 "register_operand" "")
(zero_extract:SI (match_operand:SI 1 "general_operand" "")
- (match_operand:SI 2 "general_operand" "")
- (match_operand:SI 3 "general_operand" "")))]
+ (match_operand:SI 2 "const_int_operand" "")
+ (match_operand:SI 3 "const_int_operand" "")))]
"TARGET_68020 && TARGET_BITFIELD"
"")

(define_insn ""
- [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d")
- (zero_extract:SI (match_operand:QI 1 "memory_operand" "o,d")
- (match_operand:SI 2 "general_operand" "dn,dn")
- (match_operand:SI 3 "general_operand" "dn,dn")))]
+ [(set (match_operand:SI 0 "register_operand" "=d")
+ (zero_extract:SI (match_operand:QI 1 "memory_operand" "o")
+ (match_operand:SI 2 "nonmemory_operand" "dn")
+ (match_operand:SI 3 "nonmemory_operand" "dn")))]
"TARGET_68020 && TARGET_BITFIELD"
{
if (GET_CODE (operands[2]) == CONST_INT)
@@ -5035,8 +5035,8 @@

(define_insn ""
[(set (zero_extract:SI (match_operand:QI 0 "memory_operand" "+o")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "nonmemory_operand" "dn")
+ (match_operand:SI 2 "nonmemory_operand" "dn"))
(xor:SI (zero_extract:SI (match_dup 0) (match_dup 1) (match_dup 2))
(match_operand 3 "const_int_operand" "n")))]
"TARGET_68020 && TARGET_BITFIELD
@@ -5050,8 +5050,8 @@

(define_insn ""
[(set (zero_extract:SI (match_operand:QI 0 "memory_operand" "+o")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "nonmemory_operand" "dn")
+ (match_operand:SI 2 "nonmemory_operand" "dn"))
(const_int 0))]
"TARGET_68020 && TARGET_BITFIELD"
{
@@ -5072,16 +5072,16 @@

(define_expand "insv"
[(set (zero_extract:SI (match_operand:SI 0 "nonimmediate_operand" "")
- (match_operand:SI 1 "general_operand" "")
- (match_operand:SI 2 "general_operand" ""))
+ (match_operand:SI 1 "const_int_operand" "")
+ (match_operand:SI 2 "const_int_operand" ""))
(match_operand:SI 3 "register_operand" ""))]
"TARGET_68020 && TARGET_BITFIELD"
"")

(define_insn ""
[(set (zero_extract:SI (match_operand:QI 0 "memory_operand" "+o")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "nonmemory_operand" "dn")
+ (match_operand:SI 2 "nonmemory_operand" "dn"))
(match_operand:SI 3 "register_operand" "d"))]
"TARGET_68020 && TARGET_BITFIELD"
"bfins %3,%0{%b2:%b1}")
@@ -5092,16 +5092,16 @@
(define_insn ""
[(set (match_operand:SI 0 "nonimmediate_operand" "=d")
(sign_extract:SI (match_operand:SI 1 "register_operand" "d")
- (match_operand:SI 2 "general_operand" "dn")
- (match_operand:SI 3 "general_operand" "dn")))]
+ (match_operand:SI 2 "const_int_operand" "n")
+ (match_operand:SI 3 "const_int_operand" "n")))]
"TARGET_68020 && TARGET_BITFIELD"
"bfexts %1{%b3:%b2},%0")

(define_insn ""
[(set (match_operand:SI 0 "nonimmediate_operand" "=d")
(zero_extract:SI (match_operand:SI 1 "register_operand" "d")
- (match_operand:SI 2 "general_operand" "dn")
- (match_operand:SI 3 "general_operand" "dn")))]
+ (match_operand:SI 2 "const_int_operand" "n")
+ (match_operand:SI 3 "const_int_operand" "n")))]
"TARGET_68020 && TARGET_BITFIELD"
{
if (GET_CODE (operands[2]) == CONST_INT)
@@ -5118,8 +5118,8 @@

(define_insn ""
[(set (zero_extract:SI (match_operand:SI 0 "register_operand" "+d")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "const_int_operand" "n")
+ (match_operand:SI 2 "const_int_operand" "n"))
(const_int 0))]
"TARGET_68020 && TARGET_BITFIELD"
{
@@ -5129,8 +5129,8 @@

(define_insn ""
[(set (zero_extract:SI (match_operand:SI 0 "register_operand" "+d")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "const_int_operand" "n")
+ (match_operand:SI 2 "const_int_operand" "n"))
(const_int -1))]
"TARGET_68020 && TARGET_BITFIELD"
{
@@ -5140,8 +5140,8 @@

(define_insn ""
[(set (zero_extract:SI (match_operand:SI 0 "register_operand" "+d")
- (match_operand:SI 1 "general_operand" "dn")
- (match_operand:SI 2 "general_operand" "dn"))
+ (match_operand:SI 1 "const_int_operand" "n")
+ (match_operand:SI 2 "const_int_operand" "n"))
(match_operand:SI 3 "register_operand" "d"))]
"TARGET_68020 && TARGET_BITFIELD"
{

--
Jeffrey Law
2007-02-05 17:49:47 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
When combine creates a bitfield instruction it assumes simple shift
operations, but it doesn't work quite that way on m68k. If the bitfield
crosses the register boundary it wraps around, thus it actually is a
rotate operation.
#define XINT(v) (((int)(v) << 4) >> 4)
#define XUINT(v) (((unsigned)(v) << 4) >> 4)
int f(int v, int s)
{
return (XUINT(v) >> -XINT(s)) & 0x0fffffff;
}
gcc takes the shift and the mask and generates a bitfield op, but the
real bitfield instruction wraps around at the register end and doesn't
produce the expected result.
I guess it would be quite some work to teach combine about this and I
don't know how these behave on other ports, so the simple solution
below just disables dynamic offsets and widths for register arguments
and while I'm at it I also cleaned up the predicates here as well.
Yea, I don't think we should teach combine about this behavior :-)
Post by z***@linux-m68k.org
* config/m68k/m68k.md (extv,extzv,insv): disable dynamic
parameter for register bitfield operations, general predicates
cleanup
OK.

Jeff
z***@linux-m68k.org
2007-01-30 11:26:19 UTC
Permalink
Hi,

The first part of this patch fixes numerous test suite failures,
especially return value handling was broken for various types.
The second part adds closure support.

One question here might be about frame information handling, older
assemblers don't support the cfi instructions, but I'd really like to
avoid having to hardcode this like other ports, as this is impossible to
maintain. Is there maybe a way to make this conditional?


2007-01-30 Roman Zippel <***@linux-m68k.org>

* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep),
libffi/src/m68k/sysv.S (ffi_call_SYSV): fix numerous test
suite failures
* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV,
ffi_closure_SYSV_inner,ffi_prep_closure), libffi/src/m68k/sysv.S
(ffi_closure_SYSV,ffi_closure_struct_SYSV): New, add closure support

---
libffi/src/m68k/ffi.c | 214 ++++++++++++++++++++++++++++++++------------
libffi/src/m68k/ffitarget.h | 3
libffi/src/m68k/sysv.S | 126 +++++++++++++++++++++++--
3 files changed, 273 insertions(+), 70 deletions(-)

Index: egcs/libffi/src/m68k/ffi.c
===================================================================
--- egcs.orig/libffi/src/m68k/ffi.c
+++ egcs/libffi/src/m68k/ffi.c
@@ -8,11 +8,24 @@
#include <ffi_common.h>

#include <stdlib.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <asm/cachectl.h>
+
+void ffi_call_SYSV (extended_cif *,
+ unsigned, unsigned,
+ void *, void (*fn) ());
+void *ffi_prep_args (void *stack, extended_cif *ecif);
+void ffi_closure_SYSV (ffi_closure *);
+void ffi_closure_struct_SYSV (ffi_closure *);
+unsigned int ffi_closure_SYSV_inner (ffi_closure *closure,
+ void *resp, void *args);
+

/* ffi_prep_args is called by the assembly routine once stack space has
been allocated for the function's arguments. */

-static void *
+void *
ffi_prep_args (void *stack, extended_cif *ecif)
{
unsigned int i;
@@ -24,7 +37,7 @@ ffi_prep_args (void *stack, extended_cif
argp = stack;

if (ecif->cif->rtype->type == FFI_TYPE_STRUCT
- && ecif->cif->rtype->size > 8)
+ && !ecif->cif->flags)
struct_value_ptr = ecif->rvalue;
else
struct_value_ptr = NULL;
@@ -37,44 +50,47 @@ ffi_prep_args (void *stack, extended_cif
{
size_t z;

- /* Align if necessary. */
- if (((*p_arg)->alignment - 1) & (unsigned) argp)
- argp = (char *) ALIGN (argp, (*p_arg)->alignment);
-
- z = (*p_arg)->size;
- if (z < sizeof (int))
+ z = (*p_arg)->size;
+ if (z < sizeof (int))
+ {
+ switch ((*p_arg)->type)
{
- switch ((*p_arg)->type)
- {
- case FFI_TYPE_SINT8:
- *(signed int *) argp = (signed int) *(SINT8 *) *p_argv;
- break;
-
- case FFI_TYPE_UINT8:
- *(unsigned int *) argp = (unsigned int) *(UINT8 *) *p_argv;
- break;
-
- case FFI_TYPE_SINT16:
- *(signed int *) argp = (signed int) *(SINT16 *) *p_argv;
- break;
-
- case FFI_TYPE_UINT16:
- *(unsigned int *) argp = (unsigned int) *(UINT16 *) *p_argv;
- break;
-
- case FFI_TYPE_STRUCT:
- memcpy (argp + sizeof (int) - z, *p_argv, z);
- break;
-
- default:
- FFI_ASSERT (0);
- }
- z = sizeof (int);
+ case FFI_TYPE_SINT8:
+ *(signed int *) argp = (signed int) *(SINT8 *) *p_argv;
+ break;
+
+ case FFI_TYPE_UINT8:
+ *(unsigned int *) argp = (unsigned int) *(UINT8 *) *p_argv;
+ break;
+
+ case FFI_TYPE_SINT16:
+ *(signed int *) argp = (signed int) *(SINT16 *) *p_argv;
+ break;
+
+ case FFI_TYPE_UINT16:
+ *(unsigned int *) argp = (unsigned int) *(UINT16 *) *p_argv;
+ break;
+
+ case FFI_TYPE_STRUCT:
+ memcpy (argp + sizeof (int) - z, *p_argv, z);
+ break;
+
+ default:
+ FFI_ASSERT (0);
}
- else
- memcpy (argp, *p_argv, z);
- p_argv++;
- argp += z;
+ z = sizeof (int);
+ }
+ else
+ {
+ memcpy (argp, *p_argv, z);
+
+ /* Align if necessary. */
+ if ((sizeof(int) - 1) & z)
+ z = ALIGN(z, sizeof(int));
+ }
+
+ p_argv++;
+ argp += z;
}

return struct_value_ptr;
@@ -86,7 +102,8 @@ ffi_prep_args (void *stack, extended_cif
#define CIF_FLAGS_DOUBLE 8
#define CIF_FLAGS_LDOUBLE 16
#define CIF_FLAGS_POINTER 32
-#define CIF_FLAGS_STRUCT 64
+#define CIF_FLAGS_STRUCT1 64
+#define CIF_FLAGS_STRUCT2 128

/* Perform machine dependent cif processing */
ffi_status
@@ -100,12 +117,24 @@ ffi_prep_cif_machdep (ffi_cif *cif)
break;

case FFI_TYPE_STRUCT:
- if (cif->rtype->size > 4 && cif->rtype->size <= 8)
- cif->flags = CIF_FLAGS_DINT;
- else if (cif->rtype->size <= 4)
- cif->flags = CIF_FLAGS_STRUCT;
- else
- cif->flags = 0;
+ switch (cif->rtype->size)
+ {
+ case 1:
+ cif->flags = CIF_FLAGS_STRUCT1;
+ break;
+ case 2:
+ cif->flags = CIF_FLAGS_STRUCT2;
+ break;
+ case 4:
+ cif->flags = CIF_FLAGS_INT;
+ break;
+ case 8:
+ cif->flags = CIF_FLAGS_DINT;
+ break;
+ default:
+ cif->flags = 0;
+ break;
+ }
break;

case FFI_TYPE_FLOAT:
@@ -137,11 +166,6 @@ ffi_prep_cif_machdep (ffi_cif *cif)
return FFI_OK;
}

-extern void ffi_call_SYSV (void *(*) (void *, extended_cif *),
- extended_cif *,
- unsigned, unsigned, unsigned,
- void *, void (*fn) ());
-
void
ffi_call (ffi_cif *cif, void (*fn) (), void *rvalue, void **avalue)
{
@@ -149,7 +173,7 @@ ffi_call (ffi_cif *cif, void (*fn) (), v

ecif.cif = cif;
ecif.avalue = avalue;
-
+
/* If the return value is a struct and we don't have a return value
address then we need to make one. */

@@ -159,13 +183,11 @@ ffi_call (ffi_cif *cif, void (*fn) (), v
ecif.rvalue = alloca (cif->rtype->size);
else
ecif.rvalue = rvalue;
-
-
- switch (cif->abi)
+
+ switch (cif->abi)
{
case FFI_SYSV:
- ffi_call_SYSV (ffi_prep_args, &ecif, cif->bytes,
- cif->flags, cif->rtype->size * 8,
+ ffi_call_SYSV (&ecif, cif->bytes, cif->flags,
ecif.rvalue, fn);
break;

@@ -174,3 +196,83 @@ ffi_call (ffi_cif *cif, void (*fn) (), v
break;
}
}
+
+static void
+ffi_prep_incoming_args_SYSV (char *stack, void **avalue, ffi_cif *cif)
+{
+ unsigned int i;
+ void **p_argv;
+ char *argp;
+ ffi_type **p_arg;
+
+ argp = stack;
+ p_argv = avalue;
+
+ for (i = cif->nargs, p_arg = cif->arg_types; (i != 0); i--, p_arg++)
+ {
+ size_t z;
+
+ z = (*p_arg)->size;
+ if (z <= 4)
+ {
+ *p_argv = (void *) (argp + 4 - z);
+
+ z = 4;
+ }
+ else
+ {
+ *p_argv = (void *) argp;
+
+ /* Align if necessary */
+ if ((sizeof(int) - 1) & z)
+ z = ALIGN(z, sizeof(int));
+ }
+
+ p_argv++;
+ argp += z;
+ }
+}
+
+unsigned int
+ffi_closure_SYSV_inner (ffi_closure *closure, void *resp, void *args)
+{
+ ffi_cif *cif;
+ void **arg_area;
+
+ cif = closure->cif;
+ arg_area = (void**) alloca (cif->nargs * sizeof (void *));
+
+ ffi_prep_incoming_args_SYSV(args, arg_area, cif);
+
+ (closure->fun) (cif, resp, arg_area, closure->user_data);
+
+ return cif->flags;
+}
+
+ffi_status
+ffi_prep_closure (ffi_closure* closure,
+ ffi_cif* cif,
+ void (*fun)(ffi_cif*,void*,void**,void*),
+ void *user_data)
+{
+ FFI_ASSERT (cif->abi == FFI_SYSV);
+
+ *(unsigned short *)closure->tramp = 0x207c;
+ *(void **)(closure->tramp + 2) = closure;
+ *(unsigned short *)(closure->tramp + 6) = 0x4ef9;
+ if (cif->rtype->type == FFI_TYPE_STRUCT
+ && !cif->flags)
+ *(void **)(closure->tramp + 8) = ffi_closure_struct_SYSV;
+ else
+ *(void **)(closure->tramp + 8) = ffi_closure_SYSV;
+
+ syscall(SYS_cacheflush, closure->tramp, FLUSH_SCOPE_LINE,
+ FLUSH_CACHE_BOTH, FFI_TRAMPOLINE_SIZE);
+
+ closure->cif = cif;
+ closure->user_data = user_data;
+ closure->fun = fun;
+
+ return FFI_OK;
+}
+
Index: egcs/libffi/src/m68k/ffitarget.h
===================================================================
--- egcs.orig/libffi/src/m68k/ffitarget.h
+++ egcs/libffi/src/m68k/ffitarget.h
@@ -40,7 +40,8 @@ typedef enum ffi_abi {

/* ---- Definitions for closures ----------------------------------------- */

-#define FFI_CLOSURES 0
+#define FFI_CLOSURES 1
+#define FFI_TRAMPOLINE_SIZE 16
#define FFI_NATIVE_RAW_API 0

#endif
Index: egcs/libffi/src/m68k/sysv.S
===================================================================
--- egcs.orig/libffi/src/m68k/sysv.S
+++ egcs/libffi/src/m68k/sysv.S
@@ -12,36 +12,44 @@

.globl ffi_call_SYSV
.type ffi_call_SYSV,@function
+ .align 4

ffi_call_SYSV:
+ .cfi_startproc
link %fp,#0
+ .cfi_offset 14,-8
+ .cfi_def_cfa 14,8
move.l %d2,-(%sp)
+ .cfi_offset 2,-12

| Make room for all of the new args.
- sub.l 16(%fp),%sp
+ sub.l 12(%fp),%sp

| Call ffi_prep_args
- move.l 12(%fp),-(%sp)
+ move.l 8(%fp),-(%sp)
pea 4(%sp)
- move.l 8(%fp),%a0
- jsr (%a0)
+#if !defined __PIC__
+ jsr ffi_prep_args
+#else
+ bsr.l ***@PLTPC
+#endif
addq.l #8,%sp

| Pass pointer to struct value, if any
move.l %a0,%a1

| Call the function
- move.l 32(%fp),%a0
+ move.l 24(%fp),%a0
jsr (%a0)

| Remove the space we pushed for the args
- add.l 16(%fp),%sp
+ add.l 12(%fp),%sp

| Load the pointer to storage for the return value
- move.l 28(%fp),%a1
+ move.l 20(%fp),%a1

| Load the return type code
- move.l 20(%fp),%d2
+ move.l 16(%fp),%d2

| If the return value pointer is NULL, assume no return value.
tst.l %a1
@@ -79,19 +87,111 @@ retlongdouble:

retpointer:
btst #5,%d2
- jbeq retstruct
+ jbeq retstruct1
move.l %a0,(%a1)
jbra epilogue

-retstruct:
+retstruct1:
btst #6,%d2
+ jbeq retstruct2
+ move.b %d0,(%a1)
+ jbra epilogue
+
+retstruct2:
+ btst #7,%d2
jbeq noretval
- move.l 24(%fp),%d2
- bfins %d0,(%a1){#0,%d2}
+ move.w %d0,(%a1)

noretval:
epilogue:
move.l (%sp)+,%d2
- unlk %a6
+ unlk %fp
rts
+ .cfi_endproc
.size ffi_call_SYSV,.-ffi_call_SYSV
+
+ .globl ffi_closure_SYSV
+ .type ffi_closure_SYSV, @function
+ .align 4
+
+ffi_closure_SYSV:
+ .cfi_startproc
+ link %fp,#-12
+ .cfi_offset 14,-8
+ .cfi_def_cfa 14,8
+ move.l %sp,-12(%fp)
+ pea 8(%fp)
+ pea -12(%fp)
+ move.l %a0,-(%sp)
+#if !defined __PIC__
+ jsr ffi_closure_SYSV_inner
+#else
+ bsr.l ***@PLTPC
+#endif
+
+ lsr.l #1,%d0
+ jne 1f
+ jcc .Lcls_epilogue
+ move.l -12(%fp),%d0
+.Lcls_epilogue:
+ unlk %fp
+ rts
+1:
+ lea -12(%fp),%a0
+ lsr.l #2,%d0
+ jne 1f
+ jcs .Lcls_ret_float
+ move.l (%a0)+,%d0
+ move.l (%a0),%d1
+ jra .Lcls_epilogue
+.Lcls_ret_float:
+ fmove.s (%a0),%fp0
+ jra .Lcls_epilogue
+1:
+ lsr.l #2,%d0
+ jne 1f
+ jcs .Lcls_ret_ldouble
+ fmove.d (%a0),%fp0
+ jra .Lcls_epilogue
+.Lcls_ret_ldouble:
+ fmove.x (%a0),%fp0
+ jra .Lcls_epilogue
+1:
+ lsr.l #2,%d0
+ jne .Lcls_ret_struct2
+ jcs .Lcls_ret_struct1
+ move.l (%a0),%a0
+ move.l %a0,%d0
+ jra .Lcls_epilogue
+.Lcls_ret_struct1:
+ move.b (%a0),%d0
+ jra .Lcls_epilogue
+.Lcls_ret_struct2:
+ move.w (%a0),%d0
+ jra .Lcls_epilogue
+ .cfi_endproc
+
+ .size ffi_closure_SYSV,.-ffi_closure_SYSV
+
+ .globl ffi_closure_struct_SYSV
+ .type ffi_closure_struct_SYSV, @function
+ .align 4
+
+ffi_closure_struct_SYSV:
+ .cfi_startproc
+ link %fp,#0
+ .cfi_offset 14,-8
+ .cfi_def_cfa 14,8
+ move.l %sp,-12(%fp)
+ pea 8(%fp)
+ move.l %a1,-(%sp)
+ move.l %a0,-(%sp)
+#if !defined __PIC__
+ jsr ffi_closure_SYSV_inner
+#else
+ bsr.l ***@PLTPC
+#endif
+ unlk %fp
+ rts
+ .cfi_endproc
+ .size ffi_closure_struct_SYSV,.-ffi_closure_struct_SYSV

--
Jeffrey Law
2007-02-06 18:36:37 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
The first part of this patch fixes numerous test suite failures,
especially return value handling was broken for various types.
The second part adds closure support.
One question here might be about frame information handling, older
assemblers don't support the cfi instructions, but I'd really like to
avoid having to hardcode this like other ports, as this is impossible to
maintain. Is there maybe a way to make this conditional?
* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep),
libffi/src/m68k/sysv.S (ffi_call_SYSV): fix numerous test
suite failures
* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV,
ffi_closure_SYSV_inner,ffi_prep_closure), libffi/src/m68k/sysv.S
(ffi_closure_SYSV,ffi_closure_struct_SYSV): New, add closure support
I'm really not the best person to review this.

Can someone more familiar with libffi step in, please?

Jeff
Andreas Schwab
2007-02-06 19:51:33 UTC
Permalink
Post by Jeffrey Law
Post by z***@linux-m68k.org
Hi,
The first part of this patch fixes numerous test suite failures,
especially return value handling was broken for various types.
The second part adds closure support.
One question here might be about frame information handling, older
assemblers don't support the cfi instructions, but I'd really like to
avoid having to hardcode this like other ports, as this is impossible to
maintain. Is there maybe a way to make this conditional?
* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep),
libffi/src/m68k/sysv.S (ffi_call_SYSV): fix numerous test
suite failures
* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV,
ffi_closure_SYSV_inner,ffi_prep_closure), libffi/src/m68k/sysv.S
(ffi_closure_SYSV,ffi_closure_struct_SYSV): New, add closure support
I'm really not the best person to review this.
Can someone more familiar with libffi step in, please?
Looks ok to me, except that the changelog entry needs to be fixed to
conform to the standards.

2007-01-30 Roman Zippel <***@linux-m68k.org>

* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep): Fix
numerous test suite failures.
* libffi/src/m68k/sysv.S (ffi_call_SYSV): Likewise.

* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV)
(ffi_closure_SYSV_inner, ffi_prep_closure): New, add closure
support.
* libffi/src/m68k/sysv.S (ffi_closure_SYSV)
(ffi_closure_struct_SYSV): Likewise.

Andreas.
--
Andreas Schwab, SuSE Labs, ***@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Eric Botcazou
2007-02-06 20:06:35 UTC
Permalink
Post by Andreas Schwab
Looks ok to me, except that the changelog entry needs to be fixed to
conform to the standards.
Almost. :-)
Post by Andreas Schwab
* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep): Fix
numerous test suite failures.
* libffi/src/m68k/sysv.S (ffi_call_SYSV): Likewise.
* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV)
(ffi_closure_SYSV_inner, ffi_prep_closure): New, add closure
support.
* libffi/src/m68k/sysv.S (ffi_closure_SYSV)
(ffi_closure_struct_SYSV): Likewise.
Ditch the leading libffi/ (and put the whole thing in libffi/ChangeLog).
--
Eric Botcazou
Roman Zippel
2007-02-18 01:33:52 UTC
Permalink
Hi,
Post by Andreas Schwab
Post by Jeffrey Law
Can someone more familiar with libffi step in, please?
Looks ok to me, except that the changelog entry needs to be fixed to
conform to the standards.
* libffi/src/m68k/ffi.c (ffi_prep_args,ffi_prep_cif_machdep): Fix
numerous test suite failures.
* libffi/src/m68k/sysv.S (ffi_call_SYSV): Likewise.
* libffi/src/m68k/ffi.c (ffi_prep_incoming_args_SYSV)
(ffi_closure_SYSV_inner, ffi_prep_closure): New, add closure
support.
* libffi/src/m68k/sysv.S (ffi_closure_SYSV)
(ffi_closure_struct_SYSV): Likewise.
Ok to commit with fixed Changelog?

bye, Roman
Tom Tromey
2007-02-07 14:35:17 UTC
Permalink
Roman> One question here might be about frame information handling, older
Roman> assemblers don't support the cfi instructions, but I'd really like to
Roman> avoid having to hardcode this like other ports, as this is impossible to
Roman> maintain. Is there maybe a way to make this conditional?

You could use the preprocessor, at least if you worked out the
configury necessary to decide whether the assembler supports it.

Tom
Roman Zippel
2007-02-07 22:07:40 UTC
Permalink
Hi,
Post by Tom Tromey
Roman> One question here might be about frame information handling, older
Roman> assemblers don't support the cfi instructions, but I'd really like to
Roman> avoid having to hardcode this like other ports, as this is impossible to
Roman> maintain. Is there maybe a way to make this conditional?
You could use the preprocessor, at least if you worked out the
configury necessary to decide whether the assembler supports it.
Hmm, there is already a similiar assembler test in configure.ac, based on
it this should work:

AC_CACHE_CHECK([assembler .cfi pseudo-op support],
libffi_cv_as_cfi_pseudo_op, [
libffi_cv_as_cfi_pseudo_op=unknown
AC_TRY_COMPILE([asm (".cfi_startproc\n\t.cfi_endproc");],,
[libffi_cv_as_cfi_pseudo_op=yes],
[libffi_cv_as_cfi_pseudo_op=no])
])
if test "x$libffi_cv_as_cfi_pseudo_op" = xyes; then
AC_DEFINE(HAVE_AS_CFI_PSEUDO_OP, 1,
[Define if your assembler supports .cfi_* directives.])
fi

If making it conditional like this is acceptable, I'll prepare a patch.

bye, Roman
Tom Tromey
2007-02-08 16:41:49 UTC
Permalink
Roman> if test "x$libffi_cv_as_cfi_pseudo_op" = xyes; then
Roman> AC_DEFINE(HAVE_AS_CFI_PSEUDO_OP, 1,
Roman> [Define if your assembler supports .cfi_* directives.])
Roman> fi

Roman> If making it conditional like this is acceptable, I'll prepare a patch.

Yes, this would be fine.

Tom
z***@linux-m68k.org
2007-01-30 11:26:21 UTC
Permalink
Hi,

combine sometimes tries to generate a byte push on the stack, which
isn't valid on m68k and will fail later in output_move_qimode(), so we
have to prevent this.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.md: prevent generation of byte push on the stack

---
gcc/config/m68k/m68k.md | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

Index: gcc-4.1/gcc/config/m68k/m68k.md
===================================================================
--- gcc-4.1.orig/gcc/config/m68k/m68k.md
+++ gcc-4.1/gcc/config/m68k/m68k.md
@@ -748,7 +748,10 @@
(define_insn ""
[(set (match_operand:QI 0 "nonimmediate_operand" "=d,*a,m")
(match_operand:QI 1 "general_src_operand" "dmSi*a,di*a,dmSi"))]
- "!TARGET_COLDFIRE"
+ "!TARGET_COLDFIRE
+ && (!MEM_P (operands[0])
+ || GET_CODE (XEXP (operands[0], 0)) != PRE_DEC
+ || XEXP (XEXP (operands[0], 0), 0) != stack_pointer_rtx)"
"* return output_move_qimode (operands);")

(define_insn ""

--
z***@linux-m68k.org
2007-01-30 11:26:27 UTC
Permalink
Hi,

If using some of -falign options with a larger alignment, no proper nop
instruction is inserted. The patch uses "move.l %a4,%a4" as nop (the
real nop has synchronizing properties).


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.h (ASM_OUTPUT_ALIGN_WITH_NOP): New, use
"move.l %a4,%a4" to produce nops.

---
gcc/config/m68k/linux.h | 5 +++++
1 file changed, 5 insertions(+)

Index: gcc-4.1/gcc/config/m68k/linux.h
===================================================================
--- gcc-4.1.orig/gcc/config/m68k/linux.h
+++ gcc-4.1/gcc/config/m68k/linux.h
@@ -163,6 +163,11 @@ Boston, MA 02110-1301, USA. */
if ((LOG) > 0) \
fprintf ((FILE), "%s%u\n", ALIGN_ASM_OP, 1 << (LOG));

+/* Use "move.l %a4,%a4" to advance within code. */
+#define ASM_OUTPUT_ALIGN_WITH_NOP(FILE,LOG) \
+ if ((LOG) > 0) \
+ fprintf ((FILE), "\t.balignw %u,0x284c\n", 1 << (LOG));
+
/* If defined, a C expression whose value is a string containing the
assembler operation to identify the following data as uninitialized global
data. */

--
Jeffrey Law
2007-02-05 17:04:18 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
If using some of -falign options with a larger alignment, no proper nop
instruction is inserted. The patch uses "move.l %a4,%a4" as nop (the
real nop has synchronizing properties).
* config/m68k/m68k.h (ASM_OUTPUT_ALIGN_WITH_NOP): New, use
"move.l %a4,%a4" to produce nops.
My m68k is rusty -- presumably move.l <ea>,addrreg doesn't muck
the status bits, right?

Assuming that's the case my only concern would be whether or not
the .balign pseudo-op is accepted by the various m68k assemblers.

Jeff
Rask Ingemann Lambertsen
2007-02-05 21:58:24 UTC
Permalink
Post by Jeffrey Law
My m68k is rusty -- presumably move.l <ea>,addrreg doesn't muck
the status bits, right?
It doesn't. Instructions with an address register destination don't alter
status bits. The Motorola docs use mnemonics suffixed with 'a' (movea, adda,
...) to make it clearer. Note that cmpa does set the status bits.
--
Rask Ingemann Lambertsen
Roman Zippel
2007-02-07 21:09:36 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
If using some of -falign options with a larger alignment, no proper nop
instruction is inserted. The patch uses "move.l %a4,%a4" as nop (the
real nop has synchronizing properties).
* config/m68k/m68k.h (ASM_OUTPUT_ALIGN_WITH_NOP): New, use
"move.l %a4,%a4" to produce nops.
My m68k is rusty -- presumably move.l <ea>,addrreg doesn't muck
the status bits, right?
Indeed.
Post by Jeffrey Law
Assuming that's the case my only concern would be whether or not
the .balign pseudo-op is accepted by the various m68k assemblers.
Currently gas is the only assembler left and it's only generated if one
uses one of -falign options.

bye, Roman
Jeffrey Law
2007-02-12 17:25:56 UTC
Permalink
Post by Roman Zippel
Post by Jeffrey Law
Assuming that's the case my only concern would be whether or not
the .balign pseudo-op is accepted by the various m68k assemblers.
You've made that assertion, but how do you know that? All the the
world is not gnu.
Post by Roman Zippel
Currently gas is the only assembler left and it's only generated if one
uses one of -falign options.
But again, this hinges on your assumption that the only m68k assembler
in use is gas. Just asserting that gas is the only m68k assembler in
use isn't enough, you've got to convince us that's the case since this
kind of patch would break support for other assemblers.

jeff
Roman Zippel
2007-02-13 14:03:41 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by Roman Zippel
Post by Jeffrey Law
Assuming that's the case my only concern would be whether or not
the .balign pseudo-op is accepted by the various m68k assemblers.
You've made that assertion, but how do you know that? All the the
world is not gnu.
As I've shown proviously at least the m68k world has shrunk considerably.
Post by Jeffrey Law
Post by Roman Zippel
Currently gas is the only assembler left and it's only generated if one
uses one of -falign options.
But again, this hinges on your assumption that the only m68k assembler
in use is gas. Just asserting that gas is the only m68k assembler in
use isn't enough, you've got to convince us that's the case since this
kind of patch would break support for other assemblers.
Sorry, but how am I supposed to "prove" that???
Based on the currently supported m68k targets all possible proprietary
targets are gone. Based on your logic we should have never deprecated the
other targets as well, because they might still be in use somewhere. My
patch does the only logical next step - supporting the configurations we
_know_ about and this only includes gas.

If nobody speaks up in support of other assembler, I don't see any value
in keeping useless code. At the remote chance there is still someone out
there trying to use gcc with an alternative assembler, we likely won't
hear about it until it breaks somehow and if that should happen, I'd
consider it a good thing, so we actually _know_ what kind of support is
needed. OTOH I'd consider it a real bad thing, if this fear of the unknown
kept us of doing some very real and actually useful fixes.

I cannot prove beyond doubt that there is no alternative assembler left
(which is also usable with current gcc) as much as you cannot prove there
is any other supported assembler left. We depricated and removed support
for unsupported targets in the past, so why can't the same be done here?
Any explicit support for alternative assembler has been already depricated
by the target removals, thus if there is any damage it has already been
done, my patches merely continue this deprication process by further
cleaning up after it.

I don't mind support for alternative assembler, but how are we supposed to
support something we don't know about?

bye, Roman
Jeffrey Law
2007-02-26 22:41:03 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by Roman Zippel
Post by Jeffrey Law
Assuming that's the case my only concern would be whether or not
the .balign pseudo-op is accepted by the various m68k assemblers.
You've made that assertion, but how do you know that? All the the
world is not gnu.
As I've shown proviously at least the m68k world has shrunk considerably.
I'm not doubting that. The question is has the world shrunk enough that
you can simply remove support for an assembler variant that we have
supported for 20 years.
Post by Roman Zippel
Post by Jeffrey Law
Post by Roman Zippel
Currently gas is the only assembler left and it's only generated if one
uses one of -falign options.
But again, this hinges on your assumption that the only m68k assembler
in use is gas. Just asserting that gas is the only m68k assembler in
use isn't enough, you've got to convince us that's the case since this
kind of patch would break support for other assemblers.
Sorry, but how am I supposed to "prove" that???
You can't prove it, but just asserting it isn't sufficient. You
have to find a middle ground.

One thing you could do that would remove my objection would be to
show me that there's no way the MOTOROLA_SYNTAX code could possibly
work in the previous GCC release. Or that there's no way to actually
make GCC produce MOTOROLA_SYNTAX in the previous GCC release.

If that's not possible, then that gives you a direction you can
legitimately take. Disable MOTOROLA_SYNTAX, but keep the code
working for a release. When there aren't any complaints after
a suitable period of time after the next release, then zap all
MOTOROLA_SYNTAX support.
Post by Roman Zippel
If nobody speaks up in support of other assembler, I don't see any value
in keeping useless code. At the remote chance there is still someone out
there trying to use gcc with an alternative assembler, we likely won't
hear about it until it breaks somehow and if that should happen, I'd
consider it a good thing, so we actually _know_ what kind of support is
needed. OTOH I'd consider it a real bad thing, if this fear of the unknown
kept us of doing some very real and actually useful fixes.
And the way to get to this state is to disable it, but not otherwise
break it and wait for a period of time to give the end users a chance
to complain.

Jeff
Roman Zippel
2007-02-26 23:28:27 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by Roman Zippel
As I've shown proviously at least the m68k world has shrunk considerably.
I'm not doubting that. The question is has the world shrunk enough that
you can simply remove support for an assembler variant that we have
supported for 20 years.
A lot of it has already been removed with the removals of many m68k
targets, e.g.:

http://gcc.gnu.org/viewcvs?view=rev&revision=68962
http://gcc.gnu.org/viewcvs?view=rev&revision=77518

removes explicit assembler support, seriously, there _is_ no other
explicit support left than for gas. I don't remove anything, that hasn't
been gone already for at least 3 years...
Post by Jeffrey Law
One thing you could do that would remove my objection would be to
show me that there's no way the MOTOROLA_SYNTAX code could possibly
work in the previous GCC release. Or that there's no way to actually
make GCC produce MOTOROLA_SYNTAX in the previous GCC release.
If that's not possible, then that gives you a direction you can
legitimately take. Disable MOTOROLA_SYNTAX, but keep the code
working for a release. When there aren't any complaints after
a suitable period of time after the next release, then zap all
MOTOROLA_SYNTAX support.
Huh? I don't want to remove the Motorola syntax.
In this case there's a HAVE_GAS_BALIGN_AND_P2ALIGN I can use.
The other problem are the jump instructions, which don't have anything to
do with the real Motorola syntax. The "(MOTOROLA)" check doesn't selected
the Motorola syntax at all.
Post by Jeffrey Law
Post by Roman Zippel
If nobody speaks up in support of other assembler, I don't see any value
in keeping useless code. At the remote chance there is still someone out
there trying to use gcc with an alternative assembler, we likely won't
hear about it until it breaks somehow and if that should happen, I'd
consider it a good thing, so we actually _know_ what kind of support is
needed. OTOH I'd consider it a real bad thing, if this fear of the unknown
kept us of doing some very real and actually useful fixes.
And the way to get to this state is to disable it, but not otherwise
break it and wait for a period of time to give the end users a chance
to complain.
As long as it works nobody will complain...
Do you seriously want to delay _valid_ bug fixes for a release???

bye, Roman
Jeffrey Law
2007-02-26 23:49:38 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
Post by Roman Zippel
As I've shown proviously at least the m68k world has shrunk considerably.
I'm not doubting that. The question is has the world shrunk enough that
you can simply remove support for an assembler variant that we have
supported for 20 years.
A lot of it has already been removed with the removals of many m68k
http://gcc.gnu.org/viewcvs?view=rev&revision=68962
http://gcc.gnu.org/viewcvs?view=rev&revision=77518
I don't care about CRDS or SGS, I care about the MOTOROLA_SYNTAX
support. That's all we're discussing at this point.

So I'll ask explicitly, can GCC currently generate MOTOROLA_SYNTAX
assembly code? If yes, is it broken so badly that anyone trying
to use it would find it useless?

If you can answer no to the first question or yes to the second
question then I won't object to your changes.

If the answer is yes to the first question and no to the section
question, then I'm going to object to your patch because it
changes the code we generate under MOTOROLA_SYNTAX. THe path
to getting your patch installed would then be to ensure MOTOROLA_SYNTAX
will remain unchanged.
Post by Roman Zippel
Huh? I don't want to remove the Motorola syntax.
In this case there's a HAVE_GAS_BALIGN_AND_P2ALIGN I can use.
The other problem are the jump instructions, which don't have anything to
do with the real Motorola syntax. The "(MOTOROLA)" check doesn't selected
the Motorola syntax at all.
But some of your patches will change, likely in incompatible ways the
code we generate for MOTOROLA_SYNTAX. At least that's how I read your
change for the jump instructions.
Post by Roman Zippel
As long as it works nobody will complain...
Which is precisely why I suggested you could disable it for the
next release. If we believe nobody is using it, then disabling
it should't affect anyone. You can then remove it the next
release.
Post by Roman Zippel
Do you seriously want to delay _valid_ bug fixes for a release???
Not at all. I'm saying you have to find a way to fix the bugs
without breaking MOTOROLA_SYNTAX or you have to show me that
it's highly unlikely anyone can be using MOTOROLA_SYNTAX in
the most recent GCC release.


Jeff
Roman Zippel
2007-02-27 01:04:06 UTC
Permalink
Hi,
Post by Jeffrey Law
So I'll ask explicitly, can GCC currently generate MOTOROLA_SYNTAX
assembly code? If yes, is it broken so badly that anyone trying
to use it would find it useless?
If you can answer no to the first question or yes to the second
question then I won't object to your changes.
Um, your second question depends on your first question being answered
yes, so I don't know how to resolve this. :)
Post by Jeffrey Law
If the answer is yes to the first question and no to the section
question, then I'm going to object to your patch because it
changes the code we generate under MOTOROLA_SYNTAX. THe path
to getting your patch installed would then be to ensure MOTOROLA_SYNTAX
will remain unchanged.
That depends on how you define "MOTOROLA_SYNTAX", we never generated pure
Motorola syntax, integer jump opcodes were always pseudo opcodes.
The only difference might be for fp jump opcodes, which changes e.g. from
"fbra" to "fjra" and the former is not exact Motorola syntax either as it
lacks the size argument and that's a bug if you need long jumps.
Post by Jeffrey Law
Post by Roman Zippel
As long as it works nobody will complain...
Which is precisely why I suggested you could disable it for the
next release. If we believe nobody is using it, then disabling
it should't affect anyone. You can then remove it the next
release.
Sorry, but this an exercise in futility. :-(
I'm already quite conservative when it comes to preserving compatibility,
but you're in whole different league. :-(
There is not a single sign, that this change would break anything. We
already removed so much with _a_lot_ less fuss. Even at the remote chance
this breaks anything, the sooner we detect it the better, as we can then
support it properly.
I would understand this if we actually had a significant user base. The
only active (i.e. visible) users use binutils. I would understand this if
there would be a significant chance it would break anything. There may be
some embedded users, but they'll stick to their known working executables
and they won't randomly upgrade their compiler.
There is absolutely no value in making this more complicating than
necessary, but there is value in cleaning this code up and making it
easier to maintain. There is simply more value in making the break now,
than unnecessarily delay it, so if you want to reject the patch, that's
fine, but I'm not going to change it either, as it has absolutely _no_
value.

bye, Roman
Roman Zippel
2007-02-27 10:56:52 UTC
Permalink
Hi,
Post by Jeffrey Law
So I'll ask explicitly, can GCC currently generate MOTOROLA_SYNTAX
assembly code? If yes, is it broken so badly that anyone trying
to use it would find it useless?
Apropos brokenness - the m68k port is currently not in the best shape.
With all due respect to your experience (and ten years ago I would have
agreed with you), but I actually know what it means to make a current gcc
release usable on m68k. The current list of m68k related patches in
Debian's gcc is my witness and apparently the ColdFire guys hit many of
the same problems. Considering all these problems this should be somehow
reflected in user reports, but there are no other users than we already
know about (as much as I'd wished it weren't so). All my recent practical
experience tells me that your theoretical objections have no practical
relevance anymore. I'm sorry.
So to answer your question slightly different - yes, gcc on m68k is
currently so badly broken that anyone seriously trying to use it would
find it useless.

bye, Roman

z***@linux-m68k.org
2007-01-30 11:26:17 UTC
Permalink
Hi,

This continues the previous patch and cleans up the remaining jump
opcodes. jcc and jbcc are both gas pseudo opcodes (and no real motorola
opcodes), so this uses now the officially documented variant.
This also removes the last uses of USE_GAS, after the last rounds of
target deprications only support for gas should be left now AFAICT (only
m68k-*-aout* and m68k*-*-openbsd* don't define USE_GAS). If we want to
support other assembler, this needs to be added back properly.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.c (override_options): remove USE_GAS
* config/m68k/m68k.c/config/m68k/m68k.md: replace all jbcc
alternatives with just jcc

---
gcc/config/m68k/m68k.c | 98 ++++++--------------------
gcc/config/m68k/m68k.md | 178 +++++++++---------------------------------------
2 files changed, 60 insertions(+), 216 deletions(-)

Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -543,13 +543,8 @@ override_options (void)

if (!flag_pic)
{
-#if MOTOROLA && !defined (USE_GAS)
- m68k_symbolic_call = "jsr %a0";
- m68k_symbolic_jump = "jmp %a0";
-#else
m68k_symbolic_call = "jbsr %a0";
m68k_symbolic_jump = "jra %a0";
-#endif
}
else if (TARGET_ID_SHARED_LIBRARY)
/* All addresses must be loaded from the GOT. */
@@ -558,18 +553,13 @@ override_options (void)
{
if (TARGET_PCREL)
{
- m68k_symbolic_call = "bsr.l %c0";
- m68k_symbolic_jump = "bra.l %c0";
+ m68k_symbolic_call = "bsr%.l %c0";
+ m68k_symbolic_jump = "bra%.l %c0";
}
else
{
-#if defined(USE_GAS)
- m68k_symbolic_call = "bsr.l %p0";
- m68k_symbolic_jump = "bra.l %p0";
-#else
- m68k_symbolic_call = "bsr %p0";
- m68k_symbolic_jump = "bra %p0";
-#endif
+ m68k_symbolic_call = "bsr%.l %p0";
+ m68k_symbolic_jump = "bra%.l %p0";
}
/* Turn off function cse if we are doing PIC. We always want
function call to be done as `bsr ***@PLTPC'. */
@@ -1386,73 +1376,43 @@ output_dbcc_and_branch (rtx *operands)
switch (GET_CODE (operands[3]))
{
case EQ:
- output_asm_insn (MOTOROLA
- ? "dbeq %0,%l1\n\tjbeq %l2"
- : "dbeq %0,%l1\n\tjeq %l2",
- operands);
+ output_asm_insn ("dbeq %0,%l1\n\tjeq %l2", operands);
break;

case NE:
- output_asm_insn (MOTOROLA
- ? "dbne %0,%l1\n\tjbne %l2"
- : "dbne %0,%l1\n\tjne %l2",
- operands);
+ output_asm_insn ("dbne %0,%l1\n\tjne %l2", operands);
break;

case GT:
- output_asm_insn (MOTOROLA
- ? "dbgt %0,%l1\n\tjbgt %l2"
- : "dbgt %0,%l1\n\tjgt %l2",
- operands);
+ output_asm_insn ("dbgt %0,%l1\n\tjgt %l2", operands);
break;

case GTU:
- output_asm_insn (MOTOROLA
- ? "dbhi %0,%l1\n\tjbhi %l2"
- : "dbhi %0,%l1\n\tjhi %l2",
- operands);
+ output_asm_insn ("dbhi %0,%l1\n\tjhi %l2", operands);
break;

case LT:
- output_asm_insn (MOTOROLA
- ? "dblt %0,%l1\n\tjblt %l2"
- : "dblt %0,%l1\n\tjlt %l2",
- operands);
+ output_asm_insn ("dblt %0,%l1\n\tjlt %l2", operands);
break;

case LTU:
- output_asm_insn (MOTOROLA
- ? "dbcs %0,%l1\n\tjbcs %l2"
- : "dbcs %0,%l1\n\tjcs %l2",
- operands);
+ output_asm_insn ("dbcs %0,%l1\n\tjcs %l2", operands);
break;

case GE:
- output_asm_insn (MOTOROLA
- ? "dbge %0,%l1\n\tjbge %l2"
- : "dbge %0,%l1\n\tjge %l2",
- operands);
+ output_asm_insn ("dbge %0,%l1\n\tjge %l2", operands);
break;

case GEU:
- output_asm_insn (MOTOROLA
- ? "dbcc %0,%l1\n\tjbcc %l2"
- : "dbcc %0,%l1\n\tjcc %l2",
- operands);
+ output_asm_insn ("dbcc %0,%l1\n\tjcc %l2", operands);
break;

case LE:
- output_asm_insn (MOTOROLA
- ? "dble %0,%l1\n\tjble %l2"
- : "dble %0,%l1\n\tjle %l2",
- operands);
+ output_asm_insn ("dble %0,%l1\n\tjle %l2", operands);
break;

case LEU:
- output_asm_insn (MOTOROLA
- ? "dbls %0,%l1\n\tjbls %l2"
- : "dbls %0,%l1\n\tjls %l2",
- operands);
+ output_asm_insn ("dbls %0,%l1\n\tjls %l2", operands);
break;

default:
@@ -1464,10 +1424,7 @@ output_dbcc_and_branch (rtx *operands)
switch (GET_MODE (operands[0]))
{
case SImode:
- output_asm_insn (MOTOROLA
- ? "clr%.w %0\n\tsubq%.l #1,%0\n\tjbpl %l1"
- : "clr%.w %0\n\tsubq%.l #1,%0\n\tjpl %l1",
- operands);
+ output_asm_insn ("clr%.w %0\n\tsubq%.l #1,%0\n\tjpl %l1", operands);
break;

case HImode:
@@ -1513,12 +1470,7 @@ output_scc_di (rtx op, rtx operand1, rtx
}
loperands[4] = gen_label_rtx ();
if (operand2 != const0_rtx)
- {
- output_asm_insn (MOTOROLA
- ? "cmp%.l %2,%0\n\tjbne %l4\n\tcmp%.l %3,%1"
- : "cmp%.l %2,%0\n\tjne %l4\n\tcmp%.l %3,%1",
- loperands);
- }
+ output_asm_insn ("cmp%.l %2,%0\n\tjne %l4\n\tcmp%.l %3,%1", loperands);
else
{
if (TARGET_68020 || TARGET_COLDFIRE || ! ADDRESS_REG_P (loperands[0]))
@@ -1526,7 +1478,7 @@ output_scc_di (rtx op, rtx operand1, rtx
else
output_asm_insn ("cmp%.w #0,%0", loperands);

- output_asm_insn (MOTOROLA ? "jbne %l4" : "jne %l4", loperands);
+ output_asm_insn ("jne %l4", loperands);

if (TARGET_68020 || TARGET_COLDFIRE || ! ADDRESS_REG_P (loperands[1]))
output_asm_insn ("tst%.l %1", loperands);
@@ -1552,8 +1504,7 @@ output_scc_di (rtx op, rtx operand1, rtx

case GT:
loperands[6] = gen_label_rtx ();
- output_asm_insn (MOTOROLA ? "shi %5\n\tjbra %l6" : "shi %5\n\tjra %l6",
- loperands);
+ output_asm_insn ("shi %5\n\tjra %l6", loperands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (loperands[4]));
output_asm_insn ("sgt %5", loperands);
@@ -1569,8 +1520,7 @@ output_scc_di (rtx op, rtx operand1, rtx

case LT:
loperands[6] = gen_label_rtx ();
- output_asm_insn (MOTOROLA ? "scs %5\n\tjbra %l6" : "scs %5\n\tjra %l6",
- loperands);
+ output_asm_insn ("scs %5\n\tjra %l6", loperands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (loperands[4]));
output_asm_insn ("slt %5", loperands);
@@ -1586,8 +1536,7 @@ output_scc_di (rtx op, rtx operand1, rtx

case GE:
loperands[6] = gen_label_rtx ();
- output_asm_insn (MOTOROLA ? "scc %5\n\tjbra %l6" : "scc %5\n\tjra %l6",
- loperands);
+ output_asm_insn ("scc %5\n\tjra %l6", loperands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (loperands[4]));
output_asm_insn ("sge %5", loperands);
@@ -1603,8 +1552,7 @@ output_scc_di (rtx op, rtx operand1, rtx

case LE:
loperands[6] = gen_label_rtx ();
- output_asm_insn (MOTOROLA ? "sls %5\n\tjbra %l6" : "sls %5\n\tjra %l6",
- loperands);
+ output_asm_insn ("sls %5\n\tjra %l6", loperands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (loperands[4]));
output_asm_insn ("sle %5", loperands);
@@ -3786,8 +3734,8 @@ m68k_output_mi_thunk (FILE *file, tree t
xops[0] = XEXP (xops[0], 0);

fmt = m68k_symbolic_jump;
- if (m68k_symbolic_jump == NULL)
- fmt = "move.l %%***@GOT(%%a5), %%a1\n\tjmp (%%a1)";
+ if (fmt == NULL)
+ fmt = "move.l %%***@GOT(%%a5),%%a1\n\tjmp (%%a1)";

output_asm_insn (fmt, xops);
}
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -2045,7 +2045,7 @@
operands[2] = gen_rtx_MEM (SImode, XEXP (XEXP (operands[0], 0), 0));
}
output_asm_insn ("move%.l %1,%0", operands);
- output_asm_insn (MOTOROLA ? "jbpl %l3" : "jpl %l3", operands);
+ output_asm_insn ("jpl %l3", operands);
output_asm_insn ("addq%.l #1,%2", operands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (operands[3]));
@@ -5576,19 +5576,14 @@
{
CC_STATUS_INIT;
if (which_alternative == 1)
- {
- if (MOTOROLA)
- return "move%.l %0,%2\;or%.l %0,%2\;jbeq %l1";
- else
- return "move%.l %0,%2\;or%.l %0,%2\;jeq %l1";
- }
+ return "move%.l %0,%2\;or%.l %0,%2\;jeq %l1";
if ((cc_prev_status.value1
&& rtx_equal_p (cc_prev_status.value1, operands[0]))
|| (cc_prev_status.value2
&& rtx_equal_p (cc_prev_status.value2, operands[0])))
{
cc_status = cc_prev_status;
- return MOTOROLA ? "jbeq %l1" : "jeq %l1";
+ return "jeq %l1";
}
if (GET_CODE (operands[0]) == REG)
operands[3] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
@@ -5599,40 +5594,17 @@
if (reg_overlap_mentioned_p (operands[2], operands[0]))
{
if (reg_overlap_mentioned_p (operands[2], operands[3]))
- {
- if (MOTOROLA)
- return "or%.l %0,%2\;jbeq %l1";
- else
- return "or%.l %0,%2\;jeq %l1";
- }
+ return "or%.l %0,%2\;jeq %l1";
else
- {
- if (MOTOROLA)
- return "or%.l %3,%2\;jbeq %l1";
- else
- return "or%.l %3,%2\;jeq %l1";
- }
+ return "or%.l %3,%2\;jeq %l1";
}
- if (MOTOROLA)
- return "move%.l %0,%2\;or%.l %3,%2\;jbeq %l1";
- else
- return "move%.l %0,%2\;or%.l %3,%2\;jeq %l1";
+ return "move%.l %0,%2\;or%.l %3,%2\;jeq %l1";
}
operands[4] = gen_label_rtx();
if (TARGET_68020 || TARGET_COLDFIRE)
- {
- if (MOTOROLA)
- output_asm_insn ("tst%.l %0\;jbne %l4\;tst%.l %3\;jbeq %l1", operands);
- else
- output_asm_insn ("tst%.l %0\;jne %l4\;tst%.l %3\;jeq %l1", operands);
- }
+ output_asm_insn ("tst%.l %0\;jne %l4\;tst%.l %3\;jeq %l1", operands);
else
- {
- if (MOTOROLA)
- output_asm_insn ("cmp%.w #0,%0\;jbne %l4\;cmp%.w #0,%3\;jbeq %l1", operands);
- else
- output_asm_insn ("cmp%.w #0,%0\;jne %l4\;cmp%.w #0,%3\;jeq %l1", operands);
- }
+ output_asm_insn ("cmp%.w #0,%0\;jne %l4\;cmp%.w #0,%3\;jeq %l1", operands);
(*targetm.asm_out.internal_label) (asm_out_file, "L",
CODE_LABEL_NUMBER (operands[4]));
return "";
@@ -5653,7 +5625,7 @@
&& rtx_equal_p (cc_prev_status.value2, operands[0])))
{
cc_status = cc_prev_status;
- return MOTOROLA ? "jbne %l1" : "jne %l1";
+ return "jne %l1";
}
CC_STATUS_INIT;
if (GET_CODE (operands[0]) == REG)
@@ -5665,39 +5637,16 @@
if (reg_overlap_mentioned_p (operands[2], operands[0]))
{
if (reg_overlap_mentioned_p (operands[2], operands[3]))
- {
- if (MOTOROLA)
- return "or%.l %0,%2\;jbne %l1";
- else
- return "or%.l %0,%2\;jne %l1";
- }
+ return "or%.l %0,%2\;jne %l1";
else
- {
- if (MOTOROLA)
- return "or%.l %3,%2\;jbne %l1";
- else
- return "or%.l %3,%2\;jne %l1";
- }
+ return "or%.l %3,%2\;jne %l1";
}
- if (MOTOROLA)
- return "move%.l %0,%2\;or%.l %3,%2\;jbne %l1";
- else
- return "move%.l %0,%2\;or%.l %3,%2\;jne %l1";
+ return "move%.l %0,%2\;or%.l %3,%2\;jne %l1";
}
if (TARGET_68020 || TARGET_COLDFIRE)
- {
- if (MOTOROLA)
- return "tst%.l %0\;jbne %l1\;tst%.l %3\;jbne %l1";
- else
- return "tst%.l %0\;jne %l1\;tst%.l %3\;jne %l1";
- }
+ return "tst%.l %0\;jne %l1\;tst%.l %3\;jne %l1";
else
- {
- if (MOTOROLA)
- return "cmp%.w #0,%0\;jbne %l1\;cmp%.w #0,%3\;jbne %l1";
- else
- return "cmp%.w #0,%0\;jne %l1\;cmp%.w #0,%3\;jne %l1";
- }
+ return "cmp%.w #0,%0\;jne %l1\;cmp%.w #0,%3\;jne %l1";
})

(define_insn "bge0_di"
@@ -5714,14 +5663,7 @@
&& rtx_equal_p (cc_prev_status.value2, operands[0])))
{
cc_status = cc_prev_status;
- if (cc_status.flags & CC_REVERSED)
- {
- return MOTOROLA ? "jble %l1" : "jle %l1";
- }
- else
- {
- return MOTOROLA ? "jbpl %l1" : "jpl %l1";
- }
+ return cc_status.flags & CC_REVERSED ? "jle %l1" : "jpl %l1";
}
CC_STATUS_INIT;
if (TARGET_68020 || TARGET_COLDFIRE || ! ADDRESS_REG_P (operands[0]))
@@ -5731,7 +5673,7 @@
/* On an address reg, cmpw may replace cmpl. */
output_asm_insn("cmp%.w #0,%0", operands);
}
- return MOTOROLA ? "jbpl %l1" : "jpl %l1";
+ return "jpl %l1";
})

(define_insn "blt0_di"
@@ -5748,14 +5690,7 @@
&& rtx_equal_p (cc_prev_status.value2, operands[0])))
{
cc_status = cc_prev_status;
- if (cc_status.flags & CC_REVERSED)
- {
- return MOTOROLA ? "jbgt %l1" : "jgt %l1";
- }
- else
- {
- return MOTOROLA ? "jbmi %l1" : "jmi %l1";
- }
+ return cc_status.flags & CC_REVERSED ? "jgt %l1" : "jmi %l1";
}
CC_STATUS_INIT;
if (TARGET_68020 || TARGET_COLDFIRE || ! ADDRESS_REG_P (operands[0]))
@@ -5765,8 +5700,7 @@
/* On an address reg, cmpw may replace cmpl. */
output_asm_insn("cmp%.w #0,%0", operands);
}
-
- return MOTOROLA ? "jbmi %l1" : "jmi %l1";
+ return "jmi %l1";
})

(define_insn "beq"
@@ -5809,9 +5743,7 @@
(label_ref (match_operand 0 "" ""))
(pc)))]
""
-{
- return MOTOROLA ? "jbhi %l0" : "jhi %l0";
-})
+ "jhi %l0")

(define_insn "blt"
[(set (pc)
@@ -5831,9 +5763,7 @@
(label_ref (match_operand 0 "" ""))
(pc)))]
""
-{
- return MOTOROLA ? "jbcs %l0" : "jcs %l0";
-})
+ "jcs %l0")

(define_insn "bge"
[(set (pc)
@@ -5853,9 +5783,7 @@
(label_ref (match_operand 0 "" ""))
(pc)))]
""
-{
- return MOTOROLA ? "jbcc %l0" : "jcc %l0";
-})
+ "jcc %l0")

(define_insn "ble"
[(set (pc)
@@ -5875,9 +5803,7 @@
(label_ref (match_operand 0 "" ""))
(pc)))]
""
-{
- return MOTOROLA ? "jbls %l0" : "jls %l0";
-})
+ "jls %l0")

(define_insn "bordered"
[(set (pc)
@@ -6009,9 +5935,7 @@
(pc)
(label_ref (match_operand 0 "" ""))))]
""
-{
- return MOTOROLA ? "jbls %l0" : "jls %l0";
-})
+ "jls %l0")

(define_insn ""
[(set (pc)
@@ -6031,9 +5955,7 @@
(pc)
(label_ref (match_operand 0 "" ""))))]
""
-{
- return MOTOROLA ? "jbcc %l0" : "jcc %l0";
-})
+ "jcc %l0")

(define_insn ""
[(set (pc)
@@ -6053,9 +5975,7 @@
(pc)
(label_ref (match_operand 0 "" ""))))]
""
-{
- return MOTOROLA ? "jbcs %l0" : "jcs %l0";
-})
+ "jcs %l0")

(define_insn ""
[(set (pc)
@@ -6075,9 +5995,7 @@
(pc)
(label_ref (match_operand 0 "" ""))))]
""
-{
- return MOTOROLA ? "jbhi %l0" : "jhi %l0";
-})
+ "jhi %l0")

(define_insn "*bordered_rev"
[(set (pc)
@@ -6172,9 +6090,7 @@
[(set (pc)
(label_ref (match_operand 0 "" "")))]
""
-{
- return MOTOROLA ? "jbra %l0" : "jra %l0";
-})
+ "jra %l0")

(define_expand "tablejump"
[(parallel [(set (pc) (match_operand 0 "" ""))
@@ -6238,12 +6154,8 @@
if (DATA_REG_P (operands[0]))
return "dbra %0,%l1";
if (GET_CODE (operands[0]) == MEM)
- return MOTOROLA ?
- "subq%.w #1,%0\;jbcc %l1" :
- "subqw #1,%0\;jcc %l1";
- return MOTOROLA ?
- "subq%.w #1,%0\;cmp%.w #-1,%0\;jbne %l1" :
- "subqw #1,%0\;cmpw #-1,%0\;jne %l1";
+ return "subq%.w #1,%0\;jcc %l1";
+ return "subq%.w #1,%0\;cmp%.w #-1,%0\;jne %l1";
})

(define_insn ""
@@ -6260,16 +6172,10 @@
{
CC_STATUS_INIT;
if (DATA_REG_P (operands[0]))
- return MOTOROLA ?
- "dbra %0,%l1\;clr%.w %0\;subq%.l #1,%0\;jbcc %l1" :
- "dbra %0,%l1\;clr%.w %0\;subq%.l #1,%0\;jcc %l1";
+ return "dbra %0,%l1\;clr%.w %0\;subq%.l #1,%0\;jcc %l1";
if (GET_CODE (operands[0]) == MEM)
- return MOTOROLA ?
- "subq%.l #1,%0\;jbcc %l1" :
- "subq%.l #1,%0\;jcc %l1";
- return MOTOROLA ?
- "subq.l #1,%0\;cmp.l #-1,%0\;jbne %l1" :
- "subql #1,%0\;cmpl #-1,%0\;jne %l1";
+ return "subq%.l #1,%0\;jcc %l1";
+ return "subq%.l #1,%0\;cmp%.l #-1,%0\;jne %l1";
})

;; Two dbra patterns that use REG_NOTES info generated by strength_reduce.
@@ -6291,12 +6197,8 @@
if (DATA_REG_P (operands[0]))
return "dbra %0,%l1";
if (GET_CODE (operands[0]) == MEM)
- return MOTOROLA ?
- "subq%.w #1,%0\;jbcc %l1" :
- "subq%.w #1,%0\;jcc %l1";
- return MOTOROLA ?
- "subq.w #1,%0\;cmp.w #-1,%0\;jbne %l1" :
- "subqw #1,%0\;cmpw #-1,%0\;jne %l1";
+ return "subq%.w #1,%0\;jcc %l1";
+ return "subq%.w #1,%0\;cmp%.w #-1,%0\;jne %l1";
})

(define_expand "decrement_and_branch_until_zero"
@@ -6328,16 +6230,10 @@
{
CC_STATUS_INIT;
if (DATA_REG_P (operands[0]))
- return MOTOROLA ?
- "dbra %0,%l1\;clr%.w %0\;subq%.l #1,%0\;jbcc %l1" :
- "dbra %0,%l1\;clr%.w %0\;subql #1,%0\;jcc %l1";
+ return "dbra %0,%l1\;clr%.w %0\;subq%.l #1,%0\;jcc %l1";
if (GET_CODE (operands[0]) == MEM)
- return MOTOROLA ?
- "subq%.l #1,%0\;jbcc %l1" :
- "subql #1,%0\;jcc %l1";
- return MOTOROLA ?
- "subq.l #1,%0\;cmp.l #-1,%0\;jbne %l1" :
- "subql #1,%0\;cmpl #-1,%0\;jne %l1";
+ return "subq%.l #1,%0\;jcc %l1";
+ return "subq%.l #1,%0\;cmp%.l #-1,%0\;jne %l1";
})

;; Call subroutine with no return value.

--
Jeffrey Law
2007-02-05 17:11:27 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
This continues the previous patch and cleans up the remaining jump
opcodes. jcc and jbcc are both gas pseudo opcodes (and no real motorola
opcodes), so this uses now the officially documented variant.
So your claim is that a motorola based assembler syntax will
accept jcc? Have you tested this?

Jeff
Roman Zippel
2007-02-05 19:34:04 UTC
Permalink
Hi,
Post by Jeffrey Law
Post by z***@linux-m68k.org
This continues the previous patch and cleans up the remaining jump
opcodes. jcc and jbcc are both gas pseudo opcodes (and no real motorola
opcodes), so this uses now the officially documented variant.
So your claim is that a motorola based assembler syntax will
accept jcc? Have you tested this?
How? Many targets that might have alternative assembler have been
deprecated, the following is a simple diff between 3.0 and trunk of
"grep '^m68[0k]' config.gcc | sort":

-m68000-*-*)
-m68000-att-sysv*)
-m68000-convergent-sysv*)
-m68000-hp-bsd*) # HP 9000/200 running BSD
-m68000-hp-hpux*) # HP 9000 series 300
-m68000-sun-sunos3*)
-m68000-sun-sunos4*)
+m68010-*-netbsdelf* | m68k*-*-netbsdelf*)
m68020-*-elf* | m68k-*-elf*)
-m68k*-*-netbsd*)
+m680[012]0-*-*)
m68k*-*-openbsd*)
+m68k-*-*)
m68k-*-aout*)
m68k-*-coff*)
m68k-*-linux*) # Motorola m68k's running GNU/Linux
-m68k-*-linux*aout*) # Motorola m68k's running GNU/Linux
-m68k-*-linux*libc1) # Motorola m68k's running GNU/Linux
-m68k-*-lynxos*)
-m68k-*-psos*)
-m68k-*-rtemscoff*)
-m68k-*-rtemself*|m68k-*-rtems*)
-m68k-*-sysv3*) # Motorola m68k's running system V.3
-m68k-*-sysv4*) # Motorola m68k's running system V.4
-m68k-altos-sysv*) # Altos 3068
-m68k-apollo-*)
-m68k-apple-aux*) # Apple Macintosh running A/UX
-m68k-atari-sysv4*) # Atari variant of V.4.
-m68k-bull-sysv*) # Bull DPX/2
-m68k-cbm-sysv4*) # Commodore variant of V.4.
-m68k-ccur-rtu)
-m68k-crds-unos*)
-m68k-hp-bsd*) # HP 9000/3xx running Berkeley Unix
-m68k-hp-bsd4.4*) # HP 9000/3xx running 4.4bsd
-m68k-hp-hpux*) # HP 9000 series 300
-m68k-hp-hpux7*) # HP 9000 series 300 running HPUX version 7.
-m68k-isi-bsd*)
-m68k-motorola-sysv*)
-m68k-ncr-sysv*) # NCR Tower 32 SVR3
-m68k-next-nextstep2*)
-m68k-next-nextstep[34]*)
-m68k-plexus-sysv*)
-m68k-sony-bsd* | m68k-sony-newsos*)
-m68k-sony-newsos3*)
-m68k-sun-mach*)
-m68k-sun-sunos*) # For SunOS 4 (the default).
-m68k-sun-sunos3*)
-m68k-tti-*)
-m68k-wrs-vxworks*)
+m68k-*-rtems*)
+m68k-*-uclinux*) # Motorola m68k/ColdFire running uClinux with uClibc

IMO it's pretty safe to assume that gas is the primary assembler now. If
there should be a need for an alternative assembler, we can add explicit
support for it (using USE_FOO instead of !USE_GAS).
z***@linux-m68k.org
2007-01-30 11:26:24 UTC
Permalink
Hi,

This converts many of the current text peepholes into rtl peepholes
(only peepholes involving cc0 are left). The primary reason for doing
this is that these cause corrupt frame information, as they manipulate
the stack pointer. I also adjusted the conditions, so they actually
match again.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.c (strict_low_part_peephole_ok): don't leave
the basic block
* config/m68k/m68k.md: convert text peepholes into rtl peepholes
* config/m68k/m68k.h (REGNO_OK_FOR_FP_P): remove
(FP_REG_P): use FP_REGNO_P directly without reg_renumber

---

gcc/config/m68k/m68k.c | 18 ++-
gcc/config/m68k/m68k.h | 6 -
gcc/config/m68k/m68k.md | 219 ++++++++++++++++++------------------------------
3 files changed, 97 insertions(+), 146 deletions(-)

Index: egcs/gcc/config/m68k/m68k.h
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.h
+++ egcs/gcc/config/m68k/m68k.h
@@ -766,10 +766,6 @@ __transfer_from_trampoline () \
(DATA_REGNO_P (REGNO) \
|| DATA_REGNO_P (reg_renumber[REGNO]))

-#define REGNO_OK_FOR_FP_P(REGNO) \
- (FP_REGNO_P (REGNO) \
- || FP_REGNO_P (reg_renumber[REGNO]))
-
/* Now macros that check whether X is a register and also,
strictly, whether it is in a specified class.

@@ -781,7 +777,7 @@ __transfer_from_trampoline () \
#define DATA_REG_P(X) (REG_P (X) && REGNO_OK_FOR_DATA_P (REGNO (X)))

/* 1 if X is an fp register. */
-#define FP_REG_P(X) (REG_P (X) && REGNO_OK_FOR_FP_P (REGNO (X)))
+#define FP_REG_P(X) (REG_P (X) && FP_REGNO_P (X))

/* 1 if X is an address register */
#define ADDRESS_REG_P(X) (REG_P (X) && REGNO_OK_FOR_BASE_P (REGNO (X)))
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -653,13 +653,22 @@
}
})

-;; General case of fullword move. The register constraints
-;; force integer constants in range for a moveq to be reloaded
-;; if they are headed for memory.
+;; General case of fullword move.
(define_insn ""
;; Notes: make sure no alternative allows g vs g.
;; We don't allow f-regs since fixed point cannot go in them.
[(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
+ (match_operand:SI 1 "general_src_operand" "daymSnT,n,i"))]
+ "!TARGET_COLDFIRE && reload_completed"
+{
+ return output_move_simode (operands);
+})
+
+;; Before reload is completed the register constraints
+;; force integer constants in range for a moveq to be reloaded
+;; if they are headed for memory.
+(define_insn ""
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
(match_operand:SI 1 "general_src_operand" "daymSKT,n,i"))]

"!TARGET_COLDFIRE"
@@ -6634,153 +6643,97 @@
;; and then is moved into an FP register.
;; But it is mainly intended to test the support for these optimizations.

-(define_peephole
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
-{
- rtx xoperands[2];
- xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- output_asm_insn ("move%.l %1,%@", xoperands);
- output_asm_insn ("move%.l %1,%-", operands);
- return "fmove%.d %+,%0";
-})
+ (set (match_operand:DF 0 "register_operand" "")
+ (match_operand:DF 1 "register_operand" ""))]
+ "FP_REG_P (operands[0]) && !FP_REG_P (operands[1])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 1))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 2))
+ (set (match_dup 0) (mem:DF (post_inc:SI (reg:SI SP_REG))))]
+ "split_di(operands + 1, 1, operands + 1, operands + 2);")

;; Optimize a stack-adjust followed by a push of an argument.
;; This is said to happen frequently with -msoft-float
;; when there are consecutive library calls.

-(define_peephole
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (mem:SF (pre_dec:SI (reg:SI SP_REG)))
+ (match_operand:SF 0 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (mem:SF (reg:SI SP_REG)) (match_dup 0))]
+ "")
+
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SF 1 "push_operand" "=m")
- (match_operand:SF 2 "general_operand" "rmfF"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (FP_REG_P (operands[2]))
- return "fmove%.s %2,%@";
- return "move%.l %2,%@";
-})
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (mem:SF (pre_dec:SI (reg:SI SP_REG)))
+ (match_operand:SF 1 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (mem:SF (reg:SI SP_REG)) (match_dup 1))]
+ "operands[0] = GEN_INT (INTVAL (operands[0]) - 4);")

;; Speed up stack adjust followed by a fullword fixedpoint push.

-(define_peephole
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG)))
+ (match_operand:SI 0 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 0))]
+ "")
+
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SI 1 "push_operand" "=m")
- (match_operand:SI 2 "general_operand" "g"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (operands[2] == const0_rtx)
- return "clr%.l %@";
- return "move%.l %2,%@";
-})
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG)))
+ (match_operand:SI 1 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (mem:SI (reg:SI SP_REG)) (match_dup 1))]
+ "operands[0] = GEN_INT (INTVAL (operands[0]) - 4);")

;; Speed up pushing a single byte but leaving four bytes of space.

-(define_peephole
- [(set (mem:QI (pre_dec:SI (reg:SI SP_REG)))
- (match_operand:QI 1 "general_operand" "dami"))
- (set (reg:SI SP_REG) (minus:SI (reg:SI SP_REG) (const_int 2)))]
- "! reg_mentioned_p (stack_pointer_rtx, operands[1])"
-{
- rtx xoperands[4];
-
- if (GET_CODE (operands[1]) == REG)
- return "move%.l %1,%-";
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -4)))
+ (set (mem:QI (plus:SI (reg:SI SP_REG) (const_int 3)))
+ (match_operand:QI 0 "register_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 0))]
+ "operands[0] = gen_rtx_REG (SImode, REGNO (operands[0]));")
+
+(define_peephole2
+ [(set (mem:HI (pre_dec:SI (reg:SI SP_REG)))
+ (match_operand:HI 0 "register_operand" ""))
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -2)))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 0))]
+ "operands[0] = gen_rtx_REG (SImode, REGNO (operands[0]));")

- xoperands[1] = operands[1];
- xoperands[2]
- = gen_rtx_MEM (QImode, plus_constant (stack_pointer_rtx, 3));
- xoperands[3] = stack_pointer_rtx;
- if (!TARGET_COLDFIRE)
- output_asm_insn ("subq%.w #4,%3\;move%.b %1,%2", xoperands);
- else
- output_asm_insn ("subq%.l #4,%3\;move%.b %1,%2", xoperands);
- return "";
-})
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
+ (const_int 0))
+ (set (strict_low_part (match_operand:HI 1 "register_operand" ""))
+ (match_operand:HI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (HImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

-(define_peephole
- [(set (match_operand:SI 0 "register_operand" "=d")
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
(const_int 0))
- (set (strict_low_part (subreg:HI (match_dup 0) 2))
- (match_operand:HI 1 "general_operand" "rmn"))]
- "strict_low_part_peephole_ok (HImode, prev_nonnote_insn (insn), operands[0])"
-{
- if (GET_CODE (operands[1]) == CONST_INT)
- {
- if (operands[1] == const0_rtx
- && (DATA_REG_P (operands[0])
- || GET_CODE (operands[0]) == MEM)
- /* clr insns on 68000 read before writing. */
- && ((TARGET_68010 || TARGET_COLDFIRE)
- || !(GET_CODE (operands[0]) == MEM
- && MEM_VOLATILE_P (operands[0]))))
- return "clr%.w %0";
- }
- return "move%.w %1,%0";
-})
+ (set (strict_low_part (match_operand:QI 1 "register_operand" ""))
+ (match_operand:QI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (QImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

;; dbCC peepholes
;;
Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -2484,7 +2484,7 @@ force_mode (enum machine_mode mode, rtx
static int
fp_reg_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
{
- return reg_renumber && FP_REG_P (op);
+ return FP_REG_P (op);
}

/* Emit insns to move operands[1] into operands[0].
@@ -3438,14 +3438,18 @@ bool
strict_low_part_peephole_ok (enum machine_mode mode, rtx first_insn,
rtx target)
{
- rtx p;
+ rtx p = first_insn;

- p = prev_nonnote_insn (first_insn);
-
- while (p)
+ while ((p = PREV_INSN (p)))
{
+ if (NOTE_INSN_BASIC_BLOCK_P (p))
+ return false;
+
+ if (NOTE_P (p))
+ continue;
+
/* If it isn't an insn, then give up. */
- if (GET_CODE (p) != INSN)
+ if (!INSN_P (p))
return false;

if (reg_set_p (target, p))
@@ -3475,8 +3479,6 @@ strict_low_part_peephole_ok (enum machin
else
return false;
}
-
- p = prev_nonnote_insn (p);
}

return false;

--
Richard Sandiford
2007-01-30 13:27:41 UTC
Permalink
Post by z***@linux-m68k.org
This converts many of the current text peepholes into rtl peepholes
(only peepholes involving cc0 are left). The primary reason for doing
this is that these cause corrupt frame information, as they manipulate
the stack pointer. I also adjusted the conditions, so they actually
match again.
* config/m68k/m68k.c (strict_low_part_peephole_ok): don't leave
the basic block
* config/m68k/m68k.md: convert text peepholes into rtl peepholes
* config/m68k/m68k.h (REGNO_OK_FOR_FP_P): remove
(FP_REG_P): use FP_REGNO_P directly without reg_renumber
As you mentioned, this conflicts heavily with the unreviewed
peephole2 patch I posted earlier.
Post by z***@linux-m68k.org
-;; General case of fullword move. The register constraints
-;; force integer constants in range for a moveq to be reloaded
-;; if they are headed for memory.
+;; General case of fullword move.
(define_insn ""
;; Notes: make sure no alternative allows g vs g.
;; We don't allow f-regs since fixed point cannot go in them.
[(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
+ (match_operand:SI 1 "general_src_operand" "daymSnT,n,i"))]
+ "!TARGET_COLDFIRE && reload_completed"
+{
+ return output_move_simode (operands);
+})
+
+;; Before reload is completed the register constraints
+;; force integer constants in range for a moveq to be reloaded
+;; if they are headed for memory.
+(define_insn ""
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
(match_operand:SI 1 "general_src_operand" "daymSKT,n,i"))]
"!TARGET_COLDFIRE"
I have no strong feelings here (it's !TARGET_COLDFIRE ;)) but I think it
would be better stylistically to have a new constraint that behaves like
'K' before and during reload, and like 'n' after reload. That way,
if some post-reload pass looks at the constraints of a reloaded second
insn, it will know that moveq constants are now allowed. It's a minor
point though.
Post by z***@linux-m68k.org
@@ -6634,153 +6643,97 @@
;; and then is moved into an FP register.
;; But it is mainly intended to test the support for these optimizations.
-(define_peephole
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
-{
- rtx xoperands[2];
- xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- output_asm_insn ("move%.l %1,%-", operands);
- return "fmove%.d %+,%0";
-})
+ (set (match_operand:DF 0 "register_operand" "")
+ (match_operand:DF 1 "register_operand" ""))]
+ "FP_REG_P (operands[0]) && !FP_REG_P (operands[1])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 1))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 2))
+ (set (match_dup 0) (mem:DF (post_inc:SI (reg:SI SP_REG))))]
+ "split_di(operands + 1, 1, operands + 1, operands + 2);")
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.

Richard
Roman Zippel
2007-02-06 03:32:36 UTC
Permalink
Hi,
Post by Richard Sandiford
Post by z***@linux-m68k.org
-;; General case of fullword move. The register constraints
-;; force integer constants in range for a moveq to be reloaded
-;; if they are headed for memory.
+;; General case of fullword move.
(define_insn ""
;; Notes: make sure no alternative allows g vs g.
;; We don't allow f-regs since fixed point cannot go in them.
[(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
+ (match_operand:SI 1 "general_src_operand" "daymSnT,n,i"))]
+ "!TARGET_COLDFIRE && reload_completed"
+{
+ return output_move_simode (operands);
+})
+
+;; Before reload is completed the register constraints
+;; force integer constants in range for a moveq to be reloaded
+;; if they are headed for memory.
+(define_insn ""
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
(match_operand:SI 1 "general_src_operand" "daymSKT,n,i"))]
"!TARGET_COLDFIRE"
I have no strong feelings here (it's !TARGET_COLDFIRE ;)) but I think it
would be better stylistically to have a new constraint that behaves like
'K' before and during reload, and like 'n' after reload. That way,
if some post-reload pass looks at the constraints of a reloaded second
insn, it will know that moveq constants are now allowed. It's a minor
point though.
BTW I think something similiar is needed for ColdFire as well, if we want
to avoid the unspec (which I'd really like to).
Post by Richard Sandiford
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.
Below is new version of the patch which does this, I also added the CF
variant for a single byte push. (Ligthly tested.)

bye, Roman

---

gcc/config/m68k/m68k.c | 18 +--
gcc/config/m68k/m68k.h | 6 -
gcc/config/m68k/m68k.md | 252 ++++++++++++++++++++++--------------------------
3 files changed, 130 insertions(+), 146 deletions(-)

Index: egcs/gcc/config/m68k/m68k.h
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.h
+++ egcs/gcc/config/m68k/m68k.h
@@ -773,10 +773,6 @@ __transfer_from_trampoline () \
(DATA_REGNO_P (REGNO) \
|| DATA_REGNO_P (reg_renumber[REGNO]))

-#define REGNO_OK_FOR_FP_P(REGNO) \
- (FP_REGNO_P (REGNO) \
- || FP_REGNO_P (reg_renumber[REGNO]))
-
/* Now macros that check whether X is a register and also,
strictly, whether it is in a specified class.

@@ -788,7 +784,7 @@ __transfer_from_trampoline () \
#define DATA_REG_P(X) (REG_P (X) && REGNO_OK_FOR_DATA_P (REGNO (X)))

/* 1 if X is an fp register. */
-#define FP_REG_P(X) (REG_P (X) && REGNO_OK_FOR_FP_P (REGNO (X)))
+#define FP_REG_P(X) (REG_P (X) && FP_REGNO_P (REGNO (X)))

/* 1 if X is an address register */
#define ADDRESS_REG_P(X) (REG_P (X) && REGNO_OK_FOR_BASE_P (REGNO (X)))
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -653,13 +653,22 @@
}
})

-;; General case of fullword move. The register constraints
-;; force integer constants in range for a moveq to be reloaded
-;; if they are headed for memory.
+;; General case of fullword move.
(define_insn ""
;; Notes: make sure no alternative allows g vs g.
;; We don't allow f-regs since fixed point cannot go in them.
[(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
+ (match_operand:SI 1 "general_src_operand" "daymSnT,n,i"))]
+ "!TARGET_COLDFIRE && reload_completed"
+{
+ return output_move_simode (operands);
+})
+
+;; Before reload is completed the register constraints
+;; force integer constants in range for a moveq to be reloaded
+;; if they are headed for memory.
+(define_insn ""
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
(match_operand:SI 1 "general_src_operand" "daymSKT,n,i"))]

"!TARGET_COLDFIRE"
@@ -6634,153 +6643,130 @@
;; and then is moved into an FP register.
;; But it is mainly intended to test the support for these optimizations.

-(define_peephole
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
-{
- rtx xoperands[2];
- xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- output_asm_insn ("move%.l %1,%@", xoperands);
- output_asm_insn ("move%.l %1,%-", operands);
- return "fmove%.d %+,%0";
-})
+ (set (match_operand:DF 0 "register_operand" "")
+ (match_operand:DF 1 "register_operand" ""))]
+ "FP_REG_P (operands[0]) && !FP_REG_P (operands[1])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 1))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 2))
+ (set (match_dup 0) (mem:DF (post_inc:SI (reg:SI SP_REG))))]
+ "split_di(operands + 1, 1, operands + 1, operands + 2);")

;; Optimize a stack-adjust followed by a push of an argument.
;; This is said to happen frequently with -msoft-float
;; when there are consecutive library calls.

-(define_peephole
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (match_operand:SF 0 "push_operand" "")
+ (match_operand:SF 1 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (match_dup 0) (match_dup 1))]
+ "operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);")
+
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SF 1 "push_operand" "=m")
- (match_operand:SF 2 "general_operand" "rmfF"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (FP_REG_P (operands[2]))
- return "fmove%.s %2,%@";
- return "move%.l %2,%@";
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (match_operand:SF 1 "push_operand" "")
+ (match_operand:SF 2 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[2])"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (match_dup 1) (match_dup 2))]
+{
+ operands[0] = GEN_INT (INTVAL (operands[0]) - 4);
+ operands[1] = replace_equiv_address (operands[1], stack_pointer_rtx);
})

;; Speed up stack adjust followed by a fullword fixedpoint push.

-(define_peephole
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (match_operand:SI 0 "push_operand" "")
+ (match_operand:SI 1 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_operand 0) (match_dup 1))]
+ "operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);")
+
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SI 1 "push_operand" "=m")
- (match_operand:SI 2 "general_operand" "g"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (operands[2] == const0_rtx)
- return "clr%.l %@";
- return "move%.l %2,%@";
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (match_operand:SI 1 "push_operand" "")
+ (match_operand:SI 2 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[2])"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (match_dup 1) (match_dup 2))]
+{
+ operands[0] = GEN_INT (INTVAL (operands[0]) - 4);
+ operands[1] = replace_equiv_address (operands[1], stack_pointer_rtx);
+})
+
+;; Speed up pushing a single byte/two bytes but leaving four bytes of space
+;; (which differs slightly between m680x0 and ColdFire).
+
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -4)))
+ (set (match_operand:QI 0 "memory_operand" "")
+ (match_operand:QI 1 "register_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])
+ && GET_CODE (XEXP (operands[0], 0)) == PLUS
+ && rtx_equal_p (XEXP (XEXP (operands[0], 0), 0), stack_pointer_rtx)
+ && CONST_INT_P (XEXP (XEXP (operands[0], 0), 1))
+ && INTVAL (XEXP (XEXP (operands[0], 0), 1)) == 3"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ rtx addr = gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx);
+ operands[0] = adjust_automodify_address (operands[0], SImode, addr, -3);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], QImode, 0);
+})
+
+(define_peephole2
+ [(set (match_operand:QI 0 "push_operand" "")
+ (match_operand:QI 1 "register_operand" ""))
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -3)))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_automodify_address (operands[0], SImode,
+ XEXP (operands[0], 0), -3);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], QImode, 0);
+})
+
+(define_peephole2
+ [(set (match_operand:HI 0 "push_operand" "")
+ (match_operand:HI 1 "register_operand" ""))
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -2)))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_automodify_address (operands[0], SImode,
+ XEXP (operands[0], 0), -2);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], HImode, 0);
})

-;; Speed up pushing a single byte but leaving four bytes of space.
-
-(define_peephole
- [(set (mem:QI (pre_dec:SI (reg:SI SP_REG)))
- (match_operand:QI 1 "general_operand" "dami"))
- (set (reg:SI SP_REG) (minus:SI (reg:SI SP_REG) (const_int 2)))]
- "! reg_mentioned_p (stack_pointer_rtx, operands[1])"
-{
- rtx xoperands[4];
-
- if (GET_CODE (operands[1]) == REG)
- return "move%.l %1,%-";
+;; Optimize a series of strict_low_part assignments

- xoperands[1] = operands[1];
- xoperands[2]
- = gen_rtx_MEM (QImode, plus_constant (stack_pointer_rtx, 3));
- xoperands[3] = stack_pointer_rtx;
- if (!TARGET_COLDFIRE)
- output_asm_insn ("subq%.w #4,%3\;move%.b %1,%2", xoperands);
- else
- output_asm_insn ("subq%.l #4,%3\;move%.b %1,%2", xoperands);
- return "";
-})
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
+ (const_int 0))
+ (set (strict_low_part (match_operand:HI 1 "register_operand" ""))
+ (match_operand:HI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (HImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

-(define_peephole
- [(set (match_operand:SI 0 "register_operand" "=d")
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
(const_int 0))
- (set (strict_low_part (subreg:HI (match_dup 0) 2))
- (match_operand:HI 1 "general_operand" "rmn"))]
- "strict_low_part_peephole_ok (HImode, prev_nonnote_insn (insn), operands[0])"
-{
- if (GET_CODE (operands[1]) == CONST_INT)
- {
- if (operands[1] == const0_rtx
- && (DATA_REG_P (operands[0])
- || GET_CODE (operands[0]) == MEM)
- /* clr insns on 68000 read before writing. */
- && ((TARGET_68010 || TARGET_COLDFIRE)
- || !(GET_CODE (operands[0]) == MEM
- && MEM_VOLATILE_P (operands[0]))))
- return "clr%.w %0";
- }
- return "move%.w %1,%0";
-})
+ (set (strict_low_part (match_operand:QI 1 "register_operand" ""))
+ (match_operand:QI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (QImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

;; dbCC peepholes
;;
Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -2485,7 +2485,7 @@ force_mode (enum machine_mode mode, rtx
static int
fp_reg_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
{
- return reg_renumber && FP_REG_P (op);
+ return FP_REG_P (op);
}

/* Emit insns to move operands[1] into operands[0].
@@ -3439,14 +3439,18 @@ bool
strict_low_part_peephole_ok (enum machine_mode mode, rtx first_insn,
rtx target)
{
- rtx p;
+ rtx p = first_insn;

- p = prev_nonnote_insn (first_insn);
-
- while (p)
+ while ((p = PREV_INSN (p)))
{
+ if (NOTE_INSN_BASIC_BLOCK_P (p))
+ return false;
+
+ if (NOTE_P (p))
+ continue;
+
/* If it isn't an insn, then give up. */
- if (GET_CODE (p) != INSN)
+ if (!INSN_P (p))
return false;

if (reg_set_p (target, p))
@@ -3476,8 +3480,6 @@ strict_low_part_peephole_ok (enum machin
else
return false;
}
-
- p = prev_nonnote_insn (p);
}

return false;
Richard Sandiford
2007-02-06 07:38:27 UTC
Permalink
Post by Roman Zippel
Post by Richard Sandiford
I have no strong feelings here (it's !TARGET_COLDFIRE ;)) but I think it
would be better stylistically to have a new constraint that behaves like
'K' before and during reload, and like 'n' after reload. That way,
if some post-reload pass looks at the constraints of a reloaded second
insn, it will know that moveq constants are now allowed. It's a minor
point though.
BTW I think something similiar is needed for ColdFire as well, if we want
to avoid the unspec (which I'd really like to).
I think it's 680x0 only. The ColdFire moves patterns don't play the
same moveq trick as the 680x0 code. (Don't ask me why.)
Post by Roman Zippel
Post by Richard Sandiford
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.
Below is new version of the patch which does this, I also added the CF
variant for a single byte push. (Ligthly tested.)
-(define_peephole
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
-{
- rtx xoperands[2];
- xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- output_asm_insn ("move%.l %1,%-", operands);
- return "fmove%.d %+,%0";
-})
+ (set (match_operand:DF 0 "register_operand" "")
+ (match_operand:DF 1 "register_operand" ""))]
+ "FP_REG_P (operands[0]) && !FP_REG_P (operands[1])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 1))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 2))
+ (set (match_dup 0) (mem:DF (post_inc:SI (reg:SI SP_REG))))]
+ "split_di(operands + 1, 1, operands + 1, operands + 2);")
...here.

It's your call, but seeing as the patch I posted has been approved,
and is only waiting on an unapproved dependent patch, it might be
better to hold off this discussion until either (a) that patch gets
reviewed, or (b) the approval of the patch I posted is reversed.
(It's only a suggestion -- doing patches against mainline is obviously
the right thing to do, and I'm not trying to imply otherwise.
Also, if there's something you object to in that patch -- the UNSPEC? --
then I can try to fix it.)

Richard
Roman Zippel
2007-02-06 13:20:56 UTC
Permalink
Hi,
Post by Richard Sandiford
Post by Roman Zippel
BTW I think something similiar is needed for ColdFire as well, if we want
to avoid the unspec (which I'd really like to).
I think it's 680x0 only. The ColdFire moves patterns don't play the
same moveq trick as the 680x0 code. (Don't ask me why.)
I misrembered this patch, I thought it was about the mov3q instruction.
The R constraint was unused, but one of your patches activates it, so are
you sure that ColdFire doesn't have a similiar problem now?
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.
Below is new version of the patch which does this, I also added the CF
variant for a single byte push. (Ligthly tested.)
That one looks pretty much like in your version. :)
Post by Richard Sandiford
It's your call, but seeing as the patch I posted has been approved,
and is only waiting on an unapproved dependent patch, it might be
better to hold off this discussion until either (a) that patch gets
reviewed, or (b) the approval of the patch I posted is reversed.
(It's only a suggestion -- doing patches against mainline is obviously
the right thing to do, and I'm not trying to imply otherwise.
Well, I don't mind making the patch incremental to yours (and my patch has
a dependency on another patch as well), but my patch does a bit more, so
it's not completely redundant.
I'm also thinking about other possible improvements, e.g. currently it
converts a "pea 100" to "move.l #100,(%sp)", which would only be a win if
the previous addq.l is eliminated, otherwise it would need a scratch
register and a moveq to be at least the same size.
Post by Richard Sandiford
Also, if there's something you object to in that patch -- the UNSPEC? --
then I can try to fix it.)
The last peephole added is pretty much useless as it will never match.

bye, Roman
Richard Sandiford
2007-02-06 13:41:36 UTC
Permalink
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
BTW I think something similiar is needed for ColdFire as well, if we want
to avoid the unspec (which I'd really like to).
I think it's 680x0 only. The ColdFire moves patterns don't play the
same moveq trick as the 680x0 code. (Don't ask me why.)
I misrembered this patch, I thought it was about the mov3q instruction.
The R constraint was unused, but one of your patches activates it, so are
you sure that ColdFire doesn't have a similiar problem now?
Yes. The 'R' constraint is an extra constraint that is allowed to be
the source of a memory move. It applies over and above what earlier
ColdFire targets allowed there. In other words, the 'R' stuff allows
_more_ alternatives, while the use of 'K' in the 680x0 insns is
specifically there to allow _fewer_ alternatives.

(mov3q is like a memory version of moveq, and lets you move constants
directly to offset addresses, which ColdFire's move.foo doesn't.)
Post by Roman Zippel
Post by Richard Sandiford
Post by Roman Zippel
Post by Richard Sandiford
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.
Below is new version of the patch which does this, I also added the CF
variant for a single byte push. (Ligthly tested.)
That one looks pretty much like in your version. :)
Oops ;) Serves me right for going off memory.

So have you incorporated all the address-changing stuff that was
in the patch I posted? Did you do it the same way, or differently?

If you did it the same way in each case, then I'm happy to yield to
your patch.

Richard
Jeffrey Law
2007-02-06 18:34:50 UTC
Permalink
Post by Richard Sandiford
So have you incorporated all the address-changing stuff that was
in the patch I posted? Did you do it the same way, or differently?
If you did it the same way in each case, then I'm happy to yield to
your patch.
Roman -- can you retest the latest version and post it? Based on
Richard's comments we'll go with yours first, then layer any
still relevant bits from Richard's patch on to yours.

Thanks,
JEff
Roman Zippel
2007-02-07 21:39:52 UTC
Permalink
Hi,
Post by Richard Sandiford
Yes. The 'R' constraint is an extra constraint that is allowed to be
the source of a memory move. It applies over and above what earlier
ColdFire targets allowed there. In other words, the 'R' stuff allows
_more_ alternatives, while the use of 'K' in the 680x0 insns is
specifically there to allow _fewer_ alternatives.
(mov3q is like a memory version of moveq, and lets you move constants
directly to offset addresses, which ColdFire's move.foo doesn't.)
ColdFire addressing modes are confusing. :)
Post by Richard Sandiford
Post by Roman Zippel
Post by Roman Zippel
Post by Richard Sandiford
One thing the patch I posted did was use adjust_address* family
of functions to create the new MEM. Hard-coding the MEM in the
output pattern will lose the alias information on the original MEM
(including that set up by get_frame_mem). The same comment applies
to the other peepholes.
Below is new version of the patch which does this, I also added the CF
variant for a single byte push. (Ligthly tested.)
That one looks pretty much like in your version. :)
Oops ;) Serves me right for going off memory.
So have you incorporated all the address-changing stuff that was
in the patch I posted? Did you do it the same way, or differently?
It's pretty much the same way, I'm avoiding generating a new pre_dec if
possible:

addr = gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx);
operands[2] = adjust_automodify_address (operands[0], SImode, addr, -3);

and just do:

operands[0] = adjust_automodify_address (operands[0], SImode,
XEXP (operands[0], 0), -2);

(If I read the source correctly using NULL there does the same.)

Below is hopefully the final patch, it tries now to avoid changing "pea
X.w" into "move.l" (it can still happen, but only if it's a clear win) and
also tries to use moveq if a scratch register is available.

bye, Roman

---

gcc/config/m68k/m68k.c | 18 +--
gcc/config/m68k/m68k.h | 6 -
gcc/config/m68k/m68k.md | 285 +++++++++++++++++++++++++-----------------------
3 files changed, 163 insertions(+), 146 deletions(-)

Index: egcs/gcc/config/m68k/m68k.h
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.h
+++ egcs/gcc/config/m68k/m68k.h
@@ -773,10 +773,6 @@ __transfer_from_trampoline () \
(DATA_REGNO_P (REGNO) \
|| DATA_REGNO_P (reg_renumber[REGNO]))

-#define REGNO_OK_FOR_FP_P(REGNO) \
- (FP_REGNO_P (REGNO) \
- || FP_REGNO_P (reg_renumber[REGNO]))
-
/* Now macros that check whether X is a register and also,
strictly, whether it is in a specified class.

@@ -788,7 +784,7 @@ __transfer_from_trampoline () \
#define DATA_REG_P(X) (REG_P (X) && REGNO_OK_FOR_DATA_P (REGNO (X)))

/* 1 if X is an fp register. */
-#define FP_REG_P(X) (REG_P (X) && REGNO_OK_FOR_FP_P (REGNO (X)))
+#define FP_REG_P(X) (REG_P (X) && FP_REGNO_P (REGNO (X)))

/* 1 if X is an address register */
#define ADDRESS_REG_P(X) (REG_P (X) && REGNO_OK_FOR_BASE_P (REGNO (X)))
Index: egcs/gcc/config/m68k/m68k.md
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.md
+++ egcs/gcc/config/m68k/m68k.md
@@ -653,13 +653,22 @@
}
})

-;; General case of fullword move. The register constraints
-;; force integer constants in range for a moveq to be reloaded
-;; if they are headed for memory.
+;; General case of fullword move.
(define_insn ""
;; Notes: make sure no alternative allows g vs g.
;; We don't allow f-regs since fixed point cannot go in them.
[(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
+ (match_operand:SI 1 "general_src_operand" "daymSnT,n,i"))]
+ "!TARGET_COLDFIRE && reload_completed"
+{
+ return output_move_simode (operands);
+})
+
+;; Before reload is completed the register constraints
+;; force integer constants in range for a moveq to be reloaded
+;; if they are headed for memory.
+(define_insn ""
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=g,d,a<")
(match_operand:SI 1 "general_src_operand" "daymSKT,n,i"))]

"!TARGET_COLDFIRE"
@@ -6634,153 +6643,163 @@
;; and then is moved into an FP register.
;; But it is mainly intended to test the support for these optimizations.

-(define_peephole
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
-{
- rtx xoperands[2];
- xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- output_asm_insn ("move%.l %1,%@", xoperands);
- output_asm_insn ("move%.l %1,%-", operands);
- return "fmove%.d %+,%0";
-})
+ (set (match_operand:DF 0 "register_operand" "")
+ (match_operand:DF 1 "register_operand" ""))]
+ "FP_REG_P (operands[0]) && !FP_REG_P (operands[1])"
+ [(set (mem:SI (reg:SI SP_REG)) (match_dup 1))
+ (set (mem:SI (pre_dec:SI (reg:SI SP_REG))) (match_dup 2))
+ (set (match_dup 0) (mem:DF (post_inc:SI (reg:SI SP_REG))))]
+ "split_di(operands + 1, 1, operands + 1, operands + 2);")

;; Optimize a stack-adjust followed by a push of an argument.
;; This is said to happen frequently with -msoft-float
;; when there are consecutive library calls.

-(define_peephole
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (match_operand:SF 0 "push_operand" "")
+ (match_operand:SF 1 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[0])"
+ [(set (match_dup 0) (match_dup 1))]
+ "operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);")
+
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SF 1 "push_operand" "=m")
- (match_operand:SF 2 "general_operand" "rmfF"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (FP_REG_P (operands[2]))
- return "fmove%.s %2,%@";
- return "move%.l %2,%@";
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (match_operand:SF 1 "push_operand" "")
+ (match_operand:SF 2 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[2])"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (match_dup 1) (match_dup 2))]
+{
+ operands[0] = GEN_INT (INTVAL (operands[0]) - 4);
+ operands[1] = replace_equiv_address (operands[1], stack_pointer_rtx);
})

;; Speed up stack adjust followed by a fullword fixedpoint push.
+;; Constant operands need special care, as replacing a "pea X.w" with
+;; "move.l #X,(%sp)" is often not a win.

-(define_peephole
+;; Already done by the previous csa pass, left as reference.
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 4)))
+ (set (match_operand:SI 0 "push_operand" "")
+ (match_operand:SI 1 "general_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_dup 0) (match_dup 1))]
+ "operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);")
+
+;; Try to use moveq, after stack push has been changed into a simple move.
+(define_peephole2
+ [(match_scratch:SI 2 "d")
+ (set (match_operand:SI 0 "memory_operand" "")
+ (match_operand:SI 1 "const_int_operand" ""))]
+ "GET_CODE (XEXP (operands[0], 0)) != PRE_DEC
+ && INTVAL (operands[1]) != 0
+ && IN_RANGE (INTVAL (operands[1]), -0x80, 0x7f)
+ && !valid_mov3q_const (INTVAL (operands[1]))"
+ [(set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (match_dup 2))])
+
+;; This sequence adds an instruction, but is two bytes shorter.
+(define_peephole2
+ [(match_scratch:SI 2 "d")
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 12)))
+ (set (match_operand:SI 0 "push_operand" "")
+ (match_operand:SI 1 "const_int_operand" ""))]
+ "INTVAL (operands[1]) != 0
+ && IN_RANGE (INTVAL (operands[1]), -0x80, 0x7f)
+ && !valid_mov3q_const (INTVAL (operands[1]))"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int 8)))
+ (set (match_dup 2) (match_dup 1))
+ (set (match_dup 0) (match_dup 2))]
+ "operands[0] = replace_equiv_address (operands[0], stack_pointer_rtx);")
+
+;; Changing pea X.w into a move.l is no real win here.
+(define_peephole2
[(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
- (match_operand:SI 0 "const_int_operand" "n")))
- (set (match_operand:SI 1 "push_operand" "=m")
- (match_operand:SI 2 "general_operand" "g"))]
- "INTVAL (operands[0]) >= 4
- && ! reg_mentioned_p (stack_pointer_rtx, operands[2])"
-{
- if (INTVAL (operands[0]) > 4)
- {
- rtx xoperands[2];
- xoperands[0] = stack_pointer_rtx;
- xoperands[1] = GEN_INT (INTVAL (operands[0]) - 4);
- if (INTVAL (xoperands[1]) <= 8)
- {
- if (!TARGET_COLDFIRE)
- output_asm_insn ("addq%.w %1,%0", xoperands);
- else
- output_asm_insn ("addq%.l %1,%0", xoperands);
- }
- else if (TUNE_CPU32 && INTVAL (xoperands[1]) <= 16)
- {
- xoperands[1] = GEN_INT (INTVAL (xoperands[1]) - 8);
- output_asm_insn ("addq%.w #8,%0\;addq%.w %1,%0", xoperands);
- }
- else if (INTVAL (xoperands[1]) <= 0x7FFF)
- {
- if (TUNE_68040)
- output_asm_insn ("add%.w %1,%0", xoperands);
- else if (MOTOROLA)
- output_asm_insn ("lea (%c1,%0),%0", xoperands);
- else
- output_asm_insn ("lea %0@(%c1),%0", xoperands);
- }
- else
- output_asm_insn ("add%.l %1,%0", xoperands);
- }
- if (operands[2] == const0_rtx)
- return "clr%.l %@";
- return "move%.l %2,%@";
+ (match_operand:SI 0 "const_int_operand" "")))
+ (set (match_operand:SI 1 "push_operand" "")
+ (match_operand:SI 2 "general_operand" ""))]
+ "INTVAL (operands[0]) > 4
+ && !reg_mentioned_p (stack_pointer_rtx, operands[2])
+ && !(CONST_INT_P (operands[2]) && INTVAL (operands[2]) != 0
+ && IN_RANGE (INTVAL (operands[2]), -0x8000, 0x7fff)
+ && !valid_mov3q_const (INTVAL (operands[2])))"
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (match_dup 0)))
+ (set (match_dup 1) (match_dup 2))]
+{
+ operands[0] = GEN_INT (INTVAL (operands[0]) - 4);
+ operands[1] = replace_equiv_address (operands[1], stack_pointer_rtx);
+})
+
+;; Speed up pushing a single byte/two bytes but leaving four bytes of space
+;; (which differs slightly between m680x0 and ColdFire).
+
+(define_peephole2
+ [(set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -4)))
+ (set (match_operand:QI 0 "memory_operand" "")
+ (match_operand:QI 1 "register_operand" ""))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])
+ && GET_CODE (XEXP (operands[0], 0)) == PLUS
+ && rtx_equal_p (XEXP (XEXP (operands[0], 0), 0), stack_pointer_rtx)
+ && CONST_INT_P (XEXP (XEXP (operands[0], 0), 1))
+ && INTVAL (XEXP (XEXP (operands[0], 0), 1)) == 3"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ rtx addr = gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx);
+ operands[0] = adjust_automodify_address (operands[0], SImode, addr, -3);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], QImode, 0);
+})
+
+(define_peephole2
+ [(set (match_operand:QI 0 "push_operand" "")
+ (match_operand:QI 1 "register_operand" ""))
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -3)))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_automodify_address (operands[0], SImode,
+ XEXP (operands[0], 0), -3);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], QImode, 0);
+})
+
+(define_peephole2
+ [(set (match_operand:HI 0 "push_operand" "")
+ (match_operand:HI 1 "register_operand" ""))
+ (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG) (const_int -2)))]
+ "!reg_mentioned_p (stack_pointer_rtx, operands[1])"
+ [(set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_automodify_address (operands[0], SImode,
+ XEXP (operands[0], 0), -2);
+ operands[1] = simplify_gen_subreg (SImode, operands[1], HImode, 0);
})

-;; Speed up pushing a single byte but leaving four bytes of space.
-
-(define_peephole
- [(set (mem:QI (pre_dec:SI (reg:SI SP_REG)))
- (match_operand:QI 1 "general_operand" "dami"))
- (set (reg:SI SP_REG) (minus:SI (reg:SI SP_REG) (const_int 2)))]
- "! reg_mentioned_p (stack_pointer_rtx, operands[1])"
-{
- rtx xoperands[4];
-
- if (GET_CODE (operands[1]) == REG)
- return "move%.l %1,%-";
+;; Optimize a series of strict_low_part assignments

- xoperands[1] = operands[1];
- xoperands[2]
- = gen_rtx_MEM (QImode, plus_constant (stack_pointer_rtx, 3));
- xoperands[3] = stack_pointer_rtx;
- if (!TARGET_COLDFIRE)
- output_asm_insn ("subq%.w #4,%3\;move%.b %1,%2", xoperands);
- else
- output_asm_insn ("subq%.l #4,%3\;move%.b %1,%2", xoperands);
- return "";
-})
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
+ (const_int 0))
+ (set (strict_low_part (match_operand:HI 1 "register_operand" ""))
+ (match_operand:HI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (HImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

-(define_peephole
- [(set (match_operand:SI 0 "register_operand" "=d")
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand" "")
(const_int 0))
- (set (strict_low_part (subreg:HI (match_dup 0) 2))
- (match_operand:HI 1 "general_operand" "rmn"))]
- "strict_low_part_peephole_ok (HImode, prev_nonnote_insn (insn), operands[0])"
-{
- if (GET_CODE (operands[1]) == CONST_INT)
- {
- if (operands[1] == const0_rtx
- && (DATA_REG_P (operands[0])
- || GET_CODE (operands[0]) == MEM)
- /* clr insns on 68000 read before writing. */
- && ((TARGET_68010 || TARGET_COLDFIRE)
- || !(GET_CODE (operands[0]) == MEM
- && MEM_VOLATILE_P (operands[0]))))
- return "clr%.w %0";
- }
- return "move%.w %1,%0";
-})
+ (set (strict_low_part (match_operand:QI 1 "register_operand" ""))
+ (match_operand:QI 2 "general_operand" ""))]
+ "REGNO (operands[0]) == REGNO (operands[1])
+ && strict_low_part_peephole_ok (QImode, insn, operands[0])"
+ [(set (strict_low_part (match_dup 1)) (match_dup 2))]
+ "")

;; dbCC peepholes
;;
Index: egcs/gcc/config/m68k/m68k.c
===================================================================
--- egcs.orig/gcc/config/m68k/m68k.c
+++ egcs/gcc/config/m68k/m68k.c
@@ -2485,7 +2485,7 @@ force_mode (enum machine_mode mode, rtx
static int
fp_reg_operand (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
{
- return reg_renumber && FP_REG_P (op);
+ return FP_REG_P (op);
}

/* Emit insns to move operands[1] into operands[0].
@@ -3439,14 +3439,18 @@ bool
strict_low_part_peephole_ok (enum machine_mode mode, rtx first_insn,
rtx target)
{
- rtx p;
+ rtx p = first_insn;

- p = prev_nonnote_insn (first_insn);
-
- while (p)
+ while ((p = PREV_INSN (p)))
{
+ if (NOTE_INSN_BASIC_BLOCK_P (p))
+ return false;
+
+ if (NOTE_P (p))
+ continue;
+
/* If it isn't an insn, then give up. */
- if (GET_CODE (p) != INSN)
+ if (!INSN_P (p))
return false;

if (reg_set_p (target, p))
@@ -3476,8 +3480,6 @@ strict_low_part_peephole_ok (enum machin
else
return false;
}
-
- p = prev_nonnote_insn (p);
}

return false;
Jeffrey Law
2007-02-06 17:01:11 UTC
Permalink
Post by Richard Sandiford
It's your call, but seeing as the patch I posted has been approved,
and is only waiting on an unapproved dependent patch, it might be
better to hold off this discussion until either (a) that patch gets
reviewed, or (b) the approval of the patch I posted is reversed.
(It's only a suggestion -- doing patches against mainline is obviously
the right thing to do, and I'm not trying to imply otherwise.
Also, if there's something you object to in that patch -- the UNSPEC? --
then I can try to fix it.)
Which pending patch is your peephole conversion patch dependent upon?

jeff
Richard Sandiford
2007-02-06 17:18:06 UTC
Permalink
Post by Jeffrey Law
Post by Richard Sandiford
It's your call, but seeing as the patch I posted has been approved,
and is only waiting on an unapproved dependent patch, it might be
better to hold off this discussion until either (a) that patch gets
reviewed, or (b) the approval of the patch I posted is reversed.
(It's only a suggestion -- doing patches against mainline is obviously
the right thing to do, and I'm not trying to imply otherwise.
Also, if there's something you object to in that patch -- the UNSPEC? --
then I can try to fix it.)
Which pending patch is your peephole conversion patch dependent upon?
Well, it was just a conflict: it only applies cleanly on top of the
rtl prologue/epilogue patch, which in turn depends on:

[ColdFire 27/63] Addressing mode changes and fixes
http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00784.html

which in turn depends on:

[ColdFire 25/63] Define MODE_INDEX_REG_CLASS for m68k
http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00782.html

Richard
Jeffrey Law
2007-02-06 17:32:54 UTC
Permalink
Post by Richard Sandiford
Post by Jeffrey Law
Post by Richard Sandiford
It's your call, but seeing as the patch I posted has been approved,
and is only waiting on an unapproved dependent patch, it might be
better to hold off this discussion until either (a) that patch gets
reviewed, or (b) the approval of the patch I posted is reversed.
(It's only a suggestion -- doing patches against mainline is obviously
the right thing to do, and I'm not trying to imply otherwise.
Also, if there's something you object to in that patch -- the UNSPEC? --
then I can try to fix it.)
Which pending patch is your peephole conversion patch dependent upon?
Well, it was just a conflict: it only applies cleanly on top of the
[ColdFire 27/63] Addressing mode changes and fixes
http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00784.html
[ColdFire 25/63] Define MODE_INDEX_REG_CLASS for m68k
http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00782.html
Ah. I'm still not comfortable with the MOD_INDEX_REG_CLASS
patch. Is there any way to break out the peephole patch
separately so that we can then go forward with Roman's
peephole patches?

jeff
Richard Sandiford
2007-02-06 17:39:17 UTC
Permalink
Post by Jeffrey Law
Ah. I'm still not comfortable with the MOD_INDEX_REG_CLASS
patch.
OK. Just so we're on the same page, what's the blocking issue here?
I've tried to answer the PA question.
Post by Jeffrey Law
Is there any way to break out the peephole patch separately so that we
can then go forward with Roman's peephole patches?
Sure, but Roman's patch is independent of mine. Rather than remove
the conflict in the patch I posted, and retest, then update Roman's
patch to apply on top of that, it would probably be easiest just
to go with Roman's patch as-is.

Richard
Jeffrey Law
2007-02-06 17:59:58 UTC
Permalink
Post by Richard Sandiford
Post by Jeffrey Law
Ah. I'm still not comfortable with the MOD_INDEX_REG_CLASS
patch.
OK. Just so we're on the same page, what's the blocking issue here?
I've tried to answer the PA question.
I haven't had a chance to look at your response yet. I've been on the
road some the last couple weeks and my GCC time is very limited these
days.
Post by Richard Sandiford
Post by Jeffrey Law
Is there any way to break out the peephole patch separately so that we
can then go forward with Roman's peephole patches?
Sure, but Roman's patch is independent of mine. Rather than remove
the conflict in the patch I posted, and retest, then update Roman's
patch to apply on top of that, it would probably be easiest just
to go with Roman's patch as-is.
OK. Whichever way y'all feel is best to proceed.

jeff
z***@linux-m68k.org
2007-01-30 11:26:20 UTC
Permalink
Hi,

Originially this patch did fix one of the reload problems, but in the
meantime a slightly different patch went in, so this patch mainly cleans
up the 64bit shift handling.
It fixes many predicates to avoid unnecessary reloads, also the
expander definitions only accept register arguments and leaves
generating memory arguments (plus possible scratch register) to combine.
It also splits many constant shift operations into separate rtl
instructions where possible.


2007-01-30 Roman Zippel <***@linux-m68k.org>

* config/m68k/m68k.c (split_di): New.
* config/m68k/m68k-protos.h: Declare split_di.
* config/m68k/m68k.md (ashldi3*,ashrdi3*,lshrdi3*):
Improve predicate handling and split constant shifts.

---
gcc/config/m68k/m68k-protos.h | 3
gcc/config/m68k/m68k.c | 26 +
gcc/config/m68k/m68k.md | 611 ++++++++++++++++++++++++++++++------------
3 files changed, 476 insertions(+), 164 deletions(-)

Index: gcc/gcc/config/m68k/m68k-protos.h
===================================================================
--- gcc.orig/gcc/config/m68k/m68k-protos.h 2006-12-18 01:04:13.000000000 +0100
+++ gcc/gcc/config/m68k/m68k-protos.h 2006-12-26 01:57:15.000000000 +0100
@@ -22,5 +22,8 @@ Boston, MA 02110-1301, USA. */

#ifdef RTX_CODE
extern HOST_WIDE_INT m68k_initial_elimination_offset (int from, int to);
+
+extern void split_di (rtx[], int, rtx[], rtx[]);
+
extern bool valid_mov3q_const (HOST_WIDE_INT);
extern const char *output_move_simode_const (rtx *);
Index: gcc/gcc/config/m68k/m68k.c
===================================================================
--- gcc.orig/gcc/config/m68k/m68k.c 2006-12-25 17:18:21.000000000 +0100
+++ gcc/gcc/config/m68k/m68k.c 2006-12-26 18:31:12.000000000 +0100
@@ -2454,6 +2454,32 @@ emit_move_sequence (rtx *operands, enum
return 0;
}

+void
+split_di (rtx operands[], int num, rtx lo_half[], rtx hi_half[])
+{
+ while (num--)
+ {
+ rtx op = operands[num];
+
+ /* simplify_subreg refuse to split volatile memory addresses,
+ but we still have to handle it. */
+ if (GET_CODE (op) == MEM)
+ {
+ lo_half[num] = adjust_address (op, SImode, 4);
+ hi_half[num] = adjust_address (op, SImode, 0);
+ }
+ else
+ {
+ lo_half[num] = simplify_gen_subreg (SImode, op,
+ GET_MODE (op) == VOIDmode
+ ? DImode : GET_MODE (op), 4);
+ hi_half[num] = simplify_gen_subreg (SImode, op,
+ GET_MODE (op) == VOIDmode
+ ? DImode : GET_MODE (op), 0);
+ }
+ }
+}
+
/* Return a REG that occurs in ADDR with coefficient 1.
ADDR can be effectively incremented by incrementing REG. */

Index: gcc/gcc/config/m68k/m68k.md
===================================================================
--- gcc.orig/gcc/config/m68k/m68k.md 2006-12-26 01:43:42.000000000 +0100
+++ gcc/gcc/config/m68k/m68k.md 2006-12-26 18:31:13.000000000 +0100
@@ -1496,17 +1496,31 @@
})

(define_insn "extendsidi2"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=d")
- (sign_extend:DI
- (match_operand:SI 1 "general_operand" "rm")))]
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (sign_extend:DI (match_operand:SI 1 "nonimmediate_src_operand" "rm")))]
""
{
CC_STATUS_INIT;
- operands[2] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
if (TARGET_68020 || TARGET_COLDFIRE)
- return "move%.l %1,%2\;smi %0\;extb%.l %0";
+ return "move%.l %1,%R0\;smi %0\;extb%.l %0";
else
- return "move%.l %1,%2\;smi %0\;ext%.w %0\;ext%.l %0";
+ return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
+})
+
+(define_insn "*extendsidi2_mem"
+ [(set (match_operand:DI 0 "memory_operand" "=o,<")
+ (sign_extend:DI (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm")))
+ (clobber (match_scratch:SI 2 "=d,d"))]
+ ""
+{
+ CC_STATUS_INIT;
+ operands[3] = adjust_address (operands[0], SImode,
+ which_alternative == 0 ? 4 : 0);
+ operands[0] = adjust_address (operands[0], SImode, 0);
+ if (TARGET_68020 || TARGET_COLDFIRE)
+ return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
+ else
+ return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
})

;; Special case when one can avoid register clobbering, copy and test
@@ -4095,69 +4109,168 @@
return "move%.w %1,%0\;sub%.l %R0,%R0";
})

-(define_insn "ashldi_const32"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")
- (ashift:DI (match_operand:DI 1 "general_operand" "ro")
- (const_int 32)))]
- ""
+(define_insn "*ashldi3_const1"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (ashift:DI (match_operand:DI 1 "register_operand" "0")
+ (const_int 1)))]
+ "!TARGET_COLDFIRE"
+ "add%.l %R0,%R0\;addx%.l %0,%0")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 2)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (ashift:DI (match_dup 1) (const_int 1)))
+ (set (match_dup 0)
+ (ashift:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 3)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (ashift:DI (match_dup 1) (const_int 2)))
+ (set (match_dup 0)
+ (ashift:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 8)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 2)
+ (rotate:SI (match_dup 2) (const_int 8)))
+ (set (match_dup 3)
+ (rotate:SI (match_dup 3) (const_int 8)))
+ (set (strict_low_part (subreg:QI (match_dup 0) 3))
+ (subreg:QI (match_dup 0) 7))
+ (set (strict_low_part (subreg:QI (match_dup 0) 7))
+ (const_int 0))]
{
- CC_STATUS_INIT;
- if (GET_CODE (operands[1]) == REG)
- operands[3] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- else
- operands[3] = adjust_address (operands[1], SImode, 4);
- if (GET_CODE (operands[0]) == REG)
- operands[2] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- else if (GET_CODE (XEXP (operands[0], 0)) == PRE_DEC)
- return "clr%.l %0\;move%.l %3,%0";
- else if (GET_CODE (XEXP (operands[0], 0)) == POST_INC)
- return "move%.l %3,%0\;clr%.l %0";
- else
- operands[2] = adjust_address (operands[0], SImode, 4);
- if (ADDRESS_REG_P (operands[2]))
- return "move%.l %3,%0\;sub%.l %2,%2";
- else
- return "move%.l %3,%0\;clr%.l %2";
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
})

-;; The predicate below must be general_operand, because ashldi3 allows that
-(define_insn "ashldi_const"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=d")
- (ashift:DI (match_operand:DI 1 "general_operand" "0")
- (match_operand 2 "const_int_operand" "n")))]
- "(!TARGET_COLDFIRE
- && ((INTVAL (operands[2]) >= 1 && INTVAL (operands[2]) <= 3)
- || INTVAL (operands[2]) == 8 || INTVAL (operands[2]) == 16
- || (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63)))"
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 16)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 2)
+ (rotate:SI (match_dup 2) (const_int 16)))
+ (set (match_dup 3)
+ (rotate:SI (match_dup 3) (const_int 16)))
+ (set (strict_low_part (subreg:HI (match_dup 0) 2))
+ (subreg:HI (match_dup 0) 6))
+ (set (strict_low_part (subreg:HI (match_dup 0) 6))
+ (const_int 0))]
{
- operands[1] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- if (INTVAL (operands[2]) == 1)
- return "add%.l %1,%1\;addx%.l %0,%0";
- else if (INTVAL (operands[2]) == 8)
- return "rol%.l #8,%1\;rol%.l #8,%0\;move%.b %1,%0\;clr%.b %1";
- else if (INTVAL (operands[2]) == 16)
- return "swap %1\;swap %0\;move%.w %1,%0\;clr%.w %1";
- else if (INTVAL (operands[2]) == 48)
- return "mov%.l %1,%0\;swap %0\;clr%.l %1\;clr%.w %0";
- else if (INTVAL (operands[2]) == 2)
- return "add%.l %1,%1\;addx%.l %0,%0\;add%.l %1,%1\;addx%.l %0,%0";
- else if (INTVAL (operands[2]) == 3)
- return "add%.l %1,%1\;addx%.l %0,%0\;add%.l %1,%1\;addx%.l %0,%0\;add%.l %1,%1\;addx%.l %0,%0";
- else /* 32 < INTVAL (operands[2]) <= 63 */
- {
- operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
- output_asm_insn (INTVAL (operands[2]) <= 8 ? "asl%.l %2,%1" :
- "moveq %2,%0\;asl%.l %0,%1", operands);
- return "mov%.l %1,%0\;moveq #0,%1";
- }
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "pre_dec_operand" "")
+ (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "")
+ (const_int 32)))]
+ "reload_completed"
+ [(set (match_dup 0) (const_int 0))
+ (set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_address(operands[0], SImode, 0);
+ operands[1] = gen_lowpart(SImode, operands[1]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "post_inc_operand" "")
+ (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "")
+ (const_int 32)))]
+ "reload_completed"
+ [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 0) (const_int 0))]
+{
+ operands[0] = adjust_address(operands[0], SImode, 0);
+ operands[1] = gen_lowpart(SImode, operands[1]);
+})
+
+(define_insn_and_split "*ashldi3_const32"
+ [(set (match_operand:DI 0 "nonimmediate_operand" "=ro<>")
+ (ashift:DI (match_operand:DI 1 "nonimmediate_operand" "ro")
+ (const_int 32)))]
+ ""
+ "#"
+ "&& reload_completed"
+ [(set (match_dup 4) (match_dup 3))
+ (set (match_dup 2) (const_int 0))]
+ "split_di(operands, 2, operands + 2, operands + 4);")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
+ "reload_completed && !TARGET_COLDFIRE
+ && INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 40"
+ [(set (match_dup 4) (ashift:SI (match_dup 4) (match_dup 2)))
+ (set (match_dup 3) (match_dup 4))
+ (set (match_dup 4) (const_int 0))]
+{
+ operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
+ operands[3] = gen_highpart (SImode, operands[0]);
+ operands[4] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 48)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 2) (match_dup 3))
+ (set (match_dup 2)
+ (rotate:SI (match_dup 2) (const_int 16)))
+ (set (match_dup 3) (const_int 0))
+ (set (strict_low_part (subreg:HI (match_dup 0) 2))
+ (const_int 0))]
+{
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
+ "reload_completed && !TARGET_COLDFIRE
+ && INTVAL (operands[2]) > 40 && INTVAL (operands[2]) <= 63"
+ [(set (match_dup 3) (match_dup 2))
+ (set (match_dup 4) (ashift:SI (match_dup 4) (match_dup 3)))
+ (set (match_dup 3) (match_dup 4))
+ (set (match_dup 4) (const_int 0))]
+{
+ operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
+ operands[3] = gen_highpart (SImode, operands[0]);
+ operands[4] = gen_lowpart (SImode, operands[0]);
})

+(define_insn "*ashldi3"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (ashift:DI (match_operand:DI 1 "register_operand" "0")
+ (match_operand 2 "const_int_operand" "n")))]
+ "!TARGET_COLDFIRE
+ && ((INTVAL (operands[2]) >= 1 && INTVAL (operands[2]) <= 3)
+ || INTVAL (operands[2]) == 8 || INTVAL (operands[2]) == 16
+ || (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63))"
+ "#")
+
(define_expand "ashldi3"
- [(set (match_operand:DI 0 "nonimmediate_operand" "")
- (ashift:DI (match_operand:DI 1 "general_operand" "")
- (match_operand 2 "const_int_operand" "")))]
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashift:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
"!TARGET_COLDFIRE"
- "
{
/* ??? This is a named pattern like this is not allowed to FAIL based
on its operands. */
@@ -4166,7 +4279,7 @@
&& INTVAL (operands[2]) != 8 && INTVAL (operands[2]) != 16
&& (INTVAL (operands[2]) < 32 || INTVAL (operands[2]) > 63)))
FAIL;
-} ")
+})

;; On most 68k models, this makes faster code in a special case.

@@ -4282,66 +4395,131 @@
return "move%.l %1,%0";
})

-(define_insn "ashrdi_const32"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,<")
- (ashiftrt:DI (match_operand:DI 1 "general_operand" "ro,ro,ro")
+(define_insn "*ashrdi3_const1"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "0")
+ (const_int 1)))]
+ "!TARGET_COLDFIRE"
+{
+ operands[1] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
+ return "asr%.l #1,%0\;roxr%.l #1,%1";
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 2)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (ashiftrt:DI (match_dup 1) (const_int 1)))
+ (set (match_dup 0)
+ (ashiftrt:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 3)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (ashiftrt:DI (match_dup 1) (const_int 2)))
+ (set (match_dup 0)
+ (ashiftrt:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 8)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (strict_low_part (subreg:QI (match_dup 0) 7))
+ (subreg:QI (match_dup 0) 3))
+ (set (match_dup 2)
+ (ashiftrt:SI (match_dup 2) (const_int 8)))
+ (set (match_dup 3)
+ (rotatert:SI (match_dup 3) (const_int 8)))]
+{
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 16)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (strict_low_part (subreg:HI (match_dup 0) 6))
+ (subreg:HI (match_dup 0) 2))
+ (set (match_dup 2)
+ (rotate:SI (match_dup 2) (const_int 16)))
+ (set (match_dup 3)
+ (rotate:SI (match_dup 3) (const_int 16)))
+ (set (match_dup 2)
+ (sign_extend:SI (subreg:HI (match_dup 2) 2)))]
+{
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_insn "*ashrdi_const32"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (ashiftrt:DI (match_operand:DI 1 "nonimmediate_src_operand" "ro")
+ (const_int 32)))]
+ ""
+{
+ CC_STATUS_INIT;
+ if (TARGET_68020)
+ return "move%.l %1,%R0\;smi %0\;extb%.l %0";
+ else
+ return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
+})
+
+(define_insn "*ashrdi_const32_mem"
+ [(set (match_operand:DI 0 "memory_operand" "=o,<")
+ (ashiftrt:DI (match_operand:DI 1 "nonimmediate_src_operand" "ro,ro")
(const_int 32)))
- (clobber (match_scratch:SI 2 "=X,d,d"))]
+ (clobber (match_scratch:SI 2 "=d,d"))]
""
{
CC_STATUS_INIT;
- if (which_alternative == 0)
- {
- operands[2] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- if (TARGET_68020)
- return "move%.l %1,%2\;smi %0\;extb%.l %0";
- else
- return "move%.l %1,%2\;smi %0\;ext%.w %0\;ext%.l %0";
- }
+ operands[3] = adjust_address (operands[0], SImode,
+ which_alternative == 0 ? 4 : 0);
+ operands[0] = adjust_address (operands[0], SImode, 0);
+ if (TARGET_68020 || TARGET_COLDFIRE)
+ return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
else
- {
- if (which_alternative == 2)
- operands[3] = operands[0];
- else if (which_alternative == 1)
- operands[3] = adjust_address (operands[0], SImode, 4);
- if (TARGET_68020)
- return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
- else
- return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
- }
+ return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
})

+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 63)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 3)
+ (ashiftrt:SI (match_dup 3) (const_int 31)))
+ (set (match_dup 2)
+ (match_dup 3))]
+ "split_di(operands, 1, operands + 2, operands + 3);")
+
;; The predicate below must be general_operand, because ashrdi3 allows that
(define_insn "ashrdi_const"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=d")
- (ashiftrt:DI (match_operand:DI 1 "general_operand" "0")
- (match_operand 2 "const_int_operand" "n")))
- (clobber (match_scratch:SI 3 "=X"))]
- "(!TARGET_COLDFIRE
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "0")
+ (match_operand 2 "const_int_operand" "n")))]
+ "!TARGET_COLDFIRE
&& ((INTVAL (operands[2]) >= 1 && INTVAL (operands[2]) <= 3)
|| INTVAL (operands[2]) == 8 || INTVAL (operands[2]) == 16
|| INTVAL (operands[2]) == 31
- || (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63)))"
+ || (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63))"
{
operands[1] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- if (INTVAL (operands[2]) == 63)
- return "add%.l %0,%0\;subx%.l %0,%0\;move%.l %0,%1";
CC_STATUS_INIT;
- if (INTVAL (operands[2]) == 1)
- return "asr%.l #1,%0\;roxr%.l #1,%1";
- else if (INTVAL (operands[2]) == 8)
- return "move%.b %0,%1\;asr%.l #8,%0\;ror%.l #8,%1";
- else if (INTVAL (operands[2]) == 16)
- return "move%.w %0,%1\;swap %0\;ext%.l %0\;swap %1";
- else if (INTVAL (operands[2]) == 48)
+ if (INTVAL (operands[2]) == 48)
return "swap %0\;ext%.l %0\;move%.l %0,%1\;smi %0\;ext%.w %0";
- else if (INTVAL (operands[2]) == 31)
+ if (INTVAL (operands[2]) == 31)
return "add%.l %1,%1\;addx%.l %0,%0\;move%.l %0,%1\;subx%.l %0,%0";
- else if (INTVAL (operands[2]) == 2)
- return "asr%.l #1,%0\;roxr%.l #1,%1\;asr%.l #1,%0\;roxr%.l #1,%1";
- else if (INTVAL (operands[2]) == 3)
- return "asr%.l #1,%0\;roxr%.l #1,%1\;asr%.l #1,%0\;roxr%.l #1,%1\;asr%.l #1,%0\;roxr%.l #1,%1";
- else /* 32 < INTVAL (operands[2]) <= 63 */
+ if (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63)
{
operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
output_asm_insn (INTVAL (operands[2]) <= 8 ? "asr%.l %2,%0" :
@@ -4350,15 +4528,14 @@
return INTVAL (operands[2]) >= 15 ? "ext%.w %d0" :
TARGET_68020 ? "extb%.l %0" : "ext%.w %0\;ext%.l %0";
}
+ return "#";
})

(define_expand "ashrdi3"
- [(parallel [(set (match_operand:DI 0 "nonimmediate_operand" "")
- (ashiftrt:DI (match_operand:DI 1 "general_operand" "")
- (match_operand 2 "const_int_operand" "")))
- (clobber (match_scratch:SI 3 ""))])]
+ [(set (match_operand:DI 0 "register_operand" "")
+ (ashiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
"!TARGET_COLDFIRE"
- "
{
/* ??? This is a named pattern like this is not allowed to FAIL based
on its operands. */
@@ -4367,8 +4544,7 @@
&& INTVAL (operands[2]) != 8 && INTVAL (operands[2]) != 16
&& (INTVAL (operands[2]) < 31 || INTVAL (operands[2]) > 63)))
FAIL;
- operands[3] = gen_rtx_SCRATCH (SImode);
-} ")
+})

;; On all 68k models, this makes faster code in a special case.

@@ -4456,69 +4632,176 @@
return "move%.l %1,%0";
})

-(define_insn "lshrdi_const32"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=ro,<,>")
- (lshiftrt:DI (match_operand:DI 1 "general_operand" "ro,ro,ro")
+(define_insn "*lshrdi3_const1"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "0")
+ (const_int 1)))]
+ "!TARGET_COLDFIRE"
+ "lsr%.l #1,%0\;roxr%.l #1,%R0")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 2)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (lshiftrt:DI (match_dup 1) (const_int 1)))
+ (set (match_dup 0)
+ (lshiftrt:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 3)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (match_dup 0)
+ (lshiftrt:DI (match_dup 1) (const_int 2)))
+ (set (match_dup 0)
+ (lshiftrt:DI (match_dup 0) (const_int 1)))]
+ "")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 8)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (strict_low_part (subreg:QI (match_dup 0) 7))
+ (subreg:QI (match_dup 0) 3))
+ (set (match_dup 2)
+ (lshiftrt:SI (match_dup 2) (const_int 8)))
+ (set (match_dup 3)
+ (rotatert:SI (match_dup 3) (const_int 8)))]
+{
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 16)))]
+ "reload_completed && !TARGET_COLDFIRE"
+ [(set (strict_low_part (subreg:HI (match_dup 0) 6))
+ (subreg:HI (match_dup 0) 2))
+ (set (strict_low_part (subreg:HI (match_dup 0) 2))
+ (const_int 0))
+ (set (match_dup 3)
+ (rotate:SI (match_dup 3) (const_int 16)))
+ (set (match_dup 2)
+ (rotate:SI (match_dup 2) (const_int 16)))]
+{
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "pre_dec_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "")
+ (const_int 32)))]
+ "reload_completed"
+ [(set (match_dup 0) (match_dup 1))
+ (set (match_dup 0) (const_int 0))]
+{
+ operands[0] = adjust_address(operands[0], SImode, 0);
+ operands[1] = gen_highpart(SImode, operands[1]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "post_inc_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "")
+ (const_int 32)))]
+ "reload_completed"
+ [(set (match_dup 0) (const_int 0))
+ (set (match_dup 0) (match_dup 1))]
+{
+ operands[0] = adjust_address(operands[0], SImode, 0);
+ operands[1] = gen_highpart(SImode, operands[1]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "nonimmediate_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "nonimmediate_operand" "")
+ (const_int 32)))]
+ "reload_completed"
+ [(set (match_dup 2) (match_dup 5))
+ (set (match_dup 4) (const_int 0))]
+ "split_di(operands, 2, operands + 2, operands + 4);")
+
+(define_insn "*lshrdi_const32"
+ [(set (match_operand:DI 0 "nonimmediate_operand" "=ro<>")
+ (lshiftrt:DI (match_operand:DI 1 "general_operand" "ro")
(const_int 32)))]
""
+ "#")
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
+ "reload_completed && !TARGET_COLDFIRE
+ && INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 40"
+ [(set (match_dup 3) (lshiftrt:SI (match_dup 3) (match_dup 2)))
+ (set (match_dup 4) (match_dup 3))
+ (set (match_dup 3) (const_int 0))]
+{
+ operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
+ operands[3] = gen_highpart (SImode, operands[0]);
+ operands[4] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (const_int 48)))]
+ "reload_completed"
+ [(set (match_dup 3) (match_dup 2))
+ (set (strict_low_part (subreg:HI (match_dup 0) 6))
+ (const_int 0))
+ (set (match_dup 2) (const_int 0))
+ (set (match_dup 3)
+ (rotate:SI (match_dup 3) (const_int 16)))]
{
- CC_STATUS_INIT;
- if (which_alternative == 1)
- return "move%.l %1,%0\;clr%.l %0";
- if (which_alternative == 2)
- return "clr%.l %0\;move%.l %1,%0";
- if (GET_CODE (operands[0]) == REG)
- operands[2] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- else
- operands[2] = adjust_address (operands[0], SImode, 4);
- if (GET_CODE (operands[1]) == REG)
- operands[3] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
- else
- operands[3] = adjust_address (operands[1], SImode, 4);
- if (ADDRESS_REG_P (operands[0]))
- return "move%.l %1,%2\;sub%.l %0,%0";
- else
- return "move%.l %1,%2\;clr%.l %0";
+ operands[2] = gen_highpart (SImode, operands[0]);
+ operands[3] = gen_lowpart (SImode, operands[0]);
})

-;; The predicate below must be general_operand, because lshrdi3 allows that
-(define_insn "lshrdi_const"
- [(set (match_operand:DI 0 "nonimmediate_operand" "=d")
- (lshiftrt:DI (match_operand:DI 1 "general_operand" "0")
+(define_split
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
+ (match_operand 2 "const_int_operand" "")))]
+ "reload_completed && !TARGET_COLDFIRE
+ && INTVAL (operands[2]) > 40 && INTVAL (operands[2]) <= 62"
+ [(set (match_dup 4) (match_dup 2))
+ (set (match_dup 3) (lshiftrt:SI (match_dup 3) (match_dup 4)))
+ (set (match_dup 4) (match_dup 3))
+ (set (match_dup 3) (const_int 0))]
+{
+ operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
+ operands[3] = gen_highpart (SImode, operands[0]);
+ operands[4] = gen_lowpart (SImode, operands[0]);
+})
+
+(define_insn "*lshrdi_const63"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "0")
+ (const_int 63)))]
+ ""
+ "add%.l %0,%0\;clr%.l %0\;clr%.l %R1\;addx%.l %R1,%R1")
+
+(define_insn "*lshrdi3_const"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "0")
(match_operand 2 "const_int_operand" "n")))]
"(!TARGET_COLDFIRE
- && ((INTVAL (operands[2]) >= 1 && INTVAL (operands[2]) <= 3)
+ && ((INTVAL (operands[2]) >= 2 && INTVAL (operands[2]) <= 3)
|| INTVAL (operands[2]) == 8 || INTVAL (operands[2]) == 16
|| (INTVAL (operands[2]) > 32 && INTVAL (operands[2]) <= 63)))"
-{
- operands[1] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
- if (INTVAL (operands[2]) == 63)
- return "add%.l %0,%0\;clr%.l %0\;clr%.l %1\;addx%.l %1,%1";
- CC_STATUS_INIT;
- if (INTVAL (operands[2]) == 1)
- return "lsr%.l #1,%0\;roxr%.l #1,%1";
- else if (INTVAL (operands[2]) == 8)
- return "move%.b %0,%1\;lsr%.l #8,%0\;ror%.l #8,%1";
- else if (INTVAL (operands[2]) == 16)
- return "move%.w %0,%1\;clr%.w %0\;swap %1\;swap %0";
- else if (INTVAL (operands[2]) == 48)
- return "move%.l %0,%1\;clr%.w %1\;clr%.l %0\;swap %1";
- else if (INTVAL (operands[2]) == 2)
- return "lsr%.l #1,%0\;roxr%.l #1,%1\;lsr%.l #1,%0\;roxr%.l #1,%1";
- else if (INTVAL (operands[2]) == 3)
- return "lsr%.l #1,%0\;roxr%.l #1,%1\;lsr%.l #1,%0\;roxr%.l #1,%1\;lsr%.l #1,%0\;roxr%.l #1,%1";
- else /* 32 < INTVAL (operands[2]) <= 63 */
- {
- operands[2] = GEN_INT (INTVAL (operands[2]) - 32);
- output_asm_insn (INTVAL (operands[2]) <= 8 ? "lsr%.l %2,%0" :
- "moveq %2,%1\;lsr%.l %1,%0", operands);
- return "mov%.l %0,%1\;moveq #0,%0";
- }
-})
+ "#")

(define_expand "lshrdi3"
- [(set (match_operand:DI 0 "nonimmediate_operand" "")
- (lshiftrt:DI (match_operand:DI 1 "general_operand" "")
+ [(set (match_operand:DI 0 "register_operand" "")
+ (lshiftrt:DI (match_operand:DI 1 "register_operand" "")
(match_operand 2 "const_int_operand" "")))]
"!TARGET_COLDFIRE"
{

--
Jeffrey Law
2007-02-06 18:27:35 UTC
Permalink
Post by z***@linux-m68k.org
Hi,
Originially this patch did fix one of the reload problems, but in the
meantime a slightly different patch went in, so this patch mainly cleans
up the 64bit shift handling.
It fixes many predicates to avoid unnecessary reloads, also the
expander definitions only accept register arguments and leaves
generating memory arguments (plus possible scratch register) to combine.
It also splits many constant shift operations into separate rtl
instructions where possible.
* config/m68k/m68k.c (split_di): New.
* config/m68k/m68k-protos.h: Declare split_di.
Improve predicate handling and split constant shifts.
This looks pretty good. I did spot checking on the new splits to
verify that they look reasonable relative to the raw code we used
to generate. I'm going to assume the subreg word numbers are
correct (they looked reasonable, but I didn't look at each and
every one closely).

A few minor comments:

1. It might have been better to split this patch up. Yes
it's fairly mechanical in the sense that you applied the
same transformation to each of three 64bit shit patterns.
Submitting each one separately would have been a little
easier to review.

2. You need a function comment before split_di. Also in
split_di s/refuse/refuses/

3. You modified extendsidi2, but it's not mentioned in the
ChangeLog. DId you mean to include it? (Arguably it
could have been in its own patch).

4. Do you need a CC_STATUS_INIT in your new *ashrdi3_const1?
Similarly for the other shift by one patterns.

5. Are the CC status bits correct when we split something
like a 32bit shift into moves?

Assuming #4 and #5 aren't problems that require changing the
functionality of your patch, then consider the patch pre-approved
after you address issues #2 and #3.


jeff
Roman Zippel
2007-02-07 22:04:06 UTC
Permalink
Hi,
Post by Jeffrey Law
2. You need a function comment before split_di. Also in
split_di s/refuse/refuses/
Hmm, I copied that from i386. :)
Post by Jeffrey Law
3. You modified extendsidi2, but it's not mentioned in the
ChangeLog. DId you mean to include it? (Arguably it
could have been in its own patch).
Yes, I'll add it. (It sort of belongs to ashrdi_const32, as generates the
same code sequence.)
Post by Jeffrey Law
4. Do you need a CC_STATUS_INIT in your new *ashrdi3_const1?
Similarly for the other shift by one patterns.
5. Are the CC status bits correct when we split something
like a 32bit shift into moves?
All CC status bits for any shift operation are always cleared by
notice_update_cc().

bye, Roman
Jeffrey Law
2007-02-12 17:19:37 UTC
Permalink
Post by Roman Zippel
Hi,
Post by Jeffrey Law
2. You need a function comment before split_di. Also in
split_di s/refuse/refuses/
Hmm, I copied that from i386. :)
Whomever added it to i386.c should have included a comment. When
you cobble a comment together, could you please add it to i386.c
as well (consider it pre-approved).
Post by Roman Zippel
Post by Jeffrey Law
3. You modified extendsidi2, but it's not mentioned in the
ChangeLog. DId you mean to include it? (Arguably it
could have been in its own patch).
Yes, I'll add it. (It sort of belongs to ashrdi_const32, as generates the
same code sequence.)
OK. Thanks.
Post by Roman Zippel
Post by Jeffrey Law
4. Do you need a CC_STATUS_INIT in your new *ashrdi3_const1?
Similarly for the other shift by one patterns.
5. Are the CC status bits correct when we split something
like a 32bit shift into moves?
All CC status bits for any shift operation are always cleared by
notice_update_cc().
OK.

Based on your replies, I think you can go ahead and install the
patch (after adding the missed comment and ChangeLog entry of
course.)

Thanks,
jeff
Martin Michlmayr
2007-01-30 16:13:41 UTC
Permalink
Here is series of patches, which are also currently part of m68k
Debian gcc (all but one).
The bugs in the Debian bug tracker have preprocessed source and some
even smaller testcases, but none of your patches add any testcases.
It would be nice if you would add some.
--
Martin Michlmayr
***@cyrius.com
Tom Tromey
2007-01-30 19:20:20 UTC
Permalink
Post by z***@linux-m68k.org
- the thread suspend handler has to save all registers
- reenable MPROTECT_VDB, it should work, otherwise it probably was a
kernel bug (and I couldn't find a reference to the original problem,
neither did I notice any problem)
- change STACKBOTTOM to LINUX_STACKBOTTOM so it works with 2.6 kernel
This seems unobjectionable to me, but we prefer that non-urgent GC
patches go in upstream before we check them in to GCC.

Tom
Roman Zippel
2007-01-31 12:41:36 UTC
Permalink
Hi,
Post by Tom Tromey
Post by z***@linux-m68k.org
- the thread suspend handler has to save all registers
- reenable MPROTECT_VDB, it should work, otherwise it probably was a
kernel bug (and I couldn't find a reference to the original problem,
neither did I notice any problem)
- change STACKBOTTOM to LINUX_STACKBOTTOM so it works with 2.6 kernel
This seems unobjectionable to me, but we prefer that non-urgent GC
Hmm, two of these changes are needed to get it working properly for
linux-m68k.
Post by Tom Tromey
patches go in upstream before we check them in to GCC.
Sure, but there is no explicit information, where that is (and the one
mailing list I found is subscriber only :( ).

bye, Roman
Roman Zippel
2007-01-31 14:01:50 UTC
Permalink
Hi,
Post by Tom Tromey
This seems unobjectionable to me, but we prefer that non-urgent GC
patches go in upstream before we check them in to GCC.
I found the CVS sources at sourceforge and there these changes are already
included via Debian.

bye, Roman
Tom Tromey
2007-02-05 16:06:43 UTC
Permalink
Tom> This seems unobjectionable to me, but we prefer that non-urgent
Tom> GC patches go in upstream before we check them in to GCC.

Roman> I found the CVS sources at sourceforge and there these changes
Roman> are already included via Debian.

Great. I'm checking this in to svn trunk.

Tom
Loading...