Post by s***@casperkitty.comPost by Robert WesselAnd while x86-64 can handle multiple precision code fairly well, a lot
of RISCs and other machines make it rather more tedious. Lack of a
carry flag or double width multiplication are common. Both can
obviously be worked around, but the difference between
(mov/add/mov/adc/mov/adc/mov/adc) to implement a quad add on x86 and
what you have to do on Alpha (without a carry flag) is pretty
substantial.
On systems that lack add-with-carry, it's tough to get a correct carry
out of the second word in both cases where the second word of an input
operand has all bits set and there's a carry out of the first word.
Clumsy and with excessive overhead, but not actually tough. On Alpha,
for example, you'd do something like:
;$1=carry in/out
;$2=A (in)
;$3=B (in)
;$4=sum (out)
;$5=work
addq $2,$3,$4
cmplt $4,$3,$5
addq $1,$4,$4
cmplt $4,$1,$1
bis $1,$5,$5
On S/360 it was really painful, since the only way you could really do
this was to do a conditional branch after each add:
;R1=carry in/out
;R2=A (in)
;R3=B (in)
;R4=sum (out)
;R5=work
;R6=1 (constant)
XR R5,R5 ;zero R5
LR R4,R2
ALR R4,R3
BC 6, NC1 ;branch no carry
LR R5,R6
NC1:
ALR R4,R1
BC 6, NC2 ;branch no carry
OR R5,R6
NC2:
LR R1,R5
Not actually difficult, just really painful.
The alternative of retrieving the condition code value and processing
that was even worse, because the only way to do that was to issue a
subroutine call instruction (which returned in in the high byte of the
return address register (that would have looked like the following IPM
based sample, but with the IPMs replaced with a subroutine call to the
following instruction - IOW "BALR Rx,0").
Somewhere at or just before the start of the 31-bit (XA) era a
specific instruction to do that was added (insert program mask). At
least theoretically you could do a shift and mask on that, and end up
with something like:
;R1=carry in/out
;R2=A (in)
;R3=B (in)
;R4=sum (out)
;R5=work
;R6=1 (constant)
LR R4,R2
AR R4,R3
IPM R5 ;it's either bit 2 or 3 that ends up indicating carry
SRL R5,28 ;a shift count of 28 is correct for bit 3
AR R4,R1
IPM R1
SRL R1,28
OR R1,R5
NR R1,R6
That's still ugly as sin, but at least it avoids *two* conditional
branches.
It was always tempting to do this with halfword sized limbs instead.
On S/370 and later, 24-bit limbs were plausible too.
Now, of course, you can do it in a single instruction.
Post by s***@casperkitty.comFor applications which need to yield an arithmetically correct result
if no overflow occurred, or report that an overflow occurred if the
result would not be arithmetically correct, but which are not required
to report overflows that don't end up affecting the results, it may be
helpful to have a type which could, at the compiler's leisure, either
keep some precision beyond a normal type or truncate such precision and
set an error flag if doing so would change the value. If a compiler
int x=(y+z)/2;
having a compiler do an add followed by a rotate right through carry may
be cheaper than checking whether the addition overflowed and trapping if
so.
Assuming there *is* a rotate through carry.
Post by s***@casperkitty.comLikewise, when adding a bunch of numbers it may be easier to do an
extended-precision add and then check whether the result is in range of
the target type, than to check for overflow after every step. Some
platforms are good at overflow checking and bad at multi-precision math,
while others handle multi-precision math well but aren't as good at
overflow checking. Letting a compiler pick which approach would be better
in any given situation would allow the required semantics to be achieved
more quickly than if the programmer had to force it.
In many respects Cobol did that approximately right. You could
generally ask for over (really out-of-range) checking on any
computation, and it was up to the compiler to figure out how to do it:
compute x = (y+z)/2
on size error
...error handling code...
As a general concept, it would trigger the on size error clause when
the result would not fit in the destination. And in general the
assumption was that the result *would* be computed correctly (subject
to the as-if rule, of course). So if you defined X as a two (decimal)
digit field, and y and z as 18 digits, you'd expect the result to be
equivalent to computing a 19 digit sum, dividing that by two, and then
checking if that results fits in two digits.
You could omit the on size error clause, and then you'd usually get
some sort of odd truncation, often depending on the formats of the
types being used (for example if x and the intermediate result were
binary, you might just end up with the low 16 bits of the results in
x, despite that being rather out of range).. Note that the original
Cobol spec allowed only decimal truncation, except that was ignored by
basically everyone as the overhead was too high - newer versions
explicitly allow other truncations modes.