1. All six condition codes -- (OF, SF, ZF, AF, PF, CF)
2. Two classes of five condition codes -- (OF,SF,ZF,PF,CF), (OF, SF,
3. One class of two condition codes -- (OF,CF)
just handle the first case i.e. specify that if an instruction is
register to be a source register.
Post by Gabe BlackYeah, I think we've talked about this topic in the past, but it was a
while ago and I don't remember exactly what all we talked about or the
conclusion(s) we reached.
The problem at the ISA level is that there are lots of instructions in
x86 which are pretty basic and used a lot (adds, subtracts, etc.) which
compute condition codes every time in case you need them. That, combined
with the fact that the instructions which update the condition codes
update somewhat erratic combinations of bits, means that lots of
instructions write the condition code bits, and those same common
instructions read them too so they can do a partial update.
This has happened to a lesser extent before where there are control like
bits and condition code like bits in the same register. To my knowledge
that's happened at least on SPARC, ARM, and x86. That's dealt with by
splitting the condition code bits out into their own register, which is
treated as a renamed integer register, and the control bits which are
treated as a misc reg with all the overhead and special precautions.
That doesn't entirely work on x86, though, because even among the
condition code bits there are a lot of partial accesses as described
above. The cc bits could be broken down into individual bits, but that's
pretty cumbersome since there are, including the two artificial ones for
microcode, 8 of them I believe? That would be a lot of registers to
rename, would slow down the simulator, wouldn't be that realistic, etc.
What real CPUs do, after talking to someone in the know at AMD, is that
they gather up one group of flags, about 4 if I recall, and treat those
as a unit. The others are handled individually. The group of 4 is still
not 100% treated as a unit since some instructions modify just one of
them, for instance, but it's pretty close, optimizes for the common
case, and the odd cases can still work like they do today.
The difficulty implementing this is that exactly which condition code
bits to set and which to check for conditional microops are decided at
the microcode level and are arbitrary combinations. They don't need to
be completely arbitrary, but that means that microops really effectively
know which, how many, etc., condition code registers they need at
construction time as apposed to compile time. So what we'd need to do is
to allow the constructor for a microop to look at the flags it was being
given and to use that to more programatically figure out which registers
it had as sources or destinations, and how many. The body of the
instructions themselves would need to be sophisticated enough to pull
together the different source registers, whatever they are, and to
process them appropriately with a consistent bit of code (and not 18
different parameters to some function where 14 aren't used at any
particular time). It would also have to know how to split things back up
again when writing out the results.
What I did to move us a little bit in this direction is to make the
types of operands much more flexible so that we can have structures,
typedefs, etc. What we'd still need is truely composite operand types
where a single operand, for instance the condition code bits, is built
from a set of registers (determined in some way appropriate to the
operand) and/or written back to a set of registers, but which could be
handled easily as a single value inside the code blob. Then we can avoid
having 100(s) of versions of microops for all the different combinations
of flag bits, which would be a terrible thing to have to live with.
As far as easier ways to deal with this, there is only one which is what
I was alluding to in what I think was my earliest email, and that's to
just hack around it so the instructions you know you're using in the
performance sensitive part behave incorrectly generally speaking, but do
what you expect for the benchmark. Maybe they'd even have to know where
they were running from, that they were in a range of ticks, etc. A gross
and terrible hack unfit to check in, but something that would get the
poster unstuck for now. Doing things the "right" way will take some
infrastructure work, and that may not be very quick. I don't think
there's any real shortcut around doing the infrastructure work that
doesn't have a pretty heavy cost (like blowing up the number of microop
classes 100 fold).
Gabe
Post by Watanabe, YasukoHi Gabe,
Your earlier email said "I've made some changes over time which should make
it easier to do this like a real x86 CPU would". Could you expand on that?
It sounded like you had some sort of plan or direction at least.
If we're
going to start working on this ourselves, it would be best if we can
benefit from whatever insights you've had or preliminary work you've done.
I see your later email says "I don't have any ideas for how to make it much
simpler", but that seems to contradict what you said at first. In
particular, you also earlier said "If you have an idea of how to get it to
do what you want locally, feel free. That will get you going, and when I
get it fixed for real then you can start using that.". I'd like to
explicitly reject that idea... for one thing, I'm not sure what a "local"
solution would look like, and more importantly, this issue seems
complicated enough that us doing some sort of temporary or stopgap solution
like you're implying, only to throw it away once you've done it "for real",
seems like a huge waste of effort. So overall I'd like to be sure we're in
sync with whatever you're thinking to make sure that our efforts are
additive and complementary and not redundant.
Thanks,
Steve
Post by Watanabe, Yasuko_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
http://m5sim.org/mailman/listinfo/gem5-dev