[Prev] Thread [Next] |
[Prev] Date [Next]
Re: optimizing predictable branches on x86
Mon Jul 07 20:01:06 2008
On Tuesday 26 February 2008 21:14, Jan Hubicka wrote:
> > Core2 follows a similar pattern, although it's not seeing any
> > slowdown in the "no deps, predictable, jmp" case like K8 does.
> > Any comments? (please cc me) Should gcc be using conditional jumps
> > more often eg. in the case of __builtin_expect())?
> The problem is that in general GCC's branch prediction algorithms are
> very poor on predicting predictability of branch: they are pretty good
> on guessing outcome but that is.
Yes, I guess this would be tricky. I wonder if there would be use
in having a __builtin_predictable() type of thing. I know there
are cases where we could use it in Linux (eg. we have a lot of
tunable things, but usually they aren't changed often). Then again,
maybe the benefit of doing these annotations would be too small
to bother about.
Linux generally has reasonable likely/unlikely annotations, so I
guess we can first wait to see if predictable branch optimization
gives any benefit there. (or for those doing benchmarks with
> Only cases we do so quite reliably IMO are:
> 1) loop branches that are not interesting for cmov conversion
> 2) branches leading to noreturn calls, also not interesting
> 3) builtin_expect mentioned.
> 4) when profile feedback is around to some degree (ie we know when the
> branch is very likely or very unlikely. We don't simulate what
> hardware will do on it).
At least on x86 it should also be a good idea to know which way
the branch is going to go, because it doesn't have explicit branch
hints, you really want to be able to optimize the cold branch
predictor case if converting from cmov to conditional branches.
> I guess we can implement the machinery for 3 and 4 (in fact once
> I played adding EDGE_PREDICTABLE_P predicate that basically tested if
> the esimated probability of branch is <5% or >95%) but never got really
> noticeable improvements out of it and gave up.
> It was before Core2 times, so it might be helping now. But it needs
> updating for backend cost interface as ifcvt is bit inflexible in this.
> I had BRANCH_COST and PREDICTABLE_BRANCH_COST macros.
cmov performance seems to be pretty robust (I was surprised it is so
good), so it definitely seems like the right thing to do by default.
It will be hard to beat, but I hope there is room for some improvement.