RE: Inefficient loop unrolling.

Top Page

Reply to this message
Author: Bingfeng Mei
Date:  
To: Paolo Bonzini
CC: Steven Bosscher, gcc
Subject: RE: Inefficient loop unrolling.
Paolo,
Thanks for the reply. However, I am not sure it is a simple folding
issue.

For example,

B1 = B + 4;
= [A, B1]
B2 = B + 8;
= [A, B2]
B3 = B + 12;
= [A, B3]

Should be transformed to
C = A + B
= [C, 4]
= [C, 8]
= [C, 12]

Loop exit condition needs to be changed accordingly.

BTW, I just added an experimental tree-level loop unrolling pass in my
porting, right before ivopt pass. The results are very promising except
a few quirky things, which I belive to be problem of ivopts. The
produced assembly code is as good as maunal unrolling now.

Cheers,
Bingfeng


-----Original Message-----
From: Paolo Bonzini [mailto:paolo.bonzini@???] On Behalf Of Paolo
Bonzini
Sent: 10 July 2008 13:34
To: Bingfeng Mei
Cc: Steven Bosscher; gcc@???
Subject: Re: Inefficient loop unrolling.

Bingfeng Mei wrote:
> Steven,
> I just created a bug report. You should receive a CCed mail now.
>
> I can see these issues are solvable at RTL-level, but require lots of
> efforts. The main optimization in loop unrolling pass, split iv, can
> reduce dependence chain but not extra ADDs and alias issue. What is

the
> main reason that loop unrolling should belong to RTL level? Is it
> fundamental?


No, it is just effectiveness of the code size expansion heuristics.
Ivopts is already complex enough on the tree level, that doing it on RTL

would be insane. But other low-level loop optimizations had already
been written on the RTL level and since there were no compelling
reasons, they were left there.

That said, this is a bug -- fwprop should have folded the ADDs, at the
very least. I'll look at the PR.

Paolo