llvm.org GIT mirror llvm / 7ede51b
add a note git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45397 91177308-0d34-0410-b5e6-96231b3b80d8 Chris Lattner 11 years ago
1 changed file(s) with 41 addition(s) and 1 deletion(s). Raw diff Collapse all Expand all
16111611 addl (%edx,%edi,4), %ebx
16121612 addl %ebx, (%ecx,%edi,4)
16131613
1614 Additionally, LSR should rewrite the exit condition of the loop to use
1614 Here is another interesting example:
1615
1616 void vertical_compose97iH1(int *b0, int *b1, int *b2, int width){
1617 int i;
1618 for(i=0; i
1619 b1[i] -= (1*(b0[i] + b2[i])+0)>>0;
1620 }
1621
1622 We miss the r/m/w opportunity here by using 2 subs instead of an add+sub[mem]:
1623
1624 LBB9_2: # bb
1625 movl (%ecx,%edi,4), %ebx
1626 subl (%esi,%edi,4), %ebx
1627 subl (%edx,%edi,4), %ebx
1628 movl %ebx, (%ecx,%edi,4)
1629 incl %edi
1630 cmpl %eax, %edi
1631 jne LBB9_2 # bb
1632
1633 Additionally, LSR should rewrite the exit condition of these loops to use
16151634 a stride-4 IV, would would allow all the scales in the loop to go away.
16161635 This would result in smaller code and more efficient microops.
16171636
16181637 //===---------------------------------------------------------------------===//
1638
1639 We should be smarter about conversion from fpstack to XMM regs.
1640
1641 double foo();
1642 void bar(double *P) { *P = foo(); }
1643
1644 We compile that to:
1645
1646 _bar:
1647 subl $12, %esp
1648 call L_foo$stub
1649 fstpl (%esp)
1650 movl 16(%esp), %eax
1651 movsd (%esp), %xmm0
1652 movsd %xmm0, (%eax)
1653 addl $12, %esp
1654 ret
1655
1656 for example. The magic to/from the stack is unneeded.
1657
1658 //===---------------------------------------------------------------------===//