So far we have been looking at debug mode assembly on iOS and the Raspberry Pi. If one invokes -O2 on the Pi (and switches to release build on iPhone), the optimized version of square function reduces to the following two ARM instructions :

mul r0, r0, r0 //r0 = r0*r0
bx lr //branch back to caller

 
 
1This is sad because for a ARM learning platform, nothing could be more important than a toolchain that supports thumb.

2If we force Xcode to produce ARM instructions by specifying -mno-thumb, then we see the following code for square function:

0x101af4:  sub    sp, sp, #4
0x101af8:  str    r0, [sp]
0x101afc:  ldr    r0, [sp]
0x101b00:  ldr    r1, [sp]
0x101b04:  mul    r0, r0, r1
0x101b08:  add    sp, sp, #4
0x101b0c:  bx     lr

3The other interesting bit to note here is 12 bytes reservation when just 4 bytes of locals are needed. It is not apparent to me why that is (if any of you know more about this, please leave a comment below). ARM AAPCS/EABI requires upto a max of 8 byte alignment which would have been still preserved since push {r11} would have decremented our presumably 8 byte aligned sp.

4Apparently Windows also uses r11 as the frame pointer when emitting ARM instructions. In thumb mode, Windows uses r7 as the frame pointer.

Tagged with →  
Share →

Leave a Reply

Your email address will not be published. Required fields are marked *

Looking for something?

Use the form below to search the site:


Still not finding what you're looking for? Drop us a note so we can take care of it!

Visit our friends!

A few highly recommended friends...

Set your Twitter account name in your settings to use the TwitterBar Section.