Inline Assembly(for GNU assembler)-the difference of AT&T & intel

来源：互联网发布：第一位女程序员编辑：程序博客网时间：2024/06/07 15:23

点击打开链接

Intel and AT&T Syntax.

Intel and AT&T syntax Assembly language are very different from eachother in appearance, and this will lead to confusion when one first comesacross AT&T syntax after having learnt Intel syntax first, or vice versa.So lets start with the basics.

Prefixes.

In Intel syntax there are no register prefixes or immed prefixes. InAT&T however registers are prefixed with a '%' and immed's are prefixedwith a '$'. Intel syntax hexadecimal or binary immed data are suffixed with 'h'and 'b' respectively. Also if the first hexadecimal digit is a letter then thevalue is prefixed by a '0'.

Example:

Intex Syntax

mov     eax,1

mov     ebx,0ffh

int     80h

AT&T Syntax

movl    $1,%eax

movl    $0xff,%ebx

int     $0x80

Direction of Operands.

The direction of the operands in Intel syntax is opposite from that ofAT&T syntax. In Intel syntax the first operand is the destination, and thesecond operand is the source whereas in AT&T syntax the first operand isthe source and the second operand is the destination. The advantage of AT&Tsyntax in this situation is obvious. We read from left to right, we write fromleft to right, so this way is only natural.

Example:

Intex Syntax

instr   dest,source

mov     eax,[ecx]

AT&T Syntax

instr   source,dest

movl    (%ecx),%eax

Memory Operands.

Memory operands as seen above are different also. In Intel syntax the baseregister is enclosed in '[' and ']' whereas in AT&T syntax it is enclosedin '(' and ')'.

Example:

Intex Syntax

mov     eax,[ebx]

mov     eax,[ebx+3]

AT&T Syntax

movl    (%ebx),%eax

movl    3(%ebx),%eax

The AT&T form for instructions involving complex operations is veryobscure compared to Intel syntax. The Intel syntax form of these issegreg:[base+index*scale+disp]. The AT&T syntax form is %segreg:disp(base,index,scale).

Index/scale/disp/segreg are all optional and can simply be left out. Scale,if not specified and index is specified, defaults to 1. Segreg depends on theinstruction and whether the app is being run in real mode or pmode. In realmode it depends on the instruction whereas in pmode its unnecessary. Immediatedata used should not '$' prefixed in AT&T when used for scale/disp.

Example:

Intel Syntax

instr   foo,segreg:[base+index*scale+disp]

mov     eax,[ebx+20h]

add     eax,[ebx+ecx*2h

lea     eax,[ebx+ecx]

sub     eax,[ebx+ecx*4h-20h]

AT&T Syntax

instr   %segreg:disp(base,index,scale),foo

movl    0x20(%ebx),%eax

addl    (%ebx,%ecx,0x2),%eax

leal    (%ebx,%ecx),%eax

subl    -0x20(%ebx,%ecx,0x4),%eax

As you can see, AT&T is very obscure. [base+index*scale+disp] makes moresense at a glance than disp(base,index,scale).

Suffixes.

As you may have noticed, the AT&T syntax mnemonics have a suffix. Thesignificance of this suffix is that of operand size. 'l' is for long, 'w' isfor word, and 'b' is for byte. Intel syntax has similar directives for use withmemory operands, i.e. byte ptr, word ptr, dword ptr. "dword" ofcourse corresponding to "long". This is similar to type casting in Cbut it doesnt seem to be necessary since the size of registers used is theassumed datatype.

Example:

Intel Syntax

mov     al,bl

mov     ax,bx

mov     eax,ebx

mov     eax, dword ptr [ebx]

AT&T Syntax

movb    %bl,%al

movw    %bx,%ax

movl    %ebx,%eax

movl    (%ebx),%eax

Brennan's Guide to Inline Assembly

by Brennan "Bas" Underwood

This document has disappeared from it's previous resting place. ApparentlyBrennan has changed ISPs. If you know where it is,please let me know.

Document version 1.1.2.2

Ok. This is meant to be an introduction to inline assembly under DJGPP.DJGPP is based on GCC, so it uses the AT&T/UNIX syntax and has a somewhatunique method of inline assembly. I spent many hours figuring some ofthis stuff out and toldInfo that I hate it, many times.

Hopefully if you already know Intel syntax, the examples will be helpfulto you. I've put variable names, register names and other literals inbold type.

Ok. This is meant to be an introduction to inline assembly under DJGPP.DJGPP is based on GCC, so it uses the AT&T/UNIX syntax and has a somewhat unique method of inline assembly. I spent many hours figuring some of this stuff out and toldInfo that I hate it, many times.

Hopefully if you already know Intel syntax, the examples will be helpful to you. I've put variable names, register names and other literals inbold type.

（这是一份基于UNIX语法的inline assembly文档，我奋战了几个小时才搞出这个文档，彻底解决了我很讨厌的一部分内容，另外需要说明的是变量名，寄存器名和其他特定名词将用斜体注明）

The Syntax（语法部分）

So, DJGPP uses the AT&T assembly syntax. What does that mean to you?

（DJGPP使用了AT&T assembly语法，这意味着什么呢）

Register naming:（寄存器名）
Register names are prefixed with "%".To reference eax:（寄存器名加上“%”的前缀，比如说eax:）
```
AT&T:  %eaxIntel: eax
```
Source/Destination Ordering:
In AT&T syntax (which is the UNIX standard, BTW) the source is alwayson theleft, and the destination is alwayson the right.
So let's load ebx with the value in eax:
```
AT&T:  movl %eax, %ebxIntel: mov ebx, eax
```
Constant value/immediate value format:
You must prefix all constant/immediate values with "$".
Let's load eax with the address of the "C" variable booga,which is static.
```
AT&T:  movl $_booga, %eaxIntel: mov eax, _booga
```
Now let's load ebx with 0xd00d:
```
AT&T:  movl $0xd00d, %ebxIntel: mov ebx, d00dh
```
Operator size specification:
You must suffix the instruction with one of b, w, orlto specify the width of the destination register as a byte,wordor longword. If you omit this, GAS (GNU assembler) will attempt toguess.You don't want GAS to guess, and guess wrong! Don't forget it.
```
AT&T:  movw %ax, %bxIntel: mov bx, ax
```
The equivalent forms for Intel is byte ptr, word ptr, anddword ptr, but that is for when you are...
Referencing memory:
DJGPP uses 386-protected mode, so you can forget all that real-mode addressingjunk, including the restrictions on which register has what default segment,which registers can be base or index pointers. Now, we just get 6 generalpurpose registers. (7 if you useebp, but be sure to restore ityourself or compile with -fomit-frame-pointer.)
Here is the canonical format for 32-bit addressing:
```
AT&T:  immed32(basepointer,indexpointer,indexscale)Intel: [basepointer + indexpointer*indexscale + immed32]
```
You could think of the formula to calculate the address as:
```
  immed32 + basepointer + indexpointer * indexscale
```
You don't have to use all those fields, but you do have to haveat least 1 of immed32, basepointer and youMUSTadd the size suffix to the operator!
Let's see some simple forms of memory addressing:
- Addressing a particular C variable:
```
AT&T:  _boogaIntel: [_booga]
```
  Note: the underscore ("_") is how you get at static (global) C variablesfrom assembler.This only works with global variables. Otherwise,you can use extended asm to have variables preloaded into registersfor you. I address that farther down.
- Addressing what a register points to:
```
AT&T:  (%eax)Intel: [eax]
```
- Addressing a variable offset by a value in a register:
```
AT&T: _variable(%eax)Intel: [eax + _variable]
```
- Addressing a value in an array of integers (scaling up by 4):
```
AT&T:  _array(,%eax,4)Intel: [eax*4 + array]
```
- You can also do offsets with the immediate value:
```
C code: *(p+1) where p is a char *AT&T:  1(%eax) where eax has the value of pIntel: [eax + 1]
```
- You can do some simple math on the immediate value:
```
AT&T: _struct_pointer+8
```
  I assume you can do that with Intel format as well.
- Addressing a particular char in an array of 8-character records:
  eax holds the number of the record desired. ebx has the wantedchar's offset within the record.
```
AT&T:  _array(%ebx,%eax,8)Intel: [ebx + eax*8 + _array]
```
Whew. Hopefully that covers all the addressing you'll need to do. As a note,you can putesp into the address, but only as the base register.

Basic inline assembly

The format for basic inline assembly is very simple, and much like Borland'smethod.

asm ("statements");

Pretty simple, no?So

asm ("nop");

will do nothing of course, and

asm ("cli");

will stop interrupts, with

asm ("sti");

of course enabling them. You can use __asm__ instead of asmif the keyword asm conflicts with something in your program.

When it comes to simple stuff like this, basic inline assembly is fine. Youcan even push your registers onto the stack, use them, and put themback.

asm ("pushl %eax\n\t"     "movl $0, %eax\n\t"     "popl %eax");

(The \n's and \t's are there so the .s file that GCC generates and handsto GAS comes out right when you've got multiple statements perasm.)
It's really meant for issuing instructions for which there is noequivalent in C and don't touch the registers.

But if you do touch the registers, and don't fix things at the end ofyourasm statement, like so:

asm ("movl %eax, %ebx");asm ("xorl %ebx, %edx");asm ("movl $0, _booga");

then your program will probably blow things to hell. This isbecause GCC hasn't been told that yourasm statement clobberedebx and edx andbooga, which it might have beenkeeping in a register, and might plan on using later.For that, you need:

Extended inline assembly

The basic format of the inline assembly stays much the same, but now getsWatcom-like extensions to allow input arguments and output arguments.

Here is the basic format:

asm ( "statements" : output_registers : input_registers : clobbered_registers);

Let's just jump straight to a nifty example, which I'll then explain:

asm ("cld\n\t"     "rep\n\t"     "stosl"     : /* no output registers */     : "c" (count), "a" (fill_value), "D" (dest)     : "%ecx", "%edi" );

The above stores the value in fill_value count times to thepointerdest.

Let's look at this bit by bit.

asm ("cld\n\t"

We are clearing the direction bit of the flags register. I think Intelformat calls thiscltd or something. You never know what this isgoing to be left at, and it costs you all of 1 or 2 cycles.

     "rep\n\t"     "stosl"

Notice that GAS requires the rep prefix to occupy a line of it's own.Notice also thatstos has the l suffix to make it movelongwords.

     : /* no output registers */

Well, there aren't any in this function.

     : "c" (count), "a" (fill_value), "D" (dest)

Here we load ecx with count, eax withfill_value,and edi with dest. Why makeGCC do it instead of doing it ourselves? Because GCC, in its registerallocating, might be able to arrange for, say,fill_value to alreadybe in eax. If this is in a loop, it might be able to preserveeax thru the loop, and save amovl once per loop.

     : "%ecx", "%edi" );

And here's where we specify to GCC, "you can no longer count on the valuesyou loaded intoecx or edi to be valid." This doesn't mean theywill be reloaded for certain. This is the clobberlist.

Seem funky? Well, it really helps when optimizing, when GCC can know exactlywhat you're doing with the registers before and after. It folds yourassembly code into the code it's generates (whose rules for generationlookremarkably like the above) and then optimizes. It's evensmart enough to know that if you tell it to put (x+1) in a register, thenif you don't clobber it, and later C code refers to (x+1), and it wasable to keep that register free, it will reuse the computation. Whew.

Here's the list of register loading codes that you'll be likely to use:

a        eaxb        ebxc        ecxd        edxS        esiD        ediI        constant value (0 to 31)q,r      dynamically allocated register (see below)

Note that you can't directly refer to the byte registers (ah, al,etc.) or the word registers (ax, bx, etc.) when you're loading thisway. Once you've got it in there, though, you can specifyax or whateverall you like.

The codes have to be in quotes, and the expressions to load inhave to be in parentheses.

When you do the clobber list, you specify the registers as above withthe%. If you write to a variable, you must include"memory" as one of The Clobbered. This is in case you wrote to a variablethat GCC thought it had in a register. This is the same as clobberingall registers. While I've never run into a problem with it, you might alsowant to add "cc" as a clobber if you change the condition codes (the bitsin theflags register the jnz, je, etc. operators look at.)

Now, that's all fine and good for loading specific registers. But what ifyou specify, say,ebx, and ecx, and GCC can't arrange for thevalues to be in those registers without having to stash the previous values.It's possible to let GCC pick the register(s). You do this:

asm ("leal (%1,%1,4), %0"     : "=r" (x)     : "0" (x) );

The above example multiplies x by 5 really quickly (1 cycle on the Pentium).Now, we could have specified, sayeax. But unless we really need aspecific register (like when usingrep movsl or rep stosl, whichare hardcoded to useecx, edi, and esi), why not let GCCpick an available one? So when GCC generates theoutput code for GAS, %0 will be replaced by the register it picked.

And where did "q" and "r" come from? Well, "q" causesGCC to allocate from eax, ebx,ecx, and edx."r" lets GCC also consideresi and edi.So make sure, if you use "r" that it would be possible to useesior edi in that instruction. If not, use "q".

Now, you might wonder, how to determine how the %n tokens getallocated to the arguments. It's a straightforward first-come-first-served,left-to-right thing, mapping to the"q"'s and "r"'s. But if youwant to reuse a register allocated with a"q" or "r", you use"0", "1", "2"... etc.

You don't need to put a GCC-allocated register on the clobberlistas GCC knows that you're messing with it.

Now for output registers.

asm ("leal (%1,%1,4), %0"     : "=r" (x_times_5)     : "r" (x) );

Note the use of = to specify an output register. You just have todo it that way. If you want 1 variable to stay in 1 register for bothin and out, you have to respecify the register allocated to it on theway in with the"0" type codes as mentioned above.

asm ("leal (%0,%0,4), %0"     : "=r" (x)     : "0" (x) );

This also works, by the way:

asm ("leal (%%ebx,%%ebx,4), %%ebx"     : "=b" (x)     : "b" (x) );

2 things here:

Note that we don't have to put ebx on the clobberlist, GCC knows itgoes into x. Therefore, since it can know the value ofebx,it isn't considered clobbered.
Notice that in extended asm, you must prefix registers with %%instead of just%. Why, you ask? Because as GCC parses along for%0's and %1's and so on, it would interpret %edx as a %e parameter, seethat that's non-existent, and ignore it. Then it would bitch about findinga symbol named dx, which isn't valid because it's not prefixed with %and it's not the one you meant anyway.

Important note: If your assembly statement mustexecute where you put it, (i.e. must not be moved out of a loop as anoptimization), put the keywordvolatile after asmand before the ()'s. To be ultra-careful, use

__asm__ __volatile__ (...whatever...);

However, I would like to point out that if your assembly's onlypurpose is to calculate the output registers, with no other side effects,you should leave off thevolatile keyword so your statementwill be processed into GCC's common subexpression elimination optimization.

Some useful examples

#define disable() __asm__ __volatile__ ("cli");#define enable() __asm__ __volatile__ ("sti");

Of course, libc has these defined too.

#define times3(arg1, arg2) \__asm__ ( \  "leal (%0,%0,2),%0" \  : "=r" (arg2) \  : "0" (arg1) );#define times5(arg1, arg2) \__asm__ ( \  "leal (%0,%0,4),%0" \  : "=r" (arg2) \  : "0" (arg1) );#define times9(arg1, arg2) \__asm__ ( \  "leal (%0,%0,8),%0" \  : "=r" (arg2) \  : "0" (arg1) );

These multiply arg1 by 3, 5, or 9 and put them in arg2. You should be okto do:

times5(x,x);

as well.

#define rep_movsl(src, dest, numwords) \__asm__ __volatile__ ( \  "cld\n\t" \  "rep\n\t" \  "movsl" \  : : "S" (src), "D" (dest), "c" (numwords) \  : "%ecx", "%esi", "%edi" )

Helpful Hint: If you say memcpy() with a constant length parameter, GCCwill inline it to arep movsl like above. But if you need a variablelength version that inlines and you're always moving dwords, there ya go.

#define rep_stosl(value, dest, numwords) \__asm__ __volatile__ ( \  "cld\n\t" \  "rep\n\t" \  "stosl" \  : : "a" (value), "D" (dest), "c" (numwords) \  : "%ecx", "%edi" )

Same as above but for memset(), which doesn't get inlined no matterwhat (for now.)

The End

"The End"?! Yah, I guess so.

If you're wondering, I personally am a big fan of AT&T/UNIX syntax now.(It might have helped that I cut my teeth on SPARC assembly. Of course,that machine actually had a decent number of general registers.)It might seem weird to you at first, but it's really more logical thanIntel format, and has no ambiguities.

If I still haven't answered a question of yours, look in the Infopages for more information, particularly on the input/output registers.You can do some funky stuff like use"A" to allocate two registersat once for 64-bit mathor "m" for staticmemory locations, and a bunch more that aren't really used as much as"q" and"r".

Alternately, mail me, and I'llsee what I can do. (If you find any errors in the above,please,e-mail me and tell me about it! It's frustrating enough to learn withoutbuggy docs!) Or heck, mail me to say "boogabooga."

It's the least you can do.

0 0