X86 instruction listings
Encyclopedia
The x86 instruction set
has been extended several times, introducing wider registers and datatypes and/or new functionality.
for a quick tutorial for this processor family.
The updated instruction set is also grouped according to architecture (i386, i486, i686) and more generally is referred to as x86_32 and x86_64 (also known as AMD64).
Also MMX registers and MMX support instructions were added. They are usable for both integer and floating point operations, see below.
Added with Pentium Pro
Conditional MOV:
CMOVA, CMOVAE, CMOVB, CMOVBE, CMOVC, CMOVE, CMOVG, CMOVGE, CMOVL, CMOVLE, CMOVNA, CMOVNAE, CMOVNB, CMOVNBE, CMOVNC, CMOVNE, CMOVNG, CMOVNGE, CMOVNL, CMOVNLE, CMOVNO, CMOVNP, CMOVNS, CMOVNZ, CMOVO, CMOVP, CMOVPE, CMOVPO, CMOVS, CMOVZ, SYSENTER (SYStem call ENTER), SYSEXIT (SYStem call EXIT), UD2
Added with SSE
MASKMOVQ, MOVNTPS, MOVNTQ, PREFETCH0, PREFETCH1, PREFETCH2, PREFETCHNTA, SFENCE (for Cacheability and Memory Ordering
)
Added with SSE2
CLFLUSH, LFENCE, MASKMOVDQU, MFENCE, MOVNTDQ, MOVNTI, MOVNTPD, PAUSE (for Cacheability)
Added with x86-64
CMPXCHG16B (CoMPare and eXCHanGe 16 Bytes), RDTSCP (ReaD Time Stamp Counter and Processor ID)
Added with SSE3
LDDQU (for Video Encoding)
MONITOR, MWAIT (for thread
synchronization; only on processors supporting Hyper-threading
and some dual-core processors like Core 2, Phenom
and others)
These are also supported on later Pentium IIs which do not contain SSE support
Added with Athlon
Same as the SSE SIMD
integer instructions which operated on MMX registers.
EMMI
PAVEB, PADDSIW, PMAGW, PDISTIB, PSUBSIW, PMVZB, PMULHRW, PMVNZB, PMVLZB, PMVGEZB, PMULHRIW, PMACHRIW
SSE2
Added with Pentium 4
Also see integer instructions added with Pentium 4
ADDPD, ADDSD, ANDNPD, ANDPD, CMPPD, CMPSD*, COMISD, CVTDQ2PD, CVTDQ2PS, CVTPD2DQ, CVTPD2PI, CVTPD2PS, CVTPI2PD, CVTPS2DQ, CVTPS2PD, CVTSD2SI, CVTSD2SS, CVTSI2SD, CVTSS2SD, CVTTPD2DQ, CVTTPD2PI, CVTTPS2DQ, CVTTSD2SI, DIVPD, DIVSD, MAXPD, MAXSD, MINPD, MINSD, MOVAPD
, MOVHPD
, MOVLPD, MOVMSKPD, MOVSD*, MOVUPD, MULPD, MULSD, ORPD, SHUFPD, SQRTPD, SQRTSD, SUBPD, SUBSD, UCOMISD, UNPCKHPD, UNPCKLPD, XORPD
* CMPSD and MOVSD have the same name as the string
instruction mnemonics CMPSD (CMPS) and MOVSD (MOVS); however, the former refer to scalar double-precision
floating-points
whereas the latters refer to doubleword
strings.
SSE3
Added with Pentium 4 supporting SSE3
Also see integer and floating-point instructions added with Pentium 4 SSE3
SSSE3
Added with Xeon
5100 series and initial Core 2
processors
Intel AVX
which are implemented on the chips but not listed in some official documents. They can be found in various sources across the Internet, such as Ralf Brown's Interrupt List
and at http://sandpile.org.
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...
has been extended several times, introducing wider registers and datatypes and/or new functionality.
x86 integer instructions
This is the full 8086/8088 instruction set, but most, if not all of these instructions are available in 32-bit mode, they just operate on 32-bit registers (eax, ebx, etc.) and values instead of their 16-bit (ax, bx, etc.) counterparts. See also x86 assembly languageX86 assembly language
x86 assembly language is a family of backward-compatible assembly languages, which provide some level of compatibility all the way back to the Intel 8008. x86 assembly languages are used to produce object code for the x86 class of processors, which includes Intel's Core series and AMD's Phenom and...
for a quick tutorial for this processor family.
The updated instruction set is also grouped according to architecture (i386, i486, i686) and more generally is referred to as x86_32 and x86_64 (also known as AMD64).
Original 8086/8088 instructions
Instruction | Meaning | Notes |
---|---|---|
AAA Intel BCD opcodes The Intel BCD opcodes are a set of x86 instructions that operates with BCD numbers.The radix used for the representation of numbers in the x86 processors is 2.This is called a binary numeral system.... |
ASCII adjust AL after addition | used with unpacked binary coded decimal |
AAD | ASCII adjust AX before division | 8086/8088 datasheet documents only base 10 version of the AAD instruction (opcode 0xD5 0x0A), but any other base will work. Later Intel's documentation has the generic form too. NEC V20 and V30 (and possibly other NEC V-series CPUs) always use base 10, and ignore the argument, causing a number of incompatibilities |
AAM | ASCII adjust AX after multiplication | Only base 10 version is documented, see notes for AAD |
AAS | ASCII adjust AL after subtraction | |
ADC | Add with carry | destination := destination + source + carry flag Carry flag In computer processors the carry flag is a single bit in a system status register used to indicate when an arithmetic carry or borrow has been generated out of the most significant ALU bit position... |
ADD | Add | |
AND | Logical AND Logical conjunction In logic and mathematics, a two-place logical operator and, also known as logical conjunction, results in true if both of its operands are true, otherwise the value of false.... |
|
CALL | Call procedure | |
CBW | Convert byte to word | |
CLC | Clear carry flag Carry flag In computer processors the carry flag is a single bit in a system status register used to indicate when an arithmetic carry or borrow has been generated out of the most significant ALU bit position... |
|
CLD | Clear direction flag Direction flag The Direction Flag is a flag that controls the left-to-right or right-to-left direction of string processing stored in the FLAGS register on all x86 compatible CPUs. It is bit 10.... |
|
CLI | Clear interrupt flag IF (x86 flag) IF is a system flag bit in the x86 architecture's FLAGS register, which determines whether or not the CPU will handle maskable hardware interrupts.... |
|
CMC | Complement carry flag | |
CMP | Compare operands | |
CMPSB | Compare bytes in memory | |
CMPSW | Compare words | |
CWD | Convert word to doubleword | |
DAA Intel BCD opcodes The Intel BCD opcodes are a set of x86 instructions that operates with BCD numbers.The radix used for the representation of numbers in the x86 processors is 2.This is called a binary numeral system.... |
Decimal adjust AL after addition | (used with packed binary coded decimal) |
DAS Intel BCD opcodes The Intel BCD opcodes are a set of x86 instructions that operates with BCD numbers.The radix used for the representation of numbers in the x86 processors is 2.This is called a binary numeral system.... |
Decimal adjust AL after subtraction | |
DEC | Decrement by 1 | |
DIV | Unsigned divide | |
ESC | Used with floating-point unit | |
HLT HLT In the x86 computer architecture, HLT is an assembly language instruction which halts the CPU until the next external interrupt is fired. Such interrupts are used by devices in order to signal to the CPU that an event occurred which the CPU shall react on... |
Enter halt state | |
IDIV | Signed divide | |
IMUL | Signed multiply | |
IN | Input from port | |
INC | Increment by 1 | |
INT INT (x86 instruction) INT is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.When written in assembly language, the instruction is written like this:... |
Call to interrupt Interrupt In computing, an interrupt is an asynchronous signal indicating the need for attention or a synchronous event in software indicating the need for a change in execution.... |
|
INTO | Call to interrupt if overflow | |
IRET | Return from interrupt | |
Jxx | Jump if condition | (JA, JAE, JB, JBE, JC, JCXZ, JE, JG, JGE, JL, JLE, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ) |
JMP JMP (x86 instruction) In the x86 assembly language, the JMP instruction performs an unconditional jump. Such an instruction transfers the flow of execution by changing the instruction pointer register... |
Jump | |
LAHF | Load flags into AH register | |
LDS | Load pointer using DS | |
LEA | Load Effective Address | |
LES | Load ES with pointer | |
LOCK | Assert BUS LOCK# signal | (for multiprocessing) |
LODSB | Load string byte | |
LODSW | Load string word | |
LOOP/LOOPx | Loop control | (LOOPE, LOOPNE, LOOPNZ, LOOPZ) |
MOV MOV (x86 instruction) In the x86 assembly language, the MOV instruction is a mnemonic for the copying of data from one location to another. The x86 assembly language has a number of different move instructions... |
Move | |
MOVSB | Move byte from string to string | |
MOVSW | Move word from string to string | |
MUL | Unsigned multiply | |
NEG | Two's complement negation | |
NOP NOP In computer science, NOP or NOOP is an assembly language instruction, sequence of programming language statements, or computer protocol command that effectively does nothing at all.... |
No operation | opcode (0x90) equivalent to XCHG EAX, EAX |
NOT | Negate the operand, logical NOT | |
OR | Logical OR Logical disjunction In logic and mathematics, a two-place logical connective or, is a logical disjunction, also known as inclusive disjunction or alternation, that results in true whenever one or more of its operands are true. E.g. in this context, "A or B" is true if A is true, or if B is true, or if both A and B are... |
|
OUT | Output to port | |
POP | Pop data from stack Stack (data structure) In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,... |
POP CS (opcode 0x0F) works only on 8086/8088. Later CPUs use 0x0F as a prefix for newer instructions. |
POPF | Pop data into flags register FLAGS register (computing) The FLAGS register is the status register in Intel x86 microprocessors that contains the current state of the processor. This register is 16 bits wide. Its successors, the EFLAGS and RFLAGS registers, are 32 bits and 64 bits wide, respectively... |
|
PUSH | Push data onto stack | |
PUSHF | Push flags onto stack | |
RCL | Rotate left (with carry) | |
RCR | Rotate right (with carry) | |
REPxx | Repeat MOVS/STOS/CMPS/LODS/SCAS | (REP, REPE, REPNE, REPNZ, REPZ) |
RET | Return from procedure | |
RETN | Return from near procedure | |
RETF | Return from far procedure | |
ROL | Rotate left | |
ROR | Rotate right | |
SAHF | Store AH into flags | |
SAL | Shift Arithmetically Arithmetic shift In computer programming, an arithmetic shift is a shift operator, sometimes known as a signed shift . For binary numbers it is a bitwise operation that shifts all of the bits of its operand; every bit in the operand is simply moved a given number of bit positions, and the vacant bit-positions are... left (signed shift left) |
|
SAR | Shift Arithmetically right (signed shift right) | |
SBB | Subtraction with borrow | |
SCASB | Compare byte string | |
SCASW | Compare word string | |
SHL | Shift Logical shift In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. Unlike an arithmetic shift, a logical shift does not preserve a number's sign bit or distinguish a number's exponent from its mantissa; every bit in the operand is simply moved a given number of bit... left (unsigned shift left) |
|
SHR | Shift right (unsigned shift right) | |
STC | Set carry flag | |
STD | Set direction flag | |
STI | Set interrupt flag | |
STOSB | Store byte in string | |
STOSW | Store word in string | |
SUB | Subtraction | |
TEST TEST (x86 instruction) In the x86 assembly language, the TEST instruction performs a bitwise AND on two operands. The flags SF, ZF, PF, CF, OF and AF are modified while the result of the AND is discarded. There are 9 different opcodes for the TEST instruction depending on the type and size of the operands. It can compare... |
Logical compare (AND) | |
WAIT | Wait until not busy | Waits until BUSY# pin is inactive (used with floating-point unit) |
XCHG | Exchange data | |
XLAT | Table look-up translation | |
XOR | Exclusive OR |
Added with 80186/80188
Instruction | Meaning | Notes |
---|---|---|
BOUND | Check array index against bounds | raises software interrupt 5 if test fails |
ENTER | Enter stack frame | equivalent to PUSH BP |
INS | Input from port to string | equivalent to
|
LEAVE | Leave stack frame | equivalent to MOV SP, BP |
OUTS | Output string to port | equivalent to
|
POPA | Pop all general purpose registers from stack | equivalent toPOP DI, SI, BP, SP, BX, DX, CX, AX |
PUSHA | Push all general purpose registers onto stack | equivalent toPUSH AX, CX, DX, BX, SP, BP, SI, DI |
Added with 80286
Instruction | Meaning | Notes |
---|---|---|
ARPL | Adjust RPL field of selector | |
CLTS | Clear task-switched flag in register CR0 | |
LAR | Load access rights byte | |
LGDT | Load global descriptor table | |
LIDT | Load interrupt descriptor table | |
LLDT | Load local descriptor table | |
LMSW | Load machine status word | |
LOADALL LOADALL LOADALL is the common name for two different, undocumented machine instructions of Intel 80286 and Intel 80386 processors, which allow access to areas normally outside of the IA-32 API scope, like descriptor cache registers... |
Load all CPU registers, including internal ones such as GDT | Undocumented, (80)286 and 386 only |
LSL | Load segment limit | |
LTR Load Task Register The LTR x86 instruction stands for load task register and is used in operating systems that support multitasking. LTR is supported only in protected mode and long mode, not in real mode or virtual 8086 mode. It must be executed when the CPL is 0, and therefore cannot be used by application programs... |
Load task register | |
SGDT | Store global descriptor table | |
SIDT | Store interrupt descriptor table | |
SLDT | Store local descriptor table | |
SMSW | Store machine status word | |
STR | Store task register | |
VERR | Verify a segment for reading | |
VERW | Verify a segment for writing |
Added with 80386
Instruction | Meaning | Notes |
---|---|---|
BSF | Bit scan forward | |
BSR | Bit scan reverse | |
BT | Bit test | |
BTC | Bit test and complement | |
BTR | Bit test and reset | |
BTS | Bit test and set | |
CDQ | Convert double-word to quad-word | Sign-extends EAX into EDX, forming the quad-word EDX:EAX. Since (I)DIV uses EDX:EAX as its input, CDQ must be called after setting EAX if EDX is not manually initialized (as in 64/32 division) before (I)DIV. |
CMPSD | Compare string double-word | Compares ES:[(E)DI] with DS:[SI] |
CWDE | Convert word to double-word | Unlike CWD, CWDE sign-extends AX to EAX instead of AX to DX:AX |
INSB, INSW, INSD | Input from port to string with explicit size | same as INS |
IRETx | Interrupt return; D suffix means 32-bit return, F suffix means do not generate epilogue code (i.e. LEAVE instruction) | Use IRETD rather than IRET in 32-bit situations |
JCXZ, JECXZ | Jump if register (E)CX is zero | |
LFS, LGS | Load far pointer | |
LSS | Load stack segment | |
LODSD | Load string | can be prefixed with REP |
LOOPW, LOOPD | Loop | Loop; counter register is (E)CX |
LOOPEW, LOOPED | Loop while equal | |
LOOPZW, LOOPZD | Loop while zero | |
LOOPNEW, LOOPNED | Loop while not equal | |
LOOPNZW, LOOPNZD | Loop while not zero | |
MOVSW, MOVSD | Move data from string to string | |
MOVSX | Move with sign-extend | |
MOVZX | Move with zero-extend | |
POPAD | Pop all double-word (32-bit) registers from stack | Does not pop register ESP off of stack |
POPFD | Pop data into EFLAGS register | |
PUSHAD | Push all double-word (32-bit) registers onto stack | |
PUSHFD | Push EFLAGS register onto stack | |
SCASD | Scan string data double-word | |
SETx | Set byte to one on condition | (SETA, SETAE, SETB, SETBE, SETC, SETE, SETG, SETGE, SETL, SETLE, SETNA, SETNAE, SETNB, SETNBE, SETNC, SETNE, SETNG, SETNGE, SETNL, SETNLE, SETNO, SETNP, SETNS, SETNZ, SETO, SETP, SETPE, SETPO, SETS, SETZ) |
SHLD | Shift left double-word | |
SHRD | Shift right double-word | |
STOSx | Store string |
Added with 80486
Instruction | Meaning | Notes |
---|---|---|
BSWAP | Byte Swap | Only works for 32 bit registers. |
CMPXCHG | CoMPare and eXCHanGe | |
INVD | Invalidate Internal Caches | |
INVLPG | Invalidate TLB Entry | |
WBINVD | Write Back and Invalidate Cache | |
XADD | Exchange and Add |
Added with Pentium
Instruction | Meaning | Notes |
---|---|---|
CPUID CPUID The CPUID opcode is a processor supplementary instruction for the x86 architecture. It was introduced by Intel in 1993 when it introduced the Pentium and SL-Enhanced 486 processors.... |
CPU IDentification | This was also added to later 80486 processors. |
CMPXCHG8B | CoMPare and eXCHanGe 8 bytes | |
RDMSR | ReaD from Model-Specific Register | |
RDTSC | ReaD Time Stamp Counter | |
WRMSR | WRite to Model-Specific Register | |
RSM http://www.softeng.rl.ac.uk/st/archive/SoftEng/SESP/html/SoftwareTools/vtune/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/instruct32_hh/vc279.htm | Resume from System Management Mode | This was introduced by the i386SL and later and is also in the i486SL Intel 80486SL The Intel's i486SL is the power-saving variant of the i486DX microprocessor. The SL was designed for use in mobile computers. It was produced between November 1992 and June 1993. Clock speeds available were 20, 25 and 33 MHz... and later. Resumes from System Management Mode System Management Mode System Management Mode is an operating mode in which all normal execution is suspended, and special separate software is executed in high-privilege mode. It was first released with the Intel 386SL... (SMM) |
Added with Pentium MMX
Instruction | Meaning | Notes |
---|---|---|
RDPMC | Read the PMC [Performance Monitoring Counter] | Specified in the ECX register into registers EDX:EAX |
Also MMX registers and MMX support instructions were added. They are usable for both integer and floating point operations, see below.
Added with Pentium ProPentium ProThe Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel introduced in November 1, 1995 . It introduced the P6 microarchitecture and was originally intended to replace the original Pentium in a full range of applications...
Conditional MOV:CMOVA, CMOVAE, CMOVB, CMOVBE, CMOVC, CMOVE, CMOVG, CMOVGE, CMOVL, CMOVLE, CMOVNA, CMOVNAE, CMOVNB, CMOVNBE, CMOVNC, CMOVNE, CMOVNG, CMOVNGE, CMOVNL, CMOVNLE, CMOVNO, CMOVNP, CMOVNS, CMOVNZ, CMOVO, CMOVP, CMOVPE, CMOVPO, CMOVS, CMOVZ, SYSENTER (SYStem call ENTER), SYSEXIT (SYStem call EXIT), UD2
Added with SSEStreaming SIMD ExtensionsIn computing, Streaming SIMD Extensions is a SIMD instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series processors as a reply to AMD's 3DNow! . SSE contains 70 new instructions, most of which work on single precision floating point...
MASKMOVQ, MOVNTPS, MOVNTQ, PREFETCH0, PREFETCH1, PREFETCH2, PREFETCHNTA, SFENCE (for Cacheability and Memory OrderingMemory ordering
Memory ordering is a group of properties of the modern microprocessors, characterising their possibilities in memory operations reordering. It is a type of out-of-order execution. Memory reordering can be used to fully utilize different cache and memory banks.On most modern uniprocessors memory...
)
Added with SSE2SSE2SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...
CLFLUSH, LFENCE, MASKMOVDQU, MFENCE, MOVNTDQ, MOVNTI, MOVNTPD, PAUSE (for Cacheability)Added with x86-64X86-64x86-64 is an extension of the x86 instruction set. It supports vastly larger virtual and physical address spaces than are possible on x86, thereby allowing programmers to conveniently work with much larger data sets. x86-64 also provides 64-bit general purpose registers and numerous other...
CMPXCHG16B (CoMPare and eXCHanGe 16 Bytes), RDTSCP (ReaD Time Stamp Counter and Processor ID)Added with SSE3SSE3SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...
LDDQU (for Video Encoding)MONITOR, MWAIT (for thread
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
synchronization; only on processors supporting Hyper-threading
Hyper-threading
Hyper-threading is Intel's term for its simultaneous multithreading implementation in its Atom, Intel Core i3/i5/i7, Itanium, Pentium 4 and Xeon CPUs....
and some dual-core processors like Core 2, Phenom
Phenom (processor)
Phenom is the 64-bit AMD desktop processor line based on the K10 microarchitecture, in what AMD calls family 10h processors, sometimes incorrectly called "K10h". Triple-core versions belong to the Phenom 8000 series and quad cores to the AMD Phenom X4 9000 series...
and others)
Added with AMD-V
CLGI, SKINIT, STGI, VMLOAD, VMMCALL, VMRUN, VMSAVE (SVM instructions of AMD-V)Added with Intel VT-x
VMPTRLD, VMPTRST, VMCLEAR, VMREAD, VMWRITE, VMCALL, VMLAUNCH, VMRESUME, VMXOFF, VMXONOriginal 8087 instructions
Instruction | Meaning | Notes |
---|---|---|
F2XM1 | 2x - 1 | more precise than 2x for close to zero |
FABS | Absolute value | |
FADD | Add | |
FADDP | Add and pop | |
FBLD | Load BCD | |
FBSTP | Store BCD and pop | |
FCHS | Change sign | |
FCLEX | Clear exceptions | |
FCOM | Compare | |
FCOMP | Compare and pop | |
FCOMPP | Compare and pop twice | |
FDECSTP | Decrement floating point stack pointer | |
FDISI | Disable interrupts | 8087 only, otherwise FNOP |
FDIV | Divide | Pentium FDIV bug Pentium FDIV bug The Pentium FDIV bug was a bug in the Intel P5 Pentium floating point unit . Certain floating point division operations performed with these processors would produce incorrect results... |
FDIVP | Divide and pop | |
FDIVR | Divide reversed | |
FDIVRP | Divide reversed and pop | |
FENI | Enable interrupts | 8087 only, otherwise FNOP |
FFREE | Free register | |
FIADD | Integer add | |
FICOM | Integer compare | |
FICOMP | Integer compare and pop | |
FIDIV | Integer divide | |
FIDIVR | Integer divide reversed | |
FILD | Load integer | |
FIMUL | Integer multiply | |
FINCSTP | Increment floating point stack pointer | |
FINIT | Initialize floating point processor | |
FIST | Store integer | |
FISTP | Store integer and pop | |
FISUB | Integer subtract | |
FISUBR | Integer subtract reversed | |
FLD | Floating point load | |
FLD1 | Load 1.0 onto stack | |
FLDCW | Load control word | |
FLDENV | Load environment state | |
FLDENVW | ||
FLDL2E | Load log2(e) onto stack | |
FLDL2T | Load log2(10) onto stack | |
FLDLG2 | Load log10(2) onto stack | |
FLDLN2 | Load ln(2) onto stack | |
FLDPI | Load π onto stack | |
FLDZ | Load 0.0 onto stack | |
FMUL | Multiply | |
FMULP | Multiply and pop | |
FNCLEX | Clear exceptions, no wait | |
FNDISI | Disable interrupts, no wait | 8087 only, otherwise FNOP |
FNENI | Enable interrupts, no wait | 8087 only, otherwise FNOP |
FNINIT | Initialize floating point processor, no wait | |
FNOP | No operation | |
FNSAVE | Save FPU state, no wait, 8-bit | |
FNSAVEW | Save FPU state, no wait, 16-bit | |
FNSTCW | Store control word, no wait | |
FNSTENV | Store FPU environment, no wait | |
FNSTENVW | Store FPU environment, no wait, 16-bit | |
FNSTSW | Store status word, no wait | |
FPATAN | Partial arctangent | |
FPREM | Partial remainder | |
FPTAN | Partial tangent | |
FRNDINT | Round to integer | |
FRSTOR | Restore saved state | |
FRSTORW | Restore saved state | Perhaps not actually available in 8087 |
FSAVE | Save FPU state | |
FSAVEW | Save FPU state, 16-bit | |
FSCALE | Scale by factor of 2 | |
FSQRT | Square root | |
FST | Floating point store | |
FSTCW | Store control word | |
FSTENV | Store FPU environment | |
FSTENVW | Store FPU environment, 16-bit | |
FSTP | Store and pop | |
FSTSW | Store status word | |
FSUB | Subtract | |
FSUBP | Subtract and pop | |
FSUBR | Reverse subtract | |
FSUBRP | Reverse subtract and pop | |
FTST | Test for zero | |
FWAIT | Wait while FPU is executing | |
FXAM | Examine condition flags | |
FXCH | Exchange registers | |
FXTRACT | Extract exponent and significand | |
FYL2X | y * log2(x) | if , then the base- logarithm is computed |
FYL2XP1 | y * log2(x+1) | more precise than if is close to zero |
Added with 80387
FCOS, FLDENVD, FNSAVED, FNSTENVD, FPREM1, FRSTORD, FSAVED, FSIN, FSINCOS, FSTENVD, FUCOM, FUCOMP, FUCOMPPAdded with Pentium Pro
- FCMOVFCMOVFCMOV is a floating point conditional move opcode of the Intel x86 architecture, first introduced in Pentium Pro processors. It copies the contents of one of the floating point stack register, depending on the contents of EFLAGS integer flag register, to the ST register...
variants: FCMOVB, FCMOVBE, FCMOVE, FCMOVNB, FCMOVNBE, FCMOVNE, FCMOVNU, FCMOVU - FCOMI variants: FCOMI, FCOMIP, FUCOMI, FUCOMIP
Added with SSE
FXRSTOR, FXSAVEThese are also supported on later Pentium IIs which do not contain SSE support
Added with SSE3
FISTTP (x87 to integer conversion with truncation regardless of status word)Added with Pentium MMX
EMMS, MOVD, MOVQ, PACKSSDW, PACKSSWB, PACKUSWB, PADDB, PADDD, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDW, PAND, PANDN, PCMPEQB, PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTD, PCMPGTW, PMADDWD, PMULHW, PMULLW, POR, PSLLD, PSLLQ, PSLLW, PSRAD, PSRAW, PSRLD, PSRLQ, PSRLW, PSUBB, PSUBD, PSUBSB, PSUBSW, PSUBUSB, PSUBUSW, PSUBW, PUNPCKHBW, PUNPCKHDQ, PUNPCKHWD, PUNPCKLBW, PUNPCKLDQ, PUNPCKLWD, PXORAdded with AthlonAthlonAthlon is the brand name applied to a series of x86-compatible microprocessors designed and manufactured by Advanced Micro Devices . The original Athlon was the first seventh-generation x86 processor and, in a first, retained the initial performance lead it had over Intel's competing processors...
Same as the SSE SIMDSIMD
Single instruction, multiple data , is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously...
integer instructions which operated on MMX registers.
EMMIExtended MMXExtended MMX, also known as EMMI , was an instruction set developed by Cyrix and implemented on their 6x86 MX and MII line of processors. It extended the MMX instruction set with 12 new instructions useful in multimedia applications. The extensions were not enabled by default, requiring the BIOS or...
instructions - added with 6x86MX from CyrixCyrixCyrix Corporation was a microprocessor developer that was founded in 1988 in Richardson, Texas as a specialist supplier of high-performance math coprocessors for 286 and 386 microprocessors. The company was founded by former Texas Instruments staff members and had a long but troubled relationship...
, deprecated now
PAVEB, PADDSIW, PMAGW, PDISTIB, PSUBSIW, PMVZB, PMULHRW, PMVNZB, PMVLZB, PMVGEZB, PMULHRIW, PMACHRIWAdded with K6-2
FEMMS, PAVGUSB, PF2ID, PFACC, PFADD, PFCMPEQ, PFCMPGE, PFCMPGT, PFMAX, PFMIN, PFMUL, PFRCP, PFRCPIT1, PFRCPIT2, PFRSQIT1, PFRSQRT, PFSUB, PFSUBR, PI2FD, PMULHRW, PREFETCH, PREFETCHWSSE SIMD floating-point instructions
ADDPS, ADDSS, CMPPS, CMPSS, COMISS, CVTPI2PS, CVTPS2PI, CVTSI2SS, CVTSS2SI, CVTTPS2PI, CVTTSS2SI, DIVPS, DIVSS, LDMXCSR, MAXPS, MAXSS, MINPS, MINSS, MOVAPS, MOVHLPS, MOVHPS, MOVLHPS, MOVLPS, MOVMSKPS, MOVNTPS, MOVSS, MOVUPS, MULPS, MULSS, RCPPS, RCPSS, RSQRTPS, RSQRTSS, SHUFPS, SQRTPS, SQRTSS, STMXCSR, SUBPS, SUBSS, UCOMISS, UNPCKHPS, UNPCKLPSSSE SIMD integer instructions
ANDNPS, ANDPS, ORPS, PAVGB, PAVGW, PEXTRW, PINSRW, PMAXSW, PMAXUB, PMINSW, PMINUB, PMOVMSKB, PMULHUW, PSADBW, PSHUFW, XORPSInstruction | Opcode | Meaning | Notes |
---|---|---|---|
MOVUPS xmm1, xmm2/m128 | 0F 10 /r | Move Unaligned Packed Single-Precision Floating-Point Values | |
MOVSS xmm1, xmm2/m32 | F3 0F 10 /r | Move Scalar Single-Precision Floating-Point Values | |
MOVUPS xmm2/m128, xmm1 | 0F 11 /r | Move Unaligned Packed Single-Precision Floating-Point Values | |
MOVSS xmm2/m32, xmm1 | F3 0F 11 /r | Move Scalar Single-Precision Floating-Point Values | |
MOVLPS xmm, m64 | 0F 12 /r | Move Low Packed Single-Precision Floating-Point Values | |
MOVHLPS xmm1, xmm2 | 0F 12 /r | Move Packed Single-Precision Floating-Point Values High to Low | |
MOVLPS m64, xmm | 0F 13 /r | Move Low Packed Single-Precision Floating-Point Values | |
UNPCKLPS xmm1, xmm2/m128 | 0F 14 /r | Unpack and Interleave Low Packed Single-Precision Floating-Point Values | |
UNPCKHPS xmm1, xmm2/m128 | 0F 15 /r | Unpack and Interleave High Packed Single-Precision Floating-Point Values | |
MOVHPS xmm, m64 | 0F 16 /r | Move High Packed Single-Precision Floating-Point Values | |
MOVLHPS xmm1, xmm2 | 0F 16 /r | Move Packed Single-Precision Floating-Point Values Low to High | |
MOVHPS m64, xmm | 0F 17 /r | Move High Packed Single-Precision Floating-Point Values | |
PREFETCHNTA | 0F 18 /0 | Prefetch Data Into Caches (non-temporal data with respect to all cache levels) | |
PREFETCH0 | 0F 18 /1 | Prefetch Data Into Caches (temporal data) | |
PREFETCH1 | 0F 18 /2 | Prefetch Data Into Caches (temporal data with respect to first level cache) | |
PREFETCH2 | 0F 18 /3 | Prefetch Data Into Caches (temporal data with respect to second level cache) | |
NOP | 0F 1F /0 | No Operation | |
MOVAPS xmm1, xmm2/m128 | 0F 28 /r | Move Aligned Packed Single-Precision Floating-Point Values | |
MOVAPS xmm2/m128, xmm1 | 0F 29 /r | Move Aligned Packed Single-Precision Floating-Point Values | |
CVTPI2PS xmm, mm/m64 | 0F 2A /r | Convert Packed Dword Integers to Packed Single-Precision FP Values | |
CVTSI2SS xmm, r/m32 | F3 0F 2A /r | Convert Dword Integer to Scalar Single-Precision FP Value | |
MOVNTPS m128, xmm | 0F 2B /r | Store Packed Single-Precision Floating-Point Values Using Non-Temporal Hint | |
CVTTPS2PI mm, xmm/m64 | 0F 2C /r | Convert with Truncation Packed Single-Precision FP Values to Packed Dword Integers | |
CVTTSS2SI r32, xmm/m32 | F3 0F 2C /r | Convert with Truncation Scalar Single-Precision FP Value to Dword Integer | |
CVTPS2PI mm, xmm/m64 | 0F 2D /r | Convert Packed Single-Precision FP Values to Packed Dword Integers | |
CVTSS2SI r32, xmm/m32 | F3 0F 2D /r | Convert Scalar Single-Precision FP Value to Dword Integer | |
UCOMISS xmm1, xmm2/m32 | 0F 2E /r | Unordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGS | |
COMISS xmm1, xmm2/m32 | 0F 2F /r | Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS | |
SQRTPS xmm1, xmm2/m128 | 0F 51 /r | Compute Square Roots of Packed Single-Precision Floating-Point Values | |
SQRTSS xmm1, xmm2/m32 | F3 0F 51 /r | Compute Square Root of Scalar Single-Precision Floating-Point Value | |
RSQRTPS xmm1, xmm2/m128 | 0F 52 /r | Compute Reciprocal of Square Root of Packed Single-Precision Floating-Point Value | |
RSQRTSS xmm1, xmm2/m32 | F3 0F 52 /r | Compute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Value | |
RCPPS xmm1, xmm2/m128 | 0F 53 /r | Compute Reciprocal of Packed Single-Precision Floating-Point Values | |
RCPSS xmm1, xmm2/m32 | F3 0F 53 /r | Compute Reciprocal of Scalar Single-Precision Floating-Point Values | |
ANDPS xmm1, xmm2/m128 | 0F 54 /r | Bitwise Logical AND of Packed Single-Precision Floating-Point Values | |
ANDNPS xmm1, xmm2/m128 | 0F 55 /r | Bitwise Logical AND NOT of Packed Single-Precision Floating-Point Values | |
ORPS xmm1, xmm2/m128 | 0F 56 /r | Bitwise Logical OR of Single-Precision Floating-Point Values | |
XORPS xmm1, xmm2/m128 | 0F 57 /r | Bitwise Logical XOR for Single-Precision Floating-Point Values | |
ADDPS xmm1, xmm2/m128 | 0F 58 /r | Add Packed Single-Precision Floating-Point Values | |
ADDSS xmm1, xmm2/m32 | F3 0F 58 /r | Add Scalar Single-Precision Floating-Point Values | |
MULPS xmm1, xmm2/m128 | 0F 59 /r | Multiply Packed Single-Precision Floating-Point Values | |
MULSS xmm1, xmm2/m32 | F3 0F 59 /r | Multiply Scalar Single-Precision Floating-Point Values | |
SUBPS xmm1, xmm2/m128 | 0F 5C /r | Subtract Packed Single-Precision Floating-Point Values | |
SUBSS xmm1, xmm2/m32 | F3 0F 5C /r | Subtract Scalar Single-Precision Floating-Point Values | |
MINPS xmm1, xmm2/m128 | 0F 5D /r | Return Minimum Packed Single-Precision Floating-Point Values | |
MINSS xmm1, xmm2/m32 | F3 0F 5D /r | Return Minimum Scalar Single-Precision Floating-Point Values | |
DIVPS xmm1, xmm2/m128 | 0F 5E /r | Divide Packed Single-Precision Floating-Point Values | |
DIVSS xmm1, xmm2/m32 | F3 0F 5E /r | Divide Scalar Single-Precision Floating-Point Values | |
MAXPS xmm1, xmm2/m128 | 0F 5F /r | Return Maximum Packed Single-Precision Floating-Point Values | |
MAXSS xmm1, xmm2/m32 | F3 0F 5F /r | Return Maximum Scalar Single-Precision Floating-Point Values | |
PSHUFW mm1, mm2/m64, imm8 | 0F 70 /r ib | Shuffle Packed Words | |
LDMXCSR m32 | 0F AE /2 | Load MXCSR Register State | |
STMXCSR m32 | 0F AE /3 | Store MXCSR Register State | |
SFENCE | 0F AE /7 | Store Fence | |
CMPPS xmm1, xmm2/m128, imm8 | 0F C2 /r ib | Compare Packed Single-Precision Floating-Point Values | |
CMPSS xmm1, xmm2/m32, imm8 | F3 0F C2 /r ib | Compare Scalar Single-Precision Floating-Point Values | |
PINSRW mm, r32/m16, imm8 | 0F C4 /r | Insert Word | |
PEXTRW r32, mm, imm8 | 0F C5 /r | Extract Word | |
SHUFPS xmm1, xmm2/m128, imm8 | 0F C6 /r ib | Shuffle Packed Single-Precision Floating-Point Values | |
PMOVMSKB r32, mm | 0F D7 /r | Move Byte Mask | |
PMINUB mm1, mm2/m64 | 0F DA /r | Minimum of Packed Unsigned Byte Integers | |
PMAXUB mm1, mm2/m64 | 0F DE /r | Maximum of Packed Unsigned Byte Integers | |
PAVGB mm1, mm2/m64 | 0F E0 /r | Average Packed Integers | |
PAVGW mm1, mm2/m64 | 0F E3 /r | Average Packed Integers | |
PMULHUW mm1, mm2/m64 | 0F E4 /r | Multiply Packed Unsigned Integers and Store High Result | |
MOVNTQ m64, mm | 0F E7 /r | Store of Quadword Using Non-Temporal Hint | |
PMINSW mm1, mm2/m64 | 0F EA /r | Minimum of Packed Signed Word Integers | |
PMAXSW mm1, mm2/m64 | 0F EE /r | Maximum of Packed Signed Word Integers | |
PSADBW mm1, mm2/m64 | 0F F6 /r | Compute Sum of Absolute Differences | |
MASKMOVQ mm1, mm2 | 0F F7 /r | Store Selected Bytes of Quadword |
SSE2SSE2SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...
instructions
Added with Pentium 4Pentium 4
Pentium 4 was a line of single-core desktop and laptop central processing units , introduced by Intel on November 20, 2000 and shipped through August 8, 2008. They had a 7th-generation x86 microarchitecture, called NetBurst, which was the company's first all-new design since the introduction of the...
Also see integer instructions added with Pentium 4
SSE2 SIMD floating-point instructions
Instruction | Opcode | Meaning |
---|---|---|
ADDPD xmm1, xmm2/m128 | 66 0F 58 /r | Add Packed Double-Precision Floating-Point Values |
ADDSD xmm1, xmm2/m64 | F2 0F 58 /r | Add Low Double-Precision Floating-Point Value |
ANDNPD xmm1, xmm2/m128 | 66 0F 55 /r | Bitwise Logical AND NOT |
CMPPD xmm1, xmm2/m128, imm8 | 66 0F C2 /r ib | Compare Packed Double-Precision Floating-Point Values |
CMPSD xmm1, xmm2/m64, imm8 | F2 0F C2 /r ib | Compare Low Double-Precision Floating-Point Values |
ADDPD, ADDSD, ANDNPD, ANDPD, CMPPD, CMPSD*, COMISD, CVTDQ2PD, CVTDQ2PS, CVTPD2DQ, CVTPD2PI, CVTPD2PS, CVTPI2PD, CVTPS2DQ, CVTPS2PD, CVTSD2SI, CVTSD2SS, CVTSI2SD, CVTSS2SD, CVTTPD2DQ, CVTTPD2PI, CVTTPS2DQ, CVTTSD2SI, DIVPD, DIVSD, MAXPD, MAXSD, MINPD, MINSD, MOVAPD
MOVAPD
In the x86 assembly programming language, MOVAPD is the name for a specific action performable by modern x86 processors with 2nd-generation Streaming SIMD Extensions . This action involves copying a pair of numbers to temporary space in the processor for use in other computations...
, MOVHPD
MOVHPD
In the x86 assembly programming language, MOVHPD is the name for a specific action performable by modern x86 processors with 2nd-generation Streaming SIMD Extensions...
, MOVLPD, MOVMSKPD, MOVSD*, MOVUPD, MULPD, MULSD, ORPD, SHUFPD, SQRTPD, SQRTSD, SUBPD, SUBSD, UCOMISD, UNPCKHPD, UNPCKLPD, XORPD
* CMPSD and MOVSD have the same name as the string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....
instruction mnemonics CMPSD (CMPS) and MOVSD (MOVS); however, the former refer to scalar double-precision
Double precision
In computing, double precision is a computer number format that occupies two adjacent storage locations in computer memory. A double-precision number, sometimes simply called a double, may be defined to be an integer, fixed point, or floating point .Modern computers with 32-bit storage locations...
floating-points
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...
whereas the latters refer to doubleword
Integer (computer science)
In computer science, an integer is a datum of integral data type, a data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values....
strings.
SSE2 SIMD integer instructions
MOVDQ2Q, MOVDQA, MOVDQU, MOVQ2DQ, PADDQ, PSUBQ, PMULUDQ, PSHUFHW, PSHUFLW, PSHUFD, PSLLDQ, PSRLDQ, PUNPCKHQDQ, PUNPCKLQDQSSE3SSE3SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions , is the third iteration of the SSE instruction set for the IA-32 architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU...
instructions
Added with Pentium 4 supporting SSE3Also see integer and floating-point instructions added with Pentium 4 SSE3
SSE3 SIMD floating-point instructions
- ADDSUBPD, ADDSUBPS (for Complex Arithmetic)
- HADDPD, HADDPS, HSUBPD, HSUBPS (for Graphics)
- MOVDDUPMOVDDUPIn the x86 assembly programming language, MOVDDUP is the name for a specific action performable by modern x86 processors with 3rd-generation Streaming SIMD Extensions...
, MOVSHDUP, MOVSLDUP (for Complex Arithmetic)
SSSE3SSSE3Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.- History :...
instructions
Added with XeonXeon
The Xeon is a brand of multiprocessing- or multi-socket-capable x86 microprocessors from Intel Corporation targeted at the non-consumer server, workstation and embedded system markets.-Overview:...
5100 series and initial Core 2
- PSIGNW, PSIGND, PSIGNB
- PSHUFB
- PMULHRSW, PMADDUBSW
- PHSUBW, PHSUBSW, PHSUBD
- PHADDW, PHADDSW, PHADDD
- PALIGNR
- PABSW, PABSD, PABSB
SSE4.1
Added with Core 2 manufactured in 45nm- MPSADBW
- PHMINPOSUW
- PMULLD, PMULDQ
- DPPS, DPPD
- BLENDPS, BLENDPD, BLENDVPS, BLENDVPD, PBLENDVB, PBLENDW
- PMINSB, PMAXSB, PMINUW, PMAXUW, PMINUD, PMAXUD, PMINSD, PMAXSD
- ROUNDPS, ROUNDSS, ROUNDPD, ROUNDSD
- INSERTPS, PINSRB, PINSRD/PINSRQ, EXTRACTPS, PEXTRB, PEXTRW, PEXTRD/PEXTRQ
- PMOVSXBW, PMOVZXBW, PMOVSXBD, PMOVZXBD, PMOVSXBQ, PMOVZXBQ, PMOVSXWD, PMOVZXWD, PMOVSXWQ, PMOVZXWQ, PMOVSXDQ, PMOVZXDQ
- PTEST
- PCMPEQQ
- PACKUSDW
- MOVNTDQA
SSE4a
Added with PhenomPhenom (processor)
Phenom is the 64-bit AMD desktop processor line based on the K10 microarchitecture, in what AMD calls family 10h processors, sometimes incorrectly called "K10h". Triple-core versions belong to the Phenom 8000 series and quad cores to the AMD Phenom X4 9000 series...
processors
- LZCNT, POPCNT (POPulation CouNT) - advanced bit manipulation
- EXTRQ/INSERTQ
- MOVNTSD/MOVNTSS
SSE4.2
Added with Nehalem processors- CRC32
- PCMPESTRI
- PCMPESTRM
- PCMPISTRI
- PCMPISTRM
- PCMPGTQ
Intel AVXAdvanced Vector ExtensionsAdvanced Vector Extensions is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Westmere processor shipping in Q1 2011 and now by AMD with the Bulldozer processor shipping in Q3 2011.AVX...
FMA instructions
Instruction | Opcode | Meaning | Notes |
---|---|---|---|
VFMADDPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 69 /r /is4 | Fused Multiply-Add of Packed Double-Precision Floating-Point Values | |
VFMADDPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 68 /r /is4 | Fused Multiply-Add of Packed Single-Precision Floating-Point Values | |
VFMADDSD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6B /r /is4 | Fused Multiply-Add of Scalar Double-Precision Floating-Point Values | |
VFMADDSS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6A /r /is4 | Fused Multiply-Add of Scalar Single-Precision Floating-Point Values | |
VFMADDSUBPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 5D /r /is4 | Fused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Values | |
VFMADDSUBPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 5C /r /is4 | Fused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Values | |
VFMSUBADDPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 5F /r /is4 | Fused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Values | |
VFMSUBADDPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 5E /r /is4 | Fused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Values | |
VFMSUBPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6D /r /is4 | Fused Multiply-Subtract of Packed Double-Precision Floating-Point Values | |
VFMSUBPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6C /r /is4 | Fused Multiply-Subtract of Packed Single-Precision Floating-Point Values | |
VFMSUBSD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6F /r /is4 | Fused Multiply-Subtract of Scalar Double-Precision Floating-Point Values | |
VFMSUBSS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 6E /r /is4 | Fused Multiply-Subtract of Scalar Single-Precision Floating-Point Values | |
VFNMADDPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 79 /r /is4 | Fused Negative Multiply-Add of Packed Double-Precision Floating-Point Values | |
VFNMADDPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 78 /r /is4 | Fused Negative Multiply-Add of Packed Single-Precision Floating-Point Values | |
VFNMADDSD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7B /r /is4 | Fused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values | |
VFNMADDSS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7A /r /is4 | Fused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values | |
VFNMSUBPD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7D /r /is4 | Fused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values | |
VFNMSUBPS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7C /r /is4 | Fused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values | |
VFNMSUBSD xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7F /r /is4 | Fused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values | |
VFNMSUBSS xmm0, xmm1, xmm2, xmm3 | C4E3 WvvvvL01 7E /r /is4 | Fused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values |
Intel AES instructions
6 new instructions.Instruction | Description |
---|---|
AESENC | Perform one round of an AES Advanced Encryption Standard Advanced Encryption Standard is a specification for the encryption of electronic data. It has been adopted by the U.S. government and is now used worldwide. It supersedes DES... encryption flow |
AESENCLAST | Perform the last round of an AES encryption flow |
AESDEC | Perform one round of an AES decryption flow |
AESDECLAST | Perform the last round of an AES decryption flow |
AESKEYGENASSIST | Assist in AES round key generation |
AESIMC | Assist in AES Inverse Mix Columns |
Undocumented instructions
The x86 CPUs contain undocumented instructionsIllegal opcode
An Illegal Opcode, also called an Undocumented Instruction, is an instruction to a CPU that is not mentioned in any official documentation released by the CPU's designer or manufacturer, which nevertheless has an effect. Illegal opcodes were common on older CPUs designed during the 1970s, such as...
which are implemented on the chips but not listed in some official documents. They can be found in various sources across the Internet, such as Ralf Brown's Interrupt List
Ralf Brown's Interrupt List
Ralf Brown's Interrupt List or RBIL is a comprehensive list of interrupts, calls, hooks, interfaces, data structures, memory and port addresses, and processor opcodes for x86 for machines from the very start of the PC era in 1981 up into the year 2000, most of it still applying to PCs today as well...
and at http://sandpile.org.
Mnemonic | Opcode | Description | Status |
---|---|---|---|
AAM imm8 | D4 imm8 | Divide AL by imm8, put the quotient in AH, and the remainder in AL | Available beginning with 8086, documented since Pentium (earlier documentation lists no arguments) |
AAD imm8 | D5 imm8 | Multiplication counterpart of AAM | Available beginning with 8086, documented since Pentium (earlier documentation lists no arguments) |
SALC | D6 | Set AL depending on the value of the Carry Flag (a 1-byte alternative of SBB AL, AL) | Available beginning with 8086, but only documented since Pentium Pro. |
HCF F00f The invalid operand with locked CMPXCHG8B instruction bug, commonly referred to as the Pentium F00F bug , is a design flaw in the majority of Intel Pentium, Pentium MMX, and Pentium OverDrive processors... |
F0 0F C7 C8 | Halt and Catch Fire - Causes the CPU to lock, forcing the user to hard-reboot. | This was considered a bug by Intel and has been fixed in Pentium Pro step myB2 and later processors. |
UD1 | 0F B9 | Intentionally undefined instruction, but unlike UD2 this was not published | |
ICEBP | F1 | Single byte single-step exception / Invoke ICE In-circuit emulator An in-circuit emulator is a hardware device used to debug the software of an embedded system. It was historically in the form of bond-out processor which has many internal signals brought out for the purpose of debugging... |
Available beginning with 80386, documented (as INT1) since Pentium Pro |
LOADALL LOADALL LOADALL is the common name for two different, undocumented machine instructions of Intel 80286 and Intel 80386 processors, which allow access to areas normally outside of the IA-32 API scope, like descriptor cache registers... |
0F 05 | Loads All Registers from Memory Address 0x000800H | Only available on 80286 |
Unknown mnemonic | 0F 04 | Exact purpose unknown, causes CPU hang. (the only way out is CPU reset) In some implementations, emulated through BIOS BIOS In IBM PC compatible computers, the basic input/output system , also known as the System BIOS or ROM BIOS , is a de facto standard defining a firmware interface.... as a halt Halt Halt can refer to:* Train station § Halt, a small train station, usually unstaffed, with few facilities and normally is a request stop* A sign, meaning attention* "Hungry, Angry, Lonely, Tired" in behavioral addiction recovery... ing sequence. |
Only available on 80286 |
LOADALLD | 0F 07 | Loads All Registers from Memory Address ES:EDI | Only available on 80386 |
POP CS | 0F | Pop top of the stack into CS Segment register (causing a far jump) | Only available on earliest models of 8086. Beginning with 80286 this opcode is used as a prefix for 2-Byte-Instructions |
MOV CS,r/m | 8E/1 | Moves a value from register/memory into CS Segment register (causing a far jump) | Only available on earliest models of 8086. Beginning with 80286 this opcode causes an invalid opcode exception |
MOV ES,r/m | 8E/4 | Moves a value from register/memory into ES segment register | Only available on earliest models of 8086. On 80286 this opcode causes an invalid opcode exception. Beginning with 80386 the value is moved into the FS segment register. |
MOV CS,r/m | 8E/5 | Pop top of the stack into CS Segment register (?) | Only available on earliest models of 8086. On 80286 this opcode causes an invalid opcode exception. Beginning with 80386 the value is moved into the GS segment register. |
MOV SS,r/m | 8E/6 | Moves a value from register/memory into SS Segment register | Only available on earliest models of 8086. Beginning with 80286 this opcode causes an invalid opcode exception |
MOV DS,r/m | 8E/7 | Moves a value from register/memory into DS Segment register | Only available on earliest models of 8086. Beginning with 80286 this opcode causes an invalid opcode exception |
See also
- CLMULCLMUL instruction setCarry-less Multiplication is an extension to the x86 instruction set used by microprocessors from Intel and AMD which was proposed by Intel in March 2008 and made available in the Intel Westmere processors announced in early 2010. The purpose is to improve the speed of applications doing block...
- XOPXOP instruction setThe XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, which was released on October 12th, 2011....
- CVT16CVT16 instruction setThe CVT16 instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set.CVT16 is a revision of part of the SSE5 instruction set proposal announced on August 30, 2007...
- FMAFMA instruction setThe FMA instruction set is the name of a future extension to the 128-bit SIMD instructions in the X86 microprocessor instruction set to perform fused multiply–add operations...
- RdRandRdRandRdRand is an instruction for returning random numbers from an on-chip random number generator that will be available in Ivy Bridge processors. It is part of the Intel 64 instruction set architecture...
- Larrabee extensions
- Advanced Vector Extensions 2
External links
- The 8086 / 80286 / 80386 / 80486 Instruction Set
- Free IA-32 and x86-64 documentation, provided by Intel
- Netwide Assembler Instruction List (from Netwide Assembler)
- X86 Opcode and Instruction Reference