Inside the 'i386'
The term ‘i386′ in the title does not refer to the real Intel 80386 processor but the representative of Intel 32-bit architecture(IA32). I prefer ‘i386′ rather than ‘IA32′ just like what the linux kernel does, since you can find ‘i386′ folder in $(linux-2.6.x_dir)/arch directory. This artical describes some basic knowledge of ‘i386′, which may be kinda useful to those guys who wanna do research on or develop operating system.
We know that the ‘i386′ processors are the most widely used and supported today, And even the linux was born on it. As a researcher or a developer, we wonder what the ‘i386′ processor offers to us. In brief ‘i386′ offers us an execution environment which consists of a set of registers and several mechanisms of accessing memory. Today the Intel mainstream processors, such as Pentium series, are almost based on 80386 processor which first introduced 32-bit registers and paging into ‘i386′. let us have a look at the resources supplied by the ‘i386′ processor. A part of the contents below are quoted from the book "Intel Architecture Software Developer’s Manual, Volume 1, Basic Architecture".
1. Memory accessing
Any operating system or executive designed to work with an ‘i386′ processor will use the processor’s memory management facilities to access memory. So far ‘i386′ processors support three memory-accessing model. Once using the processor’s memory management facilities, programs do not directly address physical memory. Instead, they access memory using any of these three memory models: flat, segmented, or real-address mode. With the flat memory model, memory appears to a program as a single, continuous address space, which is byte addressable and is called ‘linear address space’. it covers contiguously from 0 to (4G -1). When using this model, code (a program’s instructions), data, and the procedure stacks are all contained in this address space. With the segmented memory model, memory appears to a program as a group of independent address spaces called segments. When using this model, code, data, and stacks are typically contained in separate segments. To address a byte in a segment, a program must issue a logical address, which consists of a segment selector and an offset. Internally, the processor translates each logical address into a linear address to access a memory location and this translation is transparent to the application program. With either the flat or segmented model, the ‘i386′ processor provides facilities for dividing the linear address space into pages and mapping the pages into virtual memory. If an operating system/executive uses the ‘i386′ processor’s paging mechanism, the existence of the pages is transparent to an application program. we can also do a summary with an image as follows:
Logical Address(segmented mode) –> [Segmentation Unit] –> Liner Address(flat mode) –> [Paging Unit] –> Physical Address
The left real-address mode model uses the memory model for the Intel 8086 processor, the first ‘i386′ processor. The real-address mode uses a specific implementation of segmented memory in which the linear address space for the program and the operating system/executive consists of an array of segments, each of which is up to 64K bytes in size. The maximum size of the linear address space in real-address mode is 1M bytes.
Here, we have to say something about the ‘operatiing mode’. the ‘i386′ processor supports three operating mode which determines which instructions and architectural features are accessible.
(1) Protected mode
It is the native state of the processor. In this mode all instructions and architectural features are available, providing the highest performance and capability. This is the recommended mode for all new applications and operating systems. When in this mode, the processor can use any of the memory models described above. (The real-addressing mode memory model is ordinarily used only when the processor is in the virtual-8086 mode.)
(2) Real-address mode
This mode provides the programming environment of the Intel 8086 processor with a few extensions (such as the ability to switch to protected or system management mode). The processor is placed in real-address mode following power-up or a reset. When in this mode, the processor only supports the real-address mode memory model. As we know the process of booting from disk is in this mode.
(3) System management mode
It is unfamiliar to most of us. we have nothing to say.
2. Registers
The registers in ‘i386′ processors can be grouped into three type: ‘general-purpose data registers’, ‘segment registers’ and ‘status and control registers’. Details as follows:
(1) General-Purpose data registers
There are eight 32-bit registers available for general purpose, such as storing operands and pointers. In theory you can select any of them to do what you wanna do, but many instructions assign specific registers to hold operands. The following is a summary of these special uses:
EAX – Accumulator for operands and results data.
EBX – Pointer to data in the DS segment.
ECX – Counter for string and loop operations.
EDX – I/O pointer.
ESI – Pointer to data in the segment pointed to by the DS register; source pointer for string operations.
EDI – Pointer to data (or destination) in the segment pointed to by the ES register; destination pointer for string operations.
ESP – Stack pointer (in the SS segment).
EBP – Pointer to data on the stack (in the SS segment).
(2) Segment registers
There are six registers for holding segment selector which are a special pointer that identifies a segment in memory and all of these segment registers are 16-bit. To access a particular segment in memory, the segment selector for that segment must be present in the appropriate one of the segment registers. So, although a system can define thousands of segments, only 6 can be available for immediate use. Other segments can be made available by loading their segment selectors into these registers during program execution. Every segment register has a ‘visible’ part(16 bits in 32-bit platform) and a ‘hidden’ part. (The hidden part is sometimes referred to as a ‘descriptor cache’ or a ‘shadow register’.) When a segment selector is loaded into the visible part of a segment register, the processor also loads the hidden part of the segment register with the base address, segment limit, and access control information from the segment descriptor pointed to by the segment selector. Some load instructions such as ‘mov’, ‘pop’, etc explicitly reference the segment registers and other instructions such as ‘call’, ‘jmp’, or ‘ret’ change the contents of the CS register (and sometimes other segment registers) as an incidental part of their operation. How these segment registers are used depends on the type of memory accessing model that the operating system or executive is using.
We just mentioned ‘segment descripters’. A segment descriptor is a data structure in a GDT or LDT that provides the processor with the size and location of a segment, as well as access control and status information. Segment descriptors are typically created by compilers
, linkers, loaders, or the operating system or executive, but not application programs.
(3) Status and control registers
These registers report and allow modification of the state of the processor and of the program being executed. E.g. the 32-bit EFLAGS register contains a group of status flags, a control flag, and a group of system flags. Details as follows:
CF – Carry Flag
PF – Parity Flag
AF – Auxiliary Carry Flag
ZF – Zero Flag
SF – Sign Flag
TF – Trap Flag
IF – Interrupt Enable Flag
DF – Direction Flag
OF – Overflow Flag
IOPL – I/O Privilege Level
NT – Nested Task
RF – Resume Flag
VM – Virtual-8086 Mode
AC – alignment Check(AC)
VIF – Virtual Interrupt Flag
VIP – Virtual Interrupt Pending
ID – ID Flag
Some of the flags in the EFLAGS register can be modified directly using special-purpose instructions.
The ‘i386′ processor is so complex that we can not list all of its features here. If you are interested in it, you may read the thick enough ‘i386′ manuals to make all clear.
评论