标签 Kernel 下的文章

Outline 'memory layout'

So far we have arrived at the gate leading to the real kernel. And we’d better stop for a short break in order that we would have more energy to go ahead. Now let’s examine what we do to memory these days. 

Virtually what we want to do is drawing some pictures to describe the layout of the memory in various phases. For the layout is related to the bootloader, we’d better make our work based on the following assumption:
The machine has two systems installed (Windows XP and Linux) and uses LILO as the bootloader. Let us look at the LILO configuration:

/* LILO Configuration – /etc/lilo.conf */
boot=/dev/hda
map=/boot/map
install=/boot/boot.b
prompt
timeout=100
compact
default=Linux
image=/boot/vmlinuz-2.6.15.3
         label=Linux
         root=/dev/hda2
         read-only
other=/dev/hda1
         label=WindowsXP

Here the ‘boot=/dev/hda’ indicates it installed the LILO on the MBR of first hard disk. ‘root=/dev/had2′ indicates it installs linux system on the second partition of the first disk and ‘other=/dev/hda1′ indicates it installs windows system on the first partition of the first disk. Since lilo.conf is not read at boot time, the MBR needs to be "refreshed" when this is changed. If you do not do this upon rebooting, none of your changes to lilo.conf will be reflected at startup. Like getting LILO into the MBR in the first place, you need to run: ‘$ /sbin/lilo -v -v’. The ‘-v -v’ flags give you very verbose output.

Now we could switch on our machine! (‘<->’ means ‘begin from … end before …’)
1. Power on <-> BIOS routine
Chaos, that is the character of memory at this time.

2. BIOS routine <–> Bootloader 1st stage(MBR)
BIOS routine runs over and prepares to execute the code loaded from MBR. MBR contains the 1st stage bootloader of the LILO.

       |                        |
0A0000 +————————+
       |                        |
010000 +————————+
       |          MBR           | <- MBR (07C00 ~ 07E00)
001000 +————————+
       |                        |
000600 +————————+ 
       |      BIOS use only     |
000000 +————————+

3. Bootloader 1st stage(MBR) <-> Bootloader 2nd stage
The bootloader 1st stage moves itself to 0×090000, sets up the Real Mode stack (ranging from 0x09b000 to 0x09a200) and loads the 2nd stage of the LILO from 0x09b000.

          |                        |
0A0000 +————————+
       |      2nd bootloader    |
09b000 +————————+
       |     Real mode stack    |
09A200 +————————+
       |     1st bootloader     |
09A000 +————————+
       |                        |
010000 +————————+
       |      MBR(useless)      | <- MBR (07C00 ~ 07E00)
001000 +————————+
       |  Reserved for MBR/BIOS |
000800 +————————+
       |  Typically used by MBR |
000600 +————————+ 
       |      BIOS use only     |
000000 +————————+

4. Bootloader 2nd stage <-> setup.S
The 2nd bootloader copies the integrated boot loader of the kernel image to address 0×090000, the setup() code to address 0×090200, and the rest of the kernel image to address 0×00010000(called ‘low address’ for small Kernel Images compiled with ‘make zImage’) or 0×00100000(‘high address’ for big Kernel Images compiled with ‘make bzImage’).

zImage:

       |                        |
0A0000 +————————+
       |      2nd bootloader    |
09b000 +————————+
       |     Real mode stack    |
09A200 +————————+
       |     1st bootloader     |
09A000 +————————+
       |  Stack/heap/cmdline    | For use by the kernel real-mode code.
098000 +————————+ 
       |         Kernel setup   | The kernel real-mode code.
090200 +————————+
       |    Kernel boot sector  | The kernel legacy boot sector.
090000 +————————+
       |          zImage        | The bulk of the kernel image.
010000 +————————+
       |       MBR(useless)     | <- MBR (07C00 ~ 07E00)
001000 +————————+
       |  Reserved for MBR/BIOS |
000800 +————————+
       |  Typically used by MBR |
000600 +————————+
       |      BIOS use only     |
000000 +————————+

bzImage:

       +————————+
       |          bzImage       |
0100000+————————+
       |                        |
0A0000 +————————+
       |      2nd bootloader    |
09b000 +————————+
       |     Real mode stack    |
09A200 +————————+
       |     1st bootloader     |
09A000 +————————+
       |    Stack/heap/cmdline  | For use by the kernel real-mode code.
098000 +————————+ 
       |        Kernel setup    | The kernel real-mode code.
090200 +————————+
       |  Kernel boot sector    | The kernel legacy boot sector.
090000 +————————+
       |                        |
010000 +————————+
       |       MBR(useless)     | <- MBR (07C00 ~ 07E00)
001000 +————————+
       |  Reserved for MBR/BIOS |
000800 +————————+
       |  Typically used by MBR |
000600 +————————+
       |      BIOS use only     |
000000 +————————+

5. setup.S <-> head.S
The setup() checks the position of the Kernel Image loaded in RAM. If loaded "low" in RAM (when using zImage, at physical address 0×00010000) it is moved to "high" in RAM (at physical address 0×00001000). But, if the Kernel image is a "bzImage" loaded in "high" of RAM already, then it’s NOT moved anywhere. It also move the system to its rightful place (0×00000 ~ [<0x090000]). Some system parameters were placed from 0×090000 to 0×090200, which stores the legecy boot sector.

           +————————+
       |         bzImage        |
0100000+————————+
       |                        |
098000 +————————+ 
>       |      Kernel setup      |
090200 +————————+
       |     System parameters  | collected by setup()
090000 +————————+
       |                        |
       |                        |
       |          System        |
       |                        |
       |                        |
000000 +————————+

OK, it is much clear. and now we can walk through the door to the real kernel!

Begin 'setup.S'

It is time for ‘setup.S’ to show its power. The ‘setup.S’ is loaded by the bootloader and virtually it belongs to neither the ‘bootstrap’ routine nor the kernel program, although it is a portion of the kernel image. The source of the ‘setup.S’ is kinda ‘big’ and what it does can be summarized into one word: "the ‘setup.S’ is responsible to establish the environment for the execution of the kernel program".

Since we begin ‘setup.S’, the bootloader, which loaded the ‘setup.S into memory, has lost its meaning and the space it took up is now available. The ‘setup.S’ consists of setup header and setup body. The setup header is a part of ‘Real-mode kernel header’, which must follow some layout pattern described in ‘$(Linux-2.6.15.3_dir)/Document/i386/boot.txt’. Details as follows:
The ‘Real-mode kernel header’ looks like:

Offset Proto Name  Meaning
/Size

01F1/1 ALL(1 setup_sects The size of the setup in sectors
01F2/2 ALL root_flags If set, the root is mounted readonly
01F4/4 2.04+(2 syssize  The size of the 32-bit code in 16-byte paras
01F8/2 ALL ram_size DO NOT USE – for bootsect.S use only
01FA/2 ALL vid_mode Video mode control
01FC/2 ALL root_dev Default root device number
01FE/2 ALL boot_flag 0xAA55 magic number
0200/2 2.00+ jump  Jump instruction
0202/4 2.00+ header  Magic signature "HdrS"
0206/2 2.00+ version  Boot protocol version supported
0208/4 2.00+ realmode_swtch Boot loader hook (see below)
020C/2 2.00+ start_sys The load-low segment (0×1000) (obsolete)
020E/2 2.00+ kernel_version Pointer to kernel version string
0210/1 2.00+ type_of_loader Boot loader identifier
0211/1 2.00+ loadflags Boot protocol option flags
0212/2 2.00+ setup_move_size Move to high memory size (used with hooks)
0214/4 2.00+ code32_start Boot loader hook (see below)
0218/4 2.00+ ramdisk_image initrd load address (set by boot loader)
021C/4 2.00+ ramdisk_size initrd size (set by boot loader)
0220/4 2.00+ bootsect_kludge DO NOT USE – for bootsect.S use only
0224/2 2.01+ heap_end_ptr Free memory after setup end
0226/2 N/A pad1  Unused
0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line
022C/4 2.03+ initrd_addr_max Highest legal initrd address

The ‘Real-mode kernel header’ used to be checked by the bootloader and the setup routine. The setup won’t go well unless all the data of the header are valid. The label ‘start’ is the main entry of the ‘setup.S’, from which the setup process starts. A jump instruction will be executed first there and the ‘label’ start_of_setup, which is exactly after the ‘setup header’, is the destination of this jump. Our analysis also starts from this label. The codes in ‘setup.S’ perform some operations as follows:

1. Check code integrity
Since the ‘setup.S’ code may not be contiguously loaded, we have to check code integrity first.

/*
 * ! Get the disk type – Int 13H & AH = 0×15
 * ! I wonder why to do so.
 */
# Bootlin depends on this being done early
 movw $0×01500, %ax
 movb $0×81, %dl
 int $0×13

/* ! Reset the disk system -  Int 13H & AH = 0×00 */
#ifdef SAFE_RESET_DISK_CONTROLLER
# Reset the disk controller.
 movw $0×0000, %ax
 movb $0×80, %dl
 int $0×13
#endif

# Set %ds = %cs, we know that SETUPSEG = %cs at this point
 movw %cs, %ax  # aka SETUPSEG
 movw %ax, %ds

 /*
  * ! if ((setup_sig1 != SIG1) || (setup_sig2 != SIG2)) {
  * !   goto bad_sig;
  * ! }
  * ! goto good_sig1;
  *
  * ! If the image is loaded by ‘bootsect-loader’,
  * ! ‘bad_sig’ routine won’t happen, since ‘bootsect-loader’
  * ! loaded the image contiguously.   
  */
# Check signature at end of setup
 cmpw $SIG1, setup_sig1
 jne bad_sig

 cmpw $SIG2, setup_sig2
 jne bad_sig

 jmp good_sig1

Here let us have a look at how to find the rest of the setup code and data.

bad_sig:
 movw %cs, %ax   # SETUPSEG
 subw $DELTA_INITSEG, %ax  # INITSEG
 movw %ax, %ds
 xorb %bh, %bh

 /*
  * ! ds:[497] <=> 0×9000:[497] -> %bl
  * ! rest code in words <=> (%bx – 4) << 8 -> %cx
  * ! (%bx >> 3) + SYSSEG -> start_sys_seg
  */
 movb (497), %bl   # get setup sect from bootsect
 subw $4, %bx    # LILO loads 4 sectors of setup
 shlw $8, %bx    # convert to words (1sect=2^8 words)
 movw %bx, %cx
 shrw $3, %bx    # convert to segment
 addw $SYSSEG, %bx
 movw %bx, %cs:start_sys_seg

# Move rest of setup code/data to here
 /*
  * ! move %ds:%si to %es:%di (%cx words) <=>
  * ! move SYSSEG:0 to cs:0800 (%cx*2 bytes)
  * ! with the instruction ‘rep’
  */
 movw $2048, %di   # four sectors loaded by LILO
 subw %si, %si
 pushw %cs
 popw %es
 movw $SYSSEG, %ax
 movw %ax, %ds
 rep
 movsw
 movw %cs, %ax   # aka SETUPSEG
 movw %ax, %ds
 cmpw $SIG1, setup_sig1
 jne no_sig

 cmpw $SIG2, setup_sig2
 jne no_sig

 jmp good_sig

Now variable start_sys_seg points to where real system code starts. If "bad_sig" does not happen, start_sys_seg will remain SYSSEG as it used to be.

2. Check bootloader type
The lable ‘good_sig’ used to check if loader is compatible with image.

/*
 * ! if ((loadflags & LOADHIGH) && !type_of_loader)
 * !  goto no_sig_loop
 */
good_sig:
 movw %cs, %ax   # aka SETUPSEG
 subw $DELTA_INITSEG, %ax   # aka INITSEG
 movw %ax, %ds
# Check if an old loader tries to load a big-kernel
 testb $LOADED_HIGH, %cs:loadflags # Do we have a big kernel?
 jz loader_ok   # No, no danger for old loaders.

 cmpb $0, %cs:type_of_loader   # Do we have a loader that
      # can deal with us?
 jnz loader_ok   # Yes, continue.

 pushw %cs    # No, we have an old loader,
 popw %ds    # die. ! %ds = %cs now
 lea loader_panic_mess, %si
 call prtstr

 jmp no_sig_loop

3. Get memory size
The comments of the code told us they try three different memory detection schemes to get the extended memory size (above 1M) in KB. First, try e820h, which lets us assemble a memory map; then try e801h, which returns a 32-bit memory size; and finally 88h, which returns 0-64M.

4. Hardware support
Several hardware devices are checked and some of them are reseted here. Although the BIOS already initialized most hardware devices, Linux does not rely on it, but reinitializes the devices in its own manner to enhance portability and robustness.
(1) Keyboard
Call int $0×16 to set the keyboard repeat rate
to the max.

(2) Video adapter
The video() code in ‘$(Linux-2.6.15.3_dir)/arch/i386/video.S’ has done the job.

(3) Hard disk
The codes here separately copy hd0 data to INIT_SEG:0080(16 bytes) and copy hd1 data to INIT_SEG:0090(16 bytes). After that it checks if hd1 exists with ‘Int 13H/AH=0×15′, which has been called once before.

(4) Micro Channel (MCA) bus
(5) ROM configuration table
(6) PS/2 pointing device

5. Advanced Power Management(APM) BIOS support
Nothing to say.

6. Enhanced Disk Drive(EDD)
It is in another file ‘$(Linux-2.6.15.3_dir)/arch/i386/edd.S’. it is to build a table in RAM describing the hard disks available in the system with some proper BIOS procedure. If you are interested in it, you can go deep into these code.

7. Prepare for protected mode
(1) Disable interrput and close NMI

# This is the default real mode switch routine.
# to be called just before protected mode transition
default_switch:
 cli     # no interrupts allowed !
 movb $0×80, %al   # disable NMI for bootup
      # sequence
 outb %al, $0×70
 lret

(2) Relocate the code
/*
 * ! Do (long)code32 = code32_start, since the code32
 * ! may changed by loader.
 */
# we get the code32 start address and modify the below ‘jmpi’
# (loader may have changed it)
 movl %cs:code32_start, %eax
 movl %eax, %cs:code32

code32_start is initialized to 0×1000 for zImage or 0×100000 for bzImage. This value will be used in passing control to ‘$(Linux-2.6.15.3_dir)/arch/i386/boot/compressed/head.S’.

The code next is to move the system to its rightful place if we detected that the loaded kernel is a zImage. If we boot up zImage, it relocates vmlinux to 0100:0; If we boot up bzImage, bvmlinux remains at start_sys_seg:0. Then it will relocate code from CS-DELTA_INITSEG:0 (bbootsect and bsetup) to INITSEG:0, if necessary (whether to be downward compatible with version <=201).

8. Enable A20
Everybody hates A20 and really nobody wants it, but it continues to haunt us. Here says nothing about it.

9. Switch to protected mode
Following ‘IA-32 Intel Architecture Software Developer’s Manual’, several operations should be done during the switching:
(1) Prepare GDT with a null descriptor in the first GDT entry, one code and one data segment descriptor;
(2) Disable interrupts, including maskable hardware interrupts and NMI (this has been done);
(3) Load the base address and limit of the GDT to GDTR register, using LGDT instruction;
(4) Set PE flag in CR0 register, using MOV CR0 (Intel386 and up) or LMSW instruction (for compatibility with Intel 286);
(5) Immediately execute a far JMP or a far CALL instruction.

# jump to startup_32 in arch/i386/boot/compressed/head.S

# NOTE: For high loaded big kernels we need a
# jmpi    0×100000,__BOOT_CS
#
# but we yet haven’t reloaded the CS register, so the default size
# of the target offset still is 16 bit.
# However, using an operand prefix (0×66), the CPU will properly
# take our 48 bit far pointer. (INTeL 80386 Programmer’s Reference
# Manual, Mixing 16-bit and 32-bit code, page 16-6)

 /*
  * ! 0xea – jmp instruction
  * !
 .byte 0×66, 0xea   # prefix + jmpi-opcode
The far jmp instruction (0xea) updates CS register. The contents of the remaining segment registers (DS, SS, ES, FS and GS) should be reloaded later. Now control is passed to ‘$(Linux-2.6.15.3_dir)/arch/i386/boot/compressed/head.S:startup_32′. For zImage, it is at address 0×1000; For bzImage, it is 0×100000.

Supporting functions and variables exist in the tail of ‘setup.S’.

如发现本站页面被黑,比如:挂载广告、挖矿等恶意代码,请朋友们及时联系我。十分感谢! Go语言第一课 Go语言精进之路1 Go语言精进之路2 Go语言编程指南
商务合作请联系bigwhite.cn AT aliyun.com

欢迎使用邮件订阅我的博客

输入邮箱订阅本站,只要有新文章发布,就会第一时间发送邮件通知你哦!

这里是 Tony Bai的个人Blog,欢迎访问、订阅和留言! 订阅Feed请点击上面图片

如果您觉得这里的文章对您有帮助,请扫描上方二维码进行捐赠 ,加油后的Tony Bai将会为您呈现更多精彩的文章,谢谢!

如果您希望通过微信捐赠,请用微信客户端扫描下方赞赏码:

如果您希望通过比特币或以太币捐赠,可以扫描下方二维码:

比特币:

以太币:

如果您喜欢通过微信浏览本站内容,可以扫描下方二维码,订阅本站官方微信订阅号“iamtonybai”;点击二维码,可直达本人官方微博主页^_^:
本站Powered by Digital Ocean VPS。
选择Digital Ocean VPS主机,即可获得10美元现金充值,可 免费使用两个月哟! 著名主机提供商Linode 10$优惠码:linode10,在 这里注册即可免费获 得。阿里云推荐码: 1WFZ0V立享9折!


View Tony Bai's profile on LinkedIn
DigitalOcean Referral Badge

文章

评论

  • 正在加载...

分类

标签

归档



View My Stats