标签 Assembly 下的文章

Retired 'bootsect.S'

We know that the latest linux kernel version is 2.6.x, which is different from the ‘old kernels’ in booting. The ‘bootsect.S’, which used to make the kernel image in the floppy disk bootable in the early days, becomes useless in linux kernel 2.6.x today, although it is still a part of the kernel image.

We know that ‘bootsect.S’ is usu placed in the first 512 bytes of the kernel image and installed in the first sector of some medium on which the kernel image is installed. the mediums usu include hard disk (or the active partition of the hard disk) and floppy disk. As a minimal ‘bootloader’ included in kernel images of earlier linux versions up to the 2.4, the ‘bootsect.S’ is in duty bound to copy the left kernel image from medium to main memory when we boot linux from the floppy disk and then execute the loaded code in order to complete its mission. when we boot linux from hard disk, the ‘bootsect.S’ does nothing actively but to be checked by other booting routine stored in BIOS(Basic Input/Output System) or MBR(Master Boot Record). Today if you wanna boot linux 2.6.x from a floppy disk, you have to select a suitable bootloader yourself, just like that you boot linux from hard disk, since the ‘bootsect.S’ has retired.

Here list the source code of ‘bootsect.S’ and some comments of mine. let us go and see what the retired ‘bootsect.S’ really does! (my comments usu occur following the symbol ‘!’)

/*
 * bootsect.S  Copyright (C) 1991, 1992 Linus Torvalds
 *
 * modified by Drew Eckhardt
 * modified by Bruce Evans (bde)
 * modified by Chris Noe (May 1999) (as86 -> gas)
 * gutted by H. Peter Anvin (Jan 2003)
 *
 * BIG FAT NOTE: We’re in real mode using 64k segments.  Therefore segment
 * addresses must be multiplied by 16 to obtain their respective linear
 * addresses. To avoid confusion, linear addresses are written using leading
 * hex while segment addresses are written as segment:offset.
 *
 * ! $(linux-2.6.15.3_dir)/arch/i386/bootsect.S
 */

/* ! I found this header file in $(linux-2.6.15.3_dir)/include/asm-i386 */
#include

/*
 * ! DEF_INITSEG   0×9000
 * ! DEF_SYSSEG    0×1000
 * ! DEF_SETUPSEG  0×9020
 * ! DEF_SYSSIZE   0x7F00
 * ! These macros above are defined in ‘boot.h’ and
 * ! the values of the first three of them
 * ! used to be stored into ‘cs’ register
 */
SETUPSECTS = 4   /* default nr of setup-sectors */
BOOTSEG  = 0x07C0  /* original address of boot-sector */
INITSEG  = DEF_INITSEG  /* we move boot here – out of the way */
SETUPSEG = DEF_SETUPSEG  /* setup starts here */
SYSSEG  = DEF_SYSSEG  /* system loaded at 0×10000 (65536) */
SYSSIZE  = DEF_SYSSIZE  /* system size: # of 16-byte clicks */
                                            /* to be loaded */
/*
 * ! Here no matter what the ‘ROOT_DEV’ is is insignificant.
 * ! When kernel image builds, this ‘ROOT_DEV’ will be reset.
 * ! And so does ‘SWAP_DEV’.
 * ! ‘ROOT_DEV’ is variable which represents the type of the device
 * ! in which the root file system stores.
 * ! ‘ROOT_DEV = 0′ means the same type of floopy as boot. 
 */
ROOT_DEV = 0    /* ROOT_DEV is now written by "build" */
SWAP_DEV = 0   /* SWAP_DEV is now written by "build" */

#ifndef SVGA_MODE
#define SVGA_MODE ASK_VGA
#endif

#ifndef RAMDISK
#define RAMDISK 0
#endif

#ifndef ROOT_RDONLY
#define ROOT_RDONLY 1
#endif

/*
 * !Now we are running in 16-bit real mode, neither in
 * ! 32-bit real mode nor in 32-bit protected mode
 */
.code16
.text

.global _start
_start:

 /*
  * ! jmpl is an ‘jump’ instruction which
  * ! jumps between segments.
  * ! the instruction below first stores the
  * ! immediate number ‘$BOOTSEG’ into ‘CS’
  * ! register and stores the address of label
  * ! ‘start2′ into ‘EIP’ register, and then jumps
  * ! to label ‘start2′ to execute.
  * ! Now, R[%cs] = $BOOTSEG = 0x07C0
  */
 # Normalize the start address
 jmpl $BOOTSEG, $start2

start2:
 /*
  * ! initialize some general registers
  * ! R[%ds] = R[%es] = R[%ss] = 0x07C0
  * ! R[%sp] = 0x7c00
  */
 movw %cs, %ax
 movw %ax, %ds
 movw %ax, %es
 movw %ax, %ss
 movw $0x7c00, %sp

 /*
  * ! sti – set the interrupt flag
  * ! cld – clear ‘df’(direction flag). after it executed,
  * !       string operations will increment the index
  * !       registers (si and/or di) that they use
  */
 sti
 cld

 /*
  * ! store the address of ‘bugger_off_msg’
  * ! into register ‘si’(source-index register)
  */
 movw $bugger_off_msg, %si

 /*
  * ! this loop prints the ‘bugger_off_msg’ on screen
  * ! and jumps to ‘die’ label.
  */
msg_loop:
 /*
  * ! lodsb loads ‘al’ register with single memory
  * ! byte at the position pointed to by ‘si’ register
  * ! after the executing, the ‘si’ is automatically
  * ! increased or decreased according to the ‘df’.
  */
 lodsb
 andb %al, %al
 jz die
 movb $0xe, %ah
 movw $7, %bx
 int $0×10
 jmp msg_loop

 /*
  * ! the computer dies and you have to reboot.
  */
die:
 # Allow the user to press a key, then reboot
 xorw %ax, %ax

 /*
  * ! int 16h – bios interrupt to give user
  * ! a chance to enter something from the keyboard
  */
 int $0×16
 int $0×19

 # int 0×19 should never return.  In case it does anyway,
 # invoke the BIOS reset code…
 ljmp $0xf000,$0xfff0

bugger_off_msg:
 .ascii "Direct booting from floppy is no longer supported.\r\n"
 .ascii "Please use a boot loader program instead.\r\n"
 .ascii "\n"
 .ascii "Remove disk and press any key to reboot . . .\r\n"
 .byte 0

 # Kernel attributes; used by setup

 /*
  * ! variables below are important since
  * ! they would be refered by ‘setup.S’
  * ! the total size of these variables is
  * ! 15 bytes, 497 + 15 = 512 :)
  * ! the last word is ’0xAA55′, which indicates
  * ! this is a boot sector
  */
 .org 497
setup_sects: .byte SETUPSECTS
root_flags: .word ROOT_RDONLY
syssize: .word SYSSIZE
swap_dev: .word SWAP_DEV
ram_size: .word RAMDISK
vid_mode: .word SVGA_MODE
root_dev: .word ROOT_DEV
boot_flag: .word 0xAA55

/* ! end of bootsect.S */

Thus, we know that the retired ‘bootsect.S’ only tells us it has retired.

汇编之路-复习栈操作

不得不承认上次关于栈桢和栈操作写得有些笼统,这里做一次“补充”,美名其曰:“复习”。

下面的这个例子几乎就能覆盖所有的栈操作相关的内容了。
void dummy()
{
        int     i = 12;
        int     j = 13;
        char    c = 'a';
}

int main()
{
        dummy();
        return 0;
}

下面是利用MDB(注[1])反汇编的代码:
> main::dis
main:                           pushl   %ebp
main+1:                         movl    %esp,%ebp
main+3:                         subl    $8,%esp
main+6:                         andl    $0xf0,%esp
main+9:                         movl    $0,%eax
main+0xe:                       subl    %eax,%esp
main+0×10:                      call    -0x2a          
main+0×15:                      movl    $0,%eax
main+0x1a:                      leave
main+0x1b:                      ret

> dummy::dis
dummy:                          pushl   %ebp
dummy+1:                        movl    %esp,%ebp
dummy+3:                        subl    $0xc,%esp
dummy+6:                        movl    $0xc,-4(%ebp)
dummy+0xd:                      movl    $0xd,-8(%ebp)
dummy+0×14:                     movb    $0×61,-9(%ebp)
dummy+0×18:                     leave
dummy+0×19:                     ret

分析上面的汇编代码我们要解决如下几个方面问题:
1、过程调用的标准模式
我们知道发生过程调用的指令是call,那么call做了些什么呢?上面每个过程的最后都有leave指令,它又作了什么呢?我们不妨来跟踪一个栈帧的形成过程,分析后自然会有答案。

(1) 我们从main + 0×10处开始,这里是一个call指令,此时的活动栈帧为main的栈帧,dummy栈帧尚未形成:
+          + 0xffffffff
|          |
+———-+
|          | main的返回地址,属于main的调用者栈帧范畴
+———-+ —————————
|    A     | main栈帧栈底 <– %ebp
+———-+
|    B     |
+———-+
|    C     | main栈帧栈顶 <– %esp
+———-+
|          |
+          + 0×00000000

(2) 调用call指令后,未执行dummy前,此时main的栈帧已经结束,%eip中存放dummy起始指令地址准备执行。
+          + 0xffffffff
|          |
+———-+
|          | main的返回地址,属于main的调用者栈帧范畴
+———-+ —————————
|    A     | main栈帧栈底 <— %ebp
+———-+
|    B     |
+———-+
|    C     |
+———-+
|          | dummy的返回地址, main栈帧栈顶 <– %esp
+———-+ —————————
|          |
+          + 0×00000000
可见call首先将main调用的函数(这里是dummy)的返回地址pushl到栈中,形成main栈帧的最后一个部分,然后跳到dummy的起始处。所以call等价于下面两条指令:
pushl %eip  //将下一条指令地址压入栈中
jmp dummy

(3) 形成dummy栈帧
dummy首先将main的栈底保存起来,然后创建自己的栈底。
+          + 0xffffffff
|          |
+———-+
|          | dummy的返回地址,属于main的栈帧范畴
+———-+ —————————
|    D     | dummy栈帧栈底 <– %ebp,存储着main栈帧栈底
+———-+
|    E     |
+———-+
|    F     | dummy栈帧栈顶 <– %esp
+———-+ —————————
|          |
+          + 0×00000000

(4) dummy返回
dummy返回时调用的第一条指令leave,该指令相当于如下两条指令:
指令1: movl %ebp %esp // 将%esp置到dummy栈桢首部

该指令执行后状态如下:
+          + 0xffffffff
|          |
+———-+
|          | dummy的返回地址,属于main的栈帧范畴
+———-+ —————————
|    D     | dummy栈帧栈底 <– %esp <– %ebp
+———-+
|    E     |
+———-+
|    F     | dummy栈帧栈顶
+———-+ —————————
|          |
+          + 0×00000000

指令2:popl %ebp
该指令执行后状态如下:
+          + 0xffffffff
|          |
+———-+
|          | main的返回地址,属于main的调用者栈帧范畴
+———-+ —————————-
|    A     | main栈帧栈底 <— %ebp
+———-+
|    B     |
+———-+
|    C     |
+———-+
|          | dummy的返回地址,main栈帧栈顶 <– %esp
+———-+ —————————
|    D     | dummy栈帧栈底
+———-+
|    E     |
+———-+
|    F     | dummy栈帧栈顶
+———-+ —————————
|          |
+          + 0×00000000

dummy返回时调用的第二条指令ret,该指令相当于popl %eip,执行完内存栈的情况如下:
+          + 0xffffffff
|          |
+———-+
|          | main的返回地址,属于main的调用者栈帧范畴
+———-+ —————————-
|    A     | main栈帧栈底 <— %ebp
+———-+
|    B     |
+———-+
|    C     | <– %esp main栈帧栈顶
+———-+
|          | dummy的返回地址
+———-+ —————————
|    D     | dummy栈帧栈底
+———-+
|    E     |
+———-+
|    F     | dummy栈帧栈顶
+———-+ —————————
|          |
+          + 0×00000000

至此,main的栈桢又再次被恢复了。

经过上面分析,得出过程调用标准模式如下:
pushl %ebp
movl %esp %ebp

//过程体

leave
ret
其中ret和call对应,而leave则和最开始的那两句对应。

2、访问局部变量
在dummy的汇编码中我们可以清晰的看到对三个局部变量i,j,c的赋值语句:
movl    $0xc,-4(%ebp)
movl    $0xd,-8(%ebp)
movb    $0×61,-9(%ebp)
其三者有一个共同点就是“都是通过对%ebp的偏移来访问局部变量的”。

3、局部变量的分配
两个以上的局部变量的栈上分配涉及到栈内存的对齐问题,dummy的代码足以说明问题。我们在dummy的栈桢中分配了两个整型和一个char型变量,实际需要9个字节。那我们来看看汇编是否给我们只分配了9个字节呢?
movl    %esp,%ebp
subl    $0xc,%esp
movl    $0xc,-4(%ebp)

可以看出subl $0xc,%esp一句在内存栈上为我们留出12个字节的空间,在char c的后面又多分了3个字节,以保证对后面的变量的地址访问是对齐的。

4、对异构类型变量的分配和访问
举例如下:
struct test_t {
        int i;
        int j;
        int a[3];
};

void dummy()
{
        struct test_t t;
        t.i = 11;
        t.j = 12;
        t.a[0] = 'a';
        t.a[1] = 'b';
        t.a[2] = 'c';
}

int main()
{
        dummy();
        return 0;
}

> dummy::dis
dummy:                          pushl   %ebp
dummy+1:                        movl    %esp,%ebp
dummy+3:                        subl    $0×28,%esp
dummy+6:                        movl    $0xb,-0×28(%ebp)
dummy+0xd:                      movl    $0xc,-0×24(%ebp)
dummy+0×14:                     movl    $0×61,-0×20(%ebp)
dummy+0x1b:                     movl    $0×62,-0x1c(%ebp)
dummy+0×22:                     movl    $0×63,-0×18(%ebp)
dummy+0×29:                     leave
dummy+0x2a:                     ret

与上面的例子不同的是这次为了存储一个test_t类型结构,栈居然留出了0×28(40d)大小的空间,在t.a[2]与%ebp之间留了0×14(20)个字节空闲。这里的原因不得而知。如果是为了对齐,那么这个代价着实不小。

[注1]
在X86平台的Solaris9上,GDB反汇编使用的语法与我们的稍有差异,而使用Solaris自带的MDB(The Modular Debugger)则和我们的汇编语法保持一致。顺便说一句MDB是一个强大的调试工具,在Sun公司的网站上有其详细的使用说明。

如发现本站页面被黑,比如:挂载广告、挖矿等恶意代码,请朋友们及时联系我。十分感谢! Go语言第一课 Go语言精进之路1 Go语言精进之路2 Go语言编程指南
商务合作请联系bigwhite.cn AT aliyun.com

欢迎使用邮件订阅我的博客

输入邮箱订阅本站,只要有新文章发布,就会第一时间发送邮件通知你哦!

这里是 Tony Bai的个人Blog,欢迎访问、订阅和留言! 订阅Feed请点击上面图片

如果您觉得这里的文章对您有帮助,请扫描上方二维码进行捐赠 ,加油后的Tony Bai将会为您呈现更多精彩的文章,谢谢!

如果您希望通过微信捐赠,请用微信客户端扫描下方赞赏码:

如果您希望通过比特币或以太币捐赠,可以扫描下方二维码:

比特币:

以太币:

如果您喜欢通过微信浏览本站内容,可以扫描下方二维码,订阅本站官方微信订阅号“iamtonybai”;点击二维码,可直达本人官方微博主页^_^:
本站Powered by Digital Ocean VPS。
选择Digital Ocean VPS主机,即可获得10美元现金充值,可 免费使用两个月哟! 著名主机提供商Linode 10$优惠码:linode10,在 这里注册即可免费获 得。阿里云推荐码: 1WFZ0V立享9折!


View Tony Bai's profile on LinkedIn
DigitalOcean Referral Badge

文章

评论

  • 正在加载...

分类

标签

归档



View My Stats