Unix - Tony Bai

三月 10, 2007

C程序员和C++程序员在声明空指针时做法常常是不相同的。
C程序员常常如下做：
int *ptr = NULL;

C++程序员则是听从Bjarne Stroustrup或者其他C++大师的教诲，坚定地如下做：
int *ptr = 0;

也许没有谁对谁错之分，也许只是习惯不同罢了，毕竟C语言是老大哥，诞生的早；而在早期C编程时人们也许不习惯在程序里使用0这样的magic number，转而使用了#define NULL ((void*)0)来统一进行空指针的声明或者赋值。

在'Effective C++'中明确提出避免使用使用macro的issue，广大C++信徒自然也就将NULL抛掷脑后，并逐渐形成习惯，用0给指针赋值以意会这是个空指针的方式就流传了下来。

还是那句话没有谁对谁错，在'The C++ Programming Language Special Edition'中Bjarne Stroustrup在5.1.1小节用了不到200个words来说明了关于'0'或NULL的问题，这段叙述也是堪称经典，我们可以来回顾一下：

Zero(0) is an int. Because of standard conversions, 0 can be used as a constant of any integral, floating-point, pointer, or pointer-to-member type. The type of zero will be determined by context. Zero(0) will typically (but not necessarily) be represented by the bit patternall-zerosof the appropriate size.No object is allocated with the address 0 . Consequently, 0 acts as a pointer literal, indicating that a pointer doesn’t refer to an object.

0是一个整型数，通过标准的转型操作，0可以被用作各种数据类型常量，这些数据类型包括整型、浮点型、指针型或者指向类成员的指针类型。这时这个常量0的类型需要通过上下文才能判断出来。0通常(但不是必要的)用特定大小的全二进制0的bit串表示。没有object会被分配到0地址上，0只是字面值，其含义是这个指针变量没有指向(参考到)任何object。

举例:
int i = 0; //整型
long l = 0; //整型
float f = 0; //浮点型
double d = 0; //浮点型

int *p = 0; //整型指针
double *dp = 0; //浮点指针

class T {
public :
int func(int a){…};
};

T *pT = 0; //用户自定义类型指针
int (T::*PTR)(int) = 0; //指向类成员的指针类型

Bjarne Stroustrup继续说明了C与C++程序员习惯上的差异并给出了自己的建议：
In C, it has been popular to define a macro NULL to represent the zero pointer. Because of C++’s tighter type checking, the use of plain 0, rather than any suggested NULL macro, leads to fewer problems. If you feel you must define NULL, use const int NULL = 0;
The const qualifier prevents accidental redefinition the NULL and ensures that NULL can be used where a constant is required.

在C中，定义一个macro NULL代表空指针是很流行常见的做法。由于C++编译器更严格的类型检查，使用0比使用NULL macro给你带来的麻烦更少。如果你一定要用NULL,那么建议作如下定义: const int NULL = 0; 这行定义会阻止意外的重定义NULL，而且会保证NULL在一个需要常量的场合被使用。

由此看来，在C++中0的灵活性和适应性更强一些，至于到底用哪个还是个见仁见智的问题，谁也不能强迫谁^_^。

C++咬文嚼字－'Hijack const'

晚上无意翻看Bjarne Stroustrup的'The C++ Programming Language Special Edition'(英文版)第94页，章节5.4 Constants一节，看到这么一句原文'C++ offers the concept of a user-defined constant, a const, to express the notion that a value doesn't change directly.'字眼就在directly上，既然不能directly change，那我试试indirectly change。

问题就发现于这个indirectly change，代码如下：

#include <iostream>

int main() {
const int a = 2007;   // 这是一个常量，我们'不能directly change'^_^
int *p = const_cast<int*>(&a);   //我们换一种方法hijack
*p = 2008;    //篡改

std::cout << "a = " << a << std::endl; //期待输出2008
std::cout << "*p = " << *p << std::endl;
std::cout << "&a = " << &a << std::endl;
std::cout << "p = " << p << std::endl;

return 0;
}

我首先在Windows上使用Mingw的g++编译，输出结果让我大惊失色：
a = 2007
*p = 2008
&a = 0x23ff74
p = 0x23ff74

原以为a应该被hijack了，结果a仍然原封未动；关键是后两行打印的a的地址和p的指向都是一个地方，难道C++对常量的保护如此之好，如此智能。不行，换一个平台试试，我又把源码搬到了Solaris上同样是g++编译器，输出结果一致。

百思不得其解后继续'咬文嚼字'的往下看该小节。突然发现这么一句话：'If the compiler knows every use of the const, it need not allocate space to hold it.'…'The common simple and common case is the one in which the value of the constant is known at compile time and no storage needs to be allocated.'，左思又想，这么一来在某些时候a被当作类似宏的方式处理的，就如：std::cout << "a = " << a << std::endl;这里cout输出一个常量表达式，编译器估计直接将a替换成2007了，实际上就相当于std::cout << "a = " << 2007 << std::endl;而后的int *p = const_cast<int*>(&a);操作，这时就需要为a分配地址了。有人说a的输出操作是在分配地址之后，那为什么还输出2007呢，我们从编译器的角度看看，编译器在解析到const int a = 2007的时候发现这是一个常量，便将之首先记录到常量符号表中，而后在解析const_cast<int*>(&a)时为a在栈上分配内存，但是在走到输出a那块时首先引用到的还是常量符号表，而输出&a时，由于是取地址操作，所以就把前面分配的栈地址赋到这里了。

我们继续再看一个例子：

#include <iostream>

int main() {
int i = 2006;
const int a = i + 1;
int *p = const_cast<int*>(&a);
*p = 2008;    //篡改

std::cout << "a = " << a << std::endl; //期待输出2008
std::cout << "*p = " << *p << std::endl;
std::cout << "&a = " << &a << std::endl;
std::cout << "p = " << p << std::endl;

return 0;
}

在这个例子中const int a = i + 1;用一个非常量表达式给常量a赋初值，按照Bjarne Stroustrup的说法，是需要给a分配内存了。这样我想编译器也许不会在常量符号表中给a留位置，在下面的a的打印输出时，a真的被hijack了。

输出结果：
a = 2008
*p = 2008
&a = 0x23ff70
p = 0x23ff70

再看一个例子：
#include <iostream>

int main() {
const int i = 2006;
const int a = i + 1;
int *p = const_cast<int*>(&a);
*p = 2008;    //篡改

std::cout << "a = " << a << std::endl; //期待输出2008
std::cout << "*p = " << *p << std::endl;
std::cout << "&a = " << &a << std::endl;
std::cout << "p = " << p << std::endl;

return 0;
}

编译器在解析到const int i = 2006时首先将i作为常量保存到常量符号表中，在const int a = i + 1时实际上相当于const int a = 2006 + 1，编译器作优化，编译器直接得到a = 2007而且是一个常量，也被保存到常量表中，下面的流程就和第一个例子一样了。

标签 Unix 下的文章

C++咬文嚼字-'0 or NULL'

C++咬文嚼字－'Hijack const'

文章

评论

分类

归档

链接

开源项目

翻译项目