如果不是内存地址,C指针究竟是什么?

关于C的信誉来源,讨论&运算符后给出以下信息:

……有些遗憾的是,这个术语[地址]仍然存在,因为它混淆了那些不知道地址是关于什么的人,并且误导了那些做了这些事情的人:把指针看作是地址,通常会导致悲伤。 。

我曾经读过的其他材料(从同样有名的来源,我会说)总是毫不掩饰地指向指针, &运算符提供内存地址。 我很想继续寻找事情的真相,但是当信誉良好的消息来源不同意的时候,这是有点困难的。

现在我有点困惑 – 什么是一个指针,那么,如果不是一个内存地址?

PS

作者后来说: 我会继续使用“地址”这个词,因为发明一个不同的地方 会更糟。

C标准没有定义内部指针是什么,它是如何在内部工作的。 这是有意的,以便不限制平台的数量,其中C可以作为编译或解释语言来实现。

指针值可以是某种ID或句柄,也可以是多个ID的组合(例如,对x86段和偏移量的问候),而不一定是真实的内存地址。 这个ID可以是任何东西,甚至是固定大小的文本string。 非地址表示可能对C解释器特别有用。

我不确定你的来源,但你描述的语言types来自C标准:

6.5.3.2地址和间接运算符
[…]
3. 一元&运算符产生其操作数的地址。 […]

所以…是的,指针指向内存地址。 至less这就是C标准所暗示的意思。

说得更清楚一点,指针是一个variables,它保存了某个地址 。 一个对象(可以存储在指针中)的地址与一元&运算符一起返回。

我可以将地址“42 Wallaby Way,Sydney”存储在一个variables中(并且该variables将是一个“指针”,但由于这不是一个内存地址,所以不是我们正确调用“指针”的东西)。 您的计算机具有地址来存储内存。 指针存储地址的值(即指针存储值“42 Wallaby Way,Sydney”,这是一个地址)。

编辑:我想扩大Alexey Frunze的评论。

什么是指针? 我们来看看C标准:

6.2.5types
[…]
20. […]
指针types可以从函数types或对象types派生,称为引用types 。 指针types描述了一个对象,它的值提供了引用types实体的引用。 从引用typesT派生的指针types有时被称为“指向T的指针”。 从引用types构造指针types称为“指针types推导”。 指针types是一个完整的对象types。

本质上,指针存储一个提供对某个对象或函数的引用的值。 有点。 指针旨在存储提供对某个对象或函数的引用的值,但情况并非总是如此:

6.3.2.3指针
[…]
5.整数可以转换为任何指针types。 除了之前指定的,结果是实现定义的,可能没有正确alignment,可能不指向引用types的实体,并且可能是陷阱表示。

上面的引用说我们可以把一个整数变成一个指针。 如果我们这样做(也就是说,如果我们将一个整数值填充到一个指针中,而不是指向一个对象或函数的特定引用),那么指针“可能不指向引用types的实体”(即它可能不提供引用一个对象或函数)。 它可能会提供给我们其他的东西。 这是一个地方,你可能会在指针中附加某种句柄或ID(例如,指针不是指向一个对象;它存储了一个代表某事物的值,但该值可能不是一个地址)。

所以是的,正如Alexey Frunze所说,指针可能不会将地址存储到对象或函数中。 指针可能会存储某种“句柄”或ID,而且可以通过为指针指定一些任意的整数值来实现这一点。 这个句柄或ID代表什么取决于系统/环境/上下文。 只要你的系统/实现能够理解价值,你就可以很好的形成(但是取决于具体的价值和特定的系统/实现)。

通常 ,指针将地址存储到对象或函数。 如果它没有存储一个实际的地址(对一个对象或函数),结果是实现定义的(这意味着到底发生了什么以及指针现在代表什么取决于你的系统和实现,所以它可能是一个句柄或ID一个特定的系统,但在另一个系统上使用相同的代码/值可能会导致程序崩溃)。

最终比我想象的要长

指针与变量

在这幅图片中,

pointer_p是位于0x12345的指针,指向0x34567处的variablesvariable_v。

把一个指针看作一个地址是一个近似值 。 像所有的近似值一样,有时候有用也是有用的,但也不是确切的,这意味着依靠它会造成麻烦。

指针就像是一个地址,它指明了在哪里find一个对象。 这种类比的一个直接的限制是并不是所有的指针实际上都包含一个地址。 NULL是一个不是地址的指针。 指针variables的内容实际上可以是以下三种之一:

  • 一个对象的地址 ,可以被解除引用(如果p包含x的地址,则expression式*p的值与x相同);
  • 一个空指针 ,其中NULL是一个例子;
  • 无效的内容,它不指向一个对象(如果p不具有有效的值,那么*p可以做任何事情(“未定义的行为”),崩溃程序是一个相当普遍的可能性)。

此外,说一个指针(如果有效的和非空的) 包含一个地址会更准确:一个指针指示在哪里find一个对象,但是有更多的信息与它相关联。

特别是一个指针有一个types。 在大多数平台上,指针的types在运行时没有任何影响,但在编译时其影响超出了types。 如果p是一个指向intint *p; )的指针,那么p + 1指向p之后的sizeof(int)个字节的整数(假设p + 1仍然是一个有效的指针)。 如果q是指向与pchar *q = p; )相同的地址的char指针,则q + 1p + 1地址不同。 如果将指针看作是地址,那么对于指向同一位置的不同指针,“下一个地址”是不同的。

在某些环境中,可能有多个指针值与不同的表示(内存中的不同位模式)指向内存中的相同位置。 你可以把它们看作是拥有相同地址的不同地址,或者是同一地址的不同地址 – 在这种情况下,隐喻是不清楚的。 ==运算符总是告诉你两个操作数是否指向相同的位置,所以在这些环境中,即使pq有不同的位模式,也可以有p == q

甚至有指针携带地址之外的其他信息的环境,例如types或权限信息。 你可以很容易地通过你的程序员而不会遇到这些。

有不同types的指针有不同的表示的环境。 您可以将其视为具有不同表示forms的不同types的地址。 例如,一些体系结构具有字节指针和字指针,或者对象指针和函数指针。

总而言之,只要你牢记这一点,把指针当作地址就不算太坏

  • 它只是有效的,非空的地址指针;
  • 你可以有多个地址的相同的位置;
  • 你不能对地址进行算术运算,而且也没有对它们的顺序。
  • 指针还带有types信息。

反过来是更麻烦的。 不是所有看起来像地址的东西都可以成为指针 。 某处深处的任何指针都表示为一个可以作为整数读取的位模式,并且可以说这个整数是一个地址。 但是换个angular度来看,并不是每个整数都是一个指针。

首先有一些众所周知的局限性; 例如,指定程序地址空间外的位置的整数不能是有效的指针。 未alignment的地址不会为需要alignment的数据types生成有效的指针; 例如,在int需要4字节alignment的平台上,0x7654321不能是有效的int*值。

然而,这远远不止如此,因为当你把一个指针变成一个整数时,你就陷入了一个麻烦的世界。 这个麻烦的很大一部分是优化编译器在微观优化方面比大多数程序员期望的要好得多,因此他们关于程序如何工作的心理模型是非常错误的。 只是因为你有指向同一地址的指针并不意味着它们是相同的。 例如,请考虑下面的代码片段:

 unsigned int x = 0; unsigned short *p = (unsigned short*)&x; p[0] = 1; printf("%u = %u\n", x, *p); 

你可能会期望在sizeof(int)==4sizeof(short)==2上打印1 = 1? (小端)还是65536 = 1? (大端)。 但在我的64位Linux PC上使用GCC 4.4:

 $ c99 -O2 -Wall ac && ./a.out ac: In function 'main': ac:6: warning: dereferencing pointer 'p' does break strict-aliasing rules ac:5: note: initialized from here 0 = 1? 

海湾合作委员会是善意的警告我们这个简单的例子出了什么问题 – 在更复杂的例子,编译器可能不会注意到。 由于p&x有不同的types,因此改变p指向的内容不会影响&x指向的内容(除了一些明确定义的exception外)。 因此,编译器可以自由地将x的值保存在一个寄存器中,而不会随着*p改变而更新这个寄存器。 该程序将两个指针解引用到相同的地址,并获得两个不同的值!

这个例子的道德是,只要你保持在C语言的精确规则之内,把一个(非空有效)指针作为一个地址的想法是很好的。 硬币的另一面是,C语言的规则是复杂的,很难得到一个直观的感觉,除非你知道发生了什么。 而底层是指针和地址之间的联系有些松散,既支持“异国情调”的处理器体系结构,也支持优化编译器。

所以把指针作为地址作为你理解的第一步,但不要太直觉。

一个指针是一个variables,它提供了内存地址,而不是地址本身。 但是,您可以取消引用指针 – 并访问内存位置。

例如:

 int q = 10; /*say q is at address 0x10203040*/ int *p = &q; /*means let p contain the address of q, which is 0x10203040*/ *p = 20; /*set whatever is at the address pointed by "p" as 20*/ 

而已。 就这么简单。

在这里输入图像说明

一个程序来演示我在说什么和它的输出在这里:

http://ideone.com/rcSUsb

该程序:

 #include <stdio.h> int main(int argc, char *argv[]) { /* POINTER AS AN ADDRESS */ int q = 10; int *p = &q; printf("address of q is %p\n", (void *)&q); printf("p contains %p\n", (void *)p); p = NULL; printf("NULL p now contains %p\n", (void *)p); return 0; } 

很难准确地说出这些书的作者究竟是什么意思。 指针是否包含地址取决于您如何定义地址以及如何定义指针。

从所有写的答案来看,有人认为(1)一个地址必须是一个整数,(2)一个指针不需要是在规范中没有说的虚拟的。 有了这些假设,那么明确的指针就不一定包含地址。

然而,我们可以看到,虽然(2)可能是正确的,但(1)可能不一定是正确的。 而根据@CyStart的回答,&被称为操作符地址的事实是什么? 这是否意味着规范的作者打算使用一个包含地址的指针?

所以我们可以说,指针包含一个地址,但地址不一定是一个整数? 也许。

我认为所有这些都是乱七八糟的迂腐的语义谈话。 这实际上是毫无价值的。 你能想到一个编译器以这样的方式生成代码:指针的值不是地址吗? 如果是这样,什么? 我也这么想…

我想这本书的作者(指出指针不一定只是地址的第一个摘录)可能指的是指针带有固有的types信息。

例如,

  int x; int* y = &x; char* z = &x; 

y和z都是指针,但是y + 1和z + 1是不同的。 如果他们是内存地址,这些expression式不会给你相同的值?

而在这里, 关于指针思考就好像它们是地址一样,通常会导致悲伤 。 错误是由于人们把指针看作是地址产生的 ,这通常会导致悲伤

55555可能不是一个指针,虽然它可能是一个地址,但是(int *)55555是一个指针。 55555 + 1 = 55556,但是(int *)55555 + 1是55559(+/-在sizeof(int)方面的差异)。

那么,指针是代表内存位置的抽象 。 请注意,引用并不是说把指针看作是内存地址是错误的,它只是说它“通常会导致悲伤”。 换句话说,它会导致你有不正确的期望。

最可能的悲伤源头当然是指针算术,这实际上是C的优势之一。 如果一个指针是一个地址,你会希望指针算术是地址算术; 但事实并非如此。 例如,给地址加10应该给你一个大于10个寻址单元的地址; 但是向一个指针加10会使它增加10倍于它指向的对象的大小(甚至不是实际的大小,而是向上取整到一个alignment边界)。 在一个32位整数的普通体系结构上,如果加一个int * ,那么它将增加40个寻址单元(字节)。 有经验的C程序员已经意识到这一点,并与之共存,但你的作者显然不是马虎的隐喻的粉丝。

还有一个额外的问题, 指针的内容如何表示内存位置:正如许多答案所解释的,地址并不总是一个int(或long)。 在某些体系结构中,地址是“段”加上偏移量。 一个指针可能甚至只包含当前段(“near”指针)的偏移量,它本身不是唯一的内存地址。 而硬件了解它的指针内容可能只与内存地址有间接关系。 但引用的作者甚至没有提及表示,所以我认为这是他们想到的概念上的对等,而不是表示。

以下是我过去向一些困惑的人解释过的:指针有两个属性影响它的行为。 它具有一个 (在典型的环境中)一个内存地址和一个types ,它告诉你指向的对象的types和大小。

例如,给出:

 union { int i; char c; } u; 

你可以有三个不同的指针指向同一个对象:

 void *v = &u; int *i = &u.i; char *c = &u.c; 

如果你比较这些指针的值,它们都是平等的:

 v==i && i==c 

但是,如果您增加每个指针,您将看到它们指向的types变得相关。

 i++; c++; // You can't perform arithmetic on a void pointer, so no v++ i != c 

variablesic在这一点上将有不同的值,因为i++导致i包含下一个可访问整数的地址,而c++导致c指向下一个可寻址的字符。 通常情况下,整数比字符占用更多的内存,所以i会最终得到一个比c增大后的值大于c

像C中的任何其他variables一样,指针基本上是可以由一个或多个级联的unsigned char值表示的位的集合(与任何其他types的可指派一样, sizeof(some_variable)将指示unsigned char值的数量) 。 使指针与其他variables不同的是,C编译器会将指针中的位解释为识别variables可能存储的位置。 在C语言中,与其他语言不同的是,可以为多个variables请求空间,然后将该指针中的任何值转换为该集合中任何其他variables的指针。

许多编译器通过使用它们的位来存储实际的机器地址来实现指针,但这不是唯一可能的实现。 一个实现可以保留一个数组(用户代码无法访问),列出程序使用的所有内存对象(variables集)的硬件地址和分配的大小,并且每个指针都包含一个索引与该指数的偏移。 这样的devise将允许系统不仅将代码限制为只对其拥有的内存进行操作,而且还确保指向一个内存项目的指针不会被意外地转换为指向另一个内存项目的指针(在使用硬件的系统中地址,如果foobar是连续存储在内存中的10个项目的数组,则指向foo的“第11个”项目的指针可以指向第一个项目,但是在每个“指针”是对象的系统中ID和偏移量,如果代码尝试将指针指向foo超出其分配的范围,则系统可能会陷入陷阱)。 这样的系统也可能消除内存碎片问题,因为与任何指针相关的物理地址都可以移动。

请注意,虽然指针有些抽象,但它们不够抽象,不足以允许完全符合标准的C编译器实现垃圾回收器。 C编译器指定每个variables(包括指针)都表示为一个unsigned char值序列。 给定任何variables,可以将其分解为一系列数字,然后将该数字序列转换回原始types的variables。 因此,程序可以calloc一些存储(接收一个指针),在那里存储某些东西,将指针分解成一系列字节,在屏幕上显示它们,然后删除所有对它们的引用。 如果程序从键盘上接受了一些数字,将它们重新组合成一个指针,然后试图从指针中读取数据,如果用户input的数字与之前显示的程序相同,那么程序将被要求输出数据已经存储在calloc的内存中。 由于计算机不可能知道用户是否已经复制了所显示的数字,所以计算机可能不知道是否将来可能访问上述存储器。

Mark Bessey already said it, but this needs to be re-emphasised until understood.

Pointer has as much to do with a variable than a literal 3.

Pointer is a tuple of a value (of an address) and a type (with additional properties, such as read only). The type (and the additional parameters if any) can further define or restrict the context; 例如。 __far ptr, __near ptr : what is the context of the address: stack, heap, linear address, offset from somewhere, physical memory or what.

It's the property of type that makes pointer arithmetic a bit different to integer arithmetic.

The counter examples of a pointer of not being a variable are too many to ignore

  • fopen returning a FILE pointer. (where's the variable)
  • stack pointer or frame pointer being typically unaddressable registers

    *(int *)0x1231330 = 13; — casting an arbitrary integer value to a pointer_of_integer type and writing/reading an integer without ever introducing a variable

In the lifetime of a C-program there will be many other instances of temporary pointers that do not have addresses — and therefore they are not variables, but expressions/values with a compile time associated type.

You are right and sane. Normally, a pointer is just an address, so you can cast it to integer and do any arithmetics.

But sometimes pointers are only a part of an address. On some architectures a pointer is converted to an address with addition of base or another CPU register is used.

But these days, on PC and ARM architecture with a flat memory model and C language natively compiled, it's OK to think that a pointer is an integer address to some place in one-dimensional addressable RAM.

A pointer is a variable type that is natively available in C/C++ and contains a memory address. Like any other variable it has an address of its own and takes up memory (the amount is platform specific).

One problem you will see as a result of the confusion is trying to change the referent within a function by simply passing the pointer by value. This will make a copy of the pointer at function scope and any changes to where this new pointer "points" will not change the referent of the pointer at the scope that invoked the function. In order to modify the actual pointer within a function one would normally pass a pointer to a pointer.

BRIEF SUMMARY (which I will also put at the top):

(0) Thinking of pointers as addresses is often a good learning tool, and is often the actual implementation for pointers to ordinary data types.

(1) But on many, perhaps most, compilers pointers to functions are not addresses, but are bigger than an address (typically 2x, sometimes more), or are actually pointers to a struct in memory than contains the addresses of function and stuff like a constant pool.

(2) Pointers to data members and pointers to methods are often even stranger.

(3) Legacy x86 code with FAR and NEAR pointer issues

(4) Several examples, most notably the IBM AS/400, with secure "fat pointers".

I am sure you can find more.

DETAIL:

UMMPPHHH!!!!! Many of the answers so far are fairly typical "programmer weenie" answers – but not compiler weenie or hardware weenie. Since I pretend to be a hardware weenie, and often work with compiler weenies, let me throw in my two cents:

On many, probably most, C compilers, a pointer to data of type T is, in fact, the address of T .

Fine.

But, even on many of these compilers, certain pointers are NOT addresses. You can tell this by looking at sizeof(ThePointer) .

For example, pointers to functions are sometimes quite a lot bigger than ordinary addresses. Or, they may involve a level of indirection. This article provides one description, involving the Intel Itanium processor, but I have seen others. Typically, to call a function you must know not only the address of the function code, but also the address of the function's constant pool – a region of memory from which constants are loaded with a single load instruction, rather than the compiler having to generate a 64 bit constant out of several Load Immediate and Shift and OR instructions. So, rather than a single 64 bit address, you need 2 64 bit addresses. Some ABIs (Application Binary Interfaces) move this around as 128 bits, whereas others use a level of indirection, with the function pointer actually being the address of a function descriptor that contains the 2 actual addresses just mentioned. Which is better? Depends on your point of view: performance, code size, and some compatibility issues – often code assumes that a pointer can be cast to a long or a long long, but may also assume that the long long is exactly 64 bits. Such code may not be standards compliant, but nevertheless customers may want it to work.

Many of us have painful memories of the old Intel x86 segmented architecture, with NEAR POINTERs and FAR POINTERS. Thankfully these are nearly extinct by now, so only a quick summary: in 16 bit real mode, the actual linear address was

 LinearAddress = SegmentRegister[SegNum].base << 4 + Offset 

Whereas in protected mode, it might be

 LinearAddress = SegmentRegister[SegNum].base + offset 

with the resulting address being checked against a limit set in the segment. Some programs used not really standard C/C++ FAR and NEAR pointer declarations, but many just said *T — but there were compiler and linker switches so, for example, code pointers might be near pointers, just a 32 bit offset against whatever is in the CS (Code Segment) register, while the data pointers might be FAR pointers, specifying both a 16 bit segment number and a 32 bit offset for a 48 bit value. Now, both of these quantities are certainly related to the address, but since they aren't the same size, which of them is the address? Moreover, the segments also carried permissions – read-only, read-write, executable – in addition to stuff related to the actual address.

A more interesting example, IMHO, is (or, perhaps, was) the IBM AS/400 family. This computer was one of the first to implement an OS in C++. Pointers on this machime were typically 2X the actual address size – eg as this presentation says, 128 bit pointers, but the actual addresses were 48-64 bits, and, again, some extra info, what is called a capability, that provided permissions such as read, write, as well as a limit to prevent buffer overflow. Yes: you can do this compatibly with C/C++ — and if this were ubiquitous, the Chinese PLA and slavic mafia would not be hacking into so many Western computer systems. But historically most C/C++ programming has neglected security for performance. Most interestingly, the AS400 family allowed the operating system to create secure pointers, that could be given to unprivileged code, but which the unprivileged code could not forge or tamper with. Again, security, and while standards compliant, much sloppy non-standards compliant C/C++ code will not work in such a secure system. Again, there are official standards, and there are de-facto standards.

Now, I'll get off my security soapbox, and mention some other ways in which pointers (of various types) are often not really addresses: Pointers to data members, pointers to member functions methods, and the static versions thereof are bigger than an ordinary address. As this post says:

There are many ways of solving this [problems related to single versus multiple inheitance, and virtual inheritance]. Here's how the Visual Studio compiler decides to handle it: A pointer to a member function of a multiply-inherited class is really a structure." And they go on to say "Casting a function pointer can change its size!".

As you can probably guess from my pontificating on (in)security, I've been involved in C/C++ hardware/software projects where a pointer was treated more like a capability than a raw address.

I could go on, but I hope you get the idea.

BRIEF SUMMARY (which I will also put at the top):

(0) thinking of pointers as addresses is often a good learning tool, and is often the actual implementation for pointers to ordinary data types.

(1) But on many, perhaps most, compilers pointers to functions are not addresses, but are bigger than an address (typically 2X, sometimes more), or are actually pointers to a struct in memory than contains the addresses of function and stuff like a constant pool.

(2) Pointers to data members and pointers to methods are often even stranger.

(3) Legacy x86 code with FAR and NEAR pointer issues

(4) Several examples, most notably the IBM AS/400, with secure "fat pointers".

I am sure you can find more.

A pointer is just another variable which is used to hold the address of a memory location (usually the memory address of another variable).

You can see it this way. A pointer is a value that represents an address in the addressable memory space.

A pointer is just another variable that can contain memory address usually of another variable. A pointer being a variable it too has an memory address.

AC pointer is very similar to a memory address but with machine-dependent details abstracted away, as well as some features not found in the lower level instruction set.

For example, a C pointer is relatively richly typed. If you increment a pointer through an array of structures, it nicely jumps from one structure to the other.

Pointers are subject to conversion rules and provide compile time type checking.

There is a special "null pointer" value which is portable at the source code level, but whose representation may differ. If you assign an integer constant whose value is zero to a pointer, that pointer takes on the null pointer value. Ditto if you initialize a pointer that way.

A pointer can be used as a boolean variable: it tests true if it is other than null, and false if it is null.

In a machine language, if the null pointer is a funny address like 0xFFFFFFFF, then you may have to have explicit tests for that value. C hides that from you. Even if the null pointer is 0xFFFFFFFF, you can test it using if (ptr != 0) { /* not null! */} .

Uses of pointers which subvert the type system lead to undefined behavior, whereas similar code in machine language might be well defined. Assemblers will assemble the instructions you have written, but C compilers will optimize based on the assumption that you haven't done anything wrong. If a float *p pointer points to a long n variable, and *p = 0.0 is executed, the compiler is not required to handle this. A subsequent use of n will not necessary read the bit pattern of the float value, but perhaps, it will be an optimized access which is based on the "strict aliasing" assumption that n has not been touched! That is, the assumption that the program is well-behaved, and so p should not be pointing at n .

In C, pointers to code and pointers to data are different, but on many architectures, the addresses are the same. C compilers can be developed which have "fat" pointers, even though the target architecture does not. Fat pointers means that pointers are not just machine addresses, but contain other info, such as information about the size of the object being pointed at, for bounds checking. Portably written programs will easily port to such compilers.

So you can see, there are many semantic differences between machine addresses and C pointers.

Before understanding pointers we need to understand objects. Objects are entities which exist and has a location specifier called an address. A pointer is just a variable like any other variables in C with a type called pointer whose content is interpreted as the address of an object which supports the following operation.

 + : A variable of type integer (usually called offset) can be added to yield a new pointer - : A variable of type integer (usually called offset) can be subtracted to yield a new pointer : A variable of type pointer can be subtracted to yield an integer (usually called offset) * : De-referencing. Retrieve the value of the variable (called address) and map to the object the address refers to. ++: It's just `+= 1` --: It's just `-= 1` 

A pointer is classified based on the type of object it is currently referring. The only part of the information it matters is the size of the object.

Any object supports an operation, & (address of), which retrieves the location specifier (address) of the object as a pointer object type. This should abate the confusion surrounding the nomenclature as this would make sense to call & as an operation of an object rather than a pointer whose resultant type is a pointer of the object type.

Note Throughout this explanation, I have left out the concept of memory.

An address is used to identify a piece of fixed-size storage, usually for each bytes, as an integer. This is precisely called as byte address , which is also used by the ISO C. There can be some other methods to construct an address, eg for each bit. However, only byte address is so often used, we usually omit "byte".

Technically, an address is never a value in C, because the definition of term "value" in (ISO) C is:

precise meaning of the contents of an object when interpreted as having a specific type

(Emphasized by me.) However, there is no such "address type" in C.

Pointer is not the same. Pointer is a kind of type in the C language. There are several distinct pointer types. They does not necessarily obey to identical set of rules of the language, eg the effect of ++ on a value of type int* vs. char* .

A value in C can be of a pointer type. This is called a pointer value . To be clear, a pointer value is not a pointer in the C language. But we are accustomed to mix them together, because in C it is not likely to be ambiguous: if we call an expression p as a "pointer", it is merely a pointer value but not a type, since a named type in C is not expressed by an expression , but by a type-name or a typedef-name .

Some other things are subtle. As a C user, firstly, one should know what object means:

region of data storage in the execution environment, the contents of which can represent values

An object is an entity to represent values, which are of a specific type. A pointer is an object type . So if we declare int* p; , then p means "an object of pointer type", or an "pointer object".

Note there is no "variable" normatively defined by the standard (in fact it is never being used as a noun by ISO C in normative text). However, informally, we call an object a variable, as some other language does. (But still not so exactly, eg in C++ a variable can be of reference type normatively, which is not an object.) The phrases "pointer object" or "pointer variable" are sometimes treated like "pointer value" as above, with a probable slight difference. (One more set of examples is "array".)

Since pointer is a type, and address is effectively "typeless" in C, a pointer value roughly "contains" an address. And an expression of pointer type can yield an address, eg

ISO C11 6.5.2.3

3 The unary & operator yields the address of its operand.

Note this wording is introduced by WG14/N1256, ie ISO C99:TC3. In C99 there is

3 The unary & operator returns the address of its operand.

It reflects the committee's opinion: an address is not a pointer value returned by the unary & operator.

Despite the wording above, there are still some mess even in the standards.

ISO C11 6.6

9 An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator

ISO C++11 5.19

3 … An address constant expression is a prvalue core constant expression of pointer type that evaluates to the address of an object with static storage duration, to the address of a function, or to a null pointer value, or a prvalue core constant expression of type std::nullptr_t . …

(Recent C++ standard draft uses another wording so there is no this problem.)

Actually both "address constant" in C and "address constant expression" in C++ are constant expression of pointer types (or at least "pointer-like" types since C++11).

And the builtin unary & operator is called as "address-of" in C and C++; similarily, std::addressof is introduced in C++11.

These naming may bring misconception. The resulted expression is of pointer type, so they'd be interpreted as: the result contains/yields an address, rather than is an address.

It says "because it confuses those who don't know what addresses are about" – also, it's true: if you learn what addresses are about, you'll be not confused. Theoretically, pointer is a variable which points to another, practically holds an address, which is the address of the variable it points to. I don't know why should hide this fact, it's not a rocket science. If you understand pointers, you'll one step closer to understand how computers work. Go ahead!

Come to think about it, I think it's a matter of semantics. I don't think the author is right, since the C standard refers to a pointer as holding an address to the referenced object as others have already mentioned here. However, address!=memory address. An address can be really anything as per C standard although it will eventually lead to a memory address, the pointer itself can be an id, an offset + selector (x86), really anything as long as it can describe (after mapping) any memory address in the addressable space.

One other way in which a C or C++ pointer differs from a simple memory address due to the different pointer types I haven't seen in the other answers (altrhough given their total size, I may have overlooked it). But it is probably the most important one, because even experienced C/C++ programmers can trip over it:

The compiler may assume that pointers of incompatible types do not point to the same address even if they clearly do, which may give behaviour that would no be possible with a simple pointer==address model. Consider the following code (assuming sizeof(int) = 2*sizeof(short) ):

 unsigned int i = 0; unsigned short* p = (unsigned short*)&i; p[0]=p[1]=1; if (i == 2 + (unsigned short)(-1)) { // you'd expect this to execute, but it need not } if (i == 0) { // you'd expect this not to execute, but it actually may do so } 

Note that there's an exception for char* , so manipulating values using char* is possible (although not very portable).

Quick summary: AC address is a value, typically represented as a machine-level memory address, with a specific type.

The unqualified word "pointer" is ambiguous. C has pointer objects (variables), pointer types , pointer expressions , and pointer values .

It's very common to use the word "pointer" to mean "pointer object", and that can lead to some confusion — which is why I try to use "pointer" as an adjective rather than as a noun.

The C standard, at least in some cases, uses the word "pointer" to mean "pointer value". For example, the description of malloc says it "returns either a null pointer or a pointer to the allocated space".

So what's an address in C? It's a pointer value, ie, a value of some particular pointer type. (Except that a null pointer value is not necessarily referred to as an "address", since it isn't the address of anything).

The standard's description of the unary & operator says it "yields the address of its operand". Outside the C standard, the word "address" is commonly used to refer to a (physical or virtual) memory address, typically one word in size (whatever a "word" is on a given system).

AC "address" is typically implemented as a machine address — just as a C int value is typically implemented as a machine word. But a C address (pointer value) is more than just a machine address. It's a value typically represented as a machine address, and it's a value with some specific type .

A pointer value is an address. A pointer variable is an object that can store an address. This is true because that's what the standard defines a pointer to be. It's important to tell it to C novices because C novices are often unclear on the difference between a pointer and the thing it points to (that is to say, they don't know the difference between an envelope and a building). The notion of an address (every object has an address and that's what a pointer stores) is important because it sorts that out.

However, the standard talks at a particular level of abstraction. Those people the author talks about who "know what addresses are about", but who are new to C, must necessarily have learned about addresses at a different level of abstraction — perhaps by programming assembly language. There is no guarantee that the C implementation uses the same representation for addresses as the CPUs opcodes use (referred to as "the store address" in this passage), that these people already know about.

He goes on to talk about "perfectly reasonable address manipulation". As far as the C standard is concerned there's basically no such thing as "perfectly reasonable address manipulation". Addition is defined on pointers and that is basically it. Sure, you can convert a pointer to integer, do some bitwise or arithmetic ops, and then convert it back. This is not guaranteed to work by the standard, so before writing that code you'd better know how your particular C implementation represents pointers and performs that conversion. It probably uses the address representation you expect, but it it doesn't that's your fault because you didn't read the manual. That's not confusion, it's incorrect programming procedure 😉

In short, C uses a more abstract concept of an address than the author does.

The author's concept of an address of course is also not the lowest-level word on the matter. What with virtual memory maps and physical RAM addressing across multiple chips, the number that you tell the CPU is "the store address" you want to access has basically nothing to do with where the data you want is actually located in hardware. It's all layers of indirection and representation, but the author has chosen one to privilege. If you're going to do that when talking about C, choose the C level to privilege !

Personally I don't think the author's remarks are all that helpful, except in the context of introducing C to assembly programmers. It's certainly not helpful to those coming from higher level languages to say that pointer values aren't addresses. It would be far better to acknowledge the complexity than it is to say that the CPU has the monopoly on saying what an address is and thus that C pointer values "are not" addresses. They are addresses, but they may be written in a different language from the addresses he means. Distinguishing the two things in the context of C as "address" and "store address" would be adequate, I think.

Simply to say pointers are actually offset part of the segmentation mechanism which translate to Linear Address after segmentation and then to Physical address after paging. Physical Addresses are actually addressed from you ram.

  Selector +--------------+ +-----------+ ---------->| | | | | Segmentation | ------->| Paging | Offset | Mechanism | | Mechanism | ---------->| | | | +--------------+ +-----------+ Virtual Linear Physical