为什么可变结构是“邪恶的”?

在这里讨论之后,我已经多次读到可变结构是“邪恶的”(就像这个问题的答案一样)。

C#中的可变性和结构的实际问题是什么?

结构是值types,意味着它们在被传递时被复制。

所以,如果你改变一个副本,你只改变了那个副本,而不是原来的副本,而不是其他副本。

如果你的结构是不可变的,那么通过值传递的所有自动拷贝将是相同的。

如果你想改变它,你必须有意识地通过修改数据来创build一个新的结构实例。 (不是副本)

从哪里开始;-p

Eric Lippert的博客总是很好的引用:

这是可变值types是邪恶的另一个原因。 尝试始终使值types不可变。

首先,你倾向于很容易地失去改变…例如,从列表中取出东西:

 Foo foo = list[0]; foo.Name = "abc"; 

这个变化是什么? 没有用的

与属性相同:

 myObj.SomeProperty.Size = 22; // the compiler spots this one 

迫使你做:

 Bar bar = myObj.SomeProperty; bar.Size = 22; myObj.SomeProperty = bar; 

不那么批评,有一个尺寸问题; 可变对象往往有多个属性; 但如果你有一个结构有两个int s,一个string ,一个DateTime和一个bool ,你可以很快地烧毁大量的内存。 对于一个类,多个调用者可以共享对同一个实例的引用(引用很小)。

我不会说邪恶的,但可变性往往是程序员提供最大function的一个过分热情的迹象。 在现实中,这通常是不需要的,而这反过来又使得界面更小,更容易使用,更难以使用错误(=更强健)。

其中一个例子是竞争条件下的读/写和写/写冲突。 这些不能在不可变的结构中发生,因为写入不是有效的操作。

另外,我声称可变性几乎从来没有实际需要 ,程序员只是认为可能在未来。 例如,更改date根本没有意义。 相反,基于旧的创build一个新的date。 这是一个便宜的操作,所以性能不是一个考虑因素。

可变的结构不是邪恶的。

在高性能环境下,这是绝对必要的。 例如,当caching行和/或垃圾收集成为瓶颈时。

我不会在这些完全有效的用例“evil”中调用不可变结构的使用。

我可以看到C#的语法无法区分值types或引用types成员的访问权限,所以我倾向于不可变结构,强制不变性,而不是可变结构。

然而,我并不是简单地将不可变结构标记为“邪恶”,而是build议拥抱语言,提倡更有帮助和build设性的经验法则。

例如: “结构是值types,默认情况下是复制的,如果你不想复制它们,你需要一个引用”或者“试着首先使用只读结构”

具有公共可变字段或属性的结构不是邪恶的。

结构方法(与属性setter不同)突变“this”是有点邪恶的,只是因为.net没有提供一种方法来区分它们与没有的方法。 不改变“this”的struct方法应该是可调用的,即使在只读结构中也不需要防御性复制。 对“this”进行变异的方法在只读结构上根本不应该是可调用的。 由于.net不希望禁止不修改“this”的结构方法在只读结构上被调用,但不希望允许只读结构被突变,所以它防御地以只读方式复制结构体,只有上下文,可以说是两全其美。

尽pipe在只读上下文中处理自variables方法存在问题,但是可变结构通常提供远远优于可变类types的语义。 考虑以下三个方法签名:

 struct PointyStruct {public int x,y,z;};
 class PointyClass {public int x,y,z;};

 void Method1(PointyStruct foo);
 void Method2(ref PointyStruct foo);
 void Method3(PointyClass foo);

对于每种方法,请回答以下问题:

  1. 假设该方法不使用任何“不安全”的代码,它可以修改foo吗?
  2. 如果在调用该方法之前没有外部引用“foo”,外部引用是否存在?

回答:

问题1:
Method1() :否(清晰的意图)
Method2() :是(清晰的意图)
Method3() :是(不确定的意图)
问题2:
Method1() :no
Method2() :否(除非不安全)
Method3() :是的

Method1不能修改foo,也不会获取引用。 Method2获取对foo的短暂引用,它可以使用任何次数的任意次数修改foo的字段,直到它返回,但不能保留该引用。 在Method2返回之前,除非它使用不安全的代码,否则可能由“foo”引用组成的所有副本都将消失。 方法3与方法2不同,它与foo有一个混杂的可共享的引用,并且不知道它可能会做什么。 它可能根本不会改变foo,它可能会改变foo然后返回,或者它可能会引用foo到另一个可能以某种随意的方式随意改变它的线程。 限制Method3可能对可变类对象传递的方法的唯一方法是将可变对象封装到只读包装器中,该包装器很丑陋。

结构arrays提供了美妙的语义。 给定矩形types的RectArray [500],清楚明白如何将元素123复制到元素456,然后稍后将元素123的宽度设置为555,而不干扰元素456.“RectArray [432] = RectArray [321 ]; …; RectArray [123] .Width = 555;“。 知道Rectangle是一个带有一个称为Width的整数字段的结构,它将告诉人们需要了解上述语句。

现在假设RectClass是一个与Rectangle具有相同字段的类,并且想要对RectClasstypes的RectClassArray [500]进行相同的操作。 也许该数组应该保存500个预先初始化的不可变引用给可变的RectClass对象。 在这种情况下,正确的代码就像“RectClassArray [321] .SetBounds(RectClassArray [456]); …; RectClassArray [321] .X = 555;”。 也许数组被假定为保持不会改变的实例,所以正确的代码将更像“RectClassArray [321] = RectClassArray [456]; …; RectClassArray [321] =新的RectClass(RectClassArray [321 ]); RectClassArray [321] .X = 555;“ 要知道应该做什么,人们就必须知道更多关于RectClass(例如它是否支持复制构造函数,复制方法等)以及数组的预期用法。 远不如使用结构干净。

可以肯定的是,不幸的是,除了数组以外,任何容器类都不能提供结构数组的干净语义。 最好的办法是,如果想要一个集合被索引到例如一个string,可能会提供一个通用的“ActOnItem”方法,它将接受索引的string,generics参数和将被传递的委托通过引用通用参数和收集项目。 这将允许与结构数组几乎相同的语义,但除非能够追求vb.net和C#人提供一个很好的语法,否则即使代码性能合理,代码也会很笨重(传递一个generics参数允许使用静态委托,并避免任何需要创build任何临时类实例)。

就我个人而言,我厌恶了Eric Lippert等人的仇恨。 关于可变值types的情况。 它们提供比在各地使用的混杂参考types更清洁的语义。 尽pipe.net对值types的支持有一些限制,但是在许多情况下,可变值types比其他types的实体更适合。

价值types基本上代表了不变的概念。 Fx,有一个math值如整数,vector等是没有意义的,然后可以修改它。 这就像重新定义价值的意义。 而不是改变一个值types,更有意义的是分配另一个唯一的值。 考虑通过比较其性质的所有值来比较价值types的事实。 重点是,如果属性是相同的,那么它是相同的通用表示的价值。

正如Konrad所说,改变date也没什么意义,因为这个值代表了独特的时间点,而不是一个具有任何状态或上下文相关性的时间对象的实例。

希望这对你有意义。 可以肯定的是,更多的是关于你试图用价值types来捕获的概念而不是实际的细节。

还有另外几个angular落案例可能导致程序员的观点不可预测的行为。 这里有几个。

  1. 不可变的值types和只读字段
 // Simple mutable structure. // Method IncrementI mutates current state. struct Mutable { public Mutable(int i) : this() { I = i; } public void IncrementI() { I++; } public int I {get; private set;} } // Simple class that contains Mutable structure // as readonly field class SomeClass { public readonly Mutable mutable = new Mutable(5); } // Simple class that contains Mutable structure // as ordinary (non-readonly) field class AnotherClass { public Mutable mutable = new Mutable(5); } class Program { void Main() { // Case 1. Mutable readonly field var someClass = new SomeClass(); someClass.mutable.IncrementI(); // still 5, not 6, because SomeClass.mutable field is readonly // and compiler creates temporary copy every time when you trying to // access this field Console.WriteLine(someClass.mutable.I); // Case 2. Mutable ordinary field var anotherClass = new AnotherClass(); anotherClass.mutable.IncrementI(); //Prints 6, because AnotherClass.mutable field is not readonly Console.WriteLine(anotherClass.mutable.I); } } 
  1. 可变的值types和数组

假设有一个我们的可变结构数组,我们为该数组的第一个元素调用IncrementI方法。 你打算从这个电话中得到什么样的行为? 它应该改变arrays的价值还是只有一个副本?

 Mutable[] arrayOfMutables = new Mutable[1]; arrayOfMutables[0] = new Mutable(5); // Now we actually accessing reference to the first element // without making any additional copy arrayOfMutables[0].IncrementI(); //Prints 6!! Console.WriteLine(arrayOfMutables[0].I); // Every array implements IList<T> interface IList<Mutable> listOfMutables = arrayOfMutables; // But accessing values through this interface lead // to different behavior: IList indexer returns a copy // instead of an managed reference listOfMutables[0].IncrementI(); // Should change I to 7 // Nope! we still have 6, because previous line of code // mutate a copy instead of a list value Console.WriteLine(listOfMutables[0].I); 

所以,只要你和其他团队清楚地了解你在做什么,可变的结构就不是邪恶的。 但是,当程序行为与预期行为不同时,会有太多的情况,这可能会导致难以产生和难以理解的错误。

如果你曾经用C / C ++这样的语言进行过编程,那么结构可以很好地用作可变的。 只要通过裁判,他们周围,没有什么可以出错的。 我发现唯一的问题是C#编译器的限制,并且在某些情况下,我无法强制愚蠢的东西来使用对结构的引用,而不是复制(就像一个结构是C#类的一部分)。

所以,可变的结构不是邪恶的,C#已经使它们变得邪恶。 我一直使用C ++中的可变结构,它们非常方便和直观。 相比之下,C#使我彻底放弃了作为类的成员的结构,因为它们处理对象的方式。 他们的方便花了我们的钱。

想象一下你有一百万个结构的数组。 每个结构代表一个股权,例如bid_price,offer_price(可能是小数)等,这是由C#/ VB创build的。

想象一下,数组是在非托pipe堆中分配的一块内存中创build的,以便其他一些本地代码线程能够并发地访问该数组(可能是一些执行math运算的高性能代码)。

想象一下,C#/ VB代码正在倾听价格变化的市场反馈,代码可能需要访问数组的某个元素(无论哪种安全性),然后修改一些价格字段。

想象一下,这是每秒数十甚至数十万次。

那么让我们来看看事实,在这种情况下,我们真的希望这些结构是可变的,因为它们被其他本地代码共享,所以创build副本不会有帮助。 他们需要这样做,因为在这些速率下制作一个120字节的结构体的副本是很糟糕的,特别是当更新实际上可能影响一两个字节时。

雨果

如果你坚持使用什么结构(在C#中,Visual Basic 6,Pascal / Delphi,C ++结构types(或类),当它们不用作指针时),你会发现一个结构不过是一个复合variables 。 这意味着:你将把它们当作一个压缩的variables集合,在一个通用的名字(你引用成员的一个loggingvariables)下。

我知道这会混淆许多被OOP深深习惯的人,但是如果使用得当,那么这种说法本身就是邪恶的是不够的。 一些结构是不可改变的(Python的名称是这种情况),但是这是另一个需要考虑的范例。

是的:结构涉及大量的记忆,但是这样做并不会是更多的记忆:

 point.x = point.x + 1 

相比:

 point = Point(point.x + 1, point.y) 

内存消耗将至less是相同的,甚至更多的情况下(尽pipe对于当前的堆栈,这种情况将是暂时的,取决于语言)。

但是,最后,结构是结构 ,而不是对象。 在“公安条例”中,对象的主要属性是他们的身份 ,大部分时间不超过其内存地址。 结构代表数据结构(不是一个适当的对象,所以它们不具有身份),数据可以被修改。 在其他语言中, logging (而不是结构 ,就像Pascal的情况一样)是单词并且具有相同的目的:只是一个数据loggingvariables,用于从文件读取,修改并转储到文件中(这是主要的使用和在许多语言中,甚至可以在logging中定义数据alignment,而对于正确调用的对象则不一定如此)。

想要一个很好的例子? 结构用于轻松读取文件。 Python有这个库 ,因为它是面向对象的,并且不支持结构体,所以它必须以另一种方式来实现,这有点难看。 实现结构的语言具有内置的function。 尝试使用像Pascal或C这样的语言来读取一个合适的结构的位图头。这将很容易(如果结构正确地构build和alignment;在Pascal中,您不会使用基于logging的访问,但可以读取任意二进制数据)。 因此,对于文件和直接(本地)内存访问,结构比对象更好。 As for today, we're used to JSON and XML, and so we forget the use of binary files (and as a side effect, the use of structs). But yes: they exist, and have a purpose.

They are not evil. Just use them for the right purpose.

If you think in terms of hammers, you will want to treat screws as nails, to find screws are harder to plunge in the wall, and it will be screws' fault, and they will be the evil ones.

When something can be mutated, it gains a sense of identity.

 struct Person { public string name; // mutable public Point position = new Point(0, 0); // mutable public Person(string name, Point position) { ... } } Person eric = new Person("Eric Lippert", new Point(4, 2)); 

Because Person is mutable, it's more natural to think about changing Eric's position than cloning Eric, moving the clone, and destroying the original . Both operations would succeed in changing the contents of eric.position , but one is more intuitive than the other. Likewise, it's more intuitive to pass Eric around (as a reference) for methods to modify him. Giving a method a clone of Eric is almost always going to be surprising. Anyone wanting to mutate Person must remember to ask for a reference to Person or they'll be doing the wrong thing.

If you make the type immutable, the problem goes away; if I can't modify eric , it makes no difference to me whether I receive eric or a clone of eric . More generally, a type is safe to pass by value if all of its observable state is held in members that are either:

  • 一成不变
  • reference types
  • safe to pass by value

If those conditions are met then a mutable value type behaves like a reference type because a shallow copy will still allow the receiver to modify the original data.

The intuitiveness of an immutable Person depends on what you're trying to do though. If Person just represents a set of data about a person, there's nothing unintuitive about it; Person variables truly represent abstract values , not objects. (In that case, it'd probably be more appropriate to rename it to PersonData .) If Person is actually modeling a person itself, the idea of constantly creating and moving clones is silly even if you've avoided the pitfall of thinking you're modifying the original. In that case it'd probably be more natural to simply make Person a reference type (that is, a class.)

Granted, as functional programming has taught us there are benefits to making everything immutable (no one can secretly hold on to a reference to eric and mutate him), but since that's not idiomatic in OOP it's still going to be unintuitive to anyone else working with your code.

It doesn't have anything to do with structs (and not with C#, either) but in Java you might get problems with mutable objects when they are eg keys in a hash map. If you change them after adding them to a map and it changes its hash code , evil things might happen.

Personally when I look at code the following looks pretty clunky to me:

data.value.set ( data.value.get () + 1 ) ;

rather than simply

data.value++ ; or data.value = data.value + 1 ;

Data encapsulation is useful when passing a class around and you want to ensure the value is modified in a controlled fashion. However when you have public set and get functions that do little more than set the value to what ever is passed in, how is this an improvement over simply passing a public data structure around?

When I create a private structure inside a class, I created that structure to organize a set of variables into one group. I want to be able to modify that structure within the class scope, not get copies of that structure and create new instances.

To me this prevents a valid use of structures being used to organize public variables, if I wanted access control I'd use a class.

There are many advantages and disadvantages to mutable data. The million-dollar disadvantage is aliasing. If the same value is being used in multiple places, and one of them changes it, then it will appear to have magically changed to the other places that are using it. This is related to, but not identical with, race conditions.

The million-dollar advantage is modularity, sometimes. Mutable state can allow you to hide changing information from code that doesn't need to know about it.

The Art of the Interpreter goes into these trade offs in some detail, and gives some examples.

There are several issues with Mr. Eric Lippert's example. It is contrived to illustrate the point that structs are copied and how that could be a problem if you are not careful. Looking at the example I see it as a result of a bad programming habit and not really a problem with either struct or the class.

  1. A struct is supposed to have only public members and should not require any encapsulation. If it does then it really should be a type/class. You really do not need two constructs to say the same thing.

  2. If you have class enclosing a struct, you would call a method in the class to mutate the member struct. This is what I would do as a good programming habit.

A proper implementation would be as follows.

 struct Mutable { public int x; } class Test { private Mutable m = new Mutable(); public int mutate() { mx = mx + 1; return mx; } } static void Main(string[] args) { Test t = new Test(); System.Console.WriteLine(t.mutate()); System.Console.WriteLine(t.mutate()); System.Console.WriteLine(t.mutate()); } 

It looks like it is an issue with programming habit as opposed to an issue with struct itself. Structs are supposed to be mutable, that is the idea and intent.

The result of the changes voila behaves as expected:

1 2 3 Press any key to continue . 。 。

I don't believe they're evil if used correctly. I wouldn't put it in my production code, but I would for something like structured unit testing mocks, where the lifespan of a struct is relatively small.

Using the Eric example, perhaps you want to create a second instance of that Eric, but make adjustments, as that's the nature of your test (ie duplication, then modifying). It doesn't matter what happens with the first instance of Eric if we're just using Eric2 for the remainder of the test script, unless you're planning on using him as a test comparison.

This would be mostly useful for testing or modifying legacy code that shallow defines a particular object (the point of structs), but by having an immutable struct, this prevents it's usage annoyingly.

According to the C# Cookbook (Jay Hilyard, et. al), we get a warning about using structs as they often lead to the IL generating boxing and un-boxing commands. This might have been fixed in later versions of C# IL generation but the stigma may persist. As it stands, there isn't a lot of need for structs in c# when you have classes. And for those who might argue with that I would say … let the compiler figure it out.

The following can help prevent or eliminate boxing:

  1. Use classes instead of structures…This change can dramatically improve performance.