奇怪的行为与浮动和string转换

我已经input到python shell中：

>>> 0.1*0.1 0.010000000000000002

我预计0.1 * 0.1不是0.01，因为我知道基数10中的0.1是基数2中的周期。

 >>> len(str(0.1*0.1)) 4

我预计会得到20，因为我已经看到20个以上的字符。我为什么得到4？

 >>> str(0.1*0.1) '0.01'

好吧，这解释了为什么我len给了我4，但为什么str返回'0.01' ？

 >>> repr(0.1*0.1) '0.010000000000000002'

为什么这样做，但不是？（我已经阅读了这个答案，但是我想知道他们是如何决定什么时候浮动的，什么时候没有）

 >>> str(0.01) == str(0.0100000000001) False >>> str(0.01) == str(0.01000000000001) True

所以这似乎是花车的准确性问题。我以为Python会使用IEEE 754单精度浮点数。所以我已经检查过了

 #include <stdint.h> #include <stdio.h> // printf union myUnion { uint32_t i; // unsigned integer 32-bit type (on every machine) float f; // a type you want to play with }; int main() { union myUnion testVar; testVar.f = 0.01000000000001f; printf("%f\n", testVar.f); testVar.f = 0.01000000000000002f; printf("%f\n", testVar.f); testVar.f = 0.01f*0.01f; printf("%f\n", testVar.f); }

我有：

 0.010000 0.010000 0.000100

Python给了我：

 >>> 0.01000000000001 0.010000000000009999 >>> 0.01000000000000002 0.010000000000000019 >>> 0.01*0.01 0.0001

为什么Python给我这些结果？

（我使用Python 2.6.5，如果你知道Python版本的差异，我也会对它们感兴趣。）

对repr的关键要求是应该往返; 也就是说， eval(repr(f)) == f在所有情况下都应该为True 。

在Python 2.x（2.7之前）中， repr通过格式为%.17g的printf工作，并丢弃尾随零。 IEEE-754保证这是正确的（对于64位浮点数）。从2.7和3.1开始，Python使用更智能的algorithm，在某些情况下可以find更短的表示forms，其中%.17g给出了不必要的非零terminal数字或terminal9。看看3.1里有什么新东西？并发行1580 。

即使在Python 2.7下， repr(0.1 * 0.1)给出了"0.010000000000000002" 。这是因为在IEEE-754parsing和algorithm下， 0.1 * 0.1 == 0.01是False ; 也就是说，最接近的64位浮点值为0.1 ，当乘以自己时，产生一个64位的浮点值，而不是最接近的浮点值，即为浮点值的0.01 ：

 >>> 0.1.hex() '0x1.999999999999ap-4' >>> (0.1 * 0.1).hex() '0x1.47ae147ae147cp-7' >>> 0.01.hex() '0x1.47ae147ae147bp-7' ^ 1 ulp difference

repr和str之间的差别（2.7 / 3.1之前）是str格式，有12位小数，而不是17，这是不可循环的，但在许多情况下产生更多的可读结果。

我可以确认你的行为

 ActivePython 2.6.4.10 (ActiveState Software Inc.) based on Python 2.6.4 (r264:75706, Jan 22 2010, 17:24:21) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> repr(0.1) '0.10000000000000001' >>> repr(0.01) '0.01'

现在，文档宣称在Python <2.7

repr(1.1)的值是按照format(1.1, '.17g')

这是一个小小的简化。

请注意，这与string格式化代码有关 – 在内存中，所有的Python浮点数都只是作为C ++双精度来存储的，所以它们之间永远不会有区别。

而且，即使你知道有一个更好的string，但使用全长string来处理float也是令人不快的。事实上，在现代的Pythons中，一种新的algorithm被用于浮点格式化，以一种聪明的方式select最短的表示。

我花了一段时间在源代码中查找，所以我会在这里包含详细信息，以防您感兴趣。你可以跳过这一节。

在floatobject.c ，我们看到

 static PyObject * float_repr(PyFloatObject *v) { char buf[100]; format_float(buf, sizeof(buf), v, PREC_REPR); return PyString_FromString(buf); }

这导致我们看看format_float 。忽略NaN / inf的特殊情况，这是：

 format_float(char *buf, size_t buflen, PyFloatObject *v, int precision) { register char *cp; char format[32]; int i; /* Subroutine for float_repr and float_print. We want float numbers to be recognizable as such, ie, they should contain a decimal point or an exponent. However, %g may print the number as an integer; in such cases, we append ".0" to the string. */ assert(PyFloat_Check(v)); PyOS_snprintf(format, 32, "%%.%ig", precision); PyOS_ascii_formatd(buf, buflen, format, v->ob_fval); cp = buf; if (*cp == '-') cp++; for (; *cp != '\0'; cp++) { /* Any non-digit means it's not an integer; this takes care of NAN and INF as well. */ if (!isdigit(Py_CHARMASK(*cp))) break; } if (*cp == '\0') { *cp++ = '.'; *cp++ = '0'; *cp++ = '\0'; return; } <some NaN/inf stuff> }

我们可以看到

所以这首先初始化一些variables，并检查v是一个格式良好的浮点数。然后准备一个格式string：

 PyOS_snprintf(format, 32, "%%.%ig", precision);

现在PREC_REPR在floatobject.c定义为17，因此计算为"%.17g" 。现在我们打电话

 PyOS_ascii_formatd(buf, buflen, format, v->ob_fval);

随着隧道的尽头，我们查找PyOS_ascii_formatd并发现它在内部使用snprintf 。

来自python教程：

在Python 2.7和Python 3.1以前的版本中，Python将此值舍入为17位有效数字，给出了'0.10000000000000001' 。在当前版本中，Python显示一个基于最短小数部分的值，该小数部分正确地返回到真正的二进制值，结果仅为'0.1' 。

奇怪的行为与浮动和string转换

将最小可能的浮点数添加到浮点数

将负浮点值转换为unsigned int的行为是什么？

为什么有些浮点<整数比较比其他四倍慢？

乘以1.0和int的精度浮点转换

当我用clojure分数时，我得到一个分数，我怎么得到小数？

比较浮点值有多危险？

“近似”最大公约数

浮点variables的范围会影响它们的值吗？

有没有一个函数来循环在C中的float或我需要写我自己的？

如何在Swift中findDouble和Float的最大值