PHP中读取文件最后一行的最佳方式是什么？

在我的PHP应用程序中，我需要从多个文件 （主要是日志） 的末尾开始读取多行 。有时我只需要最后一个，有时我需要几十或几百个。基本上，我想要一些像Unix tail命令一样灵活的东西。

这里有一些关于如何从文件中获得最后一行的问题（但是我需要N行），并给出了不同的解决scheme。我不确定哪一个最好，哪个更好。

方法概述

在互联网上search，我遇到了不同的解决scheme。我可以用三种方法对他们进行分组：

天真的使用file() PHP函数;
在系统上运行tail命令的作弊者;
强大的人用fseek()高兴地跳过一个打开的文件。

我最终select（或写作）五个解决scheme，一个天真的 ，一个作弊的一个和三个强大的。

最简洁天真的解决scheme ，使用内置的数组函数。
基于tail命令的唯一可能的解决scheme有一个小问题：如果tail不可用，它就不会运行，如在非Unix（Windows）或不允许系统function的受限制环境中。
从文件末尾search（和计数）换行符的单字节的解决scheme，在这里find。
在这里find了针对大文件优化的多字节缓冲解决scheme。
解决scheme＃4的稍微修改版本，其中缓冲区长度是dynamic的，根据要检索的行数决定。

所有的解决scheme工从某种意义上说，它们返回任何文件的预期结果以及我们要求的任意数量的行（除了解决scheme＃1，在大文件的情况下可以打破PHP内存限制，不返回任何内容）。但哪一个更好？

性能testing

要回答我运行testing的问题。这就是这些事情做的，不是吗？

我准备了一个示例100 KB文件 ，将我的/var/log目录中find的不同文件合并在一起。然后，我编写了一个PHP脚本，使用这五个解决scheme中的每一个从文件末尾检索1,2，…，10，20，… 100,200，…，1000行。每个单独的testing重复十次（类似于5×28×10 = 1400testing），测量平均经过时间（以微秒为单位）。

我使用PHP命令行解释器在本地开发机器（Xubuntu 12.04，PHP 5.3.10,2.70 GHz双核CPU，2 GB RAM）上运行脚本。结果如下：

样本100 KB日志文件上的执行时间

解决scheme＃1和＃2似乎是更糟糕的。解决scheme＃3只有当我们需要阅读几行时才是好的。 解决scheme＃4和＃5似乎是最好的。 请注意，dynamic缓冲区大小如何优化algorithm：由于缓冲区减less，因此对于less数几行来说，执行时间略小。

让我们尝试一个更大的文件。如果我们必须读取一个10 MB的日志文件呢？

样本10 MB日志文件上的执行时间

现在解决scheme＃1是最糟糕的一个：事实上，加载整个10 MB的文件到内存不是一个好主意。我也在1MB和100MB的文件上运行testing，这是几乎相同的情况。

和微小的日志文件？这是一个10 KB文件的graphics：

样本10 KB日志文件上的执行时间

解决scheme＃1是现在最好的！将10 KB加载到内存中对于PHP来说不是什么大问题。＃4和＃5也performance良好。然而，这是一个边缘情况：一个10 KB的日志意味着像150/200行…

你可以在这里下载我所有的testing文件，来源和结果。

最后的想法

解决scheme5强烈build议用于一般用例：对于每个文件大小都很有效，在阅读几行时性能特别好。

如果您应该读取大于10 KB的文件，请避免使用解决scheme＃1 。

解决scheme＃2和＃3对于我运行的每个testing来说都不是最好的：＃2从来不会在less于2ms的时间内运行，＃3会受到你所要求的行数的很大影响（只有1或2行）。

这是一个可以跳过最后一行的修改版本：

 /** * Modified version of http://www.geekality.net/2011/05/28/php-tail-tackling-large-files/ and of https://gist.github.com/lorenzos/1711e81a9162320fde20 * @author Kinga the Witch (Trans-dating.com), Torleif Berger, Lorenzo Stanco * @link http://stackoverflow.com/a/15025877/995958 * @license http://creativecommons.org/licenses/by/3.0/ */ function tailWithSkip($filepath, $lines = 1, $skip = 0, $adaptive = true) { // Open file $f = @fopen($filepath, "rb"); if (@flock($f, LOCK_SH) === false) return false; if ($f === false) return false; // Sets buffer size, according to the number of lines to retrieve. // This gives a performance boost when reading a few lines from the file. $max=max($lines, $skip); if (!$adaptive) $buffer = 4096; else $buffer = ($max < 2 ? 64 : ($max < 10 ? 512 : 4096)); // Jump to last character fseek($f, -1, SEEK_END); // Read it and adjust line number if necessary // (Otherwise the result would be wrong if file doesn't end with a blank line) if (fread($f, 1) == "\n") { if ($skip > 0) { $skip++; $lines--; } } else { $lines--; } // Start reading $output = ''; $chunk = ''; // While we would like more while (ftell($f) > 0 && $lines >= 0) { // Figure out how far back we should jump $seek = min(ftell($f), $buffer); // Do the jump (backwards, relative to where we are) fseek($f, -$seek, SEEK_CUR); // Read a chunk $chunk = fread($f, $seek); // Calculate chunk parameters $count = substr_count($chunk, "\n"); $strlen = mb_strlen($chunk, '8bit'); // Move the file pointer fseek($f, -$strlen, SEEK_CUR); if ($skip > 0) { // There are some lines to skip if ($skip > $count) { $skip -= $count; $chunk=''; } // Chunk contains less new line symbols than else { $pos = 0; while ($skip > 0) { if ($pos > 0) $offset = $pos - $strlen - 1; // Calculate the offset - NEGATIVE position of last new line symbol else $offset=0; // First search (without offset) $pos = strrpos($chunk, "\n", $offset); // Search for last (including offset) new line symbol if ($pos !== false) $skip--; // Found new line symbol - skip the line else break; // "else break;" - Protection against infinite loop (just in case) } $chunk=substr($chunk, 0, $pos); // Truncated chunk $count=substr_count($chunk, "\n"); // Count new line symbols in truncated chunk } } if (strlen($chunk) > 0) { // Add chunk to the output $output = $chunk . $output; // Decrease our line counter $lines -= $count; } } // While we have too many lines // (Because of buffer size we might have read too many) while ($lines++ < 0) { // Find first newline and remove all text before that $output = substr($output, strpos($output, "\n") + 1); } // Close file and return @flock($f, LOCK_UN); fclose($f); return trim($output); }

这也将工作：

 $file = new SplFileObject("/path/to/file"); $file->seek(PHP_INT_MAX); // cheap trick to seek to EoF $total_lines = $file->key(); // last line number // output the last twenty lines $reader = new LimitIterator($file, $total_lines - 20); foreach ($reader as $line) { echo $line; // includes newlines }

或者没有LimitIterator ：

 $file = new SplFileObject($filepath); $file->seek(PHP_INT_MAX); $total_lines = $file->key(); $file->seek($total_lines - 20); while (!$file->eof()) { echo $file->current(); $file->next(); }

不幸的是，你的testing用例在我的机器上发生了段错误，所以我不知道它是如何执行的。

PHP中读取文件最后一行的最佳方式是什么？

方法概述

性能testing

最后的想法

为什么立方体比方块更快

聚簇和非聚簇索引究竟意味着什么？

For-loop性能：将数组长度存储在variables中

PHP include（）：文件大小和性能

数据属性的CSSselect器比类select器更快吗？

在c ++中，exception是如何工作的（在幕后）

为什么GoLang解决scheme比同等的Java解决scheme更快？

生成一组置换（最有效）

postgresql COUNT（DISTINCT …）很慢

大型公共数据集？