PHP的x86如何获得> 2 GB文件的文件大小没有外部程序?

我需要获取大小超过2 GB的文件的文件大小。 (在4.6 GB文件上testing)。 有没有办法做到这一点,没有外部程序?

现状:

  • filesize()stat()fseek()失败
  • fread()feof()作品

有可能通过读取文件内容(非常慢!)来获取文件大小。

 $size = (float) 0; $chunksize = 1024 * 1024; while (!feof($fp)) { fread($fp, $chunksize); $size += (float) $chunksize; } return $size; 

我知道如何在64位平台上使用(使用fseek($fp, 0, SEEK_END)ftell() ),但是我需要32位平台的解决scheme。


解决scheme:我已经为此启动了开源项目。

大文件工具

大文件工具是在PHP中(甚至在32位系统上)操作超过2 GB的文件所需的黑客集合。

  • 回答: https : //stackoverflow.com/a/35233556/631369
  • github: https : //github.com/jkuchar/BigFileTools

这是我在SoloAdmin (一个旧的FreeBSD许可项目)中使用的方法。

它首先尝试使用适合平台的shell命令(Windows shellreplace修饰符或* nix / Mac stat命令)。 如果失败,它会尝试COM(如果在Windows上),最后回退到filesize()

原始的function可以在这里find: index.php:line 1866(SVN)

我在这里发布了一个稍微修改过的版本,编辑删除了其他项目特定function的依赖关系。

 function filesize64($file) { static $iswin; if (!isset($iswin)) { $iswin = (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN'); } static $exec_works; if (!isset($exec_works)) { $exec_works = (function_exists('exec') && !ini_get('safe_mode') && @exec('echo EXEC') == 'EXEC'); } // try a shell command if ($exec_works) { $cmd = ($iswin) ? "for %F in (\"$file\") do @echo %~zF" : "stat -c%s \"$file\""; @exec($cmd, $output); if (is_array($output) && ctype_digit($size = trim(implode("\n", $output)))) { return $size; } } // try the Windows COM interface if ($iswin && class_exists("COM")) { try { $fsobj = new COM('Scripting.FileSystemObject'); $f = $fsobj->GetFile( realpath($file) ); $size = $f->Size; } catch (Exception $e) { $size = null; } if (ctype_digit($size)) { return $size; } } // if all else fails return filesize($file); } 
 <?php ###################################################################### # Human size for files smaller or bigger than 2 GB on 32 bit Systems # # size.php - 1.1 - 17.01.2012 - Alessandro Marinuzzi - www.alecos.it # ###################################################################### function showsize($file) { if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') { if (class_exists("COM")) { $fsobj = new COM('Scripting.FileSystemObject'); $f = $fsobj->GetFile(realpath($file)); $file = $f->Size; } else { $file = trim(exec("for %F in (\"" . $file . "\") do @echo %~zF")); } } elseif (PHP_OS == 'Darwin') { $file = trim(shell_exec("stat -f %z " . escapeshellarg($file))); } elseif ((PHP_OS == 'Linux') || (PHP_OS == 'FreeBSD') || (PHP_OS == 'Unix') || (PHP_OS == 'SunOS')) { $file = trim(shell_exec("stat -c%s " . escapeshellarg($file))); } else { $file = filesize($file); } if ($file < 1024) { echo $file . ' Byte'; } elseif ($file < 1048576) { echo round($file / 1024, 2) . ' KB'; } elseif ($file < 1073741824) { echo round($file / 1048576, 2) . ' MB'; } elseif ($file < 1099511627776) { echo round($file / 1073741824, 2) . ' GB'; } elseif ($file < 1125899906842624) { echo round($file / 1099511627776, 2) . ' TB'; } elseif ($file < 1152921504606846976) { echo round($file / 1125899906842624, 2) . ' PB'; } elseif ($file < 1180591620717411303424) { echo round($file / 1152921504606846976, 2) . ' EB'; } elseif ($file < 1208925819614629174706176) { echo round($file / 1180591620717411303424, 2) . ' ZB'; } else { echo round($file / 1208925819614629174706176, 2) . ' YB'; } } ?> 

使用如下:

 <?php include("php/size.php"); ?> 

而你想要的地方:

 <?php showsize("files/VeryBigFile.rar"); ?> 

如果你想改善它,欢迎你!

我发现Linux / Unix的一个很好的苗条的解决scheme,以获得与32位PHP大文件的文件大小。

 $file = "/path/to/my/file.tar.gz"; $filesize = exec("stat -c %s ".$file); 

你应该处理$filesize作为string。 如果文件大小大于PHP_INT_MAX,则尝试将int转换为int,结果是filesize = PHP_INT_MAX。

但是,虽然作为string处理下面的人类可读的algorithm工作:

 formatBytes($filesize); public function formatBytes($size, $precision = 2) { $base = log($size) / log(1024); $suffixes = array('', 'k', 'M', 'G', 'T'); return round(pow(1024, $base - floor($base)), $precision) . $suffixes[floor($base)]; } 

所以我的输出大于4 Gb的文件是:

 4.46G 

我已经开始称为大文件工具 。 它被certificate可以在Linux,Mac和Windows(甚至是32位变体)上工作。 即使对于大文件(> 4GB),它也能提供字节精确的结果。 它在内部使用砖/math – 任意精度算术库。

使用composer php安装它。

 composer install jkuchar/BigFileTools 

并使用它:

 <?php $file = BigFileTools\BigFileTools::createDefault()->getFile(__FILE__); echo $file->getSize() . " bytes\n"; 

结果是BigInteger,所以你可以计算结果

 $sizeInBytes = $file->getSize(); $sizeInMegabytes = $sizeInBytes->toBigDecimal()->dividedBy(1024*1024, 2, \Brick\Math\RoundingMode::HALF_DOWN); echo "Size is $sizeInMegabytes megabytes\n"; 

大文件工具内部使用驱动程序来可靠地确定所有平台上的确切文件大小。 这里是可用的驱动程序列表(更新2016-02-05)

 | Driver | Time (s) ↓ | Runtime requirements | Platform | --------------- | ------------------- | -------------- | --------- | CurlDriver | 0.00045299530029297 | CURL extension | - | NativeSeekDriver | 0.00052094459533691 | - | - | ComDriver | 0.0031449794769287 | COM+.NET extension | Windows only | ExecDriver | 0.042937040328979 | exec() enabled | Windows, Linux, OS X | NativeRead | 2.7670161724091 | - | - 

您可以使用BigFileTools中的任何一个或默认select最快的可用( BigFileTools::createDefault()

  use BigFileTools\BigFileTools; use BigFileTools\Driver; $bigFileTools = new BigFileTools(new Driver\CurlDriver()); 
 $file_size=sprintf("%u",filesize($working_dir."\\".$file)); 

这适用于Windows Box上的我。

我正在浏览bug日志: https : //bugs.php.net/bug.php?id = 63618 ,发现这个解决scheme。

如果你有一个FTP服务器,你可以使用fsockopen:

 $socket = fsockopen($hostName, 21); $t = fgets($socket, 128); fwrite($socket, "USER $myLogin\r\n"); $t = fgets($socket, 128); fwrite($socket, "PASS $myPass\r\n"); $t = fgets($socket, 128); fwrite($socket, "SIZE $fileName\r\n"); $t = fgets($socket, 128); $fileSize=floatval(str_replace("213 ","",$t)); echo $fileSize; fwrite($socket, "QUIT\r\n"); fclose($socket); 

(作为对ftp_size页面的评论发现)

你可能想要添加一些你使用的函数的替代方法,例如调用系统函数,如“dir”/“ls”,并从那里获取信息。 他们当然是安全的主题,你可以检查的东西,并最终恢复到缓慢的方法作为最后的手段。

一种select是寻求2GB的标志,然后从那里读取长度…

 function getTrueFileSize($filename) { $size = filesize($filename); if ($size === false) { $fp = fopen($filename, 'r'); if (!$fp) { return false; } $offset = PHP_INT_MAX - 1; $size = (float) $offset; if (!fseek($fp, $offset)) { return false; } $chunksize = 8192; while (!feof($fp)) { $size += strlen(fread($fp, $chunksize)); } } elseif ($size < 0) { // Handle overflowed integer... $size = sprintf("%u", $size); } return $size; } 

所以基本上,寻求在PHP中可表示的最大正整数有符号整数(32位系统为2GB),然后使用8KB块(这应该是公平的权衡,以获得最佳的内存效率与磁盘传输效率)读取。

另外请注意,我没有添加$chunksize的大小。 原因是fread实际上可能会返回比$chunksize更多或更less的字节,具体取决于许多可能性。 所以相反,使用strlen来确定分析string的长度。

当使用IEEE double(大多数系统)时,文件大小低于〜4EB(etabytes = 10 ^ 18字节)确实适合于精确的数字(并且在使用标准算术运算时不应该有精度损失)。

根据某些答案的build议,通过检查filesize()是否返回负数,您无法可靠地获取32位系统上文件的大小。 这是因为如果一个文件是在4到6位之间的32位系统文件大小将报告一个正数,然后负数从6到8,然后正数从8到10,依此类推。 它以说话的方式循环。

所以你用一个可以在32位系统上可靠工作的外部命令卡住了。

但是,一个非常有用的工具是能够检查文件大小是否大于特定大小,并且即使是非常大的文件也可以可靠地执行此操作。

以下寻求50兆,并尝试读取一个字节。 在我的低规格testing机器上速度非常快,即使尺寸远大于2个演出,也能可靠工作。

您可以使用它来检查一个文件是否大于2147483647字节(2147483648是32位系统上的最大整数),然后以不同方式处理该文件或让您的应用程序发出警告。

 function isTooBig($file){ $fh = @fopen($file, 'r'); if(! $fh){ return false; } $offset = 50 * 1024 * 1024; //50 megs $tooBig = false; if(fseek($fh, $offset, SEEK_SET) === 0){ if(strlen(fread($fh, 1)) === 1){ $tooBig = true; } } //Otherwise we couldn't seek there so it must be smaller fclose($fh); return $tooBig; } 

下面的代码适用于任何版本的PHP / OS / Web服务器/平台上的任何文件大小。

 // http head request to local file to get file size $opts = array('http'=>array('method'=>'HEAD')); $context = stream_context_create($opts); // change the URL below to the URL of your file. DO NOT change it to a file path. // you MUST use a http:// URL for your file for a http request to work // SECURITY - you must add a .htaccess rule which denies all requests for this database file except those coming from local ip 127.0.0.1. // $tmp will contain 0 bytes, since its a HEAD request only, so no data actually downloaded, we only want file size $tmp= file_get_contents('http://127.0.0.1/pages-articles.xml.bz2', false, $context); $tmp=$http_response_header; foreach($tmp as $rcd) if( stripos(trim($rcd),"Content-Length:")===0 ) $size= floatval(trim(str_ireplace("Content-Length:","",$rcd))); echo "File size = $size bytes"; // example output File size = 10082006833 bytes 

我迭代BigFileTools类/答案:
– 因为某些平台(例如Synology NAS)不支持Curl的FTP协议,所以select禁用curl方法
-extra非posix,但更准确的,执行sizeExec ,而不是在磁盘上的大小实际文件大小是通过使用stat而不是du返回
– 对大文件(> 4GB)的大小结果是正确的,对于sizeNativeSeek几乎同样快
-debug消息选项

 <?php /** * Class for manipulating files bigger than 2GB * (currently supports only getting filesize) * * @author Honza Kuchař * @license New BSD * @encoding UTF-8 * @copyright Copyright (c) 2013, Jan Kuchař */ class BigFileTools { /** * Absolute file path * @var string */ protected $path; /** * Use in BigFileTools::$mathLib if you want to use BCMath for mathematical operations */ const MATH_BCMATH = "BCMath"; /** * Use in BigFileTools::$mathLib if you want to use GMP for mathematical operations */ const MATH_GMP = "GMP"; /** * Which mathematical library use for mathematical operations * @var string (on of constants BigFileTools::MATH_*) */ public static $mathLib; /** * If none of fast modes is available to compute filesize, BigFileTools uses to compute size very slow * method - reading file from 0 byte to end. If you want to enable this behavior, * switch fastMode to false (default is true) * @var bool */ public static $fastMode = true; //on some platforms like Synology NAS DS214+ DSM 5.1 FTP Protocol for curl is not working or disabled // you will get an error like "Protocol file not supported or disabled in libcurl" public static $FTPProtocolCurlEnabled = false; public static $debug=false; //shows some debug/error messages public static $posix=true; //more portable but it shows size on disk not actual filesize so it's less accurate: 0..clustersize in bytes inaccuracy /** * Initialization of class * Do not call directly. */ static function init() { if (function_exists("bcadd")) { self::$mathLib = self::MATH_BCMATH; } elseif (function_exists("gmp_add")) { self::$mathLib = self::MATH_GMP; } else { throw new BigFileToolsException("You have to install BCMath or GMP. There mathematical libraries are used for size computation."); } } /** * Create BigFileTools from $path * @param string $path * @return BigFileTools */ static function fromPath($path) { return new self($path); } static function debug($msg) { if (self::$debug) echo $msg; } /** * Gets basename of file (example: for file.txt will return "file") * @return string */ public function getBaseName() { return pathinfo($this->path, PATHINFO_BASENAME); } /** * Gets extension of file (example: for file.txt will return "txt") * @return string */ public function getExtension() { return pathinfo($this->path, PATHINFO_EXTENSION); } /** * Gets extension of file (example: for file.txt will return "file.txt") * @return string */ public function getFilename() { return pathinfo($this->path, PATHINFO_FILENAME); } /** * Gets path to file of file (example: for file.txt will return path to file.txt, eg /home/test/) * ! This will call absolute path! * @return string */ public function getDirname() { return pathinfo($this->path, PATHINFO_DIRNAME); } /** * Gets md5 checksum of file content * @return string */ public function getMd5() { return md5_file($this->path); } /** * Gets sha1 checksum of file content * @return string */ public function getSha1() { return sha1_file($this->path); } /** * Constructor - do not call directly * @param string $path */ function __construct($path, $absolutizePath = true) { if (!static::isReadableFile($path)) { throw new BigFileToolsException("File not found at $path"); } if($absolutizePath) { $this->setPath($path); }else{ $this->setAbsolutePath($path); } } /** * Tries to absolutize path and than updates instance state * @param string $path */ function setPath($path) { $this->setAbsolutePath(static::absolutizePath($path)); } /** * Setts absolute path * @param string $path */ function setAbsolutePath($path) { $this->path = $path; } /** * Gets current filepath * @return string */ function getPath($a = "") { if(a != "") { trigger_error("getPath with absolutizing argument is deprecated!", E_USER_DEPRECATED); } return $this->path; } /** * Converts relative path to absolute */ static function absolutizePath($path) { $path = realpath($path); if(!$path) { // TODO: use hack like http://stackoverflow.com/questions/4049856/replace-phps-realpath or http://www.php.net/manual/en/function.realpath.php#84012 // probaly as optinal feature that can be turned on when you know, what are you doing throw new BigFileToolsException("Not possible to resolve absolute path."); } return $path; } static function isReadableFile($file) { // Do not use is_file // @link https://bugs.php.net/bug.php?id=27792 // $readable = is_readable($file); // does not always return correct value for directories $fp = @fopen($file, "r"); // must be file and must be readable if($fp) { fclose($fp); return true; } return false; } /** * Moves file to new location / rename * @param string $dest */ function move($dest) { if (move_uploaded_file($this->path, $dest)) { $this->setPath($dest); return TRUE; } else { @unlink($dest); // needed in PHP < 5.3 & Windows; intentionally @ if (rename($this->path, $dest)) { $this->setPath($dest); return TRUE; } else { if (copy($this->path, $dest)) { unlink($this->path); // delete file $this->setPath($dest); return TRUE; } return FALSE; } } } /** * Changes path of this file object * @param string $dest */ function relocate($dest) { trigger_error("Relocate is deprecated!", E_USER_DEPRECATED); $this->setPath($dest); } /** * Size of file * * Profiling results: * sizeCurl 0.00045299530029297 * sizeNativeSeek 0.00052094459533691 * sizeCom 0.0031449794769287 * sizeExec 0.042937040328979 * sizeNativeRead 2.7670161724091 * * @return string | float * @throws BigFileToolsException */ public function getSize($float = false) { if ($float == true) { return (float) $this->getSize(false); } $return = $this->sizeCurl(); if ($return) { $this->debug("sizeCurl succeeded"); return $return; } $this->debug("sizeCurl failed"); $return = $this->sizeNativeSeek(); if ($return) { $this->debug("sizeNativeSeek succeeded"); return $return; } $this->debug("sizeNativeSeek failed"); $return = $this->sizeCom(); if ($return) { $this->debug("sizeCom succeeded"); return $return; } $this->debug("sizeCom failed"); $return = $this->sizeExec(); if ($return) { $this->debug("sizeExec succeeded"); return $return; } $this->debug("sizeExec failed"); if (!self::$fastMode) { $return = $this->sizeNativeRead(); if ($return) { $this->debug("sizeNativeRead succeeded"); return $return; } $this->debug("sizeNativeRead failed"); } throw new BigFileToolsException("Can not size of file $this->path !"); } // <editor-fold defaultstate="collapsed" desc="size* implementations"> /** * Returns file size by using native fseek function * @see http://www.php.net/manual/en/function.filesize.php#79023 * @see http://www.php.net/manual/en/function.filesize.php#102135 * @return string | bool (false when fail) */ protected function sizeNativeSeek() { $fp = fopen($this->path, "rb"); if (!$fp) { return false; } flock($fp, LOCK_SH); $result= fseek($fp, 0, SEEK_END); if ($result===0) { if (PHP_INT_SIZE < 8) { // 32bit $return = 0.0; $step = 0x7FFFFFFF; while ($step > 0) { if (0 === fseek($fp, - $step, SEEK_CUR)) { $return += floatval($step); } else { $step >>= 1; } } } else { //64bit $return = ftell($fp); } } else $return = false; flock($fp, LOCK_UN); fclose($fp); return $return; } /** * Returns file size by using native fread function * @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5504829#5504829 * @return string | bool (false when fail) */ protected function sizeNativeRead() { $fp = fopen($this->path, "rb"); if (!$fp) { return false; } flock($fp, LOCK_SH); rewind($fp); $offset = PHP_INT_MAX - 1; $size = (string) $offset; if (fseek($fp, $offset) !== 0) { flock($fp, LOCK_UN); fclose($fp); return false; } $chunksize = 1024 * 1024; while (!feof($fp)) { $read = strlen(fread($fp, $chunksize)); if (self::$mathLib == self::MATH_BCMATH) { $size = bcadd($size, $read); } elseif (self::$mathLib == self::MATH_GMP) { $size = gmp_add($size, $read); } else { throw new BigFileToolsException("No mathematical library available"); } } if (self::$mathLib == self::MATH_GMP) { $size = gmp_strval($size); } flock($fp, LOCK_UN); fclose($fp); return $size; } /** * Returns file size using curl module * @see http://www.php.net/manual/en/function.filesize.php#100434 * @return string | bool (false when fail or cUrl module not available) */ protected function sizeCurl() { // curl solution - cross platform and really cool :) if (self::$FTPProtocolCurlEnabled && function_exists("curl_init")) { $ch = curl_init("file://" . $this->path); curl_setopt($ch, CURLOPT_NOBODY, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_HEADER, true); $data = curl_exec($ch); if ($data=="" || empty($data)) $this->debug(stripslashes(curl_error($ch))); curl_close($ch); if ($data !== false && preg_match('/Content-Length: (\d+)/', $data, $matches)) { return (string) $matches[1]; } } else { return false; } } /** * Returns file size by using external program (exec needed) * @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5502328#5502328 * @return string | bool (false when fail or exec is disabled) */ protected function sizeExec() { // filesize using exec if (function_exists("exec")) { if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') { // Windows // Try using the NT substition modifier %~z $escapedPath = escapeshellarg($this->path); $size = trim(exec("for %F in ($escapedPath) do @echo %~zF")); }else{ // other OS // If the platform is not Windows, use the stat command (should work for *nix and MacOS) if (self::$posix) { $tmpsize=trim(exec("du \"".$this->path."\" | cut -f1")); //du returns blocks/KB $size=(int)$tmpsize*1024; //make it bytes } else $size=trim(exec('stat "'.$this->path.'" | grep -i -o -E "Size: ([0-9]+)" | cut -d" " -f2')); if (self::$debug) var_dump($size); return $size; } } return false; } /** * Returns file size by using Windows COM interface * @see http://stackoverflow.com/questions/5501451/php-x86-how-to-get-filesize-of-2gb-file-without-external-program/5502328#5502328 * @return string | bool (false when fail or COM not available) */ protected function sizeCom() { if (class_exists("COM")) { // Use the Windows COM interface $fsobj = new COM('Scripting.FileSystemObject'); if (dirname($this->path) == '.') $this->path = ((substr(getcwd(), -1) == DIRECTORY_SEPARATOR) ? getcwd() . basename($this->path) : getcwd() . DIRECTORY_SEPARATOR . basename($this->path)); $f = $fsobj->GetFile($this->path); return (string) $f->Size; } } // </editor-fold> } BigFileTools::init(); class BigFileToolsException extends Exception{} 

那么最简单的方法就是简单地为你的号码添加一个最大值。 这意味着在x86平台上长号加2 ^ 32:

 if($size < 0) $size = pow(2,32) + $size; 

例如:Big_File.exe -3,30Gb(3.554.287.616 b)你的函数返回-740679680,所以你加2 ^ 32(4294967296)并得到3554287616。

你得到负数,因为你的系统预留一个负号的内存,所以你留下了2 ^ 31(2.147.483.648 = 2G)的负值或正值的最大值。 当系统达到这个最大值时,它不会停止,而是简单地覆盖最后一个保留位,你的号码现在被强制为负值。 用简单的话来说,当你超过最大正数时,你将被迫使最大负数,所以2147483648 + 1 = -2147483648。 进一步增加趋向于零,并再次朝向最大数量。

正如你所看到的,它就像一个循环中最高和最低的圆圈。

x86体系结构在一个tick中可以“消化”的总的最大数量是2 ^ 32 = 4294967296 = 4G,所以只要你的数量低于这个数字,这个简单的技巧就会一直工作。 在更高的数字中,你必须知道你已经通过了多less次循环点,并且简单地乘以2 ^ 32并把它加到你的结果中:

 $size = pow(2,32) * $loops_count + $size; 

当然在基本的PHP函数中这是很难做到的,因为没有函数会告诉你它已经经过了多less次循环点,所以这对于4Gigs以上的文件不起作用。

我写了一个函数,它精确地返回文件的大小,是相当快的:

 function file_get_size($file) { //open file $fh = fopen($file, "r"); //declare some variables $size = "0"; $char = ""; //set file pointer to 0; I'm a little bit paranoid, you can remove this fseek($fh, 0, SEEK_SET); //set multiplicator to zero $count = 0; while (true) { //jump 1 MB forward in file fseek($fh, 1048576, SEEK_CUR); //check if we actually left the file if (($char = fgetc($fh)) !== false) { //if not, go on $count ++; } else { //else jump back where we were before leaving and exit loop fseek($fh, -1048576, SEEK_CUR); break; } } //we could make $count jumps, so the file is at least $count * 1.000001 MB large //1048577 because we jump 1 MB and fgetc goes 1 B forward too $size = bcmul("1048577", $count); //now count the last few bytes; they're always less than 1048576 so it's quite fast $fine = 0; while(false !== ($char = fgetc($fh))) { $fine ++; } //and add them $size = bcadd($size, $fine); fclose($fh); return $size; }