在性能方面,使用BufferedOutputStream包装FileOutputStream的意义何在?

我有一个模块负责读取,处理和写入字节到磁盘。 这些字节通过UDP进入,并且在各个数据报被组装之后,被处理和写入磁盘的最终字节数组通常在200字节和500,000字节之间。 偶尔会有字节数组,在组装后,超过500,000字节,但是这些是比较less见的。

我正在使用FileOutputStreamwrite(byte\[\])方法 。 我也在用BufferedOutputStream封装FileOutputStream ,包括使用接受缓冲区大小的构造函数作为参数 。

看起来,使用BufferedOutputStream的趋势往往略好,但我只是开始尝试不同的缓冲区大小。 我只有一个有限的样本数据集(来自样本运行的两个数据集,我可以通过我的应用程序)。 是否有一个一般的经验法则,我可以申请试图计算最佳的缓冲区大小,以减less磁盘写入,并最大限度地提高了磁盘写入的性能给定的信息,我知道我正在写的数据?

BufferedOutputStream有助于当写入小于缓冲区大小,例如8 KB。 对于较大的写入来说,这并没有帮助,也不会使它变得更糟。 如果所有的写操作都大于缓冲区大小,或者每次写操作后总是flush(),那么我不会使用缓冲区。 但是,如果你写的很多部分less于缓冲区大小,并且你不用每次都使用flush(),那么值得拥有。

您可能会发现将缓冲区大小增加到32 KB或更大可能会使您的边缘得到改善,或使情况变得更糟。 因人而异


你可能会发现BufferedOutputStream.write的代码有用

 /** * Writes <code>len</code> bytes from the specified byte array * starting at offset <code>off</code> to this buffered output stream. * * <p> Ordinarily this method stores bytes from the given array into this * stream's buffer, flushing the buffer to the underlying output stream as * needed. If the requested length is at least as large as this stream's * buffer, however, then this method will flush the buffer and write the * bytes directly to the underlying output stream. Thus redundant * <code>BufferedOutputStream</code>s will not copy data unnecessarily. * * @param b the data. * @param off the start offset in the data. * @param len the number of bytes to write. * @exception IOException if an I/O error occurs. */ public synchronized void write(byte b[], int off, int len) throws IOException { if (len >= buf.length) { /* If the request length exceeds the size of the output buffer, flush the output buffer and then write the data directly. In this way buffered streams will cascade harmlessly. */ flushBuffer(); out.write(b, off, len); return; } if (len > buf.length - count) { flushBuffer(); } System.arraycopy(b, off, buf, count, len); count += len; } 

我最近一直在试图探索IO性能。 从我所观察到的,直接写入FileOutputStream导致更好的结果; 我已经归因于FileOutputStream本地调用的write(byte[], int, int) 。 此外,我还观察到,当BufferedOutputStream的延迟开始趋于直接FileOutputStream ,它会波动很多,即它可以突然甚至加倍(我还没有find原因)。

PS我正在使用Java 8,现在将不能评论我的观察是否适用于以前的Java版本。

这是我testing的代码,其中我的input是一个〜10KB的文件

 public class WriteCombinationsOutputStreamComparison { private static final Logger LOG = LogManager.getLogger(WriteCombinationsOutputStreamComparison.class); public static void main(String[] args) throws IOException { final BufferedInputStream input = new BufferedInputStream(new FileInputStream("src/main/resources/inputStream1.txt"), 4*1024); final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); int data = input.read(); while (data != -1) { byteArrayOutputStream.write(data); // everything comes in memory data = input.read(); } final byte[] bytesRead = byteArrayOutputStream.toByteArray(); input.close(); /* * 1. WRITE USING A STREAM DIRECTLY with entire byte array --> FileOutputStream directly uses a native call and writes */ try (OutputStream outputStream = new FileOutputStream("src/main/resources/outputStream1.txt")) { final long begin = System.nanoTime(); outputStream.write(bytesRead); outputStream.flush(); final long end = System.nanoTime(); LOG.info("Total time taken for file write, writing entire array [nanos=" + (end - begin) + "], [bytesWritten=" + bytesRead.length + "]"); if (LOG.isDebugEnabled()) { LOG.debug("File reading result was: \n" + new String(bytesRead, Charset.forName("UTF-8"))); } } /* * 2. WRITE USING A BUFFERED STREAM, write entire array */ // changed the buffer size to different combinations --> write latency fluctuates a lot for same buffer size over multiple runs try (BufferedOutputStream outputStream = new BufferedOutputStream(new FileOutputStream("src/main/resources/outputStream1.txt"), 16*1024)) { final long begin = System.nanoTime(); outputStream.write(bytesRead); outputStream.flush(); final long end = System.nanoTime(); LOG.info("Total time taken for buffered file write, writing entire array [nanos=" + (end - begin) + "], [bytesWritten=" + bytesRead.length + "]"); if (LOG.isDebugEnabled()) { LOG.debug("File reading result was: \n" + new String(bytesRead, Charset.forName("UTF-8"))); } } } } 

OUTPUT:

 2017-01-30 23:38:59.064 [INFO] [main] [WriteCombinationsOutputStream] - Total time taken for file write, writing entire array [nanos=100990], [bytesWritten=11059] 2017-01-30 23:38:59.086 [INFO] [main] [WriteCombinationsOutputStream] - Total time taken for buffered file write, writing entire array [nanos=142454], [bytesWritten=11059] 
Interesting Posts