用java内存映射实现读取文件行(readline)

heipark

浏览: 2079812 次
性别:
来自: 北京

最近访客更多访客>>

chenlmnet

ninedragon

w11h22j33

lbyzx123

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Java

下面代码使用两种方式读取日志文件，一种是流方式，一种是内存映射：

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.CharBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.util.Scanner;
import java.util.zip.GZIPInputStream;

public class Test {
	static String path = "D:\\log\\proclog\\loganalysis\\0108161331_CHN-MY-1_1021235501.log";

	public static void main(String[] s) throws IOException {
		stream();
		mem();
	}

	public static void stream() throws FileNotFoundException, IOException {
		Long startTime = System.currentTimeMillis();
		BufferedReader reader = getReader(new File(path));

		String line;
		while ((line = reader.readLine()) != null) {
			// 空转
		}
		Long estimatedTime = System.currentTimeMillis() - startTime;
		System.out.printf("stream Diff: %d ms\n", estimatedTime);

	}

	public static BufferedReader getReader(File f) throws FileNotFoundException, IOException {
		BufferedReader reader = null;
		if (f.getName().endsWith(".gz")) {
			reader = new BufferedReader(new InputStreamReader(new GZIPInputStream(new FileInputStream(f))));
		} else {
			reader = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
		}
		return reader;
	}

	public static void mem() throws IOException {
		Long startTime = System.currentTimeMillis();
		FileChannel fc = new FileInputStream(path).getChannel();
		MappedByteBuffer byteBuffer = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
		//Charset charset = Charset.forName("US-ASCII");
		Charset charset = Charset.forName("iso-8859-1");
		CharsetDecoder decoder = charset.newDecoder();
		CharBuffer charBuffer = decoder.decode(byteBuffer);
		Scanner sc = new Scanner(charBuffer).useDelimiter(System.getProperty("line.separator"));
		while (sc.hasNext()) {
			sc.next();
		}
		fc.close();
		Long estimatedTime = System.currentTimeMillis() - startTime;
		System.out.printf("mem Diff: %d ms", estimatedTime);
	}
}

输出：

stream Diff: 147 ms
mem Diff: 2470 ms

从输出来看流方式要远远快于内存映射读取，看来逐行读取文本还是继续使用steam api吧。

PS. 测试文件大小23MB，使用一个100MB的文件，mem方式报内存溢出，有点尴尬，先做个记号吧。

参考：

http://hi.baidu.com/limin040206/blog/item/92763dfcd301ff0008244d48.html

http://jiangzhengjun.iteye.com/blog/515745

http://stackoverflow.com/questions/1045632/bufferedreader-for-large-bytebuffer

http://www.javakb.com/Uwe/Forum.aspx/java-programmer/7117/Reading-lines-of-text-from-a-MappedByteBuffer

-- end --

分享到：

mongodb笔记 | java.util.concurrent包API学习笔记

2011-08-25 11:21
浏览 12033
评论(7)
分类:编程语言
查看更多

7 楼 cyb_rc 2013-11-18

PS. 测试文件大小23MB，使用一个100MB的文件，mem方式报内存溢出，有点尴尬，先做个记号吧。

至于这个，mem默认大小是跟堆的大小一样，可以自己设置mem大小

6 楼 cyb_rc 2013-11-18

至于这个

PS. 测试文件大小23MB，使用一个100MB的文件，mem方式报内存溢出，有点尴尬，先做个记号吧。

5 楼 cyb_rc 2013-11-18

映射方式输出的地方时间消耗太大

映射文件速度比流快很多，文件越大越明显

4 楼 test_lockxxx 2012-02-08

我估计卡在这里：

Scanner sc = new Scanner(charBuffer).useDelimiter(System.getProperty("line.separator"));

3 楼 test_lockxxx 2012-02-08

拿 BufferedReader 与 MappedByteBuffer 相比，这本身就不公平。

至少也应该拿 BufferedInputStream 与 MappedByteBuffer 相比。

2 楼 heipark 2011-10-18

lindakun 写道

请问楼主，逐行读取有没有更快的方法呢？

到目前为止除了BufferedReader，我不知道用其它了，这块sun/oracle已经做过优化，踏实用吧，如果还有问题，或许要转变解决问题的思路。

1 楼 lindakun 2011-09-29

请问楼主，逐行读取有没有更快的方法呢？

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论