mmap与malloc:奇怪的performance

我正在写一些parsing日志文件的代码,警告说这些文件是压缩的,必须在运行时解压缩。 这段代码有点性能敏感的一段代码,所以我正在尝试各种方法来find正确的。 无论我使用多less个线程,我都拥有基本上与程序一样多的内存

我发现了一个方法,似乎performance相当好,我试图理解为什么它提供了更好的性能。

两种方法都有一个读取器线程,一个读取pipe道gzip进程并写入一个大型缓冲区。 当请求下一个日志行时,这个缓冲区会被懒散的分析,返回指向这个缓冲区不同字段的地方的指针结构。

代码在D中,但与C或C ++非常相似。

共享variables:

shared(bool) _stream_empty = false;; shared(ulong) upper_bound = 0; shared(ulong) curr_index = 0; 

parsing代码:

 //Lazily parse the buffer void construct_next_elem() { while(1) { // Spin to stop us from getting ahead of the reader thread buffer_empty = curr_index >= upper_bound -1 && _stream_empty; if(curr_index >= upper_bound && !_stream_empty) { continue; } // Parsing logic ..... } } 

方法1:Malloc一个足够大的缓冲区来保存解压缩的文件。

 char[] buffer; // Same as vector<char> in C++ buffer.length = buffer_length; // Same as vector reserve in C++ or malloc 

方法2:使用匿名内存映射作为缓冲区

 MmFile buffer; buffer = new MmFile(null, MmFile.Mode.readWrite, // PROT_READ || PROT_WRITE buffer_length, null); // MAP_ANON || MAP_PRIVATE 

读者线程:

 ulong buffer_length = get_gzip_length(file_path); pipe = pipeProcess(["gunzip", "-c", file_path], Redirect.stdout); stream = pipe.stdout(); static void stream_data() { while(!l.stream.eof()) { // Splice is a reference inside the buffer char[] splice = buffer[upper_bound..upper_bound + READ_SIZE]; ulong read = stream.rawRead(splice).length; upper_bound += read; } // Clean up } void start_stream() { auto t = task!stream_data(); t.executeInNewThread(); construct_next_elem(); } 

即使是在数量级上,方法1的性能也有明显提高

 User time (seconds): 112.22 System time (seconds): 38.56 Percent of CPU this job got: 151% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:39.40 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3784992 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 5463 Voluntary context switches: 90707 Involuntary context switches: 2838 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 

 User time (seconds): 275.92 System time (seconds): 73.92 Percent of CPU this job got: 117% Elapsed (wall clock) time (h:mm:ss or m:ss): 4:58.73 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3777336 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 944779 Voluntary context switches: 89305 Involuntary context switches: 9836 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 

方法2获得更多的页面错误。

有人能帮我解释一下为什么使用mmap会出现如此明显的性能下降?

如果有人知道有什么更好的办法来解决这个问题,我会很乐意听到的。

编辑 – – –

改变方法2做:

  char * buffer = cast(char*)mmap(cast(void*)null, buffer_length, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0); 

现在比使用简单的MmFile获得3倍的性能提升。 我试图找出什么可能会导致这样一个明显不同的performance是什么,它基本上只是一个包装mmap。

perf数字只是使用直char * mmap vs Mmfile,方式less的页面错误:

 User time (seconds): 109.99 System time (seconds): 36.11 Percent of CPU this job got: 151% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:36.20 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 3777896 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 2771 Voluntary context switches: 90827 Involuntary context switches: 2999 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 

你正在得到页面错误和减速,因为默认情况下,一旦你尝试访问它,只加载页面。

另一方面阅读知道你正在顺序阅读,所以它提前提取页面,然后再请求他们。

看看madvise调用 – 它打算告诉内核你打算如何访问mmap的文件,并允许你为mmap内存的不同部分设置不同的策略 – 例如你有一个索引块要保留在内存[MADV_WILLNEED]中,但内容是随机访问和按需访问的[MADV_RANDOM],或者您正在顺序扫描内存中循环[MADV_SEQUENTIAL]

但是操作系统完全可以自由地忽略你所设定的策略,所以YMMW