如何将x64机器码写入虚拟内存并在C ++中执行Windows

我一直在想如何V8 JavaScript引擎和任何其他JIT编译器执行生成的代码。

以下是我在编写小型演示期间阅读的文章。

http://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introduction
http://nullprogram.com/blog/2015/03/19/

我只知道很less的程序集，所以我最初使用http://gcc.godbolt.org/编写一个函数并获得反汇编的输出，但代码在Windows上不起作用。

然后，我编写了一个小的C ++代码，用-g -Og编译，然后用gdb获取disassmbled输出。

 #include <stdio.h> int square(int num) { return num * num; } int main() { printf("%d\n", square(10)); return 0; }

输出：

 Dump of assembler code for function square(int): => 0x00000000004015b0 <+0>: imul %ecx,%ecx 0x00000000004015b3 <+3>: mov %ecx,%eax 0x00000000004015b5 <+5>: retq

我复制粘贴的输出（'％'删除）在线x86汇编，并获得{ 0x0F, 0xAF, 0xC9, 0x89, 0xC1, 0xC3 } 。

这是我的最终代码。如果我用gcc编译，我总是得到1.如果我用VC ++编译它，我得到随机数。到底是怎么回事？

 #include <stdio.h> #include <string.h> #include <stdlib.h> #include <windows.h> typedef unsigned char byte; typedef int (*int0_int)(int); const byte square_code[] = { 0x0f, 0xaf, 0xc9, 0x89, 0xc1, 0xc3 }; int main() { byte* buf = reinterpret_cast<byte*>(VirtualAlloc(0, 1 << 8, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE)); if (buf == nullptr) return 0; memcpy(buf, square_code, sizeof(square_code)); { DWORD old; VirtualProtect(buf, 1 << 8, PAGE_EXECUTE_READ, &old); } int0_int square = reinterpret_cast<int0_int>(buf); int ans = square(100); printf("%d\n", ans); VirtualFree(buf, 0, MEM_RELEASE); return 0; }

注意

我正在努力学习JIT是如何工作的，所以请不要build议我使用LLVM或任何库。我保证我会在实际项目中使用适当的JIT库，而不是从头开始编写。

注意：正如Ben Voigt在评论中指出的那样，这实际上只适用于x86，而不适用于x86_64。对于x86_64，你只是在你的程序集中有一些错误（在x86中也是错误的），Ben Voigt在他的回答中也指出了这个错误。

发生这种情况是因为编译器在生成程序集时可能会看到函数调用的两端。由于编译器正在控制为调用者和被调用者生成代码，因此不必遵循cdecl调用约定，也不需要遵循cdecl调用约定。

MSVC的默认调用约定是cdecl。基本上，函数参数按照它们列出的顺序被推入堆栈，所以调用foo(10, 100)可能会导致程序集：

 push 100 push 10 call foo(int, int)

在你的情况下，编译器会在调用网站上产生如下的内容：

 push 100 call esi ; assuming the address of your code is in the register esi

这不是你的代码所期待的。你的代码期望它的参数被传递到寄存器ecx ，而不是栈中。

编译器使用了看起来像fastcall调用约定。如果我编译一个类似的程序（我得到稍微不同的程序集），我得到了预期的结果：

 #include <stdio.h> #include <string.h> #include <stdlib.h> #include <windows.h> typedef unsigned char byte; typedef int (_fastcall *int0_int)(int); const byte square_code[] = { 0x8b, 0xc1, 0x0f, 0xaf, 0xc0, 0xc3 }; int main() { byte* buf = reinterpret_cast<byte*>(VirtualAlloc(0, 1 << 8, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE)); if (buf == nullptr) return 0; memcpy(buf, square_code, sizeof(square_code)); { DWORD old; VirtualProtect(buf, 1 << 8, PAGE_EXECUTE_READ, &old); } int0_int square = reinterpret_cast<int0_int>(buf); int ans = square(100); printf("%d\n", ans); VirtualFree(buf, 0, MEM_RELEASE); return 0; }

请注意，我告诉编译器使用_fastcall调用约定。如果你想使用cdecl ，程序集需要看起来更像这样：

 push ebp mov ebp, esp mov eax, DWORD PTR _n$[ebp] imul eax, eax pop ebp ret 0

（发布者：我不擅长汇编，这是由Visual Studio生成的）

我复制粘贴的输出（'％'删除）

那么，这意味着你的第二条指令是

 mov ecx, eax

这是完全没有意义的（它会用未初始化的返回值覆盖乘法的结果）。

另一方面

 mov eax, foo ret

是以非void返回类型结束函数的一种非常常见的模式。

两种汇编语言（AT＆T风格与英特尔风格）之间的区别不仅仅是%标记，操作数顺序是相反的，指针和偏移量的表示方式也非常不同。

你会想在gdb中发布一个set disassembly-flavor intel命令