这个问题是在Windows XP下发现的,当我想写一个像GetTicketCount64这样的平台上不存在的函数。 这是我的testing代码:
uint64_t GetTickCountEx() { #if _WIN32_WINNT > _WIN32_WINNT_WINXP return GetTickCount64(); #else // http://msdn.microsoft.com/en-us/library/windows/desktop/dn553408.aspx LARGE_INTEGER Frequency = {}; LARGE_INTEGER Counter = {}; BOOST_VERIFY(QueryPerformanceFrequency(&Frequency)); BOOST_VERIFY(QueryPerformanceCounter(&Counter)); return 1000 * Counter.QuadPart / Frequency.QuadPart; #endif } for (int i = 0; ++i < 1000; Sleep(30000)) { const auto utc = time(nullptr); // System time const auto xp = GetTickCount(); // API of Windows XP SP3 const auto ex = GetTickCountEx(); // Performance counter const auto diff = ex - xp; printf("%lld %I32u %I64u %I64u \n", utc, xp, ex, diff); }
我无法理解下面的结果。 从这篇文章 ,Angstrom的回复似乎不正确。 最后一栏显示,随着时间的推移,GTC和GPC的差距越来越近! …几个小时之后,它会达到零?
所以,我的问题是:我的GetTickCount64的实现是否正确,为什么?
1401778679 503258484 503355416 96932 1401778709 503288484 503385374 96890 1401778739 503318484 503415354 96870 1401778769 503348484 503445289 96805 1401778799 503378484 503475274 96790 1401778829 503408484 503505272 96788 1401778859 503438484 503535245 96761 1401778889 503468500 503565210 96710 1401778919 503498500 503595143 96643 1401778949 503528500 503625137 96637 1401778979 503558500 503655100 96600 1401779009 503588500 503685069 96569 1401779039 503618500 503715069 96569 1401779069 503648500 503745006 96506 1401779099 503678500 503774951 96451 1401779129 503708500 503804958 96458 1401779159 503738500 503834943 96443 1401779189 503768500 503864911 96411 1401779219 503798500 503894792 96292 1401779249 503828500 503924759 96259 1401779279 503858500 503954607 96107 1401779309 503888500 503984607 96107 1401779339 503918500 504014392 95892 1401779369 503948500 504044362 95862
coreinfo.exe提供的 CPU核心信息:Coreinfo v3.21 – 关于系统CPU和内存拓扑的转储信息Copyright(C)2008-2013 Mark Russinovich Sysinternals – www.sysinternals.com
Intel(R) Core(TM) i3 CPU M 380 @ 2.53GHz x86 Family 6 Model 37 Stepping 5, GenuineIntel HTT * Hyperthreading enabled HYPERVISOR - Hypervisor is present VMX * Supports Intel hardware-assisted virtualization SVM - Supports AMD hardware-assisted virtualization EM64T * Supports 64-bit mode SMX - Supports Intel trusted execution SKINIT - Supports AMD SKINIT NX * Supports no-execute page protection SMEP - Supports Supervisor Mode Execution Prevention SMAP - Supports Supervisor Mode Access Prevention PAGE1GB - Supports 1 GB large pages PAE * Supports > 32-bit physical addresses PAT * Supports Page Attribute Table PSE * Supports 4 MB pages PSE36 * Supports > 32-bit address 4 MB pages PGE * Supports global bit in page tables SS * Supports bus snooping for cache operations VME * Supports Virtual-8086 mode RDWRFSGSBASE - Supports direct GS/FS base access FPU * Implements i387 floating point instructions MMX * Supports MMX instruction set MMXEXT - Implements AMD MMX extensions 3DNOW - Supports 3DNow! instructions 3DNOWEXT - Supports 3DNow! extension instructions SSE * Supports Streaming SIMD Extensions SSE2 * Supports Streaming SIMD Extensions 2 SSE3 * Supports Streaming SIMD Extensions 3 SSSE3 * Supports Supplemental SIMD Extensions 3 SSE4a - Supports Sreaming SIMDR Extensions 4a SSE4.1 * Supports Streaming SIMD Extensions 4.1 SSE4.2 * Supports Streaming SIMD Extensions 4.2 AES - Supports AES extensions AVX - Supports AVX intruction extensions FMA - Supports FMA extensions using YMM state MSR * Implements RDMSR/WRMSR instructions MTRR * Supports Memory Type Range Registers XSAVE - Supports XSAVE/XRSTOR instructions OSXSAVE - Supports XSETBV/XGETBV instructions RDRAND - Supports RDRAND instruction RDSEED - Supports RDSEED instruction CMOV * Supports CMOVcc instruction CLFSH * Supports CLFLUSH instruction CX8 * Supports compare and exchange 8-byte instructions CX16 * Supports CMPXCHG16B instruction BMI1 - Supports bit manipulation extensions 1 BMI2 - Supports bit manipulation extensions 2 ADX - Supports ADCX/ADOX instructions DCA - Supports prefetch from memory-mapped device F16C - Supports half-precision instruction FXSR * Supports FXSAVE/FXSTOR instructions FFXSR - Supports optimized FXSAVE/FSRSTOR instruction MONITOR * Supports MONITOR and MWAIT instructions MOVBE - Supports MOVBE instruction ERMSB - Supports Enhanced REP MOVSB/STOSB PCLULDQ - Supports PCLMULDQ instruction POPCNT * Supports POPCNT instruction LZCNT - Supports LZCNT instruction SEP * Supports fast system call instructions LAHF-SAHF * Supports LAHF/SAHF instructions in 64-bit mode HLE - Supports Hardware Lock Elision instructions RTM - Supports Restricted Transactional Memory instructions DE * Supports I/O breakpoints including CR4.DE DTES64 * Can write history of 64-bit branch addresses DS * Implements memory-resident debug buffer DS-CPL * Supports Debug Store feature with CPL PCID * Supports PCIDs and settable CR4.PCIDE INVPCID - Supports INVPCID instruction PDCM * Supports Performance Capabilities MSR RDTSCP * Supports RDTSCP instruction TSC * Supports RDTSC instruction TSC-DEADLINE - Local APIC supports one-shot deadline timer TSC-INVARIANT * TSC runs at constant rate xTPR * Supports disabling task priority messages EIST * Supports Enhanced Intel Speedstep ACPI * Implements MSR for power management TM * Implements thermal monitor circuitry TM2 * Implements Thermal Monitor 2 control APIC * Implements software-accessible local APIC x2APIC - Supports x2APIC CNXT-ID - L1 data cache mode adaptive or BIOS MCE * Supports Machine Check, INT18 and CR4.MCE MCA * Implements Machine Check Architecture PBE * Supports use of FERR#/PBE# pin PSN - Implements 96-bit processor serial number PREFETCHW * Supports PREFETCHW instruction Maximum implemented CPUID leaves: 0000000B (Basic), 80000008 (Extended). Logical to Physical Processor Map: *-*- Physical Processor 0 (Hyperthreaded) -*-* Physical Processor 1 (Hyperthreaded) Logical Processor to Socket Map: **** Socket 0 Logical Processor to NUMA Node Map: **** NUMA Node 0 Logical Processor to Cache Map: *-*- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *-*- Instruction Cache 0, Level 1, 32 KB, Assoc 4, LineSize 64 *-*- Unified Cache 0, Level 2, 256 KB, Assoc 8, LineSize 64 -*-* Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -*-* Instruction Cache 1, Level 1, 32 KB, Assoc 4, LineSize 64 -*-* Unified Cache 1, Level 2, 256 KB, Assoc 8, LineSize 64 **** Unified Cache 2, Level 3, 3 MB, Assoc 12, LineSize 64
你不能比较两个定时源,它们在PC上的实现有很大的不同。
GetTickCount()
来自时钟滴答中断,即实时时钟产生的一个信号。 传统上,专用芯片,最初是摩托罗拉MC146818,现在集成在南桥。 它有用于手表的振荡器,晶体稳定,通常运行在32768赫兹。 当机器电源关闭时,该振荡器继续运行,运行锂电池或超级电容器。
所以分辨率相当差,但是通过周期性地将时钟与时间服务器提供的时间重新同步,使其具有非常好的长期稳定性,并且非常准确,大多数Windows机器使用time.windows.com。 有关详细信息,请查阅GetSystemTimeAdjustment()。
QueryPerformanceCounter()
使用芯片组中可用的频率源。 传统的8053计数器运行在1193182赫兹。 现在的HPET定时器,HAL(硬件抽象层)允许系统集成商选择他可用的任何频率源。 使用CPU时钟在便宜的设计中并不罕见。
所以分辨率非常高,但是不准确,没有任何机制来校准这个定时器。 从报道的QPF中偏离800ppm并不罕见。 这个计时器只能用于短时间间隔测量,例如一个分析器将使用的类型。
所以不,使用QueryPerformanceCounter()作为GetTickCount64()的替代方案不是一个好主意,除非你能忍受这个不准确。 从技术上讲,只要您跟踪GetTickCount()溢出的值,就可以合成自己的64位计数器。 例如,如果前一个值为负数,新值为正数,则可以增加课程计数,表示溢出。 唯一的要求是,你经常对GetTickCount()进行抽样,至少在24天内看到一次转换。
在和你联系的那个话题的同一个话题中, 陈百强的回复明确地说,你既不应该考虑任何事情。 只考虑时差(间隔)是相关数量。 因此,你应该测试的是间隔,例如在循环开始值之前,循环开始之后的每一次循环。