在XP系统上LoadLibrary 998错误

最近升级Chromium 63内核遇到一个奇怪的问题:编译出来的dll在XP上LoadLibrary失败,GetLastError为998(Invalid access to memory location)。

一般加载dll发生998的错误,是因为dll里面有内存访问错误,抛出了C0000005异常。然后把Windbg调试器挂上去调试看看,确实在代码里面有下面这样的错误:

ModLoad: 762d0000 762e0000   C:\WINDOWS\system32\WINSTA.dll
ModLoad: 76680000 76726000   C:\WINDOWS\system32\WININET.dll
ModLoad: 4ae90000 4b036000   C:\WINDOWS\WinSxS\x86_Microsoft.Windows.GdiPlus_6595b64144ccf1df_1.0.2600.5512_x-ww_dfb54e0c\gdiplus.dll
(e74.ce0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000051 ecx=056ddb0c edx=00000000 esi=04d6de98 edi=056ddb0c
eip=0261fb90 esp=0012f728 ebp=0012f74c iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
xxx!base::internal::LockImpl::LockImpl+0x10:
0261fb90 8b1490          mov     edx,dword ptr [eax+edx*4] ds:0023:00000000=????????

但是看对应的代码,实在看不出有任何问题。排查了半天,排除各种可能也毫无头绪。

晚上回家的时候,在网上无意中看到这篇https://stackoverflow.com/questions/32517234/access-violation-on-static-initialization,看里面描述的汇编代码,跟我的情况很类似。答主回答问题说这是Visual Studio 2015生成代码时候,对于局部静态初始化的变量,用到了TLS,但是这个特性在XP系统上有问题,编译器加上/Zc:threadSafeInit-编译开关就好了。我想到我出现的异常的代码那里也用到static初始化局部的静态变量,真是越来越接近了。

然后我再对照Chromium代码的变更记录,确实之前版本他们有加/Zc:threadSafeInit-编译开关,但是Chromium后来不支持XP了,就把/Zc:threadSafeInit-编译开关去掉了。应该这就是问题的原因了。

后来又找到MSDN一篇官方文档https://docs.microsoft.com/en-us/cpp/build/reference/zc-threadsafeinit-thread-safe-local-static-initialization的说明:

Thread-safe static local variables use thread-local storage (TLS) internally to provide efficient execution when the static has already been initialized. The implementation of this feature relies on Windows operating system support functions in Windows Vista and later operating systems. Windows XP, Windows Server 2003, and older operating systems do not have this support, so they do not get the efficiency advantage. These operating systems also have a lower limit on the number of TLS sections that can be loaded. Exceeding the TLS section limit can cause a crash. If this is a problem in your code, especially in code that must run on older operating systems, use /Zc:threadSafeInit- to disable the thread-safe initialization code.

参考: