TONT 34623 为什么 Windows 的错误报告程序昵称叫『Dr. Watson』?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050810-13/?p=34623

The nickname for the feature known as Windows Error Reporting is “Dr. Watson”. Where did that name come from?

Windows 的错误报告功能有个昵称叫『Dr. Watson』(华生博士),这个名字是从哪来的呢?

As you have probably guessed, The name Dr. Watson was inspired by the character of Dr. Watson, the assistant to Sherlock Holmes in the stories by Arthur Conan Doyle.

如你所想,『华生博士』这个名字是受亚瑟·柯南道尔笔下的夏洛克·福尔摩斯的助手华生博士启发而起的。

It is my understanding that the doctor was originally developed as part of Windows 3.0 beta testing. His job was to record data about application crashes to a file, so that the file could be uploaded and included with bug reports. The icon was (and continues to be) a friendly doctor using his stethoscope to investigate a problem.

据我所了解,Dr. Watson 原本是作为 Windows 3.0 的 beta 测试工具被开发出来的,其工作是将崩溃的应用程序的数据记录到文件,以便将其嵌入到bug报告中进行上传。Dr. Watson 的图标曾经(并且至今如此)是一位用听诊器在调查问题的、和善的博士。

The Doctor has remained true to the “capture information about an error” aspect of his job. In the meantime, the word “Watson” has expanded its meaning to encompass anonymous end-user feedback mechanisms in general, such as “Content Watson”. (But if you hear “Watson” by itself, the speaker is almost certainly talking about error reporting.)

Dr. Watson 的工作『获取关于错误的信息』方面的描述一直保持不变。与此同时,Watson 这个词的含义也被扩展,囊括了面向最终用户的匿名反馈事项,例如『Content Watson』这种用法。(不过如果你单独听到 Watson 这个词,那说话者多半肯定是在讨论错误报告这回事。)

TONT 34783 你所说的那个叫『网站』的东西是什么?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050728-16/?p=34783

One reaction I’ve seen when people learn about all the compatibility work done in the Windows 95 kernel is to say,

当人们对 Windows 95 内核所作出的兼容性努力发表意见时,我所知的其中一条是:

Why not add code to the installer wizard [alas, page is now 404] which checks to see if you’re installing SimCity and, if so, informs you of a known design flaw, then asks you to visit Electronic Arts’ webpage for a patch?

为什么不在安装程序里加一行代码(可惜这个页面已经404了)(译注:所以这里就不再引用失效链接了),让它可以检测到你是不是正在安装《模拟城市》,如果是的话,就提醒你游戏里的bug,然后帮你打开EA的网站去下载补丁呢?

Let’s ignore the issue of the “installer wizard”; most people do not go through the Add and Remove Programs control panel to install programs, so any changes to that control panel wouldn’t have helped anyway.

这里我们先忽略『安装程序』这个说法,毕竟大多数人是不会跑到控制面板的『添加/删除程序』里去安装新应用的,所以对控制面板做出变动并没有什么帮助。

But what about detecting that you’re running SimCity and telling you to get a patch from Electronic Arts’ web site?

姑且说,系统能不能做到检测你开启了《模拟城市》,然后通知你去EA的网站上下载补丁呢?

Remember, this was 1993. Almost nobody had web sites. The big thing was the “Information Superhighway”. (Remember that? I don’t think it ever got built; the Internet sort of stole its thunder.) If you told somebody, “Go to Electronic Arts’ web site and download a patch”, you’d get a blank stare. What’s a “web site”? How do I access that from Prodigy? I don’t have a modem. Can you mail me their web site?

请记住,那是1993年,基本上没什么人开设了『网站』这种东西,那时候的热门话题叫『信息高速公路』。(还记得这个词吗?我反正没记得这东西建成了没有,『互联网』把它的风头全抢尽了。)如果那年头你跟别人说『去EA的网站上下个补丁』,得到的只能是一脸茫然:『网站』是嘛玩意?我怎么访问它?我家又没有调制解调器,你能把他们的网站寄给我吗?

In Windows XP, when Windows detects that you’re running a program with which it is fundamentally incompatible, you do get a pop-up window directing you to the company’s web site. But that’s because it’s now 2005 and even hermits living in caves have email addresses.

在 Windows XP 中,当 Windows 检测到你运行的程序存在完全不兼容的情况时,你的确会得到一项提示,将你导向开发这个软件的公司的网站。但这是因为这时已经是2005年了,就连住在山洞里的隐士都有电子邮件地址了。

In 1993, things were a little different.

而1993年的时候,事情可不太一样。

(Heck, even by 1995 things most people did not have Internet access and those few that did used modems. Requiring users to obtain Internet access in order to set the computer clock via NTP would have been rather presumptuous.)

(1995年的时候,大多数人都没法上网,而仅有的能上网的人倒的确都在用调制解调器。让用户上网就为了能通过NTP服务设置一下本机的时钟就更加扯淡了。)

TONT 34883 为什么 FindFirstFile 会同时查找短文件名?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050720-16/?p=34883

The FindFirstFile function matches both the short and long names. This can produce somewhat surprising results. For example, if you ask for “*.htm”, this also gives you the file “x.html” since its short name is “X~1.HTM”.

FindFirstFile 函数会同时匹配短文件名和长文件名,有时这种设计会带来一些意料之外的结果。例如,如果你查找『*.htm』,x.html(译注:注意此处的扩展名是HTML,其在8.3形式下为HTM,感谢石樱灯笼在评论中指出)也会出现在结果中,因为它的短文件名是X~1.HTM。

Why does it bother matching short names? Shouldn’t it match only long names? After all, only old 16-bit programs use short names.

为什么要去匹配短文件名呢,只匹配长文件名不就够了吗?毕竟只有旧式的16位应用程序会用短文件名了。

But that’s the problem: 16-bit programs use short names.

然而这却正视问题所在:16位应用程序使用的是短文件名。

Through a process known as generic thunks, a 16-bit program can load a 32-bit DLL and call into it. Windows 95 and the Windows 16-bit emulation layer in Windows NT rely heavily on generic thunks so that they don’t have to write two versions of everything. Instead, the 16-bit version just thunks up to the 32-bit version.

通过名为通用 Thunk 的过程,16位应用程序可以加载32位的DLL并调用之。Windows 95 和 Windows NT 中的 16 位 Windows 模拟层对于通用 Thunk 重度依赖,这样才不必为各种各样的东西编写2个版本,而是让 16 位的调用直接与 32 位的相关调用进行联动。

Note, however, that this would mean that 32-bit DLLs would see two different views of the file system, depending on whether they are hosted from a 16-bit process or a 32-bit process.

然而需要注意的是,如此一来32位的DLL便对文件系统有了两种观察方式,具体取决于其宿主进程是16位还是32位。

“Then make the FindFirstFile function check to see who its caller is and change its behavior accordingly,” doesn’t fly because you can’t trust the return address.

『那就让 FindFirstFile 看看调用方是谁,然后执行不同的行为不就好了』,这种想法是不现实的,因为你无法信任返回地址。

Even if this problem were solved, you would still have the problem of 16/32 interop across the process boundary.

就算能解决这个问题,你还是要面对16/32位进程边界间互操作的问题。

For example, suppose a 16-bit program calls WinExec(“notepad X~1.HTM”). The 32-bit Notepad program had better open the file X~1.HTM even though it’s a short name. What’s more, a common way to get properties of a file such as its last access time is to call FindFirstFile with the file name, since the WIN32_FIND_DATA structure returns that information as part of the find data. (Note: GetFileAttributesEx is a better choice, but that function is comparatively new.) If the FindFirstFile function did not work for short file names, then the above trick would fail for short names passed across the 16/32 boundary.

例如,假设某个程序调用了WinExec(“notepad X~1.HTM”),那么32位的记事本最好是去打开X~1.HTM,即便这是个短文件名。此外,获取文件属性(例如文件的最后访问时间)的一种通常方式便是调用 FindFirstFile 并提供文件名,因为 WIN32_FIND_DATA 结构会将这类信息包含在返回中。(请注意:使用 GetFileAttributesEx 是更好的做法,不过这个函数相对来说还比较新就是了。)如果 FindFirstFile(在32位程序调用它的时候)不能使用短文件名,那么前述的做法就会失效,因为此时短文件名跨越了16/32位的边界。

As another example, suppose the DLL saves the file name in a location external to the process, say a configuration file, the registry, or a shared memory block. If a 16-bit program program calls into this DLL, it would pass short names, whereas if a 32-bit program calls into the DLL, it would pass long names. If the file system functions returned only long names for 32-bit programs, then the copy of the DLL running in a 32-bit program would not be able to read the data written by the DLL running in a 16-bit program.

另一个例子是,假设某个DLL将文件名保存在进程之外的某处,例如配置文件中、注册表中,或某个共享的内存区域里。如果16位应用程序调用了这个DLL,那么它传入的就是短文件名,而32位应用程序调用时则会传入长文件名。如果文件系统函数只为32位应用程序返回长文件名,那么运行在32位应用程序下的这个DLL就无法读取由16位应用程序调用其时写入的数据了。

TONT 34893 ES_OEMCONVERT 是做什么用的?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050719-12/?p=34893

The ES_OEMCONVERT edit control style is a holdover from 16-bit Windows. This ancient MSDN article from the Windows 3.1 SDK describes the flag thus:

编辑控件的 ES_OEMCONVERT 样式是从 16 位 Windows 中继承下来的。下面这篇在 Windows 3.1 SDK 中古老的 MSDN 文章对这个标志是这样描述的:

ES_OEMCONVERT causes text entered into the edit control to be converted from ANSI to OEM and then back to ANSI. This ensures proper character conversion when the application calls the AnsiToOem function to convert a Windows string in the edit control to OEM characters. ES_OEMCONVERT is most useful for edit controls that contain filenames.

ES_OEMCONVERT 标记会将输入到编辑控件的文字从 ANSI 编码转换到 OEM 编码,然后再转回 ANSI 编码。通过这样操作,可以保证应用程序调用AnsiToOem 将编辑控件中的字符转换为 OEM 编码的字符时,可以获得恰当的结果。ES_OEMCONVERT  在包含文件名的编辑控件中最有裨益。

Set the wayback machine to, well, January 31, 1992, the date of the article.

让我们把时间倒回1992年1月31日,也就是上面那段文字那时候。

At this time, the predominant Windows platform was Windows 3.0. Windows 3.1 was still a few months away from release, and Windows NT 3.1 was over a year away. The predominant file system was 16-bit FAT, and the relevant feature of FAT of this era for the purpose of this discussion is that file names were stored on disk in the OEM character set. (We discussed the history behind the schism between the OEM and ANSI code pages in an earlier article.)

那个时候,占据主导地位的 Windows 平台是 Windows 3.0。Windows 3.1 还得再过几个月才会发布,而 Windows NT 3.1 还得再等一年。那时占主导地位的文件系统是 16 位 FAT,而与此处讨论相关的 FAT 文件系统的设计,则是磁盘上存储的文件名使用的是 OEM 字符集。(我们在早先的一篇文章中讨论过有关 OEM 和 ANSI 代码页之间的分崩离析。)(译注:链接已失效,无法查证)

Since GUI programs used the ANSI character set, but file names were stored in the OEM character set, the only characters that could be used in file names from GUI programs were those that exist in both character sets. If a character existed in the ANSI character set but not the OEM character set, then there would be no way of using it as a file name; and if a character existed in the OEM character set but not the ANSI character set, the GUI program couldn’t manipulate it.

由于 GUI 程序用的是 ANSI 字符集,而文件名是用 OEM 字符集存储的,所以在 GUI 应用程序中使用的文件名,其字符集只能是在两种字符集中共存的那些。如果某个字符在 ANSI 字符集中存在,但并不存在于 OEM 字符集中,那么就不能讲这个字符用在文件名中;反之,如果某个字符在 OEM 字符集中存在,但在 ANSI 字符集中不存在,那么 GUI 应用程序也搞不定它。

The ES_OEMCONVERT flag on a edit control ensures that only characters that exist in both the ANSI and OEM character sets are used, hence the remark “ES_OEMCONVERT is most useful for edit controls that contain filenames”.

编辑控件中的 ES_OEMCONVERT 样式可以保证只有在 ANSI 和 OEM 两种字符集中共存的字符才能使用,也就印证了『ES_OEMCONVERT  在包含文件名的编辑控件中最有裨益』的说法。

Fast-forward to today.

回到现在。

All the popular Windows file systems support Unicode file names and have for ten years. There is no longer a data loss converting from the ANSI character set to the character set used by the file system. Therefore, there is no need to filter out any characters to forestall the user typing a character that will be lost during the conversion to a file name. In other words, the ES_OEMCONVERT flag is pointless today. It’s a leftover from the days before Unicode.

如今所有流行的 Windows 文件系统都支持 Unicode 文件名,并且已经如此十年了(译注:原文发布时间为2005年7月19日,Windows 3.0 发布时间为1990年5月22日),在将(文件名中的字符从)ANSI 字符集转换为文件系统使用的字符集(Unicode)不会再有数据丢失的问题了。因此也没有必要再预先将用户录入文件名时,可能在后期转换中丢失的字符过滤掉了。换句话说,ES_OEMCONVERT 这个样式标记已经毫无意义了,只是一个在 Unicode 年代之前的遗留而已。

Indeed, if you use this flag, you make your program worse, not better, because it unnecessarily restricts the set of characters that the user will be allowed to use in file names. A user running the US-English version of Windows would not be allowed to enter Chinese characters as a file name, for example, even though the file system is perfectly capable of creating files whose names contain those characters.

实际上,如果你非要用这个样式标记不可,只会让你的程序更糟糕而不是更优秀,因为对用户在文件名中允许使用过的字符集进行限制已经毫无必要。例如,(在你经过这样设计的应用程序中,)使用美国英语版本 Windows 的用户就不能输入中文字符作为文件名了,即便是文件系统可以完美支持创建包含那些(中文)字符的文件名也一样。

TONT 34913 如果 InitCommonControls 什么也没做,为什么还得调用它?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050718-16/?p=34913

One of the problems beginners run into when they start using shell common controls is that they forget to call the InitCommonControls function. But if you were to disassemble the InitCommonControls function itself, you’ll see that it, like the FlushInstructionCache function, doesn’t actually do anything.

初学者学习使用系统外壳通用控件时,经常遇到的其中一个问题是忘记调用 InitCommonControls  方法。不过,如果对 InitCommonControls 方法反编译一下的话,你会发现它像 FlushInstructionCache 一样,事实上什么事情也没做。

Then why do you need to call it?

那么,必须调用它的意义何在呢?

As with FlushInstructionCache, what’s important is not what it performs, but just the fact that you called it.

就像 FlushInstructionCache 一样,重点不在它做了什么,而在于你调用了它这件事上。

Recall that merely listing a lib file in your dependencies doesn’t actually cause your program to be bound to the corresponding DLL. You have to call a function in that DLL in order for there to be an import entry for that DLL. And InitCommonControls is that function.

回想一下,只是将某个库文件列在你的依赖列表里,并不意味着你的程序就与对应的DLL绑定了。你得调用这个DLL中的某个方法,才能保证其入口点的存在,而 InitCommonControls 做的就是这件事。

Without the InitCommonControls function, a program that wants to use the shell common controls library would otherwise have no reference to COMCTL32.DLL in its import table. This means that when the program loads, COMCTL32.DLL is not loaded and therefore is not initialized. Which means that it doesn’t register its window classes. Which means that your call to the CreateWindow function fails because the window class has not been registered.

没有对 InitCommonControls 的调用,要使用系统外壳通用控件库的程序,其导入表中就不存在对 COMCTL32.DLL 的引用,这就意味着当程序加载时,COMCTL32.DLL 并没有被加载,因此也没有被初始化,也就意味着没有注册其窗口类,最终意味着当你调用 CreateWindow 时会失败,因为窗口类尚未被注册。

That’s why you have to call a function that does nothing. It’s for your own good.

这就是为什么你需要调用一个什么也不做的方法的原因——这是为你好。

(Of course, there’s the new InitCommonControlsEx function that lets you specify which classes you would like to be registered. Only the classic Windows 95 classes are registered when COMCTL32.DLL loads. For everything else you have to ask for it explicitly.)

(当然了,后来还有个 InitCommonControlsEx 允许你指定要注册哪些类。当 COMCTL32.DLL 加载时,只有传统的 Windows 95 类是默认注册的,要想用到其它的类,你必须进行明确的指定。)