TONT 34883 为什么 FindFirstFile 会同时查找短文件名?

原文链接:https://devblogs.microsoft.com/oldnewthing/20050720-16/?p=34883

The FindFirstFile function matches both the short and long names. This can produce somewhat surprising results. For example, if you ask for “*.htm”, this also gives you the file “x.html” since its short name is “X~1.HTM”.

FindFirstFile 函数会同时匹配短文件名和长文件名,有时这种设计会带来一些意料之外的结果。例如,如果你查找『*.htm』,x.html(译注:注意此处的扩展名是HTML,其在8.3形式下为HTM,感谢石樱灯笼在评论中指出)也会出现在结果中,因为它的短文件名是X~1.HTM。

Why does it bother matching short names? Shouldn’t it match only long names? After all, only old 16-bit programs use short names.

为什么要去匹配短文件名呢,只匹配长文件名不就够了吗?毕竟只有旧式的16位应用程序会用短文件名了。

But that’s the problem: 16-bit programs use short names.

然而这却正视问题所在:16位应用程序使用的是短文件名。

Through a process known as generic thunks, a 16-bit program can load a 32-bit DLL and call into it. Windows 95 and the Windows 16-bit emulation layer in Windows NT rely heavily on generic thunks so that they don’t have to write two versions of everything. Instead, the 16-bit version just thunks up to the 32-bit version.

通过名为通用 Thunk 的过程,16位应用程序可以加载32位的DLL并调用之。Windows 95 和 Windows NT 中的 16 位 Windows 模拟层对于通用 Thunk 重度依赖,这样才不必为各种各样的东西编写2个版本,而是让 16 位的调用直接与 32 位的相关调用进行联动。

Note, however, that this would mean that 32-bit DLLs would see two different views of the file system, depending on whether they are hosted from a 16-bit process or a 32-bit process.

然而需要注意的是,如此一来32位的DLL便对文件系统有了两种观察方式,具体取决于其宿主进程是16位还是32位。

“Then make the FindFirstFile function check to see who its caller is and change its behavior accordingly,” doesn’t fly because you can’t trust the return address.

『那就让 FindFirstFile 看看调用方是谁,然后执行不同的行为不就好了』,这种想法是不现实的,因为你无法信任返回地址。

Even if this problem were solved, you would still have the problem of 16/32 interop across the process boundary.

就算能解决这个问题,你还是要面对16/32位进程边界间互操作的问题。

For example, suppose a 16-bit program calls WinExec(“notepad X~1.HTM”). The 32-bit Notepad program had better open the file X~1.HTM even though it’s a short name. What’s more, a common way to get properties of a file such as its last access time is to call FindFirstFile with the file name, since the WIN32_FIND_DATA structure returns that information as part of the find data. (Note: GetFileAttributesEx is a better choice, but that function is comparatively new.) If the FindFirstFile function did not work for short file names, then the above trick would fail for short names passed across the 16/32 boundary.

例如,假设某个程序调用了WinExec(“notepad X~1.HTM”),那么32位的记事本最好是去打开X~1.HTM,即便这是个短文件名。此外,获取文件属性(例如文件的最后访问时间)的一种通常方式便是调用 FindFirstFile 并提供文件名,因为 WIN32_FIND_DATA 结构会将这类信息包含在返回中。(请注意:使用 GetFileAttributesEx 是更好的做法,不过这个函数相对来说还比较新就是了。)如果 FindFirstFile(在32位程序调用它的时候)不能使用短文件名,那么前述的做法就会失效,因为此时短文件名跨越了16/32位的边界。

As another example, suppose the DLL saves the file name in a location external to the process, say a configuration file, the registry, or a shared memory block. If a 16-bit program program calls into this DLL, it would pass short names, whereas if a 32-bit program calls into the DLL, it would pass long names. If the file system functions returned only long names for 32-bit programs, then the copy of the DLL running in a 32-bit program would not be able to read the data written by the DLL running in a 16-bit program.

另一个例子是,假设某个DLL将文件名保存在进程之外的某处,例如配置文件中、注册表中,或某个共享的内存区域里。如果16位应用程序调用了这个DLL,那么它传入的就是短文件名,而32位应用程序调用时则会传入长文件名。如果文件系统函数只为32位应用程序返回长文件名,那么运行在32位应用程序下的这个DLL就无法读取由16位应用程序调用其时写入的数据了。

Comments

  1. 「原文疑似有误,*.htm能找出x.htm有什么可奇怪的?」
    你重点看错了,是原本只想匹配 `x*.htm` ,但是 `x.html` 被匹配出来了。短文件名问题同时影响超过三个字符的扩展名。

发表评论

电子邮件地址不会被公开。 必填项已用*标注

 剩余字数 ( Characters available )

Your comment will be available after auditing.
您的评论将在通过审核后显示。

Please DO NOT add any links in your comment, otherwise it would be identified as SPAM automatically and never be audited.
请不要在评论中插入任何链接,否则将被自动归类为垃圾评论,且永远不会被提交给博主进行复审。

*