TONT 41373 那些扒拉没有写在文档中结构的程序们

看,我发现了什么!

原文链接:https://blogs.msdn.microsoft.com/oldnewthing/20031223-00/?p=41373

Three examples off the top of my head of the consequences of grovelling into and relying on undocumented structures.

提到把没有在文档中提及的结构扒拉出来、并依赖其(实现功能)的程序的例子时,有三个例子首先浮现在我的眼前。

Defragmenting things that can’t be defragmented

整理不能整理碎片的文件

In Windows 2000, there are several categories of things that cannot be defragmented. Directories, exclusively-opened files, the MFT, the pagefile… That didn’t stop a certain software company from doing it anyway in their defragmenting software. They went into kernel mode, reverse-engineered NTFS’s data structures, and modified them on the fly. Yee-haw cowboy! And then when the NTFS folks added support for defragmenting the MFT to Windows XP, these programs went in, modified NTFS’s data structures (which changed in the meanwhile), and corrupted your disk. Of course there was no mention of this illicit behavior in the documentation. So when the background defragmenter corrupted their disks, Microsoft got the blame.

在Windows 2000中,有几类数据不能参与碎片整理,包括目录、独占打开的文件、MFT(Master File Table,NTFS分区格式的主文件表)、页面文件……但这并没有阻止某软件公司在其磁盘碎片整理软件里这么做。这家公司闯进内核模式,对NTFS分区格式的数据结构进行了逆向工程,然后(在磁盘整理时)对分区数据进行在线修改,好耶!然后NTFS开发组在Windows XP中增加了对MFT的碎片整理支持时,这些程序照旧闯进来,(按照旧的认知)修改NTFS分区的数据结构(而在这一过程中已经发生了变化),然后就把你的硬盘整乱了。当然这种不正当的做法在文档中是从来没有提及的,所以当这个后台碎片整理应用把你的磁盘搞砸了的时候,被责备的是微软。

Parsing the Explorer view data structures

分析资源管理器数据结构

A certain software company decided that they wanted to alter the behavior of the Explorer window from a shell extension. Since there is no way to do this (a shell extension is not supposed to mess with the view; the view belongs to the user), they decided to do it themselves anyway.

某家软件公司决定要通过他们的外壳扩展修改资源管理器窗口的行为。由于没有合理的办法达到这个目的(外壳扩展不应当对资源管理器的视图乱来,而应该由用户来决定),这家公司决定不顾一切自己来。

From the shell extension, they used an undocumented window message to get a pointer to one of the internal Explorer structures. Then they walked the structure until they found something they recognized. Then they knew, “The thing immediately after the thing that I recognize is the thing that I want.”

在他们的外壳扩展中,该公司使用了一个未在文档中描述的窗体消息,来获得资源管理器的某个内部结构的指针,对这个结构进行了一番研究,直到找到了某个他们认得出来的东西,然后就得出结论:『那个紧跟在我认识的东西的下一个数据一定就是我所需要的玩意儿。』

Well, the thing that they recognize and the thing that they want happened to be base classes of a multiply-derived class. If you have a class with multiple base classes, there is no guarantee from the compiler which order the base classes will be ordered. It so happened that they appeared in the order X,Y,Z in all the versions of Windows this software company tested against.

不过呢,他们所认识的东西,以及他们所需要的、跟在后面的东西,恰好是某个多派生类型的基类。如果有一个类包含多个基类,编译器是不保证基类的排列顺序的。恰巧,这家公司所测试过的所有Windows版本,这个类的基类顺序都是X、Y、Z。

Except Windows 2000.

除了Windows 2000。

In Windows 2000, the compiler decided that the order should be X,Z,Y. So now they grovelled in, saw the “X” and said “Aha, the next thing must be a Y” but instead they got a Z. And then they crashed your system some time later.

在Windows 2000中,编译器决定这个类的基类顺序应该是X、Z、Y。现在这家公司的扩展跑进来,找到了X,然后就很开心地想,『啊,下一个肯定是Y』,然而实际上他们找到的是Z。过了不久,这个扩展就把你的系统玩坏了。

So I had to create a “fake X,Y” so when the program went looking for X (so it could grab Y), it found the fake one first.

所以我得创建一个『虚伪的X、Y』(基类),以便当这个程序来找X(以便其可以获取到Y)时,首先找到的是这个虚伪的X、Y。

This took the good part of a week to figure out.

这大概花了我一个星期的老功夫来找出其真正的原因。

Reaching up the stack

冲破堆栈的天际

A certain software company decided that it was too hard to take the coordinates of the NM_DBLCLK notification and hit-test it against the treeview to see what was double-clicked. So instead, they take the address of the NMHDR structure passed to the notification, add 60 to it, and dereference a DWORD at that address. If it’s zero, they do one thing, and if it’s nonzero they do some other thing.

某家软件公司认为从NM_DBLCLK通知消息里解析出坐标来、对Treeview进行Hit-test测试来获取用户双击的对象这件事太麻烦了。所以作为替代,他们拿到了发给(NM_DBLCLK)通知消息的NMHDR数据结构,给它加上60,然后间接引用了那个地址上的一个DWORD值,如果值为0,就做某件事,如果不是0就做另一件事。

It so happens that the NMHDR is allocated on the stack, so this program is reaching up into the stack and grabbing the value of some local variable (which happens to be two frames up the stack!) and using it to control their logic.

碰巧那个NMHDR结构分配在堆栈里,所以这个程序是跑进了堆栈里获取某个本地变量的值(而这个值碰巧在堆栈的顶端2帧!),然后用这个值来控制他们的程序逻辑。

For Windows 2000, we upgraded the compiler to a version which did a better job of reordering and re-using local variables, and now the program couldn’t find the local variable it wanted and stopped working.

来到Windows 2000的年代,我们将编译器进行了优化升级,能更好地对本地变量进行重新排序和复用,然后这家公司的软件就找不到他们所需要的那个本地变量,于是就罢工了。

I got tagged to investigate and fix this. I had to create a special NMHDR structure that “looked like” the stack the program wanted to see and pass that special “fake stack”.

我被指派调查并修复这个问题,结果是创建了一个特殊的NMHDR结构,『看起来』像那个程序所需要的堆栈,然后把这个『伪堆栈』推送给这个程序。

I think this one took me two days to figure out.

这大概花了我两天的时间来研究。

I hope you understand why I tend to go ballistic when people recommend relying on undocumented behavior. These weren’t hobbyists in their garage seeing what they could do. These were major companies writing commercial software.

希望(以上这些例子)能帮助你理解为什么当人们推荐仰赖不在文档中的做法时,我会急得跳起来。这可不是某些爱好者在自家的车库里研究他们能搞出什么东西来的小打小闹,而是大公司在编写商业软件。

When you upgrade to the next version of Windows and you experience (a) disk corruption, (b) sporadic Explore crashes, or (c) sporadic loss of functionality in your favorite program, do you blame the program or do you blame Windows?

当你升级到Windows的新版本,然后遭遇了(a)磁盘损毁(b)时不时的资源管理器崩溃,或者(c)在你最喜欢的应用程序中时不时地丢失功能,你会选择责备这个应用还是Windows呢?

If you say, “I blame the program,” the first problem is of course figuring out which program. In cases (a) and (b), the offending program isn’t obvious.

如果你说『我选择责备这个应用』,那么首先遇到的难题就是找出到底是哪个应用造成了问题。在场景(a)和(b)中,找出造成问题的应用并不是一件容易的事。

Comments

  1. 其实比较神奇的是,那个年代,竟然有这么多人和这么多公司去做反向工程不说,而且目的都是注入而不是学习或效仿。对微软的反向工程注入好像也直到Win7出现之后才开始消停。

发表评论

电子邮件地址不会被公开。 必填项已用*标注

 剩余字数 ( Characters available )

注:请不要在评论中插入任何链接,否则将自动被识别为垃圾评论,博主将完全看不到。

Notice: please DO NOT add any links in your comment, otherwise it would be identified as SPAM automatically.

*