A site devoted to discussing techniques that promote quality and ethical practices in software development.

Thursday, November 30, 2006

The Problem with Programming

This is the title of the latest interview with the father of C++, the legendary Bjarne Stroustrup. Below are snippets from this article.

When he was asked why most software is so bad. Here is his responses,
The structure is appalling, and the programmers clearly didn't think deeply about correctness, algorithms, data structures, or maintainability. Most people don't actually read code....

that "we" (that is, we software developers) are in a permanent state of emergency, grasping at straws to get our work done. We perform many minor miracles through trial and error, excessive use of brute force, and lots and lots of testing, but--so often--it's not enough.

So very true observation. So how can we fix this, when asked, and here are his comments:
In theory, the answer is simple: educate our software developers better, use more-appropriate design methods, and design for flexibility and for the long haul.

Reward correct, solid, and safe systems. Punish sloppiness. In reality, that's impossible. People reward developers who deliver software that is cheap, buggy, and first. ....

On the other hand, just muddling along is expensive, dangerous, and depressing. Significant improvements are needed, and they can only come gradually. They must come on a broad front; no single change is sufficient.


Yes perfect observation and comments. I am seeing this kind of practice on daily basis.

Wednesday, November 29, 2006

Unsafe to use TApplication.OnMessage in VCL.Net

Recently, I have been called to investigate some strange problem that causes program to misbehave.

To cut the long story short, it turns out that their application depends on the dispatching of TApplication.OnMessage event. If this is disturbed their program misbehaves.

TApplication.OnMessage is part of the Borland VCL framework. So how can a framework be so fragile to be disturbed?

My investigation led me to the bowel of the message pump of Borland's VCL and this looks like this in pseudo code:

TApplication.ProcessMessage()
if PeekMessage( ...., PM_REMOVE ) then
begin
if Assigned(FOnMessageHandler) FOnMessageHandler( ..... );
Do some other processing;
TranslateMessage( ... );
DispatchMessage( .... );
end;
This is the only place that the OnMessage event is fired and this is the weakness of the VCL framework. Why?

If you do not call TApplication.ProcessMessage() but instead using their Windows unit's GetMessage(), TranslateMessage() and DispatchMessage() (which are mapped down to P/Invoke using the Win32 API), you can prevent OnMessage being fired and yet your Windows messages are still correctly dispatched.

For example, if you either in your Delphi code or in some managed or unmanaged (for example a COM component), introduces a modal loop like this:

     while someFlag || GetMessage( .... )!= -1
{
if ( msg.message == ..... )
break;
else
{
TranslateMessage( &msg );
DispatchMessage( &msg );
}
}
Using standard Win32 API constructs that are commonly used by non-Delphi developers would immediately defeat the VCL's framework with respect to the OnMessage dispatching. To avoid this you must call TApplication.ProcessMessage() but even then you still cannot prevent other component from interfering with the VCL.

As said, this could be introduced by calling Win32 API that introduces a modal loop like a message box, or someone's code that has been designed with no knowledge of VCL. Such harmless usage immediately rendering that component incompatible with VCL or render your application's dependency on this feature useless.

This highlights the fragility of the VCL framework.

So the advice is that do not rely on TApplication.OnMessage(). Furthermore this event allows you to steal messages because it is called before the message is dispatched to the rightful owner.

A Borland feature or bastardation of CLR ?

A recent chance development of a .Net prototype solution took me face to face with a ugly finding of fiddling of .Net/CLR. Fiddling other's structure or framework is not new to Borland as reported elsewhere but this surpasses all I have seen.

Let's describe the structure of my prototype leading to the discovery of this damaging technique.

I have a Delphi.Net VCL Form application, MyApp.exe, which uses a VS/C# assembly in which one finds a factory object. Because of a Delphi 2006 (D2006) internal compiler bug, this .Net assembly cannot reference any Borland assembly directly. If this assembly references any Borland vcl assembly, you cannot directly reference this assembly in a Delphi VCL package (Delphi's term for DLL assemblies). Hence this factory object, FormFactory, uses Activator.CreateInstance() to dynamically creating Delphi VCL.Net forms from a number of Delphi.Net packages directed by some policy. The factory returns System.Object to the caller.

These Delphi packages contain VCL.Net forms and that they declare references to Borland.Vcl.dll which provides the type Borland.Vcl.TCustomForm, the base type of the forms in these packages.

In MyApp.exe, it has a reference to my factory assembly but it does not have a reference to Borland.Vcl.dll and yet I can compile code like this:

obj := FormFactory.CreateInstance( .... );
Debug.Assert( obj is TCustomForm );

I built all the assemblies and they do not have any compilation error. Surprise as I expect this to complain just like their C# builder would. But when it runs, the above assert fail!!

This is totally crazy. In the debugger, I can see the base type of obj is TCustomForm so what's happening. I even do this:

Debug.WriteLine( obj.GetType().BaseType.BaseType.Fullname );

This shows Borland.Vcl.TCustomForm. So what is happening?

Initially I thought Borland must have translated the is operator to something else because their as operator and Pascal cast operator behave exactly opposite to C#. So I fired up Lutz's Reflector and is operator is generated to isinst IL instruction. So that proves the Pascal is operator is being translated to some Borland operator.

The explanation only surfaces when I type out the Type.AssemblyQualifiedName. The outputs of obj.GetType().BaseType.BaseType and TCustomForm from MyApp explain why isinst is failing.

The Delphi compiler took action when it sees that I have not defined a reference to Borland.Vcl.dll and rammed the IL code into MyApp.exe to produce this:

Borland.Vcl.TCustomForm, MyApp, Version=1.0.27277.2828, Culture=neutral, PublicKeyToken=None

The base class from obj is
Borland.Vcl.TCustomForm, Borland.Vcl, Version=10.xxxxxx, .....

In other words, as far as CLR is concerned, these are two distinct types. Borland must mistakenly believing that CLR would simply use namespace qualified type names. How wrong is such a view!

Let's see what the standard says about the above types - are they 2 distinct types or one the same?

With reference to Partition I, section 8.5.2 Assmblies and Scoping:
To fully identify a type, the type name shall be qualified by the scope that includes the type name. A type name is scoped by the assembly that contains the implementation of the type. An assembly is a configured set of loadable code modules and other resources that together implement a unit of functionality. The type name is said to be in the assembly scope of the assembly that implements the type. Assemblies themselves have names that form the basis of the CTS naming hierarchy.

Here you have the unambiguous definition from the standard that says the above encounter resulted in two distinct types.

In fact, Borland is willing to throw away the strongly name attribute of their Borland.Vcl.dll to a non-strongly name one. Gee I am wondering if I can use that to con users that I am a genuine Borland assembly. I will leave that for another day.

After discovering this, a report was dispatched to Borland. As usual, it is not a bug but a 'feature' and 'an advantage'. Oh Yah! Probably the COM bug in their code is also a feature. It is a special feature called static binding. Indeed it is a static binding! If you open the assembly with Lutz's Reflector, you can see all the IL code rammed into your assembly as if they were in Borland.Vcl.dll. The trouble is that they have just allowed you to give their type your version and assembly scope. Even if it is in an executable, you can still load it as if an ordinary DLL assembly.

Incidentally, their compiler does not even have a switch to turn off this 'feature' nor does it generates any warning. So beware!

Now let's consider a number of normal .Net usages in which Borland's 'feature' can wreck havoc to a CLR solution.

Scenario 1
Let's say a gun ho Delphi developer comes up with this special component that he wants other to use. Seeing it is cooler not having to dispatch Borland.Vcl.dll, he uses this static linkage to produces his assembly.

The other developer using factory pattern to load this will have lots of problem.

In fact, if this developer wants to achieve this, he/she would be better to use DILMerge.

Scenario 2
Let say developer creates MyAssem1.dll using static binding, another one produces MyAssem2.dll, etc. (incidentally don't be fool to believe that if you have MyAssem1.exe, MyAssem2.exe, two separate applications will not encounter this madness), soon you have as many distinct TCustomForm types as you have assemblies.

Scenario 3
When Borland finally catches up to .Net 2 and that they support generic, whose syntax is still a mystery, you cannot use List because if you have a number of distinct TCustomForm types as far as CLR is concerned. I pray Borland will not create its own flavour of System.Collections.Generic.List<>? The only way out is to resort back to using ArrayList that support inhomogenous collection.

Clearly Borland must be naively believing type comparison is only performed based on namespace qualified name rather than what is defined by CLI. Perhaps in their VCL32 world this is the case. Well that's fine as that is their own creation and they can be as free as they like because there is no one they have to be compatible with except themselves. Sorry Borland. Someone must have forgotten to tell you that the world has changed!

I would like to agree with Borland that this is a 'feature' or 'advantage' but if this creates more problems, violates CLI specification, misleads developers and has no real benefit at all, I am afraid that my logical thinking does not allow me to agree. So what can I say other than to label this as a bug and an attempt to bastardize CLR or to create Borland's CLR.

Having features to distinguish oneself from others is great but the features should not damage and confuse a framework that promotes interchangeability, cross-language programming and component technology.

What Borland has done is to transplant their VCL32 features to .Net and blindly or rather arrogantly ignoring the need to meet the ECMA CLI standard.

VCL namespace is already non-compatible with the System namespace and this is another Borland's feature to take their product further away from being compatible with .Net. So users be warned and stay clear.

Wednesday, November 22, 2006

HTTP: The Definitive Guide

"HTTP: The Definitive Guide"
by David Gourley and Brian Totty.
ISBN: 1-56592-509-2

This is an excellent book providing very detail coverage of not only the protocol but also the way it weaves through the TCP/IP stack.

Chapter 4 is a must read to understand how the HTTP protocol works and what can influence the performance of a web server.

Saturday, November 18, 2006

Lesson on buying PC games

Not only I fell for this many times before but many of my friends and their friends have experienced this problem in buying moderately recent released version of PC games.

Buying the game is the cheapest part of the exercise and only fools would believe that's the end of the deal.

Unless your machine has the latest video card, inevitably after paying for the game, you are then forced to begin on the journey of upgrading. First the video card and then memory and in the extreme case literally a brand new box because the CPU is not fast enough.

I have never heard of one case in which a games console game forcing the owner of the console to upgrade. If the game is labeled to run on PS2 , XBox or XBox 360 you pop the game in and it runs flawlessly. Why can't PC game developer doing the same thing, writing to some most commonly found hardware configuration rather than using a very narrow set of the latest and greatest hardware? Could it be some symbiotic relationship with hardware manufacturer?

Just like this afternoon, I was called by cousin to rescue his system.

They bought this game for the son. The game wouldn't run on their GForce card. So they upgraded the card as required by the game. Started the game and after the intro and set up to begin playing, machine threw up a message that looked like someone has pressed the power down button.

Took a look at the card, I suspected the old 400W power supply could not meet the surge. The regulator may be faulty. I did a quick transplant of my spare 400W power supply and off it went and the game played, albeit a bit slow. So the next was a search on the net for memory and that looked like a good time to bid my cousin goodbye.

This hopefully will serve as a lesson to anyone contemplating on buying PC game.

After adding up all the cost (of course my service call was free) and hassle he could have bought 2 or more console games. Never will I consider a PC game.

Wednesday, November 15, 2006

Fixing up Borland Delphi's COM Server Registration Problem

The story where Borland does not even understand their requirement for COM Server registration and then translating that to code has been described in the most kiddish format in my other blog message.

I doubt Borland will have the courage to admit mistake and fix their problem as this problem exists in their product unchanged since Delphi 3 to the latest, including those not yet released product. Hence it is a dead loss trying to get Borland to do something. I am going to show you how to fix this problem. It is as easy as learning "Mary has a little lamb" rhyme.

Before I'll show you the fix, let me reproduce their documentation partially here:
Start mode Switch Meaning
smAutomation embedding The application was started by Windows in response to a request from an automation controller.
smRegServer regserver The application was started only to add the server to the system registry.
smStandalone
The user started the application as a stand-alone, interactive application.
smUnregServer unregserver The application was started only to remove the server from the system registry.
Table 1 - Permissible COM Server switches

For running the automation server as a stand alone application, you do not include any switch.

From the above requirement, it seems pretty obvious that you only have to run through the COM registration manipulation code when you encounter the switches /regserver or /unregserver. Incidentally this is the standard stuff if you build the COM local server using VB6, MFC and ATL.

Now it is time to reveal how Borland handles this situation badly. To see this you can locate this code fragment in ComServ.pas at around line 373, if you have access to Borland's Delphi tool or simply search for TComServer.Initialize in Unit ComServ. It is reproduced here in Listing 1,
// Listing 1
procedure TComServer.Initialize;
begin
try
UpdateRegistry(FStartMode <> smUnregServer);
except
on E: EOleRegistrationError do
// User may not have write access to the registry.
// Squelch the exception unless we were explicitly told to register.
if FStartMode = smRegServer then raise;
end;
if FStartMode in [smRegServer, smUnregServer] then Halt;

ComClassManager.ForEachFactory(Self, FactoryRegisterClassObject);
end;
In merely 9 lines (ignoring comments), the two bold red lines are erroneous and I am going to show you why.

Let's take the bug number 1 and that is the COM Server registration code (the first line in red).

If you assign the values listed in column 1 in table 1 to the variable FStartMode, one value at a time, and then ask yourself under what value does the function UpdateRegistry() is not called.

If your answer is that UpdateRegistry() is called regardless of the value of FStartMode, then you are correct and you begin to see how silly that piece of code is.

UpdateRegistry() is a procedure that executes COM registration steps when the parameter is true. Otherwise it performs COM unregistration steps.

If you have a basic understanding of COM, you will realise that by the time COM successfully launches your local server via a client's call to COM method like CoCreateInstance(), your server must have been registered properly. So in that situation a Delphi automation server will have a FStartMode = smAutomation.
If you have followed me so far, UpdateRegistry() is called regardless of the switch and this also include the situation when you, the COM server, are launched by COM. So why do you then have to update the registry again? Doing so is like you have just flew into an airport and you still insist on buying an inbound ticket. It is silly.

To fix that all you have to do is to check if the FStartMode is smRegServer or smUnregServer. If so then called UpdateRegistry().

Now the second bug requires one to perform a function calls tracing code review. But in short, no one in the possible call chain throws EOleRegistrationError the one that Borland has programmed to catch (see the second line in red in Listing 1). Instead the code throws EOleSysError when there is COM registration error.
If you look up Delphi documentation you can see that these are two distinct exceptions both derived from EOleError and that they are siblings to EOleError.
So the corrected code is as follows, Listing 2:
// Listing 2
procedure TComServer.Initialize;
begin
try
if FStartMode in [smRegServer, smUnregServer] then
UpdateRegistry(FStartMode <> smUnregServer);
except
// on E: EOleRegistrationError do
on E: EOleError do
// User may not have write access to the registry.
// Squelch the exception unless we were explicitly told to register.
if FStartMode = smRegServer then raise;
end;
if FStartMode in [smRegServer, smUnregServer] then Halt;

ComClassManager.ForEachFactory(Self, FactoryRegisterClassObject);
end;
I have left Borland's incorrect catch statement there for comparison. Comparing this to that shown in Listing 1, you can see that it is pretty obvious that someone must have accidentally deleted the test for /Regserver or /Unregserver (the line in greed) or someone must be high on illegal substance and thinking what is left in Borland's code is a pretty cool optimisation.

As mentioned, many people have reported this bug before and Borland showed a complete lack of consumer concern to fix this problem. Incidentally, once you have registered, the subsequent calls to UpdateRegistry( true ) does not cause problem in LUA because COM's RegisterTypeLib() performs a read check to see if the COM information is the same. If so it does not update it. Otherwise it does. When that happen, you need power users or higher privilege.

If you want to replace Borland's buggy code, all you have to do is:
  • Take a copy of ComServ.pas to somewhere safe.
  • Edit the TComServer.Initialize to that shown in Listing 2.
  • In the Delphi's COM Server project, include a copy of this file.
  • Delete all the dcu file and close the entire project/group. Delphi IDE has this crazy frequently happening moment of memory lapse that it fails to pick up that your project now has a new file. Pretty dumb stuff.
  • Reopen this and rebuild it.

This automation server should now work properly. The best way to test this is to unregister the COM Local server (just the executable) in an Admin account. Then start the automation server as a stand alone in LUA. In this situation, the program should start up properly. If this is built using Borland's buggy code, it will throw an exception and that is not the one Borland programmed to catch.

It is this simple to fix Borland's bug!

More LUA Articles

Several articles on this very topic has been published by Mircosoft on this very topic. Here is the collection of them:

This one is a very detail discussion on applying the principle of Least Privilege User Account principle (LUA) in XP.

This one is provides a set of very good guidelines on resolving LUA problems and the common mistakes developers made.

This article is on the principle of applying LUA in Vista.

Blog Archive