The error is the result of coding error in Borland's VCL.Net library code that is manifested into data corruption caused by Microsoft's tightening the compliance rule to conform to Unicode 5 in Framework 2 SP1.
This article will pin-point the exact cause in Borland's code. It is an very common coding error that has not been picked up in code review and the .Net Framework of the past has chosen to ignore that mistake. Thus masking out the coding error.
The VCL bug causes the use of TRegistry.ReadString() to return a string that has an additional Unicode character of value 0xFFFD appended to the end. This is the Unicode's standard replacement character whenever the encoder detects an invalid Unicode Character. The use of this character is the default action in the .Net Framework.
It is worth noting that Microsoft.Win32.RegistryKey.GetValue() for REG_SZ data does not produce this error and is not affected by the installation of Framework 2 SP1.
Let's begin the code review from TRegistry.ReadString(), which can be found in Borland.Vcl.Registry.Pas line 546.
function TRegistry.ReadString(const Name: string): string;The coding error is located in the SetLength() as indicated. To understand why this is a mistake, we need to refer to the PInvoke declaration for the registry access function RegQueryValueEx(), which is the corner stone for GetDataSize() and GetData().
var
Len: Integer;
RegData: TRegDataType;
Buffer: TBytes;
begin
Len := GetDataSize(Name);
if Len > 0 then
begin
SetLength(Buffer, Len);
GetData(Name, Buffer, Len, RegData);
if (RegData = rdString) or (RegData = rdExpandString) then
begin
SetLength(Buffer, Len - 1); // <<--- Line(A) - The mistake. // .... end;
The declaration can be found in Borland.Vcl.Windows.Pas, line 21,265 and is reproduced in part here:
[SuppressUnmanagedCodeSecurity, DllImport(advapi32, CharSet = CharSet.Auto, SetLastError = True, EntryPoint = 'RegQueryValueEx')]According to MSDN documentation for CharSet.Auto, this declaration causes all strings to be marshaled as 2-byte Unicode strings and that it will be calling RegQueryValueExW variant of the RegQueryValueEx function.
function RegQueryValueEx(hKey: HKEY; lpValueName: string;
lpReserved: IntPtr; ..... ): Longint; external;
According to the documentation for RegQueryValueEx(), the data returned from calling RegQueryValueExW() for type REG_SZ is a 2-byte Unicode string and the 6th parameter should contain the length of the string
If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, this size includes any terminating null character or characters unless the data was stored without them.Also worth noting that the unit of this parameter is in bytes and not in characters. Therefore for a 2-byte Unicode string, this value is always even.
Now returning to Line(A) above. Since Buffer is of type TBytes, which is an array of bytes, if one subtracts 1 from the length of Buffer that is even, this will produce an odd number of bytes. The end result is in producing a nonsensical UTF-16 Unicode string, which is expected to compose of even number of bytes. Now, instead of ending with 2-bytes of zeros, the UTF-16 null terminator, the string now contains an odd byte of zero, which is clearly not a valid UTF-16 character.
According to the knowledge base article:
the trailing NULL byte was removed. However, now the NULL byte is converted to the Unicode replacement character.As a result, a string returned from TRegistry.ReadString() was like this, for example, "C:\Program Files" now becomes "C:\Program Files\xFFFD" or in appearance like this "C:\Program Files�"
In conclusion, the extra character tagged onto the end is the result of Framework 2 SP1 highlighting the programming error in VCL library. As mentioned, Microsoft.Win32.RegistryKey class does not have this kind of mishandling in all versions of framework. It is not a bug in Framework 2 SP1.
If you have Delphi 2006.Net program, it is therefore recommended that you include an application configuration file containing the
No comments:
Post a Comment