A site devoted to discussing techniques that promote quality and ethical practices in software development.

Sunday, October 5, 2008

Is it wise to break convention - the wildcard convention

I have been a long term user of 7Za.exe, the command line version 7-ZIP, and recently have been using it to back up my Subversion repositories. Like many that use this kind of programs, one is too trusting to believe that it will honor the same convention as other CMD commands to archive all files, with the exception of those files in used, when one issues a command line like this:
7za a -tzip Test.zip MyWork\*.* -r
To many long time users of Windows Dos Prompt, this is the standard way to instruct a program to process all files; the Del, Attrib, Copy, XCopy, Cacls, Dir, RoboCopy, and Rar all conform to this convention and hence one naturally assumes that 7Za.exe would naturally honor this.

Not so, and I learned this the painful way. According to the help file of 7-Zip:
7-Zip doesn't follow the archaic rule by which *.* means any file. 7-Zip treats *.* as matching the name of any file that has an extension. To process all files, you must use a * wildcard.
Well, while it is admirable for someone to make a partial stand of this 'archaic rule', it is dangerous to use the same syntax and then silently producing a different result set. It is like switching the active and neutral wires in the electric wiring just because someone took exception to the color coding convention of the wire.

The fact remains that before 7-zip comes along to Windows, that convention has already been established and entrenched, way before it was mistakenly claimed by 7-Zip as introduced in Win 95. There are millions of users accustomed to this convention whether it is logical or illogical; this has become their second nature; it is like arguing whether it is logical or illogical to drive on the left hand side or right hand side of the road; a lone group of dissenting drivers taking a stand can not only play havoc on our roads but producing fatalities.

A convention has been established and people driving on public road has therefore to conform to it, like it or not. In US, all vehicles driving in a mine drives on the opposite side to those on the public road. The change of convention was explicitly stated, for reason sound in mining operations, and drivers are deliberately made to go through a change over section.

7Za however, did not do such a thing. It took *.* to mean 'all files must have extensions' in contrast to the Windows convention which means 'all files with or without extensions', which is understood by millions or trillions.

It is not a dispute of 7Za for being correct. It is raised here for its dangerous and irresponsible practice of flaunting a convention while using the same syntax.

For example,
7za a -tzip Test.zip MyWork\*.* -r
Should pick up all files in a Subversion repository regardless if the files contain extensions of not; many Subversion files do not have extensions. Rar with this command picks up all files honoring the Windows convention:
Rar a -r Test.rar MyWork\*.*
This is not only wise but also responsible realizing the consequence for failing to and taking a stand only brings at best hollow victory and wrath from users at the worst; such is the case with 7za now.

According to the Windows API and treatments of wildcard characters, there is no way to specify collecting files only with extensions. For example using Dir as an example:
Dir *.* /s /b
and
Dir * /s /b
produce the same result: a list of files with or without extensions. The second form is 7Za's way of specifying any file with or without extension but accepting the first form and producing a totally different result set.
Dir *. /s /b
produces a list of file without extensions. But there is no wildcard syntax to say 'file only with extension'.

As a result of 7-ZIP hollow stand against a convention entrenched in more users of Windows than 7za, it has successfully dislodged people's trust on this program. Archiver should behave much like a copy/xcopy commands; changing the operation with the same syntax is extremely dangerous and developer should not toy with this kind ideological stand in an important tool.

My ill-placed trust on 7za has caused me losing the collection of my Subversion repositories. It is an expensive loss and 7Za's ideological stand against an illogical convention is equally illogical resulting in real loss; what is 7-ZIP hoping to achieve? It has hardly won any friend!

The result is a total distrust of this tool. While I can use Subversion's command to ascertain the integrity of the restored repository, other archives produced by 7Za lack such detection and hence there remains an unknown number of imperfect archives.

I don't discourage 7za to take this kind of admirable stand but it should be selected by users. The default should always follow the convention of the OS in which it is deployed into. 7Za has already established this kind of overriding switches/options and its admirable attempt to correct the convention should only be selected when user makes the choice.

If that is a Unix convention, then either build 7Za to be a Posix conforming program that runs in Windows' Posix subsystem, in which case 7Za can even use case-sensitive file names or have a switch to turn on Unix convention.

This should be a real-life example of the danger of developers failing to conform to entrenched convention, no matter how 'archaic' or illogical that is. The first occupancy rule applies here!

For me, 7Za is now being banished to the recycled-bin as it is too dangerous to use tools that do not conform to convention. It will not earn my recommendation for sure.

Blog Archive