I wrote a piece of code that used System.IO.Directory.GetFiles(string, string) and executed the code in Windows XP. I was totally dump founded when I discovered this function returning more than what was expecting. The same things happened in Vista & Windows 7.
I was trying to get a list of files '*.tab' from a nominated directory but the result contained files meeting '*.tab*'. I was shocked. So I went to my favourite tool, the command prompt, and did a dir *.tab in the same directory and got the same result. Powershell does not produce this unexpected result.
Knowing System.IO.Directory.GetFiles() most likely P/Invoke down to Win32 API, I search through and confirmed the FindFirstFile() is responsible for this unexpected behaviour.
The explanation of this 'oddity' is given by Raymond Chen and you can blame the 8.3 file names for this problem.
So if you use System.IO.Directory.GetFiles(string, string), make sure you filter your results properly to avoid any unexpected results.
There is no way in CMD to list out a list of files with that kind of requirements.
while Mr Chen's blog is not online, I did a test with *.pd, then the result looked fine; with *.pdf, the result looked odd as you found. Apparently some boundary codes of the api function only fit 8.3 names.
ReplyDeleteYes, your observation is correct for *.pd and *.pdf. Try *.htm and *.html. You can see the effect if you do this:
ReplyDeletedir *.* /x
This will show you the short file names which FindFirstFile() will scan leading to the oddity.
This is further substantiated by the following remark:
A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern.
Try to use *.pdf and *.pdfx and you can see the oddity.