Jump to content
A.M. Hoornweg

ExtractFileDrive bug

Recommended Posts

I've just discovered that ExtractFileDrive misses a corner case on Windows and posted a QC about it. 

https://quality.embarcadero.com/browse/RSP-31109

 

 

I was experimenting with very long file and path names (see https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation) and stumbled upon a file name syntax which I hadn't seen before. It turned out that Delphi didn't know about it either.

 

The syntax is \\?\UNC\server\share which is just another way of writing \\server\share

Delphi's ExtractFileDrive returns "\\?\UNC" on such a path which is meaningless.

 

 

 

 

Share this post


Link to post
2 hours ago, A.M. Hoornweg said:

Delphi's ExtractFileDrive returns "\\?\UNC" on such a path which is meaningless.

One could argue that it's equally meaningless to use ExtractFileDrive on that path since whatever comes after the "\\?\" is handled directly by the file system and not by the file namespace parser.

I guess the correct thing to do would be to return an empty string as documented:

Quote

For file names with drive letters, the result is in the form '<drive>'.

For file names with a UNC path (Universal Naming Convention), the result is in the form '\\<servername>\<sharedname>'.

If the given path contains neither style of path prefix, the result is an empty string.

I do not agree with your suggestion that it should attempt to extract anything if the string starts with "\\?\" (or \\.\ for that matter). The result would be meaningless in the context.

Share this post


Link to post

The \\?\UNC is for unicode path up to 32K in length so you can work around the MAX_PATH limit. I wouldn't expect ExtractFileDrive to work with these as you would add the \\?\UNC prefix before calling CopyFileW() etc.

There is similar syntax for drive letter. 

Have a look at the Microsoft Documentation (Looks like they've change something in Windows 10 recently as well - https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-copyfile

  • Like 1

Share this post


Link to post
2 minutes ago, Anders Melander said:

One could argue that it's equally meaningless to use ExtractFileDrive on that path since whatever comes after the "\\?\" is handled directly by the file system and not by the file namespace parser.

I guess the correct thing to do would be to return an empty string as documented:

I do not agree with your suggestion that it should attempt to extract anything if the string starts with "\\?\" (or \\.\ for that matter). The result would be meaningless in the context.

 

I respectfully disagree.

 

The prefix "\\?\" is a well-known method to tell Windows that a program is able to parse file names longer than MAX_PATH. It seems that the prefix \\?\UNC\  extends this ability to network paths. Otherwise I wouldn't bother with them. 

 

But in this specific case I needed to test if two paths referred to the same volume so I used ExtractFileDrive, which failed on this syntax.

 

 

 

 

 

 

Share this post


Link to post
19 minutes ago, David Hoyle said:

The \\?\UNC is for unicode path up to 32K in length so you can work around the MAX_PATH limit. I wouldn't expect ExtractFileDrive to work with these as you would add the \\?\UNC prefix before calling CopyFileW() etc. There is similar syntax for drive letter. 

 

The word "drive" is a bit unlucky. What ExtractFileDrive() really does is to return the root of a volume, which can be either a drive letter or a file share.

 

ExtractFileDrive ('\\server\share\folder\filename.ext") returns "\\server\share", which is perfectly OK. I can use the result as the root path for a folder structure without problems.

ExtractFileDrive ('\\?\C:\folder\filename.ext") returns "\\?\C:" which is also perfectly OK. This result, too, can be used as the root path for a folder structure without problems.

 

The only corner case that isn't handled correctly is "\\?\UNC\" which is needed for long network paths. I think it wouldn't hurt anyone to support that syntax as well.

Share this post


Link to post
10 minutes ago, A.M. Hoornweg said:

The prefix "\\?\" is a well-known method to tell Windows that a program is able to parse file names longer than MAX_PATH. It seems that the prefix \\?\UNC\  extends this ability to network paths. Otherwise I wouldn't bother with them.

\\?\UNC\ isn't an extension of \\?\

https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#win32-file-namespaces

 

Quote

For file I/O, the "\\?\" prefix to a path string tells the Windows APIs to disable all string parsing and to send the string that follows it straight to the file system.

[...]

Because it turns off automatic expansion of the path string, the "\\?\" prefix also allows the use of ".." and "." in the path names, which can be useful if you are attempting to perform operations on a file with these otherwise reserved relative path specifiers as part of the fully qualified path.

Many but not all file I/O APIs support "\\?\"; you should look at the reference topic for each API to be sure.

This means that if the path starts with \\?\ then what ever follows is beyond what ExtractFileDrive was meant to handle. It could attempt to just strip the \\?\ part and retry but it I think it would be reasonable to give up and just return an empty string.

 

19 minutes ago, A.M. Hoornweg said:

I needed to test if two paths referred to the same volume

Good luck with that. It isn't even possible to determine if two fully qualified paths refer to the same file.

Share this post


Link to post
3 hours ago, Anders Melander said:

It isn't even possible to determine if two fully qualified paths refer to the same file.

Yes, it is.  In fact, there are several way to do exactly that.  For example, parse the 2 paths into absolute PIDLs, and then see if they compare equal.  Or, open the files, retrieve their volume serial numbers and file identifiers with GetFileInformationByHandle/Ex(), and see if they compare equal.

Share this post


Link to post
3 hours ago, Remy Lebeau said:

Yes, it is.  In fact, there are several way to do exactly that.  For example, parse the 2 paths into absolute PIDLs, and then see if they compare equal.  Or, open the files, retrieve their volume serial numbers and file identifiers with GetFileInformationByHandle/Ex(), and see if they compare equal.

Well I meant without opening the file. Of course Windows itself is able to determine if two local files are the same. If opening the file is OK then, yes GetFileInformationByHandle will get the job done. There are also other API functions that can be used if opening the file is okay. For example the undocumented NtQueryObject API can be used to get the logical filename of just about anything (e.g. C:\Windows -> \Device\HarddiskVolume1\Windows).

 

The PIDL solution definitely won't work. PIDLs live in the shell namespace and you need to go much lower than that to determine identity.

Share this post


Link to post
56 minutes ago, Anders Melander said:

Well I meant without opening the file.

In that case, yes, without some degree of parsing the path components and mapping them to something real to see if they map the same.

56 minutes ago, Anders Melander said:

The PIDL solution definitely won't work.

Yes, it does.  I've been using it for years.  I have an app that displays files, tracking their PIDLs.  If the user tells the app to open a new file, and its PIDL is already open, I jump to the existing display instead.  It works just fine.  It may not be the BEST approach, but it works.  When I first wrote the code. the only approach I knew about was to convert both paths to their short 8.3 form and then do a simple string comparison.  I later found the PIDL approach to be much more reliable, so that is what the app uses now.  Later, I learned about the GetFileInformationByHandle() approach, but I never got a chance to update the code.

56 minutes ago, Anders Melander said:

PIDLs live in the shell namespace

So?  The filesystem is part of that.  Filesystem paths can be converted to PIDLs, the filesystem shell provider will map the folder and file components accordingly.  Get the IShellFolder interface for the root Desktop namespace, pass both filesystem paths to its ParseDispayName() method (short paths, long paths, it doesn't matter), and it will ask the filesystem to parse them and provide absolute PIDLs, which will compare equal if they refer to the same file.  Trust me, it works, I use it.

Share this post


Link to post
37 minutes ago, Remy Lebeau said:

Get the IShellFolder interface for the root Desktop namespace, pass both filesystem paths to its ParseDispayName() method (short paths, long paths, it doesn't matter), and it will ask the filesystem to parse them and provide absolute PIDLs, which will compare equal if they refer to the same file.  Trust me, it works, I use it.

So how do you deal with mapped drives, mounted volumes, symbolic links, junctions and hard links?

Share this post


Link to post
24 minutes ago, Anders Melander said:

So how do you deal with mapped drives, mounted volumes

My app only deals with local files, not remote files.  It accesses the file data using memory-mapped views, which are not coherent over remote connections.

24 minutes ago, Anders Melander said:

symbolic links, junctions and hard links?

I hear what you are saying, but honestly that hasn't been an issue for my app yet.  But I can see how PIDLs would not work for those.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×