Convert file path to a file URI?

The System.Uri constructor has the ability to parse full file paths and turn them into URI style paths. So you can just do the following:

var uri = new System.Uri("c:\\foo");
var converted = uri.AbsoluteUri;

The solutions above do not work on Linux.

Using .NET Core, attempting to execute new Uri("/home/foo/README.md") results in an exception:

Unhandled Exception: System.UriFormatException: Invalid URI: The format of the URI could not be determined.
   at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
   at System.Uri..ctor(String uriString)
   ...

You need to give the CLR some hints about what sort of URL you have.

This works:

Uri fileUri = new Uri(new Uri("file://"), "home/foo/README.md");

...and the string returned by fileUri.ToString() is "file:///home/foo/README.md"

This works on Windows, too.

new Uri(new Uri("file://"), @"C:\Users\foo\README.md").ToString()

...emits "file:///C:/Users/foo/README.md"


What no-one seems to realize is that none of the System.Uri constructors correctly handles certain paths with percent signs in them.

new Uri(@"C:\%51.txt").AbsoluteUri;

This gives you "file:///C:/Q.txt" instead of "file:///C:/%2551.txt".

Neither values of the deprecated dontEscape argument makes any difference, and specifying the UriKind gives the same result too. Trying with the UriBuilder doesn't help either:

new UriBuilder() { Scheme = Uri.UriSchemeFile, Host = "", Path = @"C:\%51.txt" }.Uri.AbsoluteUri

This returns "file:///C:/Q.txt" as well.

As far as I can tell the framework is actually lacking any way of doing this correctly.

We can try to it by replacing the backslashes with forward slashes and feed the path to Uri.EscapeUriString - i.e.

new Uri(Uri.EscapeUriString(filePath.Replace(Path.DirectorySeparatorChar, '/'))).AbsoluteUri

This seems to work at first, but if you give it the path C:\a b.txt then you end up with file:///C:/a%2520b.txt instead of file:///C:/a%20b.txt - somehow it decides that some sequences should be decoded but not others. Now we could just prefix with "file:///" ourselves, however this fails to take UNC paths like \\remote\share\foo.txt into account - what seems to be generally accepted on Windows is to turn them into pseudo-urls of the form file://remote/share/foo.txt, so we should take that into account as well.

EscapeUriString also has the problem that it does not escape the '#' character. It would seem at this point that we have no other choice but making our own method from scratch. So this is what I suggest:

public static string FilePathToFileUrl(string filePath)
{
  StringBuilder uri = new StringBuilder();
  foreach (char v in filePath)
  {
    if ((v >= 'a' && v <= 'z') || (v >= 'A' && v <= 'Z') || (v >= '0' && v <= '9') ||
      v == '+' || v == '/' || v == ':' || v == '.' || v == '-' || v == '_' || v == '~' ||
      v > '\xFF')
    {
      uri.Append(v);
    }
    else if (v == Path.DirectorySeparatorChar || v == Path.AltDirectorySeparatorChar)
    {
      uri.Append('/');
    }
    else
    {
      uri.Append(String.Format("%{0:X2}", (int)v));
    }
  }
  if (uri.Length >= 2 && uri[0] == '/' && uri[1] == '/') // UNC path
    uri.Insert(0, "file:");
  else
    uri.Insert(0, "file:///");
  return uri.ToString();
}

This intentionally leaves + and : unencoded as that seems to be how it's usually done on Windows. It also only encodes latin1 as Internet Explorer can't understand unicode characters in file urls if they are encoded.

Tags:

C#

.Net

Uri

Path