How do I check for illegal characters in a path?
Be careful when relying on Path.GetInvalidFileNameChars
, which may not be as reliable as you'd think. Notice the following remark in the MSDN documentation on Path.GetInvalidFileNameChars
:
The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names. The full set of invalid characters can vary by file system. For example, on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t).
It's not any better with Path.GetInvalidPathChars
method. It contains the exact same remark.
As of .NET 4.7.2, Path.GetInvalidFileNameChars()
reports the following 41 'bad' characters.
0x0000 0 '\0' | 0x000d 13 '\r' | 0x001b 27 '\u001b' 0x0001 1 '\u0001' | 0x000e 14 '\u000e' | 0x001c 28 '\u001c' 0x0002 2 '\u0002' | 0x000f 15 '\u000f' | 0x001d 29 '\u001d' 0x0003 3 '\u0003' | 0x0010 16 '\u0010' | 0x001e 30 '\u001e' 0x0004 4 '\u0004' | 0x0011 17 '\u0011' | 0x001f 31 '\u001f' 0x0005 5 '\u0005' | 0x0012 18 '\u0012' | 0x0022 34 '"' 0x0006 6 '\u0006' | 0x0013 19 '\u0013' | 0x002a 42 '*' 0x0007 7 '\a' | 0x0014 20 '\u0014' | 0x002f 47 '/' 0x0008 8 '\b' | 0x0015 21 '\u0015' | 0x003a 58 ':' 0x0009 9 '\t' | 0x0016 22 '\u0016' | 0x003c 60 '<' 0x000a 10 '\n' | 0x0017 23 '\u0017' | 0x003e 62 '>' 0x000b 11 '\v' | 0x0018 24 '\u0018' | 0x003f 63 '?' 0x000c 12 '\f' | 0x0019 25 '\u0019' | 0x005c 92 '\\' | 0x001a 26 '\u001a' | 0x007c 124 '|'
As noted by another poster, this is a proper superset of the set of characters returned by Path.GetInvalidPathChars()
.
The following function detects the exact set of 41 characters shown above:
public static bool IsInvalidFileNameChar(Char c) => c < 64U ?
(1UL << c & 0xD4008404FFFFFFFFUL) != 0 :
c == '\\' || c == '|';
I ended up borrowing and combining a few internal .NET implementations to come up with a performant method:
/// <summary>Determines if the path contains invalid characters.</summary>
/// <remarks>This method is intended to prevent ArgumentException's from being thrown when creating a new FileInfo on a file path with invalid characters.</remarks>
/// <param name="filePath">File path.</param>
/// <returns>True if file path contains invalid characters.</returns>
private static bool ContainsInvalidPathCharacters(string filePath)
{
for (var i = 0; i < filePath.Length; i++)
{
int c = filePath[i];
if (c == '\"' || c == '<' || c == '>' || c == '|' || c == '*' || c == '?' || c < 32)
return true;
}
return false;
}
I then used it like so but also wrapped it up in a try/catch block for safety:
if ( !string.IsNullOrWhiteSpace(path) && !ContainsInvalidPathCharacters(path))
{
FileInfo fileInfo = null;
try
{
fileInfo = new FileInfo(path);
}
catch (ArgumentException)
{
}
...
}
InvalidPathChars is deprecated. Use GetInvalidPathChars() instead:
public static bool FilePathHasInvalidChars(string path)
{
return (!string.IsNullOrEmpty(path) && path.IndexOfAny(System.IO.Path.GetInvalidPathChars()) >= 0);
}
Edit: Slightly longer, but handles path vs file invalid chars in one function:
// WARNING: Not tested
public static bool FilePathHasInvalidChars(string path)
{
bool ret = false;
if(!string.IsNullOrEmpty(path))
{
try
{
// Careful!
// Path.GetDirectoryName("C:\Directory\SubDirectory")
// returns "C:\Directory", which may not be what you want in
// this case. You may need to explicitly add a trailing \
// if path is a directory and not a file path. As written,
// this function just assumes path is a file path.
string fileName = System.IO.Path.GetFileName(path);
string fileDirectory = System.IO.Path.GetDirectoryName(path);
// we don't need to do anything else,
// if we got here without throwing an
// exception, then the path does not
// contain invalid characters
}
catch (ArgumentException)
{
// Path functions will throw this
// if path contains invalid chars
ret = true;
}
}
return ret;
}