How does the Windows RENAME command interpret wildcards?
These rules were discovered after extensive testing on a Vista machine. No tests were done with unicode in file names.
RENAME requires 2 parameters - a sourceMask, followed by a targetMask. Both the sourceMask and targetMask can contain *
and/or ?
wildcards. The behavior of the wildcards changes slightly between source and target masks.
Note - REN can be used to rename a folder, but wildcards are not allowed in either the sourceMask or targetMask when renaming a folder. If the sourceMask matches at least one file, then the file(s) will be renamed and folders will be ignored. If the sourceMask matches only folders and not files, then a syntax error is generated if wildcards appear in source or target. If the sourceMask does not match anything, then a "file not found" error results.
Also, when renaming files, wildcards are only allowed in the file name portion of the sourceMask. Wildcards are not allowed in the path leading up to the file name.
sourceMask
The sourceMask works as a filter to determine which files are renamed. The wildcards work here the same as with any other command that filters file names.
?
- Matches any 0 or 1 character except.
This wildcard is greedy - it always consumes the next character if it is not a.
However it will match nothing without failure if at name end or if the next character is a.
*
- Matches any 0 or more characters including.
(with one exception below). This wildcard is not greedy. It will match as little or as much as is needed to enable subsequent characters to match.
All non-wildcard characters must match themselves, with a few special case exceptions.
.
- Matches itself or it can match the end of name (nothing) if no more characters remain. (Note - a valid Windows name cannot end with.
){space}
- Matches itself or it can match the end of name (nothing) if no more characters remain. (Note - a valid Windows name cannot end with{space}
)*.
at the end - Matches any 0 or more characters except.
The terminating.
can actually be any combination of.
and{space}
as long as the very last character in the mask is.
This is the one and only exception where*
does not simply match any set of characters.
The above rules are not that complex. But there is one more very important rule that makes the situation confusing: The sourceMask is compared against both the long name and the short 8.3 name (if it exists). This last rule can make interpretation of the results very tricky, because it is not always obvious when the mask is matching via the short name.
It is possible to use RegEdit to disable the generation of short 8.3 names on NTFS volumes, at which point interpretation of file mask results is much more straight forward. Any short names that were generated before disabling short names will remain.
targetMask
Note - I haven't done any rigorous testing, but it appears these same rules also work for the target name of the COPY commmand
The targetMask specifies the new name. It is always applied to the full long name; The targetMask is never applied to the short 8.3 name, even if the sourceMask matched the short 8.3 name.
The presence or absence of wildcards in the sourceMask has no impact on how wildcards are processed in the targetMask.
In the following discussion - c
represents any character that is not *
, ?
, or .
The targetMask is processed against the source name strictly from left to right with no back-tracking.
c
- Advances the position within the source name only if the source character is not.
, and always appendsc
to the target name. (Replaces the character that was in source withc
, but never replaces.
)?
- Matches the next character from the source long name and appends it to the target name as long as the source character is not.
If the next character is.
or if at the end of the source name then no character is added to the result and the current position within the source name is unchanged.*
at end of targetMask - Appends all remaining characters from source to the target. If already at the end of source, then does nothing.*c
- Matches all source characters from current position through the last occurance ofc
(case sensitive greedy match) and appends the matched set of characters to the target name. Ifc
is not found, then all remaining characters from source are appended, followed byc
This is the only situation I am aware of where Windows file pattern matching is case sensitive.*.
- Matches all source characters from current position through the last occurance of.
(greedy match) and appends the matched set of characters to the target name. If.
is not found, then all remaining characters from source are appended, followed by.
*?
- Appends all remaining characters from source to the target. If already at end of source then does nothing..
without*
in front - Advances the position in source through the first occurance of.
without copying any characters, and appends.
to the target name. If.
is not found in the source, then advances to the end of source and appends.
to the target name.
After the targetMask has been exhausted, any trailing .
and {space}
are trimmed off the end of the resulting target name because Windows file names cannot end with .
or {space}
Some practical examples
Substitute a character in the 1st and 3rd positions prior to any extension (adds a 2nd or 3rd character if it doesn't exist yet)
ren * A?Z*
1 -> AZ
12 -> A2Z
1.txt -> AZ.txt
12.txt -> A2Z.txt
123 -> A2Z
123.txt -> A2Z.txt
1234 -> A2Z4
1234.txt -> A2Z4.txt
Change the (final) extension of every file
ren * *.txt
a -> a.txt
b.dat -> b.txt
c.x.y -> c.x.txt
Append an extension to every file
ren * *?.bak
a -> a.bak
b.dat -> b.dat.bak
c.x.y -> c.x.y.bak
Remove any extra extension after the initial extension. Note that adequate ?
must be used to preserve the full existing name and initial extension.
ren * ?????.?????
a -> a
a.b -> a.b
a.b.c -> a.b
part1.part2.part3 -> part1.part2
123456.123456.123456 -> 12345.12345 (note truncated name and extension because not enough `?` were used)
Same as above, but filter out files with initial name and/or extension longer than 5 chars so that they are not truncated. (Obviously could add an additional ?
on either end of targetMask to preserve names and extensions up to 6 chars long)
ren ?????.?????.* ?????.?????
a -> a
a.b -> a.b
a.b.c -> a.b
part1.part2.part3 -> part1.part2
123456.123456.123456 (Not renamed because doesn't match sourceMask)
Change characters after last _
in name and attempt to preserve extension. (Doesn't work properly if _
appears in extension)
ren *_* *_NEW.*
abcd_12345.txt -> abcd_NEW.txt
abc_newt_1.dat -> abc_newt_NEW.dat
abcdef.jpg (Not renamed because doesn't match sourceMask)
abcd_123.a_b -> abcd_123.a_NEW (not desired, but no simple RENAME form will work in this case)
Any name can be broken up into components that are delimited by .
Characters may only be appended to or deleted from the end of each component. Characters cannot be deleted from or added to the beginning or middle of a component while preserving the remainder with wildcards. Substitutions are allowed anywhere.
ren ??????.??????.?????? ?x.????999.*rForTheCourse
part1.part2 -> px.part999.rForTheCourse
part1.part2.part3 -> px.part999.parForTheCourse
part1.part2.part3.part4 (Not renamed because doesn't match sourceMask)
a.b.c -> ax.b999.crForTheCourse
a.b.CarPart3BEER -> ax.b999.CarParForTheCourse
If short names are enabled, then a sourceMask with at least 8 ?
for the name and at least 3 ?
for the extension will match all files because it will always match the short 8.3 name.
ren ????????.??? ?x.????999.*rForTheCourse
part1.part2.part3.part4 -> px.part999.part3.parForTheCourse
Useful quirk/bug? for deleting name prefixes
This SuperUser post describes how a set of forward slashes (/
) can be used to delete leading characters (except .
) from a file name. One slash is required for each character to be deleted. I've confirmed the behavior on a Windows 10 machine.
ren "abc-*.txt" "////*.txt"
abc-123.txt --> 123.txt
abc-HelloWorld.txt --> HelloWorld.txt
Unfortunately leading /
cannot remove .
in a name. So the technique cannot be used to remove a prefix that contains .
. For example:
ren "abc.xyz.*.txt" "////////*.txt"
abc.xyz.123.txt --> .xyz.123.txt
abc.xyz.HelloWorld.txt --> .xyz.HelloWorld.txt
This technique only works if both the source and target masks are enclosed in double quotes. All of the following forms without the requisite quotes fail with this error: The syntax of the command is incorrect
REM - All of these forms fail with a syntax error.
ren abc-*.txt "////*.txt"
ren "abc-*.txt" ////*.txt
ren abc-*.txt ////*.txt
The /
cannot be used to remove any characters in the middle or end of a file name. It can only remove leading (prefix) characters. Also note this technique does not work with folder names.
Technically the /
is not functioning as a wildcard. Rather it is doing a simple character substitution following the c
target mask rule. But then after the substitution, the REN command recognizes that /
is not valid in a file name, and strips the leading /
slashes from the name. REN gives a syntax error if it detects /
in the middle of a target name.
Possible RENAME bug - a single command may rename the same file twice!
Starting in an empty test folder:
C:\test>copy nul 123456789.123
1 file(s) copied.
C:\test>dir /x
Volume in drive C is OS
Volume Serial Number is EE2C-5A11
Directory of C:\test
09/15/2012 07:42 PM <DIR> .
09/15/2012 07:42 PM <DIR> ..
09/15/2012 07:42 PM 0 123456~1.123 123456789.123
1 File(s) 0 bytes
2 Dir(s) 327,237,562,368 bytes free
C:\test>ren *1* 2*3.?x
C:\test>dir /x
Volume in drive C is OS
Volume Serial Number is EE2C-5A11
Directory of C:\test
09/15/2012 07:42 PM <DIR> .
09/15/2012 07:42 PM <DIR> ..
09/15/2012 07:42 PM 0 223456~1.XX 223456789.123.xx
1 File(s) 0 bytes
2 Dir(s) 327,237,562,368 bytes free
REM Expected result = 223456789.123.x
I believe the sourceMask *1*
first matches the long file name, and the file is renamed to the expected result of 223456789.123.x
. RENAME then continues to look for more files to process and finds the newly named file via the new short name of 223456~1.X
. The file is then renamed again giving the final result of 223456789.123.xx
.
If I disable 8.3 name generation then the RENAME gives the expected result.
I haven't fully worked out all of the trigger conditions that must exist to induce this odd behavior. I was concerned that it might be possible to create a never ending recursive RENAME, but I was never able to induce one.
I believe all of the following must be true to induce the bug. Every bugged case I saw had the following conditions, but not all cases that met the following conditions were bugged.
- Short 8.3 names must be enabled
- The sourceMask must match the original long name.
- The initial rename must generate a short name that also matches the sourceMask
- The initial renamed short name must sort later than the original short name (if it existed?)
Similar to exebook, here's a C# implementation to get the target filename from a sourcefile.
I found 1 small error in dbenham's examples:
ren *_* *_NEW.*
abc_newt_1.dat -> abc_newt_NEW.txt (should be: abd_newt_NEW.dat)
Here's the code:
/// <summary>
/// Returns a filename based on the sourcefile and the targetMask, as used in the second argument in rename/copy operations.
/// targetMask may contain wildcards (* and ?).
///
/// This follows the rules of: http://superuser.com/questions/475874/how-does-the-windows-rename-command-interpret-wildcards
/// </summary>
/// <param name="sourcefile">filename to change to target without wildcards</param>
/// <param name="targetMask">mask with wildcards</param>
/// <returns>a valid target filename given sourcefile and targetMask</returns>
public static string GetTargetFileName(string sourcefile, string targetMask)
{
if (string.IsNullOrEmpty(sourcefile))
throw new ArgumentNullException("sourcefile");
if (string.IsNullOrEmpty(targetMask))
throw new ArgumentNullException("targetMask");
if (sourcefile.Contains('*') || sourcefile.Contains('?'))
throw new ArgumentException("sourcefile cannot contain wildcards");
// no wildcards: return complete mask as file
if (!targetMask.Contains('*') && !targetMask.Contains('?'))
return targetMask;
var maskReader = new StringReader(targetMask);
var sourceReader = new StringReader(sourcefile);
var targetBuilder = new StringBuilder();
while (maskReader.Peek() != -1)
{
int current = maskReader.Read();
int sourcePeek = sourceReader.Peek();
switch (current)
{
case '*':
int next = maskReader.Read();
switch (next)
{
case -1:
case '?':
// Append all remaining characters from sourcefile
targetBuilder.Append(sourceReader.ReadToEnd());
break;
default:
// Read source until the last occurrance of 'next'.
// We cannot seek in the StringReader, so we will create a new StringReader if needed
string sourceTail = sourceReader.ReadToEnd();
int lastIndexOf = sourceTail.LastIndexOf((char) next);
// If not found, append everything and the 'next' char
if (lastIndexOf == -1)
{
targetBuilder.Append(sourceTail);
targetBuilder.Append((char) next);
}
else
{
string toAppend = sourceTail.Substring(0, lastIndexOf + 1);
string rest = sourceTail.Substring(lastIndexOf + 1);
sourceReader.Dispose();
// go on with the rest...
sourceReader = new StringReader(rest);
targetBuilder.Append(toAppend);
}
break;
}
break;
case '?':
if (sourcePeek != -1 && sourcePeek != '.')
{
targetBuilder.Append((char)sourceReader.Read());
}
break;
case '.':
// eat all characters until the dot is found
while (sourcePeek != -1 && sourcePeek != '.')
{
sourceReader.Read();
sourcePeek = sourceReader.Peek();
}
targetBuilder.Append('.');
// need to eat the . when we peeked it
if (sourcePeek == '.')
sourceReader.Read();
break;
default:
if (sourcePeek != '.') sourceReader.Read(); // also consume the source's char if not .
targetBuilder.Append((char)current);
break;
}
}
sourceReader.Dispose();
maskReader.Dispose();
return targetBuilder.ToString().TrimEnd('.', ' ');
}
And here's an NUnit test method to test the examples:
[Test]
public void TestGetTargetFileName()
{
string targetMask = "?????.?????";
Assert.AreEqual("a", FileUtil.GetTargetFileName("a", targetMask));
Assert.AreEqual("a.b", FileUtil.GetTargetFileName("a.b", targetMask));
Assert.AreEqual("a.b", FileUtil.GetTargetFileName("a.b.c", targetMask));
Assert.AreEqual("part1.part2", FileUtil.GetTargetFileName("part1.part2.part3", targetMask));
Assert.AreEqual("12345.12345", FileUtil.GetTargetFileName("123456.123456.123456", targetMask));
targetMask = "A?Z*";
Assert.AreEqual("AZ", FileUtil.GetTargetFileName("1", targetMask));
Assert.AreEqual("A2Z", FileUtil.GetTargetFileName("12", targetMask));
Assert.AreEqual("AZ.txt", FileUtil.GetTargetFileName("1.txt", targetMask));
Assert.AreEqual("A2Z.txt", FileUtil.GetTargetFileName("12.txt", targetMask));
Assert.AreEqual("A2Z", FileUtil.GetTargetFileName("123", targetMask));
Assert.AreEqual("A2Z.txt", FileUtil.GetTargetFileName("123.txt", targetMask));
Assert.AreEqual("A2Z4", FileUtil.GetTargetFileName("1234", targetMask));
Assert.AreEqual("A2Z4.txt", FileUtil.GetTargetFileName("1234.txt", targetMask));
targetMask = "*.txt";
Assert.AreEqual("a.txt", FileUtil.GetTargetFileName("a", targetMask));
Assert.AreEqual("b.txt", FileUtil.GetTargetFileName("b.dat", targetMask));
Assert.AreEqual("c.x.txt", FileUtil.GetTargetFileName("c.x.y", targetMask));
targetMask = "*?.bak";
Assert.AreEqual("a.bak", FileUtil.GetTargetFileName("a", targetMask));
Assert.AreEqual("b.dat.bak", FileUtil.GetTargetFileName("b.dat", targetMask));
Assert.AreEqual("c.x.y.bak", FileUtil.GetTargetFileName("c.x.y", targetMask));
targetMask = "*_NEW.*";
Assert.AreEqual("abcd_NEW.txt", FileUtil.GetTargetFileName("abcd_12345.txt", targetMask));
Assert.AreEqual("abc_newt_NEW.dat", FileUtil.GetTargetFileName("abc_newt_1.dat", targetMask));
Assert.AreEqual("abcd_123.a_NEW", FileUtil.GetTargetFileName("abcd_123.a_b", targetMask));
targetMask = "?x.????999.*rForTheCourse";
Assert.AreEqual("px.part999.rForTheCourse", FileUtil.GetTargetFileName("part1.part2", targetMask));
Assert.AreEqual("px.part999.parForTheCourse", FileUtil.GetTargetFileName("part1.part2.part3", targetMask));
Assert.AreEqual("ax.b999.crForTheCourse", FileUtil.GetTargetFileName("a.b.c", targetMask));
Assert.AreEqual("ax.b999.CarParForTheCourse", FileUtil.GetTargetFileName("a.b.CarPart3BEER", targetMask));
}