Batch script to merge files without Hex char 1A at the end
The placing of /a
and /b
switches is critical. They perform differently depending on whether they are placed after the source filename(s) or the target filename.
When used with a target filename, /a
causes the end-of-file marker (ASCII 26) to be added. You are actually specifying this!
When used with the source filename,
/a
specifies the file is ASCII and it's copied up to but not including the first ASCII 26 end-of-file mark. That character and anything after it is ignored.
/b
causes the entire file to be copied, including any end-of-file markers and anything after them.
When used with the destination filename,
/a
causes ASCII 26 to be added as the last character.
/b
does not add ASCII 26 as the last character.
Your solution
...although I haven't tested it, is probably to use
COPY a.txt+b.txt /a c.txt /b /y
https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/copy
If /a precedes or follows a list of files on the command line, it applies to all files listed until copy encounters /b. In this case, /b applies to the file preceding /b.
The effect of /a depends on its position in the command-line string: - If /a follows source, the copy command treats the file as an ASCII file and copies data that precedes the first end-of-file character (CTRL+Z). - If /a follows destination, the copy command adds an end-of-file character (CTRL+Z) as the last character of the file.
If /b directs the command interpreter to read the number of bytes specified by the file size in the directory. /b is the default value for copy, unless copy combines files.
If /b precedes or follows a list of files on the command line, it applies to all listed files until copy encounters /a. In this case, /a applies to the file preceding /a.
This is a very long winded way of saying the following: The default when combining files is /a. This means that the /a option is redundant in your code snippet, and would have applied regardless of where the /a was placed.
The solution is to use /b, this instructs it to ignore the #1A [DOS end of file] character when reading, and to not output it on writing.
Unlike the /a, the position of the /b is important if a source file includes the #1A character. If the /b is at the end of the command, the file will be truncated up to the #1A (but will not include the #1A).
Any of the following will correct this behaviour:
COPY a.txt+b.txt c.txt /y /b
COPY a.txt+b.txt /b c.txt /y
COPY /b a.txt+b.txt c.txt /y
But only the following will work in cases where the DOS end of file is not used to denote the end of a file:
COPY a.txt /b + b.txt c.txt /y
COPY /b a.txt + b.txt c.txt /y
Note: To confuse things further, adding /b after a source file will apply /b to every source file after it until there is a /a.
In normal operation, this behaviour may seem at best bizarre. As DOS file systems have always recorded the file size, an End of File character should be redundant.
https://en.wikipedia.org/wiki/End-of-file
This was done for two reasons:
Backward compatibility with CP/M. The CP/M file system only recorded the lengths of files in multiples of 128-byte "records", so by convention a Control-Z character was used to mark the end of meaningful data if it ended in the middle of a record. The MS-DOS filesystem has always recorded the exact byte-length of files, so this was never necessary on MS-DOS.
It allows programs to use the same code to read input from both a terminal and a text file.
The upshot of this is that this allows one to take input from a device (e.g. a COM port), or output to a device while still being able to distinguish different files.
https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/copy
You can substitute a device name for one or more occurrences of source or destination.
You could change the switch /a
(ASCII-Text) to /b
(binary)
Look also at copy /?
so the resulting command is
COPY a.txt+b.txt c.txt /y /b
Change from copy
to type
type a.txt>c.txt
type b.txt>>c.txt