How to join text files?
Use cat
with output redirection. Syntax: cat file [file] [[file] ...] > joined-file
.
Example with just two files (you can have many more):
$ echo "some text in a file" > file1
$ echo "another file with some text" > file2
$ cat file1 file2 > mergedfiles
$ cat mergedfiles
some text in a file
another file with some text
In case you have "many documents", make use of shell globbing (patterns):
cat input-files-dir/* > joined-file
This will join all files in that directory to the current directory (preventing it to match the output file itself). It is totally independent to the use of cat
and output redirection - it's just Bash providing all the files as arguments to cat
.
File types
It will just glue (join) files together as you would do with paper and tape. It does not care about the actual file format being capable of handling this. It will work for text files, but not for PDFs, ODTs, etc. Well, it will glue them together, but it's not a valid PDF/ODT anymore.
Order of joining
As phoibos pointed out the shell globbing will result in alphabetical order of file names. This is how Bash and shell globbing works.
Addendum about input file is output file
error
When the pattern of the input files matches the very same file as being output, this will cause an error. It's a safety feature. Example: cat *.txt > out.txt
run the second time will cause this.
What you can do about it:
- Choose a more specific pattern to match the actual input files, not matching the output name. Example: input files pattern
*.txt
with output fileoutput.out
will not collide. - Work in different directories. In the example above I've used a separate
input-files-dir
directory to place all files in, and output to the current working directory. This makes it impossible to get this error.
A simple way to do that is by using cat:
cat file1 file2 > joined_file
If you just issue cat file1 file2
you'll see both files on the standard output. By using >
, you're just redirecting the standard output to a file. That will work also with another commands.
Do it with a simple loop:
for i in *.txt; do cat "$i" >> complete.txt; done
>>
appends to the file.
Note: If for some reason you have to run the command again, you have to remove complete.txt
, otherwise you'd write the file to itself, which doesn't work.