Combining large amount of files
If you have root permissions on that machine you can temporarily increase the "maximum number of open file descriptors" limit:
ulimit -Hn 10240 # The hard limit
ulimit -Sn 10240 # The soft limit
And then
paste res.* >final.res
After that you can set it back to the original values.
A second solution, if you cannot change the limit:
for f in res.*; do cat final.res | paste - $f >temp; cp temp final.res; done; rm temp
It calls paste
for each file once, and at the end there is a huge file with all columns (it takes its minute).
Edit: Useless use of cat... Not!
As mentioned in the comments the usage of cat
here (cat final.res | paste - $f >temp
) is not useless. The first time the loop runs, the file final.res
doesn't already exist. paste
would then fail and the file is never filled, nor created. With my solution only cat
fails the first time with No such file or directory
and paste
reads from stdin just an empty file, but it continues. The error can be ignored.
If chaos' answer isn't applicable (because you don't have the required permissions), you can batch up the paste
calls as follows:
ls -1 res.* | split -l 1000 -d - lists
for list in lists*; do paste $(cat $list) > merge${list##lists}; done
paste merge* > final.res
This lists the files 1000 at a time in files named lists00
, lists01
etc., then pastes the corresponding res.
files into files named merge00
, merge01
etc., and finally merges all the resulting partially merged files.
As mentioned by chaos you can increase the number of files used at once; the limit is the value given ulimit -n
minus however many files you already have open, so you'd say
ls -1 res.* | split -l $(($(ulimit -n)-10)) -d - lists
to use the limit minus ten.
If your version of split
doesn't support -d
, you can remove it: all it does is tell split
to use numeric suffixes. By default the suffixes will be aa
, ab
etc. instead of 01
, 02
etc.
If there are so many files that ls -1 res.*
fails ("argument list too long"), you can replace it with find
which will avoid that error:
find . -maxdepth 1 -type f -name res.\* | split -l 1000 -d - lists
(As pointed out by don_crissti, -1
shouldn't be necessary when piping ls
's output; but I'm leaving it in to handle cases where ls
is aliased with -C
.)
Try to execute it on this way:
ls res.*|xargs paste >final.res
You can also split the batch in parts and try something like:
paste `echo res.{1..100}` >final.100
paste `echo res.{101..200}` >final.200
...
and at the end combine final files
paste final.* >final.res