giza: Giza White Mage (Default)
[personal profile] giza
Ever get the error mv: argument list too long or cp: argument list too long in Linux? Yeah, I know I shouldn't work with directories that have thousands of files in them, but sometimes that's not an option. Anyway, I did a little research and found this interesting workaround:

(cd $SOURCE && tar cfv - .) | (cd $TARGET && tar xfv -)

This command will change to your source directory, start tarring the contents, and write the tarball to stdout. The command on the right side of the pipe will change to your target directory and extract the tarball from stdin.

The reason for changing into the appropriate directories is because tar preserves the directory structure of whatever it archives, so you don't want it to include any parent directories of $SOURCE. The && characters mean that the command on the right side of them is only executed if the cd command on the left side of them executed successfully. Note that the above command requires you have enough disk space for a copy of the files to be made.

I love UNIX shell scripting. :-)

(no subject)

Date: 2004-10-18 09:57 pm (UTC)
From: [identity profile] lurene.livejournal.com
you can use xargs too

(no subject)

Date: 2004-10-18 09:58 pm (UTC)
From: [identity profile] doco.livejournal.com
We actually had that little trick in our Advanced Unix class at the community college a few years ago...but I forgot about that just until now.

Thanks for reminding me. :)

(no subject)

Date: 2004-10-18 10:02 pm (UTC)
From: [identity profile] giza.livejournal.com
But that would call mv or cp for each file, right? I think the overhead for setting up and tearing down processes would be a bit of a performance killer...

(no subject)

Date: 2004-10-18 10:22 pm (UTC)
From: [identity profile] lurene.livejournal.com
That depends on how large your tar files are - if they're larger than your ram, you'll be much slower that way than with xargs

(no subject)

Date: 2004-10-18 10:26 pm (UTC)
From: [identity profile] wolfspawn3.livejournal.com
Find wants to be your friend as well!

find . -type f -exec mv {} $TARGET \;

(no subject)

Date: 2004-10-18 10:28 pm (UTC)
From: [identity profile] wolfspawn3.livejournal.com
Actually depending on the situation tar might not be whats best at all, sometimes dump (if we are talking whole filesystems) is what is called for.

(no subject)

Date: 2004-10-18 10:36 pm (UTC)

(no subject)

Date: 2004-10-18 10:36 pm (UTC)

(no subject)

Date: 2004-10-18 11:15 pm (UTC)
From: [identity profile] points.livejournal.com
You can set up xargs to batch your requests as well.. i.e.: send out the commands using 100 files at a time, plus you can tell xargs to do the commands serially, or spawn multiple threads at a time.. so if you had some 5000 files to copy or so, and you wanted to do it in batches of 100, you could say always have five seperate copy processes running.

Anyhow... xargs, arcane, ludicrously powerful. Like so many other UNIX tools. ;)

(no subject)

Date: 2004-10-18 11:17 pm (UTC)
From: [identity profile] pandaguy.livejournal.com
also look at "cpio -p" -- this uses cpio in "pipe" mode, which will accomplish the same without having to make a tarball in the middle of the process.

something like "find $SOURCE -print | cpio -pdvum $TARGET" I believe is the format, but it's been few years since I have used cpio that way.

(no subject)

Date: 2004-10-18 11:37 pm (UTC)
From: [identity profile] crywolf.livejournal.com
Don't forget the p flag for tar, to keep permissions.

Also, rsync, while a bit more cpu overhead, especially for large file lists, is a nifty way to copy a bunch of files from one directory to another. It's especially slick if for some reason you have to interrupt the copy and resume later. "rsync -av $SOURCE $TARGET" (and add other flags if you want to delete files in $TARGET that are not in $SOURCE).

find/mv, as has been said, is also an option. I think that caching essentially eliminates the overhead of reloading mv each time.

(no subject)

Date: 2004-10-19 01:01 am (UTC)
From: [identity profile] furahi.livejournal.com
Linux Torvalds said, when Kernel 2.4 (or 2.2) was released that because of how filesystems work now you should never use dump.. I wish I had a link :X

(no subject)

Date: 2004-10-19 01:03 am (UTC)
From: [identity profile] furahi.livejournal.com
I have indeed gotten the argument list too long error, a lot.
However.... not with cp or mv; both can handle entire directories without having to have an argument for each file.
I usually deal with that using foreach; which is odd as foreach takes parameters anyway, but it'd seem to support more parameters, probably because it's not an actual command

like instead of mv * somewhere
foreach i in * ; do mv $i somewhere ; done

Yes, there is overhead for running mv with each file, and it /is/ noticeable

(no subject)

Date: 2004-10-19 01:08 am (UTC)
From: [identity profile] giza.livejournal.com
I was previously doing that when working on directories of tens of thousands of files. I was looking for a solution that was more efficient. :-)

(no subject)

Date: 2004-10-19 01:11 am (UTC)
From: [identity profile] smoke-au.livejournal.com
Reading this comment thread I swear there is more I have forgotten than I ever knew.. or something like that. Oh for the days when I worked with computers...

(no subject)

Date: 2004-10-19 04:13 am (UTC)
From: [identity profile] stormydragon.livejournal.com
I usually use: find $SOURCE -name $PATTERN -exec cp {} $TARGET \;

(no subject)

Date: 2004-10-19 05:34 am (UTC)
From: [identity profile] taral.livejournal.com
Yay cpio!

(no subject)

Date: 2004-10-19 05:35 am (UTC)
From: [identity profile] taral.livejournal.com
I often use cp -a $SOURCE $TARGET

(no subject)

Date: 2004-10-19 11:53 am (UTC)
From: [identity profile] wolfspawn3.livejournal.com
Ahh yes, you are quite right, dump is made to be used on ufs filesystems (bsd systems) there is an ext2dump around though but how good and stable it is I dont know.

(no subject)

Date: 2004-10-19 02:18 pm (UTC)
From: [identity profile] bigtig.livejournal.com
I was kinda wondering why folks are putting thousands of files on the comman via shell substitutions when you can just move the dang directory.

Of course if you wanted to be sneak you could rsync and then suppress the various files you don't want to copy. The nice thing about using rsync as a local large copy is its resumable....

(no subject)

Date: 2004-10-19 02:26 pm (UTC)
From: [identity profile] giza.livejournal.com
I should have clarified... I had two directories each with tens of thousands of files, and was moving their contents all to one directory. I did not want to just move things around either so that I could preserve the original directory structure.

(no subject)

Date: 2004-10-19 02:40 pm (UTC)
From: [identity profile] bigtig.livejournal.com
Uhhh -- so why not this?

cd $SOURCE; cp -r ./ $TARGET

(no subject)

Date: 2004-10-19 07:26 pm (UTC)
From: [identity profile] darthgeek.livejournal.com
I ran into rm: argument list too long once

I was working on a SCO Unix box (which might still be running) and somehow a directory was created that contained a copy of everything one directory above it, and then proceeded to, very likely, infinitely recurse. I was never able to find the "bottom" of it and it didn't seem to have any sort of performance impact so I just left it alone.

Profile

giza: Giza White Mage (Default)
Douglas Muth

April 2012

S M T W T F S
1234567
891011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags