How to copy-merge two directories?
. and other 4000 folders Each of these folders contain images and the directories’ names under images and images2 are exactly the same, however their content is different. Then I want to know how I can copy-merge the images of /images2/ad into images/ad, the images of /images2/foo into images/foo and so on with all the 4000 folders..
@AmirAliAkbari, I don’t think that it is a duplicate — the other question basically is ‘Does mv do merging?’ (answer: no). This question is about how to merge 2 directory hierarchies.
10 Answers 10
This is a job for rsync. There’s no benefit to doing this manually with a shell loop unless you want to move the file rather than copy them.
rsync -a /path/to/source/ /path/to/destination
(Note trailing slash on images2 , otherwise it would copy to /images/images2 .)
If images with the same name exist in both directories, the command above will overwrite /images/SOMEPATH/SOMEFILE with /images2/SOMEPATH/SOMEFILE . If you want to replace only older files, add the option -u . If you want to always keep the version in /images , add the option —ignore-existing .
If you want to move the files from /images2 , with rsync, you can pass the option —remove-source-files . Then rsync copies all the files in turn, and removes each file when it’s done. This is a lot slower than moving if the source and destination directories are on the same filesystem.
@Wildcard, well, that’s not quite the same as moving. As Gilles points out, it’s a lot slower than moving if they’re on the same fs; and moreover it requires a lot more temporary spae.
I’d also like to point out that it’s important to include the trailing slashes for each directory. For example, if you simply ran rsync -a images images2 , it will just copy images2 into images instead of merging them.
The best choice, as already posted, is of course rsync . Nevertheless also unison would be a great piece of software to do this job, though typically requires a package install. Both can be used in several operating systems.
Rsync
rsync synchronizes in one direction from source to destination. Therefore the following statement
rsync -avh --progress Source Destination
syncs everything from Source to Destination. The merged folder resides in Destination.
-a means «archive» and copies everything recursively from source to destination preserving nearly everything.
-v gives more output («verbose»).
-h for human readable.
—progress to show how much work is done.
If you want only update the destination folder with newer files from source folder:
rsync -avhu --progress source destination
Unison
unison synchronizes in both directions. Therefore the following statement
unison Source Destination
syncs both directories in both directions and finally source equals destination. It’s like doing rsync twice from source to dest and vice versa.
For more advanced usages look at the man pages or the following websites:
I want to mention that the correct path to the folder should be with the trailing slash at the end rsync -avh —progress source/ destination/ , otherwise source folder will be created in destination folder, at least in my case that was like this.
There are faster and much more space-efficient ways of merging two directories using the —link option to cp if the directories are on the same file system, described in the multiple varied answers in a related article here: (The title of the article doesn’t exactly match the user’s question, and the answers address the title topic, merging, more than they address the user’s actual question.)
The —link option to cp means no file data is copied. An example of this, where everything in /images2 replaces any older items in /images is:
cp —force —archive —update —link /images2/. /images
After the merge into /images , you can then rm -rf /images2
This solution will fail if anywhere in the file tree the merge tries to merge a directory onto an existing file or symlink with the same name, i.e. it won’t merge a directory named /images2/x onto an existing file or symlink with the same name /images/x and if you get such an error you can manually delete the file or symlink and just re-run the command.
The nice thing about —link is that no data is moved to merge the directories.
How Can I Merge Multiple Directories into One
I have multiple files in multiple folders under one directory that need to be in one folder. Is there a command line that can help me accomplish this?
Give us some more hints..do you want to move all files in all subdirectories inside a directory?, or the directories are randomly located (how to find them?), do you want to move specific files or all of them inside the directories? also do you want to move them to a existing directory or a new directory?
I’m not sure how to make the question simpler, but I’ll give it a shot. There are over 50 folders (all containing files) that I need merged into one.
Are these folders distributed randomly or under the same directory? if distributed randomly then do they contain a pattern in their names (how to find them)? if under the same directory then does the directory contain any other file/directory that needs to be excluded?
5 Answers 5
find . -type f -print0 | xargs -0 -I file mv --backup=numbered file .
This will move all the files in the current working directory and its subdirectories (recursively) into the current working directory, numbering files with the same filename numerically in order to avoid overwrites of files with the same filename.
Sample result on a tmp folder with a 1 , 2 and 3 subfolders each containing a 1.ext , 2.ext and 3.ext file:
ubuntu@ubuntu:~/tmp$ tree . ├── 1 │ ├── 1.ext │ ├── 2.ext │ └── 3.ext ├── 2 │ ├── 1.ext │ ├── 2.ext │ └── 3.ext └── 3 ├── 1.ext ├── 2.ext └── 3.ext 3 directories, 9 files ubuntu@ubuntu:~/tmp$ find . -type f -print0 | xargs -0 -I file mv --backup=numbered file . ubuntu@ubuntu:~/tmp$ tree . ├── 1 ├── 1.ext ├── 1.ext.~1~ ├── 1.ext.~2~ ├── 2 ├── 2.ext ├── 2.ext.~1~ ├── 2.ext.~2~ ├── 3 ├── 3.ext ├── 3.ext.~1~ └── 3.ext.~2~ 3 directories, 9 files
How do I merge one directory into another using Bash?
All files and directories in source will end up in destination . For example, source/file1 will be copied to destination/file1 .
The -T flag stops source/file1 from being copied to destination/source/file1 instead. (Unfortunately, cp on macOS does not support the -T flag.)
On Mac OS X / darwin you’ll want to use cp -r source/ destination — it’s tricky you have to use the / exactly as indicated.
You probably just want cp -R $1/* $2/ — that’s a recursive copy.
(If there might be hidden files (those whose names begin with a dot), you should prefix that command with shopt -s dotglob; to be sure they get matched.)
rsync --recursive html/ html_new/
Notice that the trailing slash / matters in this case. If you omit it from the source argument, rsync will write the files to html_new/html/ instead of html_new/ .
Rsync has got a lot of flags to set so look at rsync manpage for details.
Just use rsync — it’s a great tool for local file copy and merging in addition to remote copying.
rsync -av /path/to/source_folder/ /path/to/destination_folder/
Note that the trailing slash on the source folder is necessary to copy only the contents of source_folder to the destination. If you leave it off, it will copy the source_folder and it’s contents, which is probably not what you are looking for since you want to merge folders.
there is small typo, extra hyphen, so it should be rsync -av /path/to/source_folder/ /path/to/destination_folder/
Even though this question and its accepted answer are ancient, I am adding my answer because the presently existing ones using cp either don’t handle some edge-cases or require working interactively. Often edge-cases/scriptability/portability/multiple-sources don’t matter though, in which case simplicity wins, and it is better to use cp directly with less flags (as in other answers) to reduce cognitive load — but for those other times (or for a robustly reusable function) this invocation/function is useful, and incidentally isn’t bash-specific (I realise this question was about bash though, so that’s just a bonus in this case). Some flags can be abbreviated (e.g. with -a ), but I have included all explicitly in long-form (except for -R , see below) for the sake of explanation. Obviously just remove any flags if there is some feature you specifically don’t want (or you are on a non-posix OS, or your version of cp doesn’t process that flag — I tested this on GNU coreutils 8.25’s cp ):
mergedirs() < _retval=0 _dest="$1" shift yes | \ for _src do cp -R --no-dereference --preserve=all --force --one-file-system \ --no-target-directory "$/" "$_dest" || < _retval=1; break; >done 2>/dev/null return $_retval > mergedirs destination source-1 [source-2 source-3 . ]
- -R : has subtly different semantics from -r / —recursive on some systems (particularly with respect to special files in source dirs) as explained in this answer
- —no-dereference : never follow symbolic links in SOURCE
- —preserve=all : preserve the specified attributes (default: mode,ownership,timestamps), if possible additional attributes: context, links, xattr, all
- —force : if an existing destination file cannot be opened, remove it and try again
- —one-file-system : stay on this file system
- —no-target-directory : treat DEST as a normal file (explained in in this answer, namely: If you do a recursive copy and the source is a directory, then cp -T copies the content of the source into the destination, rather than copying the source itself. )
- [piped input from yes ]: even with —force , in this particular recursive mode cp still asks before clobbering each file, so we achieve non-interactiveness by piping output from yes to it
- [piped output to /dev/null ]: this is to silence the messy string of questions along the lines of cp: overwrite ‘xx’?
- [return-val & early exit]: this ensures the loop exits as soon as there is a failed copy, and returns 1 if there was an error
- A funky new flag which I also use with this on my system is —reflink=auto for doing so-called «light copies» (copy-on-write, with the same speed benefits as hard-linking, and the same size benefits until and in inverse proportion to how much the files diverge in the future). This flag is accepted in recent GNU cp , and does more than a no-op with compatible filesystems on recent Linux kernels. YMWV-a-lot on other systems.
What’s the best way to merge two directories on the same filesystem in linux?
I have two directories that needs to be merged together. Files in these two directories are all large files (>= 500MB). What I want to archive: For files in source directory: if it doesn’t exist in destination directory, mv it to the destination directory (which is fast since we are basically creating a new hard link and unlink the source file); if it exist in destination directory, copy the source file there and remove source file. The most common way to merge directories in Linux system is to use rsync with —remove-source-files option. But this is slow because it will do copy operation even the destination file doesn’t exist. Any better ideas? Thank you.
3 Answers 3
Basically what You described is move files an overwrite destination if exists. So Just move them.
I’m too obsessed with the idea of merging directories and didn’t realize mv can do the job. Thank you and @Michael Hampton for pointing it out.
There’s a case where mv fails. Here’s some example data:
mkdir -p src/d dest/d touch src/d/f1 dest/d/f2
$ mv src/* dest/ mv: cannot move 'src/d' to 'dest/d': Directory not empty $ mv -f src/* dest/ mv: cannot move 'src/d' to 'dest/d': Directory not empty $ mv -fv src/* dest/ mv: cannot move 'src/d' to 'dest/d': Directory not empty $ mv -fvi src/* dest/ mv: overwrite 'dest/d'? y mv: cannot move 'src/d' to 'dest/d': Directory not empty $ mv -fvi -t dest/ src/* mv: overwrite 'dest/d'? y mv: cannot move 'src/d' to 'dest/d': Directory not empty
This example does no error checking (DISCLAIMER: works for me, but please test that it works for you. maybe with echo before mv ), and will overwrite files with same path. And it uses find with \; which is terribly inefficient, but + doesn’t work right with «$dest» prepended. Older versions will make some dirs without the path prepended, and newer versions of find will say:
find: In '-exec . <> +' the '<>' must appear by itself, but you specified 'dest/<>'
You could probably find a way to fix that with xargs though. (It took a few minutes on the 64k files 8TB that I was moving). Add this content:
#!/bin/bash src=$1 dest=$2 src=$(readlink -f "$src") dest=$(readlink -f "$dest") cd "$src" # also copy hidden files shopt -s dotglob # make dirs (missing old permission,acl,xattr data), and then mv the files time find * -type d -exec mkdir -p "$dest"/<> \; time find * -type f -exec mv <> "$dest"/<> \; # also copy permissions, acls, xattrs rsync -aAX "$src"/ "$dest"/
$ find src dest src/ src/d src/d/f1 dest/ dest/d dest/d/f2
$ find src dest src src/d dest dest/d dest/d/f1 dest/d/f2
Now src/ should be just empty dirs. If so, you can rm -r src to clean up.