backup - How can I find duplicate photos in a very large pool of data (tens to hundreds of gigs)?

Tuesday, 26 March 2019

backup - How can I find duplicate photos in a very large pool of data (tens to hundreds of gigs)?

Can anyone suggests a good photo duplication detection utility that works well when I am dealing with about a 100gb of data (collected over the years)?

I would prefer something that works on Ubuntu.

Thanks in advance!

Edit: Is there a tool that will help me reorganize my collection and remove duplicates, once they have been detected?

Edit2: The hard part is figuring out what to do once I have the output consisting of thousands of duplicate files (such as the output of fdupes).

Its not obvious if I can still safely delete a directory (i.e. if a directory might contain unique files), which directories are subsets of other directories and so on. An ideal tool for this problem should be able to determine file duplication an then provide a powerful means of restructuring your files and folders. Doing a merge by hardlinking (as fslint does) does indeed free up diskspace but it does not solve the underlying problem which gave rise to the duplication to start with -- i.e. bad file/dir organization.

Blog

Tuesday, 26 March 2019

backup - How can I find duplicate photos in a very large pool of data (tens to hundreds of gigs)?

No comments:

Post a Comment

Why is the front element of a telephoto lens larger than a wide angle lens?