look for duplicates using #mongodb #mapreduce
I recently had to look for dups inside our mongo instance... I wrote a simple script, which failed because our data is too big, so then I wrote a mapreduce version, which ran beautifully...
here is a sample of the data (you can re-create on your own)
results
and here are the simple and complex (mapreduce) versions of my script