Easily Move Documents Between Collections or Databases

in Administration, Querying

Does this senario sound familiar?

The good news: your new super amazing mashup is becoming super popular, its getting featured on tech blogs and podcasts w00t!

You’ve used MongoDB for your database backed and so things have been scaling pretty well, and your analytics you have been tracking are really adding up, like millions and millions of documents.

The bad news is, well pretty much the same as the good — you’re starting to fill up your MongoDB Collection pretty quick and you want to break out your  analytics by month, and since the it’s analytic data it’s pretty much read only so you can move it around … but how?

Now, of course you could do some fancy sharding or something like that, but lets keep things simply why don’t we?

Copying Particular Documents Between Collections

So, say you want to take all the documents (records) that were created in May and move them to a stats_2012_05 collection?

Turns out this is pretty simple with MongoDB, much like a SELECT INTO statement in SQL you can make a copy of the May documents and insert them into your new collection, and then remove them from your source collection.

To do this, we need to remember that the shell of MongoDB uses javascript, so instead of a long query like one would use in SQL we will instead use the power of javascript and write a small function.

Grab Just the Docs You Want …

First off we’ll gather the documents we want and store them in a javascript variable, switch databases (or don’t if you just want to move between collections) then loop over the documents you saved in your varible and insert them into your new collection.

> use source_database;
> var docs = db.source_collection.find({ accessed: {
     '$gte': new Date(2012, 4, 1), '$lt': new Date(2012, 5, 1)
} }); 
> use new_database;
switched to db new_database
> docs.forEach(function(doc) { db.new_collection.insert(doc) });

Let’s Breakdown What We Did There …

First, we got all the documents that have an accessed date in May, i.e. where the date is greater than or equal ($gte) to 5/1/2012 and less than ($lt) 6/1/2012 and loaded all the documents into a varible called docs.

Javascript has super weird 0 based months, so May = 4 not 5, I know … weird.

Then we switched databases, and looped over each document in our docs variable, loaded it into a variable called doc and inserted it into our new collection. And you’re done!

Optionally, you can remove() the documents form your source collection when you are done.

Other Uses

You could use this for all sorts of purposes of course, pretty much any query you can think of to help you break out your data into separate collections for lots of useful reasons.

 

 

 

1 Comment
  • http://gianpaj.com/ Gianfranco Palumbo

    sorry wrong comment