Quick Tip: $size things up to $type

in Querying, Quick Tip

There are a number of quarky little MongoDB commands and queries hidden in the docs, two of them are $size and $type.

Does it really matter? $size that is …

The $size operator matches any array with the specified number of elements.

The basic use would be something like …

> db.ideas.find( { votes : { $size: 2 } } );

That would get you all the documents in the ideas collection that have exactly 2 items in the votes array (if say, you kept an array of votes.)

This however is of somewhat dubous use as:

You cannot use $size to find a range of sizes.

Meaning you can’t query for ideas that have 2 or more votes, or between 2 and 25 votes.

Are you her $type?

Another interesting query modifier is $type

The $type operator matches values based on their BSON type.

This might be useful to seek out documents that have a property that is an int and not a string; the types correspond numeric IDs (see the list below.)

To get documents with a property which has a value that is of the type of int the query might look something like this …

> db.mycollection.find( { foo : { $type : 16 } } );

Again, the practical usefulness of this might be rather thin, but who knows … it might come in handy someday!

MongoDB BSON Types

Type Name

Type Number

Double

1

String

2

Object

3

Array

4

Binary data

5

Object id

7

Boolean

8

Date

9

Null

10

Regular expression

11

JavaScript code

13

Symbol

14

JavaScript code with scope

15

32-bit integer

16

Timestamp

17

64-bit integer

18

Min key

255

Max key

127

0 Comments

Easily Move Documents Between Collections or Databases

in Administration, Querying

Does this senario sound familiar?

The good news: your new super amazing mashup is becoming super popular, its getting featured on tech blogs and podcasts w00t!

You’ve used MongoDB for your database backed and so things have been scaling pretty well, and your analytics you have been tracking are really adding up, like millions and millions of documents.

The bad news is, well pretty much the same as the good — you’re starting to fill up your MongoDB Collection pretty quick and you want to break out your  analytics by month, and since the it’s analytic data it’s pretty much read only so you can move it around … but how?

Now, of course you could do some fancy sharding or something like that, but lets keep things simply why don’t we?

Copying Particular Documents Between Collections

So, say you want to take all the documents (records) that were created in May and move them to a stats_2012_05 collection?

Turns out this is pretty simple with MongoDB, much like a SELECT INTO statement in SQL you can make a copy of the May documents and insert them into your new collection, and then remove them from your source collection.

To do this, we need to remember that the shell of MongoDB uses javascript, so instead of a long query like one would use in SQL we will instead use the power of javascript and write a small function.

Grab Just the Docs You Want …

First off we’ll gather the documents we want and store them in a javascript variable, switch databases (or don’t if you just want to move between collections) then loop over the documents you saved in your varible and insert them into your new collection.

> use source_database;
> var docs = db.source_collection.find({ accessed: {
     '$gte': new Date(2012, 4, 1), '$lt': new Date(2012, 5, 1)
} }); 
> use new_database;
switched to db new_database
> docs.forEach(function(doc) { db.new_collection.insert(doc) });

Let’s Breakdown What We Did There …

First, we got all the documents that have an accessed date in May, i.e. where the date is greater than or equal ($gte) to 5/1/2012 and less than ($lt) 6/1/2012 and loaded all the documents into a varible called docs.

Javascript has super weird 0 based months, so May = 4 not 5, I know … weird.

Then we switched databases, and looped over each document in our docs variable, loaded it into a variable called doc and inserted it into our new collection. And you’re done!

Optionally, you can remove() the documents form your source collection when you are done.

Other Uses

You could use this for all sorts of purposes of course, pretty much any query you can think of to help you break out your data into separate collections for lots of useful reasons.

 

 

 

1 Comment

Mongo Seattle 2011

in Announcements, MongoSeattle

I’m happy to (sorta) announce I (Justin) will be giving a talk at Mongo Seattle 2011 on Dec 1st in downtown Seattle.

According my contacts at 10Gen the registration is going gang-busters and there are going to be some really great sessions to attend including one from our buddy Damon Cortesi at Simply Measured as well as a bunch of 10Gen engineers (the people that make MongoDB) plus MongoLab, Geek.net, VMWare and WordSquared (a super fun game!)

So, what are you waiting for?

Go ahead a register here  … and if you send me a DM or @Reply on Twitter (@learnmongo) I might just give you a pretty decent discount code!

 

0 Comments

Compacting MongoDB Collections

in Administration

A while back we wrote a post explaining how to compact MongoDB data files, that example shows how to use some server side javascript and a cron job to automatically compact the data files on a schedule … however this isn’t always ideal as it will compact all the collections in a database at one time.

If you have a very large Collection you didn’t want to run the operation on …or a number of Collections in your database you didn’t wish to compact you were a bit out of luck.

Also, using repairDatabase() requires the disk space for the current database plus the repaired copy (i.e. double the disk space.)

Enter MongoDB 2.0 … in 2.0+ you can now use the simple compact command to target the compact to a single collection.

Running compact has three major performance benefits:

  • Compacts collection (less disk space.)
  • Defragments a collection (data pages are aligned better.)
  • Rebuilds and compacts the collection’s indexes (less RAM needed, and better perf.)

If  you have a lot of read/write/delete operations going on in your Collection this could possibly have a fairly noticeable performance impact.

There are two ways to run the compact command:

> db.yourCollection.runCommand("compact");
> db.runCommand({ compact : 'yourCollection' });

Now, that said there are two big downsides to using compact …

  • The compact command blocks operations on the collection until it’s done compacting (so it’s best to run this off hours during scheduled maintenance.)
  • It’s typically slower than repairDatabase in its actual operational time.

Those aside, as part of routine maintenance compact is a really helpful new feature and might justify an upgrade to 2.0 all on it’s own!

1 Comment

Quick Tip: How to $size up a MongoDB Array

in Querying, Quick Tip

Since MongoDB will allow you to store more than just string and int values but also things like arrays … from time to time you might need to know how many items are in an array in your document.

For example, say you have a hosting business and to make a good profit you need to sell at least three upgrades to each hosting client.

To make sure you market to the correct people you want to find all the clients with two upgrades (and get them to sign up for just one more.)

Your general document structure looks something like this:

{
 customer_name : "Sam Taylor",
 email : "sam@widgetxyz.com",
 upgrades: ["ssl", "diskspace 3", "RAM 5"]
}
{
 customer_name : "Kyle Lopez",
 email : "klopez@buymybikes.com",
 upgrades: ["diskspace 3", "RAM 5"]
}

To get all the clients that have two upgrades we’ll use the $size operator:

> db.clients.find( { upgrades : { $size: 2 } } );

Now you’ll get back all the documents that have two upgrades in the “upgrades” array (or in this example, the document for “Kyle Lopez”.)

What about a range?

Unfortunately the $size operator doesn’t support ranges (like getting all documents with 2 to 4 array values.) To do that you’d need to create your own separate field to keep track of the count and query on that field.

0 Comments