MongoDB Aggregations: Identifying Popular Ingredients with $unwind and $group

In this series, we’re exploring different MongoDB aggregation operators by applying them to a collection of recipes.

MongoDB Aggregations Series

As part of our exploration we are imagining you’re building a recipe website. Today we want to analyze ingredient usage across all our recipes to find the most common ingredients.

By using MongoDB’s $unwind and $group operators together, we can calculate ingredient frequencies and identify popular ingredients. This analysis can be invaluable for creating ingredient-based filters, improving recipe recommendations, and even assisting with grocery list planning.

In this post, we’ll walk through how to use $unwind and $group to count ingredient occurrences across multiple recipes.

Why Use $unwind and $group for Ingredient Analysis?

Here are some examples of how this type of aggregation could be useful:

“As a User, I want to filter recipes by common ingredients I already have.”

Users can more easily find recipes they can make with ingredients they have on hand.

As a Grocery Planner, I want to generate a list of essential ingredients to stock up on.”

Identifying common ingredients can help users know which items to prioritize when grocery shopping.

“As a Content Creator, I want to analyze ingredient popularity to inspire new recipes.”

For those creating new recipes, ingredient frequency data can reveal trends and guide future recipe development.

MongoDB’s $unwind and $group operators together make this kind of analysis straightforward and efficient.

Setting Up the Data

Here’s a sample document from our recipe collection:

{
  "_id": {
    "$oid": "636aa94c7dd21c28fda493a1"
  },
  "title": "Zucchini Fudge Cake",
  "type": "Dessert",
  "ingredients": [
    {
      "name": "butter",
      "quantity": { "amount": 0.33, "unit": "cup" },
      "vegetarian": true
    },
    {
      "name": "sugar",
      "quantity": { "amount": 1, "unit": "cup" },
      "vegetarian": true
    },
    {
      "name": "flour",
      "quantity": { "amount": 2, "unit": "cup" },
      "vegetarian": true
    },
    {
      "name": "egg yolks",
      "quantity": { "amount": 1 },
      "vegetarian": true
    },
  
    {
      "name": "chocolate",
      "quantity": { "amount": 2, "unit": "oz" },
      "vegetarian": true
    },
    ...
    {
      "name": "baking powder",
      "quantity": { "amount": 1, "unit": "teaspoon" },
      "vegetarian": true
    },
    {
      "name": "salt",
      "quantity": { "amount": 0.25, "unit": "teaspoon" },
      "vegetarian": true
    },
    {
      "name": "vanilla",
      "quantity": { "amount": 1, "unit": "teaspoon" },
      "vegetarian": true
    },
  ]
}

This “Zucchini Fudge Cake“ recipe includes a rich array of ingredients (get it?), making it a good candidate for ingredient analysis.

Our goal is to count how many times each ingredient (e.g., “salt,“ “butter“) appears across all recipes.

Using $unwind and $group to Analyze Ingredient Frequency

To count ingredient occurrences, we’ll start by breaking down the ingredients array with $unwind so that each ingredient can be treated as a separate document. We’ll then use $group to count each unique ingredient’s frequency.

Step 1: Using $unwind to Expand Ingredients

Since ingredients is an array, we need to use $unwind to “explode” each ingredient into its own document. This lets us analyze each ingredient individually.

> db.recipes.aggregate([
  { $unwind: "$ingredients" },
  { $group: { _id: "$ingredients.name", count: { $sum: 1 } } },
]);

$unwind: "$ingredients"

Splits each document with an ingredients array into multiple documents, one for each ingredient. This allows us to treat each ingredient as an individual entry.

$group: { _id: "$ingredients.name", count: { $sum: 1 } }

Groups the documents by the ingredient name, counting each occurrence.

Output Example:

[
  { _id: "salt", count: 50 },
  { _id: "butter", count: 30 },
  { _id: "flour", count: 40 },
  { _id: "chicken", count: 22 },
];

This output shows us how many times each ingredient appears across all recipes. Now we know that “salt” is used in 50 recipes, making it a commonly used ingredient.

However more interesting is something like “chicken” at 22 recipes. However we could do a cool feature with this data when a user hovered over an ingredient and show a tooltip like “see 12 more recipes featuring Sriracha sauce”, well if you like it hot maybe!

Adding Additional Information to Each Ingredient Group

To make the analysis more insightful, we can modify our pipeline to include the total quantity used for each ingredient across recipes. For instance, we might want to sum the quantity of “butter” used in all recipes.

Step 2: Calculating Total Quantity for Each Ingredient

We’ll extend our pipeline to sum the quantities, assuming all quantities are in the same unit (e.g., cups or tablespoons).

> db.recipes.aggregate([
  { $unwind: "$ingredients" },
  {
    $group: {
      _id: "$ingredients.name",
      count: { $sum: 1 },
      totalQuantity: { $sum: "$ingredients.quantity.amount" },
    },
  },
]);

totalQuantity: { $sum: "$ingredients.quantity.amount" }

Adds up the quantity amount for each ingredient across all recipes.

Output Example:

[
  { _id: "salt", count: 50, totalQuantity: 30 },
  { _id: "butter", count: 30, totalQuantity: 15 },
  { _id: "flour", count: 40, totalQuantity: 80 },
];

This output now tells us not only how frequently each ingredient is used, but also the total quantity used across recipes. This information could be used to recommend how much of each ingredient to keep in stock.

Advanced Tip: Normalizing Measurements

In the example above we assumed all the quantities are in the same unit. That’s very unlikely so what are some things we could do to deal with different quantities without bringing the data down to the client side and processing it? Luckly there are a lot of options with the Aggregation Framework!

Most likely the best option is to pre-process our documents and add fields for different measurements but what if we want to do this on the fly?

Here is one way we could handle this using the $switch operator, along with a few other friends $sum, $eq and $divide:

> db.recipes.aggregate([
  { $unwind: "$ingredients" },
  {
    $group: {
      _id: "$ingredients.name",
      count: { $sum: 1 },
      totalQuantity: {
        $sum: {
          $switch: {
            branches: [
              { 
                case: { $eq: ["$ingredients.quantity.unit", "tablespoon"] },
                then: { $divide: ["$ingredients.quantity.amount", 16] }
              },
              { 
                case: { $eq: ["$ingredients.quantity.unit", "ounce"] },
                then: { $divide: ["$ingredients.quantity.amount", 8] }
              }
            ],
            // Assume base unit is cups
            default: "$ingredients.quantity.amount"
          }
        }
      }
    }
  }
])

$switch

Allows you to specify multiple conversions for different units in a single pipeline stage via “branches“.

The default option sets a base assumption (like cups), simplifying the calculation.

Advanced Tip: But Wait, There’s More (Rest of the World)!

I know what a lot of you are thinking, we don’t use old fashioned measurements like Imperial measures, we use Metric! Well no need to worry my friends we can deal with this as well.

Again, probably the best way to deal with this is via some pre-processing and adding both Imperial and Metric fields to your documents … let’s explore another possible way to do this (warning, includes some maths):

  • 1 cup = 240 milliliters (ml)
  • 1 tablespoon = 15 milliliters (ml)
  • 1 ounce = 28.35 grams (g)
> db.recipes.aggregate([
  { $unwind: "$ingredients" },
  {
    $addFields: {
      "ingredients.quantity": {
        imperial: {
          amount: "$ingredients.quantity.amount",
          unit: "$ingredients.quantity.unit"
        },
        metric: {
          amount: {
            $switch: {
              branches: [
                // Convert cups to ml
                {
                  case: { $eq: ["$ingredients.quantity.unit", "cup"] },
                  then: { $multiply: ["$ingredients.quantity.amount", 240] } },
                // Convert tablespoons to ml 
                {
                  case: { $eq: ["$ingredients.quantity.unit", "tablespoon"] },
                  then: { $multiply: ["$ingredients.quantity.amount", 15] } },
                // Convert ounces to grams
                {
                  case: { $eq: ["$ingredients.quantity.unit", "ounce"] },
                  then: { $multiply: ["$ingredients.quantity.amount", 28.35] } }, 
                // Convert teaspoons to ml
                {
                  case: { $eq: ["$ingredients.quantity.unit", "teaspoon"] },
                  then: { $multiply: ["$ingredients.quantity.amount", 5] }
                } 
              ],
              // Leave as-is if no conversion needed
              default: "$ingredients.quantity.amount"
            }
          },
          unit: {
            $switch: {
              branches: [
                { case: { $eq: ["$ingredients.quantity.unit", "cup"] }, then: "ml" },
                { case: { $eq: ["$ingredients.quantity.unit", "tablespoon"] }, then: "ml" },
                { case: { $eq: ["$ingredients.quantity.unit", "ounce"] }, then: "g" },
                { case: { $eq: ["$ingredients.quantity.unit", "teaspoon"] }, then: "ml" }
              ],
              default: "$ingredients.quantity.unit"
            }
          }
        }
      }
    }
  },
  {
    $group: {
      _id: "$_id",
      title: { $first: "$title" },
      type: { $first: "$type" },
      ingredients: { $push: "$ingredients" }
    }
  }
])

While its very unlikely we would want to use a query like this on our production website, hopefully it gives you an idea of how you can use these different pipeline steps like $switch and $addFields … and perhaps how you might run a query to pre-process your documents when you create / edit them to add this extra information!

Output Example:

[
  {
    "_id": ObjectId("636821387dd21c28fda4939f"),
    "title": "Zucchini Fudge Cake",
    "type": "Dessert",
    "ingredients": [
      {
        "name": "butter",
        "quantity": {
          "imperial": { "amount": 0.33, "unit": "cup" },
          "metric": { "amount": 79.2, "unit": "ml" }
        }
      },
      {
        "name": "flour",
        "quantity": {
          "imperial": { "amount": 2, "unit": "cup" },
          "metric": { "amount": 480, "unit": "ml" }
        }
      },
      {
        "name": "sugar",
        "quantity": {
          "imperial": { "amount": 1, "unit": "cup" },
          "metric": { "amount": 240, "unit": "ml" }
        }
      },
      {
        "name": "salt",
        "quantity": {
          "imperial": { "amount": 0.25, "unit": "teaspoon" },
          "metric": { "amount": 1.25, "unit": "ml" }
        }
      }
      // other ingredients...
    ]
  }
]

Now we have both Imperial and Metric measures all in one document!

Practical Applications for Ingredient Frequency Analysis

By analyzing ingredient usage frequency, you can add value to your application in several ways:

  • Ingredient-Based Recipe Suggestions
    Recommend recipes based on commonly used ingredients, allowing users to find recipes they can make with what they already have.
  • Grocery Planning
    Help users create grocery lists by identifying the ingredients most frequently used in recipes.
  • Recipe Content Insights
    For content creators, ingredient frequency can reveal popular ingredients and highlight potential areas for new recipes.

Advanced Tip: Filtering Ingredients by Dietary Preference

If you want to analyze only vegetarian ingredients, add a $match stage after $unwind to filter for ingredients marked as vegetarian:

> db.recipes.aggregate([
  { $unwind: "$ingredients" },
  { $match: { "ingredients.vegetarian": true } },
  {
    $group: {
      _id: "$ingredients.name",
      count: { $sum: 1 },
      totalQuantity: { $sum: "$ingredients.quantity.amount" },
    },
  },
])

In this example, only vegetarian ingredients are counted and summed, giving insight into which vegetarian items are most commonly used.

Conclusion

MongoDB’s $unwind and $group operators make it easy to analyze ingredient usage across recipes. By identifying frequently used ingredients, you can enhance recipe recommendations, improve grocery planning, and gain insights into popular ingredients. Experiment with $unwind and $group in your own MongoDB collections, and discover how these aggregations can enrich your recipe app’s functionality and user experience.

,

MongoDB for Jobseekers Book

If you’re fascinated by the intricacies of MongoDB and yearn to explore its limitless possibilities further, I invite you to delve into my comprehensive book, “MongoDB for Jobseekers.”

This book is your passport to unlocking MongoDB’s full potential, whether you’re a beginner or an experienced enthusiast. Seamlessly navigate the intricacies of data management, uncover advanced techniques, and gain insights that will elevate your proficiency.

Available on Amazon and other leading platforms, “MongoDB for Jobseekers” offers a comprehensive roadmap for honing your MongoDB skills. As you embark on your journey through the world of databases, this book will be your trusted companion, guiding you towards mastery and helping you stand out in the competitive landscape.