Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
cookbook/content/patterns/pivot.txt
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
65 lines (49 sloc)
2.14 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: Pivot Data with Map reduce | |
created_at: 2011-05-05 18:0:024.036546 -04:00 | |
recipe: true | |
author: Gaetan Voyer-Perrault | |
description: How to use map-reduce to pivot table data. | |
filter: | |
- erb | |
- markdown | |
--- | |
### Problem | |
You have a collection of Actors with an array of the Movies they've done. | |
You want to generate a collection of Movies with an array of Actors in each. | |
Some sample data | |
<% code 'javascript' do %> | |
db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] }); | |
db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] }); | |
<% end %> | |
### Solution | |
We need to loop through each movie in the Actor document and emit each Movie individually. | |
The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned. | |
#### The code | |
<% code 'javascript' do %> | |
map = function() { | |
for(var i in this.movies){ | |
key = { movie: this.movies[i] }; | |
value = { actors: [ this.actor ] }; | |
emit(key, value); | |
} | |
} | |
reduce = function(key, values) { | |
actor_list = { actors: [] }; | |
for(var i in values) { | |
actor_list.actors = values[i].actors.concat(actor_list.actors); | |
} | |
return actor_list; | |
} | |
<% end %> | |
Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure. | |
Run the following to execute the map / reduce, output it to the "pivot" collection and print the result: | |
<% code 'javascript' do %> | |
printjson(db.actors.mapReduce(map, reduce, "pivot")); | |
db.pivot.find().forEach(printjson); | |
<% end %> | |
Here is the sample output, note that "Pretty Woman" and "Runaway Bride" have both "Richard Gere" and "Julia Roberts". | |
{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } } | |
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } } | |
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } } | |
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } } |