The Tale of a Naïve Mongoose user
Once there was a naïve Mongoose user who tried to run a Mongoose .find()
over a whole collection and do operations on all of the documents found there.
OK, maybe it was me, this morning.
First approach
It was something like this:
SomeModel
.find({})
.populate('someReferences')
.exec(function(err, doc) {
doStuff(doc);
});
Can you guess what happened after that script churned on the collection for a while?
Crash!
This happened, node.js ran out of memory:
<--- Last few GCs --->
88998 ms: Scavenge 1405.4 (1457.1) -> 1405.4 (1457.1) MB, 15.6 / 0 ms (+ 19.2 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
90156 ms: Mark-sweep 1405.4 (1457.1) -> 1404.8 (1457.1) MB, 1158.4 / 0 ms (+ 286.9 ms in 1053 steps since start of marking, biggest step 22.6 ms) [last resort gc].
91284 ms: Mark-sweep 1404.8 (1457.1) -> 1404.6 (1457.1) MB, 1127.7 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x1a6945644a49 <JS Object>
1: new constructor(aka MongooseDocumentArray) [/myproject/node_modules/mongoose/lib/types/documentarray.js:~23] [pc=0x2422a4edf2a5] (this=0x2ccc06275191 <a MongooseDocumentArray with map 0x271710fce9e1>,values=0x2ccc06275171 <JS Array[0]>,path=0x101fed47b049 <String[12]: achievements>,doc=0x2ccc06275119 <a model with map 0x271710fcf381>)
3: /* ano...
Turns out it was trying to load the whole collection into memory at once instead of iterating over each document in a cursor-type manner.
The second approach
Mongoose streams to the rescue! I refactored the code in the following manner and the subsequent results made me smile:
var stream = SomeModel
.find({})
.populate('someReferences')
.stream();
stream.on('data', function (doc) {
doStuff(doc);
});
stream.on('close', function () {
mongoose.disconnect();
console.log("All done.");
});
And The Naïve Mongoose user has learnt his lesson!