Sources Contact Advanced Search Tutorials

An Interest In:

Web News this Week

Search Archive

Some of Our Sources

View All Sources

Help Webnuz

Referal links:

January 25, 2021 08:57 am GMT

The right way to make advanced and efficient MongoDB pagination

One upon a time, we had a complex project enough (ride-sharing and taxi application) with stack Node.js and MongoDB. We have chosen this stack because it was preferable by the customer, good known by our team, and at the same time looks like good a suite for project tasks.

Everything was great, the number of users became more than twelve thousand, the number of active drivers was close to three hundred drivers. In one year, the number of rides becomes more than two million.

But once we need to create an admin panel to control and monitor all processes (from the business point of view) in the main application. The huge percent of requirements was to have advanced lists of a variety of entities, with bind statistics over them.

Because we use mongoose, as ODM, first of all, we take a look at its plugins. The most popular of them, that related to pagination are
mongoose-paginate
*Pagination plugin for Mongoose Note: This plugin will only work with Node.js >= 4.0 and Mongoose >= 4.0. Add plugin to*www.npmjs.com
mongoose-paginate-v2
*A cursor based custom pagination library for Mongoose with customizable labels.*www.npmjs.com
mongoose-aggregate-paginate
*mongoose-aggregate-paginate is a Mongoose plugin easy to add pagination for aggregates. This plugin can be used in*www.npmjs.com
mongoose-aggregate-paginate-v2
*A cursor based custom aggregate pagination library for Mongoose with customizable labels. If you are looking for basic*www.npmjs.com

Another requirement was to have a possibility to choose a specific page on-demand, so the approach to use previous-next-like pagination, that based on cursors was prohibited immediately its mongoose-paginate-v2 and *mongoose-aggregate-paginate-v2 *libraries.

The oldest, and probably the simplest in usage is mongoose-paginate it uses simple search queries, limit, sort, and the skip operations. I guess its a good variant for simple pagination just install a plugin, add few lines of code to your endpoint, and thats all work is done. It even can use populate of mongoose, something that emulates joins from SQL world. Technically it just makes additional queries to the database, that probably not the way you want. Even more, when you just have a little bit more complicated query, with any data transformation, it will be totally unusable. I know just one way to normally use it in such cases first create MongoDB View technically its pre-saved queries, that MongoDB represents as read-only collections. And just then run pagination using mongoose-paginate over this view. Not bad you will hide complicated queries under view, but we have a better idea of how to solve this problem.

MongoDB Aggregation Framework is here!

You know, I guess, it was really day for the MongoDB community when Aggregation Framework was released. Probably it allows for most of the queries that you can imagine. So, we think about taking mongoose-aggregate-paginate into use*.*

But the next two things that disappointed us:

**What does this plugin need? **I mean what task does it help to solve, that cannot be solved without this plugin, with the same effort. Looks like it just one more additional dependency in your project, because it doesnt bring any profit, even dont save your time

Internal codebase, and general approach to making queries. This library makes TWO calls into a database and waits for responses via Promise.all. First to fetch query result and second to calculate the count of total records that query returns, without $filter and $limit stages. It needs this to calculate the number of total pages.

How we can avoid additional queries into the database? The worst thing here that we need to run all aggregation pipeline twice, that can be costly enough in terms of memory and CPU usage. Even more, if collection huge, and documents tend to be few megabytes, it can impact Disc I/O usage, that also a big problem.

The good news Aggregation Framework has a specific stage in its arsenal, that can solve this problem. Its $facet:

Processes multiple aggregation pipelines within a single stage on the same set of input documents. Each sub-pipeline has its field in the output document where its results are stored as an array of documents.

MongoDB documentation about $facet stage .

Aggregation Pipeline for pagination will have the next shape:

{   $facet: {      <outputField1>: [ <stage1>, <stage2>, ... ],      <outputField2>: [ <stage1>, <stage2>, ... ],      ...   }}

Enter fullscreen mode Exit fullscreen mode

Also, pipelines for pagination can be improved by customization for specific cases. Some tips are listed below:

Run all operations, that dont directly affect on final pagination result, after all possible filters ($match stages). There are stages like $project or $lookup that dont change the number or order of result documents. Try to cut off as many documents as you can at once.
Try to make your models as more self-sufficient, to avoid additional $lookups. Its normal to duplicate some data or make pre-computing fields.
If you have a really huge pipeline, that processes many data, your query will probably use more than 100MB. In this case, you need to use allowDiskUse flag.
Follow Aggregation Pipelines performance optimization guide. This advice helps you to make your queries more efficient.
And technically you can make dynamic queries on your application code side depends on conditions you can add, remove or modify specific stages. It can speed up your queries, and moreover, make your code more eloquent.

Coz NDA, I cannot show you real database schema and real queries. But let me show you a small example of such pagination.

Imagine that you have two collections Statistic and Drivers. Drivers collection is static enough in thinking of types and amount of fields in different documents. But Statistic is polymorphic, can be changed during time, as a result of business requirements updates. Also, some drivers could have big statistic documents and history in general. So you cannot make Statistic as subdocument of Driver.

So code and MongoDB query will have the next shape:

const ridesInfoPaginationPipeline = (filter = {}, skip = 0, limit = 10, sort = {}) => [{    $match: {      ...filter,      active: true,    }  },  {    $sort: {      ...sort,      createdAt: -1,    }  },  {    $lookup: {      from: 'statistic',      localField: '_id',      foreignField: 'driverId',      as: 'driver',    },  },  {    $unwind: {      path: '$driver',      preserveNullAndEmptyArrays: true,    },  },  {    $project: {      driver: {        $ifNull: [{          $concat: ['$driver.firstName', ' ', '$driver.lastName']        }, 'Technical']      },      entityId: 1,      message: 1,      meta: 1,      createdAt: 1,    },  },  {    $facet: {      total: [{        $count: 'createdAt'      }],      data: [{        $addFields: {          _id: '$_id'        }      }],    },  },  {    $unwind: '$total'  },  {    $project: {      data: {        $slice: ['$data', skip, {          $ifNull: [limit, '$total.createdAt']        }]      },      meta: {        total: '$total.createdAt',        limit: {          $literal: limit        },        page: {          $literal: ((skip / limit) + 1)        },        pages: {          $ceil: {            $divide: ['$total.createdAt', limit]          }        },      },    },  },];const executePagination = async () => {    return Statistic.aggregate(ridesInfoPaginationPipeline());}

Enter fullscreen mode Exit fullscreen mode

As you see, using Aggregation Framework and $facet stage we can:

make data transformation and complex queries;
fetch data from multiple collections;
get metadata of pagination (total, page, pages)in the one query without additional query execution.

Regarding the main drawbacks of such an approach, I guess just one is major higher complicity of development and debug process, along with higher entry threshold. It includes performance troubleshooting, knowledge of a variety of stages, and data modeling approaches.

So, pagination, that based on MongoDB Aggregation Framework, is not pretending to be a silver bullet. But after many attempts and pitfalls we found that this solution is covered all our cases, with no effects and no high coupling to a specific library.

Original Link: https://dev.to/max_vynohradov/the-right-way-to-make-advanced-and-efficient-mongodb-pagination-16oa

Share this article:

View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To