Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
October 4, 2021 08:52 pm GMT

PostgreSQL query optimization for Gophers: It's much easier than it sounds!

Introduction

Hello, amazing DEV people!

Today I will show you a wonderful query optimization technique for Postgres that I often use myself. This approach to optimization can save you from a long and tedious transfer of your project to another technology stack, such as GraphQL.

Intrigued? Here we go!

Table of contents

Problem statement

We'll take query optimization as an example of a simple task for any developer. Let's imagine that we have the task of creating a new endpoint for the REST API of our project, which should return:

  1. Data on the requested project by its alias;
  2. Array of all tasks that relate to the requested project in descending order by creation date;
  3. Number of tasks (as a separate response attribute);

Here you can immediately see one quick solution make several queries for each of the models in the database (for the project and for related tasks for that project).

Well, let's look at it in more detail.

Note: I will give all the code samples in Go with Fiber web framework, since this is my main language & framework for backend development at the moment.

Table of contents

A quick solution to the problem

Okay, here is our controller for the endpoint:

// ./app/controllers/project_controller.go// ...// GetProjectByAlias func for getting one project by given alias.func GetProjectByAlias(c *fiber.Ctx) error {    // Catch project alias from URL.    alias := c.Params("alias")    // Create database connection.    db, err := database.OpenDBConnection()    if err != nil {        return err    }    // Get project by ID.    project, err := db.GetProjectByAlias(alias)    if err != nil {        return err    }    // Get all tasks by project ID.    tasks, err := db.GetTasksByProjectID(project.ID)    if err != nil {        return err    }    // Return status 200 OK.    return c.JSON(fiber.Map{        "status":  fiber.StatusOK,        "project": project,        // <-- 1        "tasks_count": len(tasks), // <-- 2        "tasks": tasks,            // <-- 3    })}

As you can see, this controller fully meets the conditions of our task (all three points of the original problem).

It will work?
Yes, of course!
Would such code be optimal?
Probably not

We call alternately the functions GetProjectByAlias and GetTasksByProjectID which creates additional latency and wastes additional resources of both the server API and the PostgreSQL database itself.

It's all because queries in DB most likely look like this:

-- For Project model:SELECT *FROM    projectsWHERE    alias = $1::varcharLIMIT 1-- For Task model:SELECT *FROM    tasksWHERE    project_id = $1::uuidORDER BY    created_at DESC

Since the Go language created for speed and efficient use of server resources, such a waste of resources is simply unacceptable for any self-respecting Go developer.

Let's fix that in the next section.

Table of contents

Optimize this

So, how do we optimize this? Of course, by reducing the number of queries to the database. But then how do we get all the necessary tasks for the project and their number?

This is helped by the wonderful built-in aggregate function jsonb_agg that have appeared in PostgreSQL v9.6 and are constantly being improved from version to version.

Furthermore, we will be using COALESCE function with FILTER condition to correctly handle an empty value when the project may have no tasks. And immediately count the number of tasks through the COUNT function.

Note: See more info about COALESCE here.

SELECT    p.*,    COALESCE(jsonb_agg(t.*) FILTER (WHERE t.project_id IS NOT NULL), '[]') AS tasks,    COUNT(t.id) AS tasks_countFROM    projects AS p    LEFT JOIN tasks AS t ON t.project_id = p.idWHERE    p.alias = $1::varcharGROUP BY     p.idLIMIT 1

It's a little difficult to understand the first time, isn't it? Don't worry, you'll figure it out! Here's an explanation of what's going on here:

  • Output all the data about the found project;
  • We got only one project, which has a unique alias we are looking for;
  • Using the LEFT JOIN function, we only joined the sample of tasks that have a connection to the project by ID;
  • We grouped all the data by project ID;
  • We did an aggregation of all obtained tasks using the aggregation function jsonb_agg, filtering it all by project ID;
  • For projects that have no tasks, we provided a display in the form of an empty list;
  • We used the COUNT function to calculate the number of tasks in the project;

Next, we just need to prepare the output of all the data obtained from the database. Let's add the appropriate structures to the Project and Task models.

A simplified structure with a description of each project task:

// ./app/models/task_model.go// ...// GetProjectTasks struct to describe getting tasks list for given project.type GetProjectTasks struct {    ID          uuid.UUID `db:"id" json:"id"`    Alias       string    `db:"alias" json:"alias"`    Description string    `db:"description" json:"description"`}

And additional structures for the Project model:

// ./app/models/project_model.go// ...// ProjectTasks struct to describe getting list of tasks for a project.type ProjectTasks []*GetProjectTasks // struct from Task model// GetProject struct to describe getting one project.type GetProject struct {    ID            uuid.UUID    `db:"id" json:"id"`    CreatedAt     time.Time    `db:"created_at" json:"created_at"`    UpdatedAt     time.Time    `db:"updated_at" json:"updated_at"`    UserID        uuid.UUID    `db:"user_id" json:"user_id"`    Alias         string       `db:"alias" json:"alias"`    ProjectStatus int          `db:"project_status" json:"project_status"`    ProjectAttrs  ProjectAttrs `db:"project_attrs" json:"project_attrs"`    // Fields for JOIN tables:    TasksCount int          `db:"tasks_count" json:"tasks_count"`    Tasks      ProjectTasks `db:"tasks" json:"tasks"`}

Note: The ProjectTasks type needed to correctly output a list of all the tasks in the project.

Let's fix controller:

// ./app/controllers/project_controller.go// ...// GetProjectByAlias func for getting one project by given alias.func GetProjectByAlias(c *fiber.Ctx) error {    // Catch project alias from URL.    alias := c.Params("alias")    // Create database connection.    db, err := database.OpenDBConnection()    if err != nil {        return err    }    // Get project by ID with tasks.    project, err := db.GetProjectByAlias(alias)    if err != nil {        return err    }    // Return status 200 OK.    return c.JSON(fiber.Map{        "status":  fiber.StatusOK,        "project": project,        // <-- 1, 2, 3    })}

The final optimized query result for our new endpoint should look like this:

{ "status": 200, "project": {  "id": "a5326b7d-eb6c-4d5e-b264-44ee15fb4375",  "created_at": "2021-09-21T19:58:30.939495Z",  "updated_at": "0001-01-01T00:00:00Z",  "user_id": "9b8734f9-05c8-43ac-9cd8-d8bd15230624",  "alias": "dvc08xyufws3uwmn",  "project_status": 1,  "project_attrs": {   "title": "Test title",   "description": "Test description",   "category": "test"  },  "tasks_count": 5,  "tasks": [   {    "id": "26035934-1ea4-42e7-9364-ef47a5b57126",    "alias": "dc3b9d2b6296",    "description": "Task one"   },   // ...  ] }}

That's how gracefully and easily we used all the power of built-in Postgres function and pure SQL to solve a database query optimization problem.

Wow, how great is that?

Table of contents

Photos and videos by

P.S.

If you want more articles like this on this blog, then post a comment below and subscribe to me. Thanks!


Original Link: https://dev.to/koddr/postgresql-query-optimization-for-gophers-it-s-much-easier-than-it-sounds-24nf

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To