An Interest In:
Web News this Week
- April 19, 2024
- April 18, 2024
- April 17, 2024
- April 16, 2024
- April 15, 2024
- April 14, 2024
- April 13, 2024
PostgreSQL query optimization for Gophers: It's much easier than it sounds!
Introduction
Hello, amazing DEV people!
Today I will show you a wonderful query optimization technique for Postgres that I often use myself. This approach to optimization can save you from a long and tedious transfer of your project to another technology stack, such as GraphQL.
Intrigued? Here we go!
Table of contents
Problem statement
We'll take query optimization as an example of a simple task for any developer. Let's imagine that we have the task of creating a new endpoint for the REST API of our project, which should return:
- Data on the requested project by its alias;
- Array of all tasks that relate to the requested project in descending order by creation date;
- Number of tasks (as a separate response attribute);
Here you can immediately see one quick solution make several queries for each of the models in the database (for the project and for related tasks for that project).
Well, let's look at it in more detail.
Note: I will give all the code samples in Go with Fiber web framework, since this is my main language & framework for backend development at the moment.
A quick solution to the problem
Okay, here is our controller for the endpoint:
// ./app/controllers/project_controller.go// ...// GetProjectByAlias func for getting one project by given alias.func GetProjectByAlias(c *fiber.Ctx) error { // Catch project alias from URL. alias := c.Params("alias") // Create database connection. db, err := database.OpenDBConnection() if err != nil { return err } // Get project by ID. project, err := db.GetProjectByAlias(alias) if err != nil { return err } // Get all tasks by project ID. tasks, err := db.GetTasksByProjectID(project.ID) if err != nil { return err } // Return status 200 OK. return c.JSON(fiber.Map{ "status": fiber.StatusOK, "project": project, // <-- 1 "tasks_count": len(tasks), // <-- 2 "tasks": tasks, // <-- 3 })}
As you can see, this controller fully meets the conditions of our task (all three points of the original problem).
It will work?
Yes, of course!
Would such code be optimal?
Probably not
We call alternately the functions GetProjectByAlias
and GetTasksByProjectID
which creates additional latency and wastes additional resources of both the server API and the PostgreSQL database itself.
It's all because queries in DB most likely look like this:
-- For Project model:SELECT *FROM projectsWHERE alias = $1::varcharLIMIT 1-- For Task model:SELECT *FROM tasksWHERE project_id = $1::uuidORDER BY created_at DESC
Since the Go language created for speed and efficient use of server resources, such a waste of resources is simply unacceptable for any self-respecting Go developer.
Let's fix that in the next section.
Optimize this
So, how do we optimize this? Of course, by reducing the number of queries to the database. But then how do we get all the necessary tasks for the project and their number?
This is helped by the wonderful built-in aggregate function jsonb_agg that have appeared in PostgreSQL v9.6
and are constantly being improved from version to version.
Furthermore, we will be using COALESCE
function with FILTER
condition to correctly handle an empty value when the project may have no tasks. And immediately count the number of tasks through the COUNT
function.
Note: See more info about
COALESCE
here.
SELECT p.*, COALESCE(jsonb_agg(t.*) FILTER (WHERE t.project_id IS NOT NULL), '[]') AS tasks, COUNT(t.id) AS tasks_countFROM projects AS p LEFT JOIN tasks AS t ON t.project_id = p.idWHERE p.alias = $1::varcharGROUP BY p.idLIMIT 1
It's a little difficult to understand the first time, isn't it? Don't worry, you'll figure it out! Here's an explanation of what's going on here:
- Output all the data about the found project;
- We got only one project, which has a unique alias we are looking for;
- Using the
LEFT JOIN
function, we only joined the sample of tasks that have a connection to the project by ID; - We grouped all the data by project ID;
- We did an aggregation of all obtained tasks using the aggregation function
jsonb_agg
, filtering it all by project ID; - For projects that have no tasks, we provided a display in the form of an empty list;
- We used the
COUNT
function to calculate the number of tasks in the project;
Next, we just need to prepare the output of all the data obtained from the database. Let's add the appropriate structures to the Project
and Task
models.
A simplified structure with a description of each project task:
// ./app/models/task_model.go// ...// GetProjectTasks struct to describe getting tasks list for given project.type GetProjectTasks struct { ID uuid.UUID `db:"id" json:"id"` Alias string `db:"alias" json:"alias"` Description string `db:"description" json:"description"`}
And additional structures for the Project
model:
// ./app/models/project_model.go// ...// ProjectTasks struct to describe getting list of tasks for a project.type ProjectTasks []*GetProjectTasks // struct from Task model// GetProject struct to describe getting one project.type GetProject struct { ID uuid.UUID `db:"id" json:"id"` CreatedAt time.Time `db:"created_at" json:"created_at"` UpdatedAt time.Time `db:"updated_at" json:"updated_at"` UserID uuid.UUID `db:"user_id" json:"user_id"` Alias string `db:"alias" json:"alias"` ProjectStatus int `db:"project_status" json:"project_status"` ProjectAttrs ProjectAttrs `db:"project_attrs" json:"project_attrs"` // Fields for JOIN tables: TasksCount int `db:"tasks_count" json:"tasks_count"` Tasks ProjectTasks `db:"tasks" json:"tasks"`}
Note: The
ProjectTasks
type needed to correctly output a list of all the tasks in the project.
Let's fix controller:
// ./app/controllers/project_controller.go// ...// GetProjectByAlias func for getting one project by given alias.func GetProjectByAlias(c *fiber.Ctx) error { // Catch project alias from URL. alias := c.Params("alias") // Create database connection. db, err := database.OpenDBConnection() if err != nil { return err } // Get project by ID with tasks. project, err := db.GetProjectByAlias(alias) if err != nil { return err } // Return status 200 OK. return c.JSON(fiber.Map{ "status": fiber.StatusOK, "project": project, // <-- 1, 2, 3 })}
The final optimized query result for our new endpoint should look like this:
{ "status": 200, "project": { "id": "a5326b7d-eb6c-4d5e-b264-44ee15fb4375", "created_at": "2021-09-21T19:58:30.939495Z", "updated_at": "0001-01-01T00:00:00Z", "user_id": "9b8734f9-05c8-43ac-9cd8-d8bd15230624", "alias": "dvc08xyufws3uwmn", "project_status": 1, "project_attrs": { "title": "Test title", "description": "Test description", "category": "test" }, "tasks_count": 5, "tasks": [ { "id": "26035934-1ea4-42e7-9364-ef47a5b57126", "alias": "dc3b9d2b6296", "description": "Task one" }, // ... ] }}
That's how gracefully and easily we used all the power of built-in Postgres function and pure SQL to solve a database query optimization problem.
Wow, how great is that?
Photos and videos by
- Hannah Busing https://unsplash.com/photos/Zyx1bK9mqmA
P.S.
If you want more articles like this on this blog, then post a comment below and subscribe to me. Thanks!
Original Link: https://dev.to/koddr/postgresql-query-optimization-for-gophers-it-s-much-easier-than-it-sounds-24nf
Dev To
An online community for sharing and discovering great ideas, having debates, and making friendsMore About this Source Visit Dev To