Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
June 15, 2022 11:38 am GMT

A Second Walk Through of Composing a SQL Query

Showing my Work as I Query a Forem Instance

This builds on work from my Walk Through of Using Postgresql and Blazer to Generate a Cohort Report.

The query Ill be building helps answer what percentage of active users commented on at least one welcome article. For this query, an active user is someone whos been on the site 4 of the last 7 days.

Establishing the Entity Relationship Model

Again, when writing SQL, I like to start with a relationship diagram. In this data foray, we have four relevant tables.

class User  has_many :articles  has_many :commentsendclass Article  belongs_to :user  has_many :comments, as: :commentableendclass Comment  belongs_to :commentable, polymorphic: true  belongs_to :userendclass PageView  belongs_to :article  belongs_to :user, optional: trueend

Below is a diagram for those of you who prefer an Entity Relationship Model (ERM) of the four data models.

An ERM of the tables used in these queries.

An Entity Relationship Model of Users, Articles, Comments, and Page Views

Querying All Active Members

I want all users who have page_views on at least 4 of the last seven days; well consider these active users.

First I want to build a very narrow query; one that lets me make sure I know Im on the right path. Ill limit the page views to my user_id:

SELECT user_id, extract(isodow from created_at) AS day_of_weekFROM page_viewsWHERE page_views.user_id = 702612      AND page_views.created_at::date > NOW()::date - INTERVAL '7 day'GROUP BY page_views.user_id, day_of_week

The following query times out; Its trying to query all users.

SELECT dow.user_id,       count(dow.day_of_week) AS number_of_daysFROM (  SELECT user_id,         extract(isodow from created_at) AS day_of_week  FROM page_views  WHERE page_views.created_at::date        > NOW()::date - INTERVAL '7 day'    AND user_id IS NOT NULL  GROUP BY page_views.user_id, day_of_week) AS dowGROUP BY dow.user_idHAVING count(dow.day_of_week) >= 4

Because of the enormity of the page views we need to limit to only recently updated users. The following is the query to get recent users.

SELECT idFROM usersWHERE updated_at::date      > NOW()::date - INTERVAL '7 day'

The following query is the foundational Who are the current active users of Forem.

SELECT DISTINCT dow.user_id FROM (  SELECT users.id AS user_id,    extract(isodow from page_views.created_at) AS day_of_week  FROM users  INNER JOIN page_views    ON page_views.user_id = users.id    AND page_views.created_at::date        > NOW()::date - INTERVAL '7 day'    AND user_id IS NOT NULL  -- Extend the window for users just a bit to account --  -- for timing variance --  WHERE users.updated_at::date        >  NOW()::date - INTERVAL '8 day'  GROUP BY users.id, day_of_week) AS dowGROUP BY dow.user_idHAVING count(dow.day_of_week) >= 4

I saved the above query to https://dev.to/admin/blazer/queries/717-regular-and-active-recent-users-of-dev. We now have our whos the currently active users of DEV.to query.

Querying Active Users Who Have Commented on a Welcome Post

The next part is to work out who all commented on a welcome post. In Walk Through of Using Postgresql and Blazer to Generate a Cohort Report, I wrote about finding the users who had commented on the welcome article.

However, I need to adjust the cohort query; I only want users who commented on the welcome post. The cohort query has users who did and did not comment on the welcome post.

As a quick reminder, the result of the following query is all user_id that commented on a welcome post; but with a limitation on the users updated_at

SELECT DISTINCT comments.user_id AS user_idFROM commentsINNER JOIN users  ON comments.user_id = users.id    -- Extend the window for users just a bit to --    -- account for timing variance --    AND users.updated_at::date        > NOW()::date - INTERVAL '8 day'INNER JOIN articles  ON comments.commentable_id = articles.id    AND comments.commentable_type = 'Article'    AND articles.title LIKE 'Welcome Thread - v%'    AND articles.published = true    AND articles.user_id = 3GROUP BY comments.user_id

Now to meld the two queries. Im using the Postgresql WITH statement to create two queries that I can reference later on. I find the WITH statement to help encapsulate queries and hopefully make them more conceptually understandable.

WITH cow AS (  -- User IDs of recent folks who have commented on the  -- welcome threads --  SELECT DISTINCT comments.user_id AS user_id  FROM comments  INNER JOIN users ON comments.user_id = users.id    -- Extend the window for users just a bit to account --    -- for timing variance --    AND users.updated_at::date > NOW()::date - INTERVAL '8 day'  INNER JOIN articles    ON comments.commentable_id = articles.id    AND comments.commentable_type = 'Article'    AND articles.title LIKE 'Welcome Thread - v%'    AND articles.published = true    AND articles.user_id = 3  GROUP BY comments.user_id), dow AS (  -- User IDs of folks who have interacted at least 4 different  -- days of this week --  SELECT user_id FROM (    SELECT users.id AS user_id,      extract(isodow from page_views.created_at) AS day_of_week    FROM users    INNER JOIN page_views      ON page_views.user_id = users.id      AND page_views.created_at::date >    NOW()::date - INTERVAL '7 day'      AND user_id IS NOT NULL    -- Extend the window for users just a bit to account for    -- timing variance --    WHERE users.updated_at::date >      NOW()::date - INTERVAL '8 day'    GROUP BY users.id, day_of_week  ) AS dows  GROUP BY user_id  HAVING COUNT(day_of_week) >= 4)SELECT COUNT(dow.user_id) AS count_of_users,  (    SELECT COUNT(*)    FROM dow    INNER JOIN cow      ON cow.user_id = dow.user_id  ) AS count_of_users_that_said_helloFROM dow

Conclusion

When I was writing the last query, I kept getting a result that said every active user on the site had commented on a welcome article. I didnt trust that result; it seemed highly improbable. I revisited my queries and logic, found my error, reworked the query and got a more reasonable answer.

What was wrong? I had copied and pasted a query from Walk Through of Using Postgresql and Blazer to Generate a Cohort Report. But that query wasnt the right thing to ask. It does highlight one challenge of SQL; it can be hard to test the correctness of your query.


Original Link: https://dev.to/devteam/a-second-walk-through-of-composing-a-sql-query-561b

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To