Skip to content

SQL queries

In my work, I often need to perform certain data-engineering tasks frequently. One such frequent task is to create one-off database queries( aka sql queries) for data migration purposes. The process is often like this.

  1. Oh, I need to figure out how to write this query as part of my data migration script due to this new feature I need to add.
  2. Google and find the right StackOverflow answer.
  3. Copy, paste, and modify till it works.
  4. Forget how it’s done. Then when I need the query again, repeat step 1-3.

I want to break the cycle of Google-StackOverflow-Copy-Paste-Forget. Looking ahead, I also expect to talk in greater depth about my work involving data engineering work. Hence, I will kill two birds with one stone by writing all these one-off queries in this page.

For the time being, I will simply append the queries to this page as and when I google for them in my daily work.

When the list gets too long, I will further reorganise this page. For now, my cycle is now Google-StackOverflow-Copy-Paste-Append-To-This-Page.

Insert Into on Duplicate Update with Select

Situation: You want to insert a bunch of records into a table using data from other tables. At the same time, when there’s duplication in terms identity values, you want to use update instead of insert.

MySQL

INSERT INTO table1 (val1, val2) SELECT val_a, val_b FROM table2 ON DUPLICATE KEY UPDATE val1=table2.val_a, val2=table2.val_b;

Postgres

Will add this later.

Insert Into on Duplicate Update with Group By Clause

Situation: Similar situation as the case above. You want to insert a bunch of records into a table using data from other tables. At the same time, when there’s duplication in terms identity values, you want to use update instead of insert. The tricky part is that the inserted is an aggregation of data from other tables. This typically requires Group By clause in the select subquery.

MySQL

INSERT INTO table1(val1, val2) SELECT table2sum.val_a, table2sum.val_b FROM (SELECT SUM(val_a) as val_a, val_b from table2 group by table2.val_b) as table2sum ON DUPLICATE KEY UPDATE val1=table2sum.val_a, val2=table2sum.val_b;

Postgres

Will add this later.

Update with Select SUM from another table

Situation: You have a table that is the parent of another child table. You want to sum a field in the child table and update the value to the corresponding parent record in the parent table.

MySQL

UPDATE parent_table pt 
INNER JOIN
(
   SELECT pt_id, SUM(field1) as field1_sum
   FROM child_table 
   GROUP BY pt_id
) i ON pt.id = i.pt_id
SET pt.total_field1 = i.field1_sum
[WHERE predicate];

Postgres

Will add this later.