Skip to content

SQL queries

In my work, I often need to perform certain data-engineering tasks frequently. One such frequent task is to create one-off database queries( aka sql queries) for data migration purposes. The process is often like this.

  1. Oh, I need to figure out how to write this query as part of my data migration script due to this new feature I need to add.
  2. Google and find the right StackOverflow answer.
  3. Copy, paste, and modify till it works.
  4. Promptly forget how it’s done.
  5. When I need the query again, repeat step 1-3.

I want to break the cycle of Google-StackOverflow-Copy-Paste-Forget. Looking ahead, I also expect to talk in greater depth about my work involving data engineering work. Hence, I want to fulfil both functions with this page.

For the time being, I will simply append the queries to this page as and when I encounter the need for one-off sql queries during my work.

When the list gets too long, I may reorganise this page. For now, my cycle is now Google-StackOverflow-Copy-Paste-Append-To-This-Page.

 

Reset the Auto-increment to Max ID + 1

Situation: The table has auto-increment for the id field which uses int. You want to reset the auto-increment counter so that it carries on from the maximum id field that’s already being used.

MySQL

The issue here is that you need to use stored procedures. So there are basically two steps.

First, create the stored procedure. Second, run the stored procedure.

Step 1: Create Stored Procedure

    DELIMITER //

    CREATE PROCEDURE reset_autoincrement(IN tablename varchar(200))

    BEGIN

     SET @get_next_inc = CONCAT('SELECT @next_inc := max(id) + 1 FROM ',tablename,';');

     PREPARE stmt FROM @get_next_inc; 

     EXECUTE stmt; 

     SELECT @next_inc AS result;

     DEALLOCATE PREPARE stmt; 

         
     SET @alter_statement = CONCAT('ALTER TABLE ', tablename, ' AUTO_INCREMENT = ', @next_inc, ';');

      PREPARE stmt FROM @alter_statement;

      EXECUTE stmt;

      DEALLOCATE PREPARE stmt;

    END //

    DELIMITER ;
Before you run the stored procedure, you shouldn’t be able to find it.
No such stored procedure when search for `reset`
When I run, I simply select the stored procedure and press Run Selection.
Running the Stored Procedure
Note: the delimiters part are crucial.
After I run, I should be able to find the stored procedure.
Now the stored procedure exists
Note: If you want to change the stored procedure, you need to delete the stored procedure, then change it and run again.

Step 2: Call the stored procedure

This time you can simply use normal MySQL queries to call the stored procedure.
call reset_autoincrement('products');

Postgres

-- To be determined

Insert Into on Duplicate Update with Select

Situation: You want to insert a bunch of records into a table using data from other tables. At the same time, when there’s duplication in terms identity values, you want to use update instead of insert.

MySQL

INSERT INTO table1 (val1, val2) SELECT val_a, val_b FROM table2 ON DUPLICATE KEY UPDATE val1=table2.val_a, val2=table2.val_b;

Postgres

Will add this later.

Insert Into on Duplicate Update with Group By Clause

Situation: Similar situation as the case above. You want to insert a bunch of records into a table using data from other tables. At the same time, when there’s duplication in terms identity values, you want to use update instead of insert. The tricky part is that the inserted is an aggregation of data from other tables. This typically requires Group By clause in the select subquery.

MySQL

INSERT INTO table1(val1, val2) SELECT table2sum.val_a, table2sum.val_b FROM (SELECT SUM(val_a) as val_a, val_b from table2 group by table2.val_b) as table2sum ON DUPLICATE KEY UPDATE val1=table2sum.val_a, val2=table2sum.val_b;

Postgres

Will add this later.

Update with Select SUM from another table

Situation: You have a table that is the parent of another child table. You want to sum a field in the child table and update the value to the corresponding parent record in the parent table.

MySQL

UPDATE parent_table pt 
INNER JOIN
(
   SELECT pt_id, COALESCE(SUM(field1),0) as field1_sum
   FROM child_table 
   GROUP BY pt_id
) i ON pt.id = i.pt_id
SET pt.total_field1 = i.field1_sum
[WHERE predicate];

Postgres

UPDATE parent_table pt
SET pt.total_field1 = i.field1_sum
FROM 
(
  SELECT pt_id, COALESCE(SUM(field1),0) as field1_sum
  FROM child_table
  GROUP BY pt_id
) i
WHERE i.pt_id = pt.id [and more WHERE predicate];

Update One Table with Data From Another Related Table

Situation: You want to update the fields for a particular row in the child table with data from the parent table.

MySQL

UPDATE parent_table p 
INNER JOIN 
    child_table c 
ON 
    p.id = c.parent_id 
SET 
    c.name = concat(p.name, ' > Child' 
WHERE c.parent_id = 123;

Postgres

UPDATE child_table AS c
SET 
    name = concat(p.name, ' > Child')
FROM parent AS p
WHERE c.parent_id= p.id and c.parent_id = 123;

Update Two Tables That Are Related

Situation: You want to update the fields for a particular row in both the parent and child table.

MySQL

UPDATE parent_table p
INNER JOIN
    child_table c
ON
    p.id = c.parent_id
SET
    p.parent_name = 'parent',
    c.child_name = 'child'
WHERE p.serial_number = '1234ST';

Postgres

Postgres does not have a similar update-inner-join statement like MySQL.

You can do this in two ways. Do it in 2 separate queries. Or do it in 1 query. The 2 queries way is easier to understand.

2 queries

UPDATE parent_table AS p
SET parent_name = 'parent'
WHERE p.serial_number = '1234ST';


UPDATE child_table AS c
SET child_name = 'child'
FROM parent_table p
WHERE c.parent_id = p.id and p.serial_number = '1234ST';

1 query

The technique below is also known as the writeable common table query but honestly, I find it not as intuitive.

The advantage of this over the 2 queries method is that you only need to state the WHERE clause condition (WHERE serial_number = '1234ST') once.

WITH parent_update as (
  UPDATE parent_table
   SET parent_name = 'parent'
  WHERE serial_number = '1234ST'
  returning id, serial_number
)
UPDATE child_table 
  SET child_name = 'child'
WHERE (parent_id, serial_number) IN (select id, serial_number from parent_update);

Order by Length of Field

Situation: You want to see the records of a table (table1) ordered by the string length of a field (field1) which is a text or char field. You want to do it by descending order too.

Postgres

SELECT * from table1 ORDER BY CHAR_LENGTH(field1) desc;

MySQL

SELECT * from table1 ORDER BY LENGTH(field1) desc;

TRUNCATE and CASCADE

Situation: You want to truncate parent table but it has child table using foreign key reference on it. You want to cascade the truncate.

Postgres

TRUNCATE TABLE parent CASCADE;

MySQL

TBD