How do I join two temporary tables using MySQL?

  • Hi all,

    I'm still pretty new to SQL and could use some help getting this query to work. The ultimate goal is to join two temporary tables.

    In the first temporary table, I pulled all of the distinct rows from a table called 'Table A' and used a JOIN clause to combine it with a table called 'Table_B' so that I could apply a condition - in this case, I only wanted the most recently submitted records.

    This is the query I used to accomplish this:

    CREATE TEMPORARY TABLE temp_A as 

    SELECT DISTINCT a.ID,
    a.Year,
    a.Variable_1,
    a.Variable_2,
    a.Variable_3,
    a.Variable_4
    FROM db_prod.Table_A a
    INNER JOIN
    (SELECT * FROM db_prod.Table_B
    WHERE isMostRecentSubmission = True) AS b
    ON a.ID = b.ID;

    This is what the output table looks like:

    ID       Year      Var_1      Var_2      Var_3      Var_4         
    1001 2019 X 100 1 2
    1002 2019 Y 101 3 1
    1001 2020 Z 102 1 4
    1003 2020 Y 103 2 4
    1002 2021 Z 101 4 3

    For the second temp table, if unique combos for 'ID', 'Var_1', and 'Var_2' appear in multiple years in the temporary table 'temp_A', I only kept the row with the most recent year and dropped the previous years.

    Here is the query:

    CREATE TEMPORARY TABLE temp_B as

    SELECT ID, Var_1, Var_2
    max(Year) as 'Max_Year'
    FROM temp_B
    GROUP BY ID, Var_1, Var_2;

    This is what the output table looks like:

    ID      Var_1      Var_2      Max_Year
    1002 Y 101 2019
    1003 Y 103 2020
    1002 Z 101 2021

    The ultimate goal is to use the 'temp_A' table to bring in the missing rows: ('Var_3' & 'Var_4') and join them back onto the 'temp_B' table.

    So I want the final output table to look like this:

    Max_Year     ID      Var_1      Var_2     Var_3     Var_4
    2019 1002 Y 101 3 1
    2020 1003 Y 103 2 4
    2021 1002 Z 101 4 3

    I'm pretty sure it's simple, I just can't wrap my head around how to do it. Let me know if I didn't explain something properly or you need additional context. Thanks!

    NOTE: The more efficient the query the better, as the 'temp_A' table has over 280K records and the 'temp_B' table has over 175K observations and I've been having issues with the server losing connection.

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • First, I think you would likely get better support on a MySQL forum than a SQL Server forum.

    Second, my understanding of MySQL is that it keeps the temporary tables in the same way that SQL server does.  A quick google confirmed this  - "MySQL removes the temporary table automatically when the session ends or the connection is terminated".  So you could just join temp_A on temp_B on the ID column.

    Now, for efficiency, I have a feeling that is more likely related to indexes as you don't have a lot of rows.  I have tables with hundreds of millions of rows and even that isn't really that large of data and I haven't had timeout issues.

    I expect your timeout issues (server losing connection) is a configurable option in MySQL either in the connection string or on the server itself.

    But to JOIN them, I expect it would just be a simple "SELECT <columns> FROM temp_A JOIN temp_B ON temp_A.ID = temp_B.ID", but you may need to specify the join type.  I do not know MySQL well (or really at all), so I would recommend looking up the syntax if you are not certain.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply