How do I join two temporary tables using MySQL?

  • Hi all,

    I'm still pretty new to SQL and could use some help getting this query to work. The ultimate goal is to join two temporary tables.

    In the first temporary table, I pulled all of the distinct rows from a table called 'Table A' and used a JOIN clause to combine it with a table called 'Table_B' so that I could apply a condition - in this case, I only wanted the most recently submitted records.

    This is the query I used to accomplish this:

    CREATE TEMPORARY TABLE temp_A as 

    SELECT DISTINCT a.ID,
    a.Year,
    a.Variable_1,
    a.Variable_2,
    a.Variable_3,
    a.Variable_4
    FROM db_prod.Table_A a
    INNER JOIN
    (SELECT * FROM db_prod.Table_B
    WHERE isMostRecentSubmission = True) AS b
    ON a.ID = b.ID;

    This is what the output table looks like:

    ID       Year      Var_1      Var_2      Var_3      Var_4         
    1001 2019 X 100 1 2
    1002 2019 Y 101 3 1
    1001 2020 Z 102 1 4
    1003 2020 Y 103 2 4
    1002 2021 Z 101 4 3

    For the second temp table, if unique combos for 'ID', 'Var_1', and 'Var_2' appear in multiple years in the temporary table 'temp_A', I only kept the row with the most recent year and dropped the previous years.

    Here is the query:

    CREATE TEMPORARY TABLE temp_B as

    SELECT ID, Var_1, Var_2
    max(Year) as 'Max_Year'
    FROM temp_B
    GROUP BY ID, Var_1, Var_2;

    This is what the output table looks like:

    ID      Var_1      Var_2      Max_Year
    1002 Y 101 2019
    1003 Y 103 2020
    1002 Z 101 2021

    The ultimate goal is to use the 'temp_A' table to bring in the missing rows: ('Var_3' & 'Var_4') and join them back onto the 'temp_B' table.

    So I want the final output table to look like this:

    Max_Year     ID      Var_1      Var_2     Var_3     Var_4
    2019 1002 Y 101 3 1
    2020 1003 Y 103 2 4
    2021 1002 Z 101 4 3

    I'm pretty sure it's simple, I just can't wrap my head around how to do it. Let me know if I didn't explain something properly or you need additional context. Thanks!

    NOTE: The more efficient the query the better, as the 'temp_A' table has over 280K records and the 'temp_B' table has over 175K observations and I've been having issues with the server losing connection.

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • First, I think you would likely get better support on a MySQL forum than a SQL Server forum.

    Second, my understanding of MySQL is that it keeps the temporary tables in the same way that SQL server does.  A quick google confirmed this  - "MySQL removes the temporary table automatically when the session ends or the connection is terminated".  So you could just join temp_A on temp_B on the ID column.

    Now, for efficiency, I have a feeling that is more likely related to indexes as you don't have a lot of rows.  I have tables with hundreds of millions of rows and even that isn't really that large of data and I haven't had timeout issues.

    I expect your timeout issues (server losing connection) is a configurable option in MySQL either in the connection string or on the server itself.

    But to JOIN them, I expect it would just be a simple "SELECT <columns> FROM temp_A JOIN temp_B ON temp_A.ID = temp_B.ID", but you may need to specify the join type.  I do not know MySQL well (or really at all), so I would recommend looking up the syntax if you are not certain.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply