Top 10000 from every user table in a database.

  • Hello,

    I'm looking to export the top 10K rows of every table in a database to an equivalent table in another database, I can use ssis with a separate data flow for each table, but, there is a lot of tables! Is there a way I can employ the sp_MSForeachtable stored proc (not sure if this works for Azure SQL DB), or is there a better way?

    Thank you.

    Regards,
    D.

  • Actually, you can't really do what you want. 10 rows from this table and 10 rows from that table, but, the 10 rows from the first table have a relationship to 400 rows from the second table, so you have to move all 400 or lose data integrity. In short, what you're asking for extremely difficult. There is no easy way to get it done.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • OK thank you, Grant, that is a good point, and thank you for replying. But then if I wanted all the data from all the tables (I should mention that there are 3 different schema's in the database so a database backup/copy is not really helpful), what should my approach be? I was planning on creating an SSIS package, with a DFT for each table, time consuming, but reusable. Would you have a better method for me to try?

    Regards,
    D.

  • Not really. Getting just a sub-set of data is quite difficult. There's not really a simple short cut. I'm assuming you want this for test & development. People usually take a few approaches (none of which are without flaws).

    First (and worst), they backup production and use that (look up GDPR to get an idea of why this is a silly choice).

    Next, backup production, restore it somewhere, clean it up, back that up, use it in dev & test. It works, but, you're moving around a lot of data. Company I used to work for did this and had 5 times the storage in dev & test that we had in prod. A tool like Redgate SQL Clone can help with this (yeah, shameless plug for my company). This approach has a bunch of work around creating & maintaining the clean-up scripts.

    Next, do what you're doing, build a method for extracting some of the production data, and hopefully cleaning it along the way. It can be done. It's just lots of work and then you have to maintain it as the structures in production change, oh, and take into account that dev & test structures are also changing, but on a different schedule than production.

    Finally, create test data that's completely fake but you use it to load your test & dev environments. This too is work, but it's not as much work as the extraction method. It's also radically safer than any of the other methods. Here too, Redgate can help with Data Generator (zero shame). The work here is multi-fold, creating the data, creating the scripts and then maintaining it all.

    Best, generate the data, but then you won't be seeing the same distributions as production, testing could miss things, etc. Easiest, backup & clean, but then you might miss important stuff exposing data illegally and you have to deal with space & time. I've mostly used the easy approach. BTW, SQL Clone will shortly have a data masking element that will help to automate this (shame, what's that).

    Hope that helps a little.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply