Creating ASDW tables

  • Comments posted to this topic are about the item Creating ASDW tables

  • Learned something, thanks Steve, great question.

    ...

  • Nice question, thanks Steve

    ____________________________________________
    Space, the final frontier? not any more...
    All limits henceforth are self-imposed.
    β€œlibera tute vulgaris ex”

  • Easy to google, found the answer in the first article I checked πŸ˜›

  • Unfortunately the question is out of date. REPLICATE is also an option now.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster - Friday, August 4, 2017 2:17 AM

    Unfortunately the question is out of date. REPLICATE is also an option now.

    That was my first reaction too.  But it could be argued that although REPLICATED is an option for the DISTRIBUTION parameter, the resulting table is a replicated table, not a distributed table, and perhaps that was why Steve didn't include that option.

    Tom

  • GilaMonster - Friday, August 4, 2017 2:17 AM

    Unfortunately the question is out of date. REPLICATE is also an option now.

    Agreed, per https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-azure-sql-data-warehouse which shows the following:
            DISTRIBUTION = HASH ( distribution_column_name )
          | DISTRIBUTION = ROUND_ROBIN -- default for SQL Data Warehouse
          | DISTRIBUTION = REPLICATE -- default for Parallel Data Warehouse

  • TomThomson - Friday, August 4, 2017 5:04 AM

    GilaMonster - Friday, August 4, 2017 2:17 AM

    Unfortunately the question is out of date. REPLICATE is also an option now.

    That was my first reaction too.  But it could be argued that although REPLICATED is an option for the DISTRIBUTION parameter, the resulting table is a replicated table, not a distributed table, and perhaps that was why Steve didn't include that option.

    It's a distributed table. All tables in SQLDW are distributed tables, the DISTRIBUTION parameter defines which of the options for distribution are to be used for that table.

    The option was only added a couple of weeks ago, the question likely didn't include it because it was written before REPLICATED became an option on Azure (it's been an option on the PDW since the first implementation, PDW doesn't have ROUND_ROBIN)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Having never used SQL Azure, I had to research it and learned something new.  Thanks.

  • Ed Wagner - Friday, August 4, 2017 6:29 AM

    Having never used SQL Azure, I had to research it and learned something new.  Thanks.

    This isn't SQL Azure. This is specifically for the DW product, the MPP-architecture data warehouse.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Terje Hermanseter - Friday, August 4, 2017 1:06 AM

    Easy to google, found the answer in the first article I checked πŸ˜›

    Question related to ASDW, not to the PDW, so it was according to the documentation easy. πŸ˜‰ But in the answers I saw some fantastic parameters, such as eg. INTERLEAVED,
    which led to the fact that I've googled this interesting MSDN blog: Introducing Interleaved Execution for Multi-Statement Table Valued Functions

  • Easy for someone who creates several ASDW tables every week... πŸ˜‰

    Thanks, Steve!

  • Been working a lot with Azure Data Warehouse, so knew this one pretty easy.

    I surely choose ROUND_ROBIN a lot for a number of tables where I just don't have a good candidate for a distribution key. This is where I can't find a field of data that can evenly be distributed across all 60 databases behind the scenes where I can minimize data movement and reduce the skew.

    There are some interesting ways to get around default values though. Having a lot of fun exploring the methodologies here.

    P.S

    Statistics are your friend if you op for ROUND_ROBIN.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply