Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Balloons and Data Expand / Collapse
Author
Message
Posted Wednesday, August 7, 2013 8:26 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 5:38 PM
Points: 31,368, Visits: 15,834
Comments posted to this topic are about the item Balloons and Data






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1482114
Posted Thursday, August 8, 2013 8:40 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 7:20 AM
Points: 313, Visits: 1,151
My comment is centered around the use of tempdb over 'work' tables. What I mean is that in the very beginning of us using SQL server we wrote almost everything to 'work' tables, like work_db.dbo.policy_list. As we developed more SQL we eventually ran into an issue with a table name being used in more than one job. Of course this happened in the middle of the night and trying to figure out why the code was saying the table didn't exist when by the code it just created it was frustrating in the early AM hours. Another job had deleted it since it was using the exact same table name and happened to run at the same time this night. From that point forward we have coded more and more tables as temp tables, #policy_list instead of work_db_dbo.policy_list. So now when we run our monthly load of our warehouse the tempdb space grows, the rest of the month it's not used as heavily. Our DBA has asked that we look at writing to more 'work' tables instead of so much in the tempdb. My question is how do we avoid the situation that happened in the past? There are some 'work' tables that are deleted at the end of the SQL that created it. So I can't easily look to see if there is anything that is using the same name.
I'm not a DBA but from my point of view I would think it would be easier to deal with one database growth each month(tempdb) than with multiple databases.
Post #1482370
Posted Thursday, August 8, 2013 8:49 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 5:38 PM
Points: 31,368, Visits: 15,834
First, you need to change terminology. Worktables are intermediate tables used by SQL Server. If you refer to tables you create as worktables, it becomes hard to discuss performance related items.

Second, if you need places for data to manipulate it, like staging tables for ETL, why not just create real tables? Don't drop them, and don't clear them out, except for the process that uses them. If you can't keep naming straight, then use a GUID in each process to create/drop the table. If you have process 1 using the same table as process 2 and you aren't sure when these processes will run, you are architecturally making a big mistake.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1482379
Posted Thursday, August 8, 2013 9:32 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 7:20 AM
Points: 313, Visits: 1,151
Sorry for using 'work' table when I mean an intermediate table. If I have process A that normally ran at 10 PM and it created a intermediate table work_db.dbo.policy_list, this is a list of policies to renew in the next 90 days lets say. Process B normally runs at midnight also creates a table called work_db.dbo.policy_list but this is only current policies. If process A is delayed, say it has a file dependency on another job that is delayed so process A starts at 11:58 PM. As the SQL runs it creates its policy_list table at 11:59 PM then Process B starts at midnight and the first thing it does is delete's the policy_list table so it can create it for it's processing. This will either cause Process A to fail of get the wrong results. Neither process is going to use the table policy_list beyond the SQL that created it. That's why I prefer to just use the temp table #policy_list in both of these processes. As you are writing the SQL for the process that may be a 1000 lines long how do you make sure that your intermediate table name doe not already exist in another process. When most people have coded a delete at the end of there SQL for any intermediate tables they used the table name may not exist on the database. Without knowing every line of code you can't be sure your intermediate table name is not used somewhere else. At least not yet, I'm trying to come up with something in my spare time to list all tables used by our processes. My concern is that we are being asked to use intermediate tables instead of temp tables and I'm afraid we will run into the same situation some where down the road. So without some kind of naming convention for the intermediate table names then I say use temp tables. Any thoughts on how you could even come up with a naming convention for intermediate table names?
Post #1482413
Posted Thursday, August 8, 2013 12:34 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 5:38 PM
Points: 31,368, Visits: 15,834
You shouldn't have that many processes that work in the same area, without knowing what is being done with each. Even in large development teams, 20 people, I haven't seen naming collisions get beyond test. We know that the same name is being used and someone changes.

This is team communication. Either you have a failing here, or an architectural failing that allows the same table to be used for two things.

If you have the need for a permanent table that you use for temporary storage, then you create a name. The name doesn't matter, but if you create it and only truncate or clear it, you easily know if it is there . If you are dropping tables, then there's no reason not to use a tempdb temp table.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1482518
Posted Thursday, August 8, 2013 1:00 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, December 12, 2014 1:37 PM
Points: 351, Visits: 316
Steve Jones - SSC Editor (8/8/2013)

if you need places for data to manipulate it, like staging tables for ETL, why not just create real tables? Don't drop them, and don't clear them out, except for the process that uses them. If you can't keep naming straight, then use a GUID in each process to create/drop the table. If you have process 1 using the same table as process 2 and you aren't sure when these processes will run, you are architecturally making a big mistake.

Steve - This is so true. Use of a permanent staging table has always improved the reliability of any well written ETL process. I used to insert into #temp tables all the time. The difference in performance is amazing. I have heard that you should make these tables heaps, but that has not been the better way for me. What is your opinion on that?
Post #1482535
Posted Thursday, August 8, 2013 1:02 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, December 12, 2014 1:37 PM
Points: 351, Visits: 316
What is your opinion on how many TempDB files you should have?
I know that it is best to have the TempDB on it's own LUN separate from all others.
I have also read that you should create at least one TempDB database file per CPU core.
What is your opinion on that?
Post #1482537
Posted Thursday, August 8, 2013 1:07 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 5:38 PM
Points: 31,368, Visits: 15,834
Read this: http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-1230-tempdb-should-always-have-one-data-file-per-processor-core/

In general, here's what I'd say. Separate LUN if possible, but not critical. Go with # cores for <= 8. After that, stick with 8 unless you can test.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1482543
Posted Thursday, August 8, 2013 1:17 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, December 12, 2014 1:37 PM
Points: 351, Visits: 316
Steve Jones - SSC Editor (8/8/2013)
Read this: http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-1230-tempdb-should-always-have-one-data-file-per-processor-core/

In general, here's what I'd say. Separate LUN if possible, but not critical. Go with # cores for <= 8. After that, stick with 8 unless you can test.

LOLZ Steve - That was the article I was thinking about when I said "I have read"....
It is also the one I pointed too when starting at my current place of work when they asked why.

Doing this has also allowed us to rebuild indexes on Standard Edition servers without taking our applications offline.

They have stopped asking why.

Paul is the go to master for SQL Deep Dive's and water ones two.
Post #1482548
Posted Thursday, August 8, 2013 3:53 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 12:50 PM
Points: 2,500, Visits: 1,586
Hi Steve, Years back I instituted the use of and Operational Data Store or as some say staging area. The initial thought was to use this as a clearing house to build dataflows from an XML based webservice network but it grew. Once data folks knew that there we a place where ETL could be done where the expectation was that you could keep the intermediate tables if you choose, it grew. We have had to institute some controls on who, what, and when to not overload the process, but some in the organization are now using it as a main part of their data strategy.

Why not keep what is reasonable to keep. Much of it could be done in temp tables, but this works nicely.



Not all gray hairs are Dinosaurs!
Post #1482594
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse