Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Multiple backup at the same time Expand / Collapse
Author
Message
Posted Sunday, March 16, 2014 1:40 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 5:58 AM
Points: 36,755, Visits: 31,211
obtllc (3/15/2014)
If I have 30 players, it will simply be 30/5 = 6. Simple.

The teams are grouped as follows:
Best player (largest db) - group 1
Second best (second largest db) - group 2
Third best (third largest db) – group 3
Fourth best (fourth largest db) – group 4
Fifth best (fifth largest db) – group 5
Sixth best (sixth largest db) group 6
Seventh best (seventh largest db) group 1
Etc, etc.


That will mean that Group 1 will always take the longest and you lose out on the benefit of having multiple backups running at the same time because Groups 2 thru 6 will finish before Group 1, Groups 3 thru 6 will finish before Group 2, etc, etc.

As a simple example, lets say that you had just 6 databases and they all varied in size by just 1. According to your algorithm and with 3 simultaneous backups running, you would have the following...

Size Group
6 1
5 2
4 3
3 1
2 2
1 3

If we aggregate the sizes for each group, we get...

Size Group
6+3 = 9 1
5+2 = 7 2
4+1 = 5 3

If we said that each unit of value were 15 minutes worth of backup time, that means that Group 1 would finish in 2:15 hours. Group 2 would finish in 1:45 and sit idle for 0:30 compared to Group 1. Group 3 would finish in 1:15 and sit idle for 1:00 compared to group 1.

What you want to happen is the following where all of the groups have a balance load and finish as quickly as possible (1:45 for all groups instead of 2:15 for the longest group):

Size Group
6+1 = 7 1
5+2 = 7 2
4+3 = 7 3

That's also known as a "Load Balancing" or "Bin Stacking" problem and it requires a loop of some sort to do it's job.

Here's a simple bit of code for the example above.

DROP TABLE #MyHead,#Accumulator

DECLARE @DBCount INT
,@Backups INT
;
SELECT @Backups = 3 --Number of simultaneous backups
;
--===== Get the database names and sizes and remember them in descending size order.
-- Obviously, this is just test data. You'd have to change this to look at
-- something like sys.master_files.
SELECT N = IDENTITY(INT,1,1), GB = Number, DBName = 'Database'+RIGHT(Number+1000,3)
INTO #MyHead
FROM master.dbo.spt_Values t
WHERE t.Number BETWEEN 1 AND 6
AND t.Type = 'P'
ORDER BY GB DESC --Sorting by descending size is critical for this to work.
;
--===== Remember the number of databases to backup
SELECT @DBCount = @@ROWCOUNT
;
--===== Create our backup control bins
SELECT TOP (@Backups)
Bin = IDENTITY(INT,1,1)
,GB = 0
,DBNames = CAST(NULL AS VARCHAR(MAX))
INTO #Accumulator
FROM master.sys.all_columns ac1
;
--===== Presets for the loop
DECLARE @Counter INT;
SELECT @Counter = 1;

--===== Stack the bins so "weigh" as the same as closely as possible by
-- always assigning the current database, which has been sorted by
-- size in descending order, to the bin with the least amount of
-- work to do.
WHILE @Counter <= @DBCount
BEGIN
--===== This takes whatever the current database size is, finds the bin
-- with the least amount of work to do (bin number breaks ties),
-- and assigns the database name to that bin in a CSV.
UPDATE a
SET a.GB = a.GB + mh.GB
,a.DBNames = ISNULL(a.DBNames +',','') + mh.DBName
FROM #MyHead mh
CROSS JOIN #Accumulator a
WHERE mh.N = @Counter
AND a.Bin IN (SELECT TOP 1 a1.Bin FROM #Accumulator a1 ORDER BY a1.GB ASC, a1.Bin ASC)
;
--===== I left this here just so you can see the bins fill in order.
-- Take this out for production
SELECT * FROM #Accumulator ORDER BY Bin
;
--===== Bump the counter
SET @Counter = @Counter+1
;
END
;
--===== This displays the worklist by bin and process order (ItemNumber) and could be converted to
-- dynamic SQL backup commands to power just the number of jobs that you want to have running.
SELECT a.Bin, TotalBinSize = a.GB, ca.ItemNumber, ca.Item
FROM #Accumulator a
CROSS APPLY dbo.DelimitedSplit8K(a.DBNames,',')ca
;

Here's the output for above. I manually added a space between backup jobs just for clarity sake.

Bin TotalBinSize ItemNumber Item
--- ------------ ---------- -----------
1 7 1 Database006
1 7 2 Database001
2 7 1 Database005
2 7 2 Database002
3 7 1 Database004
3 7 2 Database003

(6 row(s) affected)


This system works well for when you have a database or two that might be quite a bit larger than the rest. For example, if Database001 had a size of 10, we'd get the following balancing act because the "system" wouldn't give it any more to do because of it's size and would try to balance the load as evenly as possible.

Bin TotalBinSize ItemNumber Item
--- ------------ ---------- -----------
1 10 1 Database001
2 11 1 Database006
2 11 2 Database003
2 11 3 Database002
3 9 1 Database005
3 9 2 Database004

(6 row(s) affected)



In this case, it does make a bit of a bad guess because the combination of 6+4 and 5+3+2 would be better combinations to make them all come out to 10 and would require another "pass" to resolve that, but it's a whole lot better than 10+4, 6+3, 5+2 or 14, 9, 7.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1551592
Posted Tuesday, March 18, 2014 11:34 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, July 25, 2014 2:06 PM
Points: 1,194, Visits: 2,210
obtllc (3/15/2014)

The teams are grouped as follows:
Best player (largest db) - group 1
Second best (second largest db) - group 2
Third best (third largest db) – group 3
Fourth best (fourth largest db) – group 4
Fifth best (fifth largest db) – group 5
Sixth best (sixth largest db) group 6
Seventh best (seventh largest db) group 1
Etc, etc.

Two of the groups will end up with seven players (seven databases). The seventh players will be low ranking players (small databases) and may not have a huge impact on the team.



Did you think of a scenario when these Groups overlap with one another ? In such case you will have more than the required number of parallel backups. How do you overcome this ?

--
SQLBuddy


Post #1552337
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse