SQL formula calc challenge

Question

SQL formula calc challenge

Digs

SSCertifiable

Points: 7333
More actions
July 26, 2010 at 4:26 pm

#387431

Hi gurus !
First download the attached ZIP file.
Save data.txt file to your c:\drive.txt (or where ever)
Review SQLChallenge.xls example
The set up code.
CREATE TABLE [dbo].[DIM_SQLChallengeData](
[Date] [int] NOT NULL,
[Symbol] [varchar](5) NOT NULL,
[Value] [int] NOT NULL
) ON [PRIMARY]
GO
BULK
INSERT dbo.DIM_SQLChallengeData
FROM 'C:\data.txt'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = ''
)
GO
I need to perform the calculations server side to send to client.
GOAL: To have a stored procedure (or two separete procs) that produces the data series separately highlighted in YELLOW boxes in the attached spreadsheet.
McCMcC A-D
A-DSUMMATION
OSCINDEX
The out put would be like this for each of the calculations
Date, Value
NOTE: I am not after two series as output, its either the one or the other. So one series output at a time.
This allows me to chart each series.
Issues
1) This problem is the data is stored the same way as in the DIM_SQLChallenge table, and I must match current row by date with the previous row by date for some of the calculations.
2) Each data row must be matched up by date.
Any kind sole out there to guide me on this beast.
Also must be fast...

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 1

First of all, very well done on the data and the loader code you provided. You just can't ask for much better than that.

My only questions would be...

1. The .xls file appears to have only the "ADV" data. What do you want done for calculation for the "DEC" data? Or is that what you meant by "separate series".

2. Does the "DEC" data also have starting values? If so, are they the same as the "ADV" or different? If different, what are they?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 2

Actually, I do have another question...

You seem to have very neatly solved the problem in the .xls file. Your goal is (in your words) to chart the output. Why do any of this in SQL Server? Why not just do it all in Excel?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Digs SSCertifiable Points: 7333 More actions · Answer 3

Thanks for a response..

.." The .xls file appears to have only the "ADV" data. "

Sorry, more info for you to clear things up...in the XLS

B field is ADV: The number of stocks on the New York Stock Exchange advancing securities on that day. (thats stock value going up) In excel I I had 'Adc' by mistake.

C field is DEC: The number of stocks on the New York Stock Exchange declining securities on that day. (thats stock value going down)

.."What do you want done for calculation for the "DEC" data? Or is that what you meant by "separate series"..

The DEC data in XLS is defined above. The data in B and C is used on field C

2. Does the "DEC" data also have starting values? If so, are they the same as the "ADV" or different?

The only data that has a starting value is data in field E. YES the data between B and C is different, see above for more.

You seem to have very neatly solved the problem in the .xls file. Your goal is (in your words) to chart the output. Why do any of this in SQL Server? Why not just do it all in Excel?

Yes the xls calculation is correct. I need the TSQL calculation as I have a MSSQL application that I would like this formula in it for a charting tool I have. I need to calculate it server side, rather than client side. There is a method to my madness.

So the out put format.

Say stored proc has a Input Variable call 'SelectOption AS INT'

If SelectOption = 0 then output would be field G from xls sheet as a table format of Date,Value

If SelectOption = 1 then output would be field H from xls sheet as a table format of Date,Value

Please ask if you need more info..:-)

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 4

Understood on ALL that, thanks.

Do you want the "setup values" to be hard coded or should they be stored in a "setup" table somewhere?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Digs SSCertifiable Points: 7333 More actions · Answer 5

.."Do you want the "setup values" to be hard coded or should they be stored in a "setup" table somewhere? '..

The result set should be

SELECT Date,Value FROM @temp

NO need to store output...I will cache it at client end for heavy use.

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 6

My recommendation is to save the data somewhere so you don't need to recalculate it over and over. But... I'll let you figure that out (it's simple). Here's the hard part and it runs in the proverbial "blink of an eye"...

--===== Declare a possible input parameter

DECLARE @pSelectOption TINYINT

;

SELECT @pSelectOption = 1 --0 for Osc, 1 for Summation

;

--DROP TABLE #TheBeast;

--===== Copy the data from the immutable source table to a place where we can work on the problem.

SELECT ISNULL(adv.Date,0) AS Date, --IsNull makes a NOT NULL column here

adv.Value AS Advance,

dec.Value AS Decline,

CAST(NULL AS DECIMAL(28,15)) AS Osc,

CAST(NULL AS DECIMAL(28,15)) AS Summation

INTO #TheBeast

FROM dbo.DIM_SQLChallengeData adv

INNER JOIN dbo.DIM_SQLChallengeData dec

ON adv.Date = dec.Date

AND adv.Symbol = 'Adv'

AND dec.Symbol = 'Dec'

;

--===== Add the quintessential Clustered index to drive the order of the quirky update.

ALTER TABLE #TheBeast

ADD PRIMARY KEY CLUSTERED (Date) WITH FILLFACTOR = 100

;

--===== Declare the variables necessary to do the calculations in the quirky update.

DECLARE @AD DECIMAL(28,15),

@AD10 DECIMAL(28,15),

@AD05 DECIMAL(28,15),

@PrevSum DECIMAL(28,15),

@Anchor INT

;

--===== Preset those variables to the required starting conditions.

SELECT @AD10 = 121.85,

@AD05 = 0.00,

@PrevSum = 0.00

;

--===== Do the quirky update to include previous row values in order by the clustered index

-- whilst following all the rules to make this a success.

UPDATE tb

SET @AD = tb.Advance - tb.Decline,

@AD10 = (0.10*@AD)+(0.90*@AD10), --Prev @AD10 here

@AD05 = (0.05*@AD)+(0.95*@AD05), --Prev @AD05 here

Osc = @AD10 - @AD05,

@PrevSum = tb.Summation = @PrevSum + (@AD10 - @AD05), -- New Osc Used

@Anchor = tb.Date

FROM #TheBeast tb WITH (TABLOCKX)

OPTION (MAXDOP 1)

;

--==== Now, select the correct column to display and format the output

-- for just two decimal places. Couldn't do it before because it

-- would have caused rounding errors.

SELECT Date,

CASE @pSelectOption

WHEN 0 THEN CAST(Osc AS DECIMAL(9,2))

WHEN 1 THEN CAST(Summation AS DECIMAL(9,2))

ELSE NULL

END AS [Value]

FROM #TheBeast

ORDER BY Date

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Digs SSCertifiable Points: 7333 More actions · Answer 7

The test data you use has DATE like this YYYYMMDD and is an INT

My real data has DATE like normal MSSQL date in a DateTime field ( like 23-07-2010 00:00:00, ie no hrs or min etc)

I guess that is no big deal for me to change the over from INT date format to the other DATETIME date.

If not please advise.

Thanks for your efforts...:-):-):-)

Digs SSCertifiable Points: 7333 More actions · Answer 8

Digs

SSCertifiable

Points: 7333

July 26, 2010 at 11:14 pm

#1198528

Thanks got it working...

Cheers dude !

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 9

Because the test data used the ISO date format of YYYYMMDD, the dates were sortable. Since the only thing we're doing with the dates is joining and sorting, that made it OK for them to be INT's but I'm glad your real data is in the DATETIME format. It's just the right thing to do for future requirements if any arise.

Glad to help... my next question would be, [font="Arial Black"]do you understand how it works?[/font]

The other thing is, let no one convince you to remove any of the "hints" nor make any changes to the code including the clustered index on the temp table because... it will eventually fail if you do.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Digs SSCertifiable Points: 7333 More actions · Answer 10

Thanks..

I did make changes

I changed the date from YYYYMMDD and INT field to a DATETIME field..

ALL seam to work ok, and YES I can read and understand TSQL.

So can I keep my changes, or should I change the code back to the INT date of YYYYMMDD

Please advise?

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 11

Digs (7/27/2010)
Thanks..
I did make changes
I changed the date from YYYYMMDD and INT field to a DATETIME field..
ALL seam to work ok, and YES I can read and understand TSQL.
So can I keep my changes, or should I change the code back to the INT date of YYYYMMDD
Please advise?

Sorry... didn't mean to infer that you couldn't keep the datatype change for the dates. THOSE changes are good to go.

And, yeah... I know you know how to read SQL. This technique is a bit different though and I want to make sure you understand how it works so you can do the next one. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Digs SSCertifiable Points: 7333 More actions · Answer 12

Thanks, I am all good. Thanks for help !:-D:-):-D

Digs SSCertifiable Points: 7333 More actions · Answer 13

Question

This code

ALTER TABLE #tempSumIndex

ADD PRIMARY KEY CLUSTERED (Date) WITH FILLFACTOR = 100

The FILLFACTOR = 100

My data could get into the 1000s of records, does this effect this setting?

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 14

Digs (7/27/2010)
Question
This code
ALTER TABLE #tempSumIndex
ADD PRIMARY KEY CLUSTERED (Date) WITH FILLFACTOR = 100
The FILLFACTOR = 100
My data could get into the 1000s of records, does this effect this setting?

Not even if it gets into the millions of rows... the command/index is absolutely essential to both the proper operation of the code and the incredible high speed. The FILLFACTOR = 100 just guarantees the speed (little or no useless blank page space to read) if your server default is otherwise.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)