Dealing with a large number of columns

Question

Post reply

Dealing with a large number of columns

rav1080

SSC Enthusiast

Points: 120
More actions
February 13, 2010 at 3:13 pm

#386309

Hello,
I'm struggling with a very large table in my application. There's one column with unique values (around 24 million rows) but there are around 120 other columns in the same database. Depending upon the client's input 2 or 3 or multiple values are compared (its a zipcode profile column) and it outputs the highest value.
When a client gives around a million of these, we are facing performance issues. At first I thought it might be large number of rows in the db, so I created views (each range has around only 500,000 records), so around 48 views. But the problem, I think, is also with the large number of columns in the table.
Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?
--------------
this is my SP:
ALTER Procedure [dbo].[STR_SP_GetCultureByZcode]
(
@Zipcode INT,
@Querystr Varchar(200),
@TblName Varchar(100),
@existsZip INT OUTPUT
)
AS
Begin
SET NOCOUNT ON;
DECLARE @Result VARCHAR(8000),
@CondStmt1 nvarchar(200),
@Zip5 varchar(50),
@Zcode Varchar(50)
Set @Zcode = Convert(varchar(30),@Zipcode)
Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))
--check Whether record exist in table with this zipcode
Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + @Zcode
EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT
If(@existsZip > 0) -- If Exists get the data
Begin
Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @TblName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'
End
else -- If not exist take only 5 digits from zipcode and pass as input to query
Begin
Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'
End
Exec(@Result)
End
GO

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Ron McCullough SSC Guru Points: 63877 More actions · Answer 1

Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?

Remember a view only contains the columns that are SELECTED in the select statement which creates the view.

Do the clients need to have data returned for ALL 120 columns in the basic table.?

Are your views indexed? If not read:

Improving Performance with SQL Server 2008 Indexed Views

http://technet.microsoft.com/en-us/library/dd171921.aspx

And please post the table definition for the table(s) in question and some sample data as per the link in my signature block, so those who want to help you can do so with proven code.

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

rav1080 SSC Enthusiast Points: 120 More actions · Answer 2

The views are not indexed. The data table looks something like this:

Column1Column2Column3Column4Column120

852604029231234

852601111242311112

Basically, depending on which values the applications wants to compare, any of the 120 columns could be in play, so that’s why all columns are present in the table unless there’s a better way to index it. The main goal is to compare the column values for a given zipcode and return the column which has the higher value.

The views are divided by range (like 500-9999, 90001111 to 900009999….each one has about 500,000 zipcodes, but performance is surprisingly bad, so I am missing something!).

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

How about stop doing count and simply do exists.

Not sure why you would need the views to make that efficient...

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 5

I wouldn't check just one zip code at a time and I wouldn't split the 24 million row table up. Correctly written queries and joins in the presence of the correct indexing will absolutely fly through this table.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Satish-463129 Valued Member Points: 71 More actions · Answer 6

Hi, I am working with Rav1080. I am giving definations of my application.

he table Tbl_CountryOrigin contains around 160 columns and 200 million rows of data.

Below is the stored procedure defination

CREATE Procedure [dbo].[SP_GetDataByZcode]

(

@Zipcode INT,

@Querystr Varchar(200),

@ViewName Varchar(100),

@existsZip INT OUTPUT

)

AS

Begin

SET NOCOUNT ON;

DECLARE @Result VARCHAR(8000),

@CondStmt1 nvarchar(200),

@Zip5 varchar(50),

@Zcode Varchar(50)

Set @Zcode = Convert(varchar(30),@Zipcode)

Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))

--check Whether record exist in table with this zipcode

Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @ViewName + ' WHERE ZipData = ' + @Zcode

EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT

If(@existsZip > 0) -- If Exists get the data

Begin

Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @ViewName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

else -- If not exist take only 5 digits from zipcode and pass as input to query

Begin

Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

End

Exec(@Result)

End

Below is the table defination. My table contains 160 columns.

CREATE TABLE [dbo].[Tbl_CountryOrigin](

[Column1] [int] NULL,

[Column2] [int] NULL,

[Column3] [int] NULL,

[Column4] [int] NULL,

[Column5] [int] NULL,

-

[Column160] [int] NULL,

[ZipData] [int] NOT NULL,

CONSTRAINT [PK_ZipData] PRIMARY KEY CLUSTERED

(

[ZipData] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

) ON [PRIMARY]

Below is the code for View(we had divided above table to 40 views)

CountryOrigin_View0,CountryOrigin_View1,......CountryOrigin_View160.

CREATE VIEW [dbo].[CountryOrigin_View0] WITH SCHEMABINDING AS

SELECT

Column1,Column2,......Column160,ZipData

FROM dbo.TBL_CountryOrigin WHERE Zipdata > 100000 and Zipdata < 27000000

Below is the index defination of the table

ALTER TABLE [dbo].[Tbl_CountryOrigin] ADD CONSTRAINT [PK_ZipData] PRIMARY KEY CLUSTERED

(

[ZipData] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]