Dealing with a large number of columns

  • Hello,

    I'm struggling with a very large table in my application. There's one column with unique values (around 24 million rows) but there are around 120 other columns in the same database. Depending upon the client's input 2 or 3 or multiple values are compared (its a zipcode profile column) and it outputs the highest value.

    When a client gives around a million of these, we are facing performance issues. At first I thought it might be large number of rows in the db, so I created views (each range has around only 500,000 records), so around 48 views. But the problem, I think, is also with the large number of columns in the table.

    Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?

    --------------

    this is my SP:

    ALTER Procedure [dbo].[STR_SP_GetCultureByZcode]

    (

    @Zipcode INT,

    @Querystr Varchar(200),

    @TblName Varchar(100),

    @existsZip INT OUTPUT

    )

    AS

    Begin

    SET NOCOUNT ON;

    DECLARE @Result VARCHAR(8000),

    @CondStmt1 nvarchar(200),

    @Zip5 varchar(50),

    @Zcode Varchar(50)

    Set @Zcode = Convert(varchar(30),@Zipcode)

    Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))

    --check Whether record exist in table with this zipcode

    Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @TblName + ' WHERE ZipData = ' + @Zcode

    EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT

    If(@existsZip > 0) -- If Exists get the data

    Begin

    Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @TblName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

    End

    else -- If not exist take only 5 digits from zipcode and pass as input to query

    Begin

    Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

    End

    Exec(@Result)

    End

    GO

  • Is there an efficient way to handle large # of columns in a db like there's the views concept of large # of rows?

    Remember a view only contains the columns that are SELECTED in the select statement which creates the view.

    Do the clients need to have data returned for ALL 120 columns in the basic table.?

    Are your views indexed? If not read:

    Improving Performance with SQL Server 2008 Indexed Views

    http://technet.microsoft.com/en-us/library/dd171921.aspx

    And please post the table definition for the table(s) in question and some sample data as per the link in my signature block, so those who want to help you can do so with proven code.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • The views are not indexed. The data table looks something like this:

    Column1Column2Column3Column4Column120

    852604029231234

    852601111242311112

    Basically, depending on which values the applications wants to compare, any of the 120 columns could be in play, so that’s why all columns are present in the table unless there’s a better way to index it. The main goal is to compare the column values for a given zipcode and return the column which has the higher value.

    The views are divided by range (like 500-9999, 90001111 to 900009999….each one has about 500,000 zipcodes, but performance is surprisingly bad, so I am missing something!).

  • Please post table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • How about stop doing count and simply do exists.

    Not sure why you would need the views to make that efficient...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I wouldn't check just one zip code at a time and I wouldn't split the 24 million row table up. Correctly written queries and joins in the presence of the correct indexing will absolutely fly through this table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi, I am working with Rav1080. I am giving definations of my application.

    he table Tbl_CountryOrigin contains around 160 columns and 200 million rows of data.

    Below is the stored procedure defination

    CREATE Procedure [dbo].[SP_GetDataByZcode]

    (

    @Zipcode INT,

    @Querystr Varchar(200),

    @ViewName Varchar(100),

    @existsZip INT OUTPUT

    )

    AS

    Begin

    SET NOCOUNT ON;

    DECLARE @Result VARCHAR(8000),

    @CondStmt1 nvarchar(200),

    @Zip5 varchar(50),

    @Zcode Varchar(50)

    Set @Zcode = Convert(varchar(30),@Zipcode)

    Set @Zip5 = Convert(varchar(30),Left(@Zipcode,5))

    --check Whether record exist in table with this zipcode

    Set @CondStmt1 = 'SELECT @ex = COUNT(1) from ' + @ViewName + ' WHERE ZipData = ' + @Zcode

    EXEC sp_executesql @CondStmt1, N'@ex INT OUTPUT', @existsZip OUTPUT

    If(@existsZip > 0) -- If Exists get the data

    Begin

    Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM ' + @ViewName + ' WHERE ZipData =' + @Zcode + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

    End

    else -- If not exist take only 5 digits from zipcode and pass as input to query

    Begin

    Set @Result = 'SELECT TOP 1 theCol FROM (SELECT ' + @Querystr + ' FROM CountryOrigin_View0 WHERE ZipData =' + @Zip5 + ') AS t1 UNPIVOT (theValue FOR theCol IN (' + @Querystr + ')) AS u1 ORDER BY theValue DESC'

    End

    Exec(@Result)

    End

    Below is the table defination. My table contains 160 columns.

    CREATE TABLE [dbo].[Tbl_CountryOrigin](

    [Column1] [int] NULL,

    [Column2] [int] NULL,

    [Column3] [int] NULL,

    [Column4] [int] NULL,

    [Column5] [int] NULL,

    -

    -

    -

    -

    -

    [Column160] [int] NULL,

    [ZipData] [int] NOT NULL,

    CONSTRAINT [PK_ZipData] PRIMARY KEY CLUSTERED

    (

    [ZipData] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

    ) ON [PRIMARY]

    Below is the code for View(we had divided above table to 40 views)

    CountryOrigin_View0,CountryOrigin_View1,......CountryOrigin_View160.

    CREATE VIEW [dbo].[CountryOrigin_View0] WITH SCHEMABINDING AS

    SELECT

    Column1,Column2,......Column160,ZipData

    FROM dbo.TBL_CountryOrigin WHERE Zipdata > 100000 and Zipdata < 27000000

    Below is the index defination of the table

    ALTER TABLE [dbo].[Tbl_CountryOrigin] ADD CONSTRAINT [PK_ZipData] PRIMARY KEY CLUSTERED

    (

    [ZipData] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply