Concatenation/Placement of data

  • My current dillema.

    The flat files I am working with consist of records with multiple row text (generally parsed in 70 character chunks). Now the vast majority of the total concatenated text on each record is <=1,200 characters. However 3% are over 8,000 characters total (with some reaching 25-30K).

    The database expects 500K unique records to start and consistent monthly growth of 30K.

    Currently using SQL Server 2005 with asmall possibility of moving to 2008.

    So my two questions:

    1. Is it really intelligent for me to format the base field for all this data as text, or should I stick with varchar(8000) and somehow treat the exceptions separately?

    2. Based on the answer to #1, which I think will be the latter, what is the best methodology of separation/concatenation?

  • I'd use varchar(max) or nvarchar(max) not text/ntext. If it fits on the page SQL Server will keep it in the table, if not it will store it off the table sp the majority of your data will still be on the pages.

  • Thanks, Jack

    I understood that the page could store more just that query outputs and displays were always going to be cut to 8,000 character chunks (and I could simply code to show those breaks, say as separate columns in a query)

    But it was also my understanding that the concatenation function using "+" would also die when I tried to stitch them together from the flat file. The recomendations I saw on these boards seemed to imply moving to a text/ntext field at that point.

    The end result is going to be working with a text-mining tool (most likely Clarabridge) and needs to be workable in it's entirety; at least until we define more logical ways to break it up. This also means that the larger sized records represent the most prized data.

    Apologies on any failure on my part. My background is much more in financial and operational metrics, so I am struggling a bit by the size of this customer service textual data.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply