OK, attached is an SSIS Package that uses only native SSIS Tasks and Components and T-SQL, i.e. no C# Script is utilized in the SSIS. Files included in the attached zip file:
1. ActiveDirectoryLog.txt - sample input file (from OP)
2. ActiveDirectoryLog2_TABLE.sql - DDL to create destination table in tempdb
3. ImportActiveDirectoryLog2.dtsx - SSIS Package that loads the sample file into the table in tempdb
4. ActiveDirectoryLog2_SELECT.sql - query to deliver results in required shape
This turned out to be remarkably simple once I realized the row-delimiter I should have been using all along was \r\n\r\n, not \r\n 😎
This is the table definition the file is loaded into (also in the attached zip):
CREATE TABLE dbo.active_directory_profile(
profile_id int IDENTITY(1,1) NOT NULL,
profile varchar(8000) NULL
);
And this is the query that brings back the desired output (also in the attached zip):
WITH cte AS (SELECT adp.profile_id,
LEFT(x.item,CHARINDEX(': ',x.item)-1) AS [name],
-- requires dbo.GROUP_CONCAT from http://groupconcat.codeplex.com
dba.dbo.GROUP_CONCAT(SUBSTRING(x.item,CHARINDEX(': ',x.item)+2,8000)) AS [value]
FROM dbo.active_directory_profile adp
-- requires dbo.StringSplit_Multi from http://sqlblog.com/blogs/adam_machanic/archive/2009/04/28/sqlclr-string-splitting-part-2-even-faster-even-more-scalable.aspx
CROSS APPLY dba.dbo.SplitString_Multi(adp.[profile], CHAR(13)+CHAR(10)) x
GROUP BY adp.profile_id,
LEFT(x.item,CHARINDEX(': ',x.item)-1))
SELECT profile_id,
MAX(CASE WHEN name = 'dn' THEN [value] END) AS dn,
MAX(CASE WHEN name = 'objectClass' THEN [value] END) AS objectClass,
MAX(CASE WHEN name = 'userid' THEN [value] END) AS userid,
MAX(CASE WHEN name = 'companyVal' THEN [value] END) AS companyVal,
MAX(CASE WHEN name = 'company' THEN [value] END) AS company,
MAX(CASE WHEN name = 'job-title' THEN [value] END) AS [job-title],
MAX(CASE WHEN name = 'mail' THEN [value] END) AS mail,
MAX(CASE WHEN name = 'createdby' THEN [value] END) AS createdby,
MAX(CASE WHEN name = 'modifiedby' THEN [value] END) AS modifiedby,
MAX(CASE WHEN name = 'createdTime' THEN [value] END) AS createdTime,
MAX(CASE WHEN name = 'modifiedTime' THEN [value] END) AS modifiedTime
FROM cte
GROUP BY profile_id;
Results:
Note that there are two functions you will need to install for my query above to work, a grouped-string-concatenation-function and a string-splitter-function. One is implemented as a SQLCLR Aggregate and the other as a SQLCLR Table-Valued function. The comments show where you can get them. If SQLCLR makes you twist in your seat you can replace the grouped-string-concatenation-function with the technique that leverages FOR XML and the string-splitter with Mr. Moden's T-SQL splitter.
Edit: fix tabs in code
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato