~
~  Step 1
~

write function to read text file
write function to write text file


~
~  Step 2
~

write parse_tsql_function
input unprocessed string
output processed string
break into statements on keyword GO
keep just CREATE TABLE statements
combine statements with divider word GO


~
~  Step 3
~

python change working directory
python replace file extension in string
python read text file that is encoded


-- 
-- Part 4
--

create function with raw tsql string as input.
return processed tsql string as output.
break string on GO keyword into statements.

use regex to do the following using case insensative matches:

remove "on [primary]" clause from statement.
remove "textimage_on [primary]" clause from statement.
remove "with" clause from statement.
remove "identity" clause from statement.
remove "ROWGUIDCOL" option from statement
remove "NOT FOR REPLICATION" option from statement

replace "[nvarchar]" with "[varchar]" in statement.
replace "[nchar]" with "[char]" in statement.
replace "(max)" with "(4000)" in statement.
replace "[datetime]" with "[datetime2](6)" in statement.
replace "[datetimeoffset]" with "[datetime2]" in statement.
replace "[money]" with "[decimal](18, 4)" in statement.
replace calculated columns "as ()" with "[varchar](4000)"

replace "[dbo].[AccountNumber]" with "[varchar](15) NULL".
replace "[dbo].[Flag]" with "[bit] NOT NULL".
replace "[dbo].[Name]" with "[varchar](50) NULL".
replace "[dbo].[NameStyle]" with "[bit] NOT NULL".
replace "[dbo].[OrderNumber]" with "[varchar](25) NULL"
replace "[dbo].[Phone]" with "[varchar](25) NULL"
replace "NOT NULL" with "NULL"

constraints can span multiple lines
remove "default" constraints defined in statememt.
remove "unique" constraints defined in statement.
remove "primary key" constraints defined in statement.
remove "foreign key" constraints defined in statement.

sample input:

```
CREATE TABLE [SalesLT].[Address](
	[AddressID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
	[AddressLine1] [nvarchar](60) NOT NULL,
	[AddressLine2] [nvarchar](60) NULL,
	[City] [nvarchar](30) NOT NULL,
	[StateProvince] [dbo].[Name] NOT NULL,
	[CountryRegion] [dbo].[Name] NOT NULL,
	[PostalCode] [nvarchar](15) NOT NULL,
	[rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	[ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Address_AddressID] PRIMARY KEY CLUSTERED 
(
	[AddressID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY],
 CONSTRAINT [AK_Address_rowguid] UNIQUE NONCLUSTERED 
(
	[rowguid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
```

sample output:

```
CREATE TABLE [SalesLT].[Address](
	[AddressID] [int] NOT NULL,
	[AddressLine1] [varchar](60) NOT NULL,
	[AddressLine2] [varchar](60) NULL,
	[City] [varchar](30) NOT NULL,
	[StateProvince] [dbo].[Name] NOT NULL,
	[CountryRegion] [dbo].[Name] NOT NULL,
	[PostalCode] [varchar](15) NOT NULL,
	[rowguid] [uniqueidentifier] NOT NULL,
	[ModifiedDate] [datetime] NOT NULL
);
GO
```
 

-- 
-- Part 5
--

create function read a tsql file as a string 
break string on GO keyword into statements.
create a new jupyter notebook.
create a new cell for each statement.