Weird rounding for decimal while doing a bulk insert from a CSV

  • Good morning

    I am currently importing CSV file using a bulk insert.

    I discover that my decimal values of temperature were rounded in a strange way.

    For example, 27.15 in the CSV becomes 27.14 in SQL Server, 26.7 becomes 26.69.

    On the other hand, 16.05 is correctly imported as well as 16.5 for example.

    I can not really predict what is happening.

    In my Schema.ini for running the bulk insert, i declared these columns as DOUBLE and in my SQL Server, i declared them as Decimal (5.2). I tried using numeric or having Decimal 5.3 and it doesn' t work (26.7 becomes 26.699).

    Could any of you help on this, i don' t know how to make it working?

    Thanks in advance for your help

    Sylvain

  • Is the cell formatted? If so, remove any formation and try again


    Madhivanan

    Failing to plan is Planning to fail

  • Thanks for your answer.

    What do you mean about the cell format?

    In my SQL Server, the field has been defined as decimal(5,2), i tried with Numeric(5,2), and i have still the same result.

    In my CSV file opened with NotePad for example, i have effectively 26.7 and not 26.69 as obtained in SQL Server.

    For my bulk insert, i am using a schema.ini where i have defined this field as being double with NumberDigits=2.

    Really don't see what is going wrong...

    Thanks for your help

    Sylvain

  • DOUBLE is a floating point datatype which is approximate.

    http://msdn.microsoft.com/en-us/library/ayazw934.aspx

    If you declare it as numeric or decimal, it may work more precisely.


    And then again, I might be wrong ...
    David Webb

  • Thanks David,

    I just tried to change the data type in my schema.ini for the temperature with float:

    Col4=TEMP_INT float and i obtain the same uncertainty about the rounding.

    It seems that i can not use any other type for the decimal values:

    http://msdn.microsoft.com/en-us/library/ms709353%28VS.85%29.aspx

    Could the problem come from somewhere else?

    Thanks again

    Sylvain

  • Can you provide the DDL (CREATE TABLE statement) for the target table and a sample of the data you are attempting to import, and what the expected results should be when the import is completed?

  • Open your CSV file in a text editor and look at the row you are importing. Does it have two decimal places or more? The first rows in the CSV will have an impact on the way the numbers are converted when imported. Try manipulating the numbers in Excel and exporting to a new CSV and reimport the file to see if the numbers are still coming in wrong.

  • Thanks all for your replies

    THis is the SQL code of my table in SQL Server:

    CREATE TABLE [ARCHIVAL].[ARC_DATA](

    [arc_data_id] [int] IDENTITY(1,1) NOT NULL,

    [arc_tag_id] [int] NOT NULL,

    [date_UTC] [datetime] NULL,

    [date_local] [datetime] NULL,

    [depth] [decimal](6, 2) NULL,

    [temp_int] [numeric](5, 2) NULL,

    [temp_ext] [numeric](5, 2) NULL,

    [light_surface] [decimal](7, 2) NULL,

    [light_depth] [decimal](7, 2) NULL,

    CONSTRAINT [PK_ARC_DATA] PRIMARY KEY CLUSTERED

    (

    [arc_data_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    This is a snapshot of my CSV opened with Notepad++:

    Date UTC,Local date/time,Depth,Recorder Temperature,Stalk Temperature,Light Level

    10/05/2008 17:48:00,10/05/2008 7:48:00,44,27.15,26.7,144

    10/05/2008 17:48:30,10/05/2008 7:48:30,69.5,27.15,26.65,134

    10/05/2008 17:49:00,10/05/2008 7:49:00,99.5,27.15,26.3,123

    10/05/2008 17:49:30,10/05/2008 7:49:30,115.5,26.9,21.15,112

    This is now what i obtained in my SQL Server table:

    arc_data_idarc_tag_iddate_UTCdate_localdepth temp_int temp_ext light_surface light_depth

    255714792022008-05-10 17:48:00.0002008-05-10 07:48:00.00044.00 27.1426.69 NULL144.00

    255714802022008-05-10 17:48:30.0002008-05-10 07:48:30.00069.50 27.1426.64 NULL134.00

    255714812022008-05-10 17:49:00.0002008-05-10 07:49:00.00099.50 27.1426.30 NULL123.00

    255714822022008-05-10 17:49:30.0002008-05-10 07:49:30.000115.50 26.8921.14 NULL112.00

    As you can see 27.15 becomes 27.14...

    My schema.ini to make bulk insert is as follows:

    ColNameHeader=True

    Format=CSVDelimited

    DecimalSymbol=.

    NumberDigits=2

    DateTimeFormat=dd/mm/yyyy hh:nn:ss

    Col1=DATE_UTC DateTime

    Col2=DATE_LOCAL DateTime

    Col3=DEPTH float

    Col4=TEMP_INT float

    Col5=TEMP_EXT float

    Col6=LIGHT_DEPTH float

    For doing the bulk insert, i wrote a small application in VB.Net, nothing fancy...

    I tried already to modify the datatype in the schem.ini between DOUBLE and Float, or using decimal datatype in my SQL server table but no change...

    If anyone have an idea... Start being desperate

    Thanks in advance

    Sylvain

  • I have come up with a SQL version using OPENROWSET BULK which seems to work OK. You could also use BULK INSERT via a staging table if you don't want to use a format file. Not sure how you're populating arc_tag_id, so I've hard coded it as 202 for the time being.

    INSERT ARCHIVAL.ARC_DATA

    (

    arc_tag_id,

    date_UTC,

    date_local,

    depth,

    temp_int,

    temp_ext,

    light_depth

    )

    SELECT

    202 AS arc_tag_id,

    CONVERT(DATETIME, date_UTC, 103),

    CONVERT(DATETIME, date_local, 103),

    CONVERT(DECIMAL(6,2), depth),

    CONVERT(NUMERIC(5,2), temp_int),

    CONVERT(NUMERIC(5,2), temp_ext),

    CONVERT(DECIMAL(7,2), light_depth)

    FROM OPENROWSET (BULK 'C:\Arc.txt', FORMATFILE = 'C:\Arc.fmt', FIRSTROW = 2) AS Z

    Arc.fmt

    7.0

    6

    1SQLCHAR00","1date_UTC

    2SQLCHAR00","2date_local

    3SQLCHAR00","3depth

    4SQLCHAR00","4temp_int

    5SQLCHAR00","5temp_ext

    6SQLCHAR00"\r\n"6light_depth

  • Thanks Steve,

    Just tried your code in my VB.Net application:

    cnSQL.ConnectionString = "Data Source=TETAUTAI;Initial Catalog=TUNA_DBS;UID=user;PWD=pwd;"

    cnSQL.Open()

    cmdSQL.Connection = cnSQL

    cmdSQL.CommandText = "INSERT INTO archival.ARC_DATA(arc_tag_id,date_UTC,date_local,depth,temp_int,temp_ext,light_depth)" _

    & " SELECT 204 AS arc_tag_id,CONVERT(DATETIME, date_UTC, 103),CONVERT(DATETIME, date_local, 103)" _

    & " ,CONVERT(DECIMAL(6,2), depth),CONVERT(NUMERIC(5,2), temp_int),CONVERT(NUMERIC(5,2), temp_ext)," _

    & " CONVERT(DECIMAL(7,2), light_depth)" _

    & " FROM OPENROWSET (BULK 'F:\import\490597_local_time.csv', FORMATFILE = 'F:\import\Arc.fmt', FIRSTROW = 2) AS Z;"

    cmdSQL.ExecuteNonQuery()

    cnSQL.Close()

    And it works!! All the decimal values are correctly transferred.

    But i remembered why i didn't follow this Bulk solution at the beginning because of the fact that the BULK statement seems to only point to files stored on the DB server only.

    In my situation, i am accessing from my local desktop, CSV files on my office network and load them into a remote SQL server DB.

    Can i overcome this limitation for the BULK?

    I have already changed the permission of "user' to allow bulkadmin roles but now have issue to read files not located on the DB server. I had some of these errors "Cannot bulk load because the file could not be opened"

    Thanks again

    Sylvain

  • You might try using the UNC of the files location. It does mean you need to share the folder or use the hidden admin share if there is one.

  • springrider (9/9/2010)


    Thanks Steve,

    Just tried your code in my VB.Net application:

    cnSQL.ConnectionString = "Data Source=TETAUTAI;Initial Catalog=TUNA_DBS;UID=user;PWD=pwd;"

    cnSQL.Open()

    cmdSQL.Connection = cnSQL

    cmdSQL.CommandText = "INSERT INTO archival.ARC_DATA(arc_tag_id,date_UTC,date_local,depth,temp_int,temp_ext,light_depth)" _

    & " SELECT 204 AS arc_tag_id,CONVERT(DATETIME, date_UTC, 103),CONVERT(DATETIME, date_local, 103)" _

    & " ,CONVERT(DECIMAL(6,2), depth),CONVERT(NUMERIC(5,2), temp_int),CONVERT(NUMERIC(5,2), temp_ext)," _

    & " CONVERT(DECIMAL(7,2), light_depth)" _

    & " FROM OPENROWSET (BULK 'F:\import\490597_local_time.csv', FORMATFILE = 'F:\import\Arc.fmt', FIRSTROW = 2) AS Z;"

    cmdSQL.ExecuteNonQuery()

    cnSQL.Close()

    And it works!! All the decimal values are correctly transferred.

    But i remembered why i didn't follow this Bulk solution at the beginning because of the fact that the BULK statement seems to only point to files stored on the DB server only.

    In my situation, i am accessing from my local desktop, CSV files on my office network and load them into a remote SQL server DB.

    Can i overcome this limitation for the BULK?

    I have already changed the permission of "user' to allow bulkadmin roles but now have issue to read files not located on the DB server. I had some of these errors "Cannot bulk load because the file could not be opened"

    Thanks again

    Sylvain

    Lynn Pettis (9/9/2010)


    You might try using the UNC of the files location. It does mean you need to share the folder or use the hidden admin share if there is one.

    Thanks for the feedback folks. Were you able to set up a share on your local machine to accommodate the OPENROWSET BULK or was this not an acceptable solution for you?

  • hello all,

    Thanks for your feedback.

    The OPENROWSET will not work for me due to limited permissions in accessing my SQL server.

    I am pretty sure the issue is linked to the fact that my datacolumn in my datatable for the numeric value are defined as DOUBLE and that the output fields in SQL Server are DECIMAL.

    My idea now would be to convert my DOUBLE datacolumns in my datatable as DECIMAL.

    These DOUBLE datacolumns are defined in my Schema.ini.

    Would any of you see a way of keeping the schema.ini (important for me because it defines data format) and in the meantime changing the data types of my numeric columns as DECIMAL?

    From what i have seen, you can not change the datatype after the column has been populated. Is is possible to assign the schema.ini to the datatable before filling it with data? It will then allow me to update the datacolumns type as DECIMAL.

    Thanks for your help

    Sylvain

  • Have you tried to convert to string first then to decimal?

  • OK i finally find a solution and wanted to share it with you in case it is useful to someone else.

    The problem of the bulk import is the difference of types between DOUBLE in the schema.ini and DECIMAL in the output table. Schema.ini does not provide any other choice than DOUBLE to define my numeric field.

    So i filled my datatable using the Schema.ini and then recreate new datacolums with decimal datatypes which will be copy of the other Double columns.

    Then i used these new columns as being the one to be imported into SQL Server. I have a perfect match now between my datatable columns and my SQL SERVER table columns for the types and there is no loss of precision.:-)

    Dim myCSVAdapter As New OleDb.OleDbDataAdapter

    myCSVAdapter.Fill(dt, myRSCSV)

    'My datatable is filled using the schema.ini with DOUBLE columns

    Dim dcDepth As New DataColumn()

    dcDepth.DataType = System.Type.GetType("System.Decimal")

    dcDepth.ColumnName = "DEPTH_DEC"

    dcDepth.Expression = "DEPTH"

    dt.Columns.Add(dcDepth)

    This new decimal column is a clone of the original DOUBLE column and it will be the one i will use for the bulk insert.

    Thanks a lot for your help it was really helpful

    Cheers

    Sylvain

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply