Homework...Enron data set

  • yeah...tis homework for a proof of concept idea I am thinking about.

    looking for a BIG dataset of email, chat etc.....and seems the Enron data would be good.

    however all I can readily find this evening is mySQL scripts....was wondering if anyone knew of a MS SQL equivalent.

    Appreciate I am being (possibly) lazy, but what I have so far from the mySQL scripts is throwing errors on the message body.

    so either I need a way to eliminate the errors or find a nice and easy peasy MS SQl equivalent?? 🙂

    see below for the mysql insert example

    very happy to be pointed to a solution that tidies up the syntax errors, if someone was prepared to help.

    (the overall file is 1 GB+...this is just a very very small subset of the first table that describes my first issue, expect further problems with larger dataset)

    CREATE TABLE bodies (

    messageid int,

    body varchar(max)

    )

    --

    -- Dumping data for table `bodies`

    --

    INSERT INTO bodies VALUES (1,'System Notification: At 0115 PST, WACM terminated request for coordinatedoperation controllable devices for Path 30 USF mitigation.');

    INSERT INTO bodies VALUES (2,'Internal path flows are now below limits. BEEP has been returned to normalmode (unsplit operation) as of 0000 hours. BEEP will dispatch as one zone.Sent by Market Operations, inquiries please call the Real Time Desk.The system conditions described in this communication are dynamic andsubject to change. While the ISO has attempted to reflect the most current,accurate information available in preparing this notice, system conditionsmay change suddenly with little or no notice. ');

    INSERT INTO bodies VALUES (3,'Path 15 S-N flows are near the limit. BEEP has been put into split regionmode as of 2331 hours. BEEP will dispatch in two zones and will produce twoEx-Post prices. Sent by Market Operations, inquiries please call the RealTime Desk.The system conditions described in this communication are dynamic andsubject to change. While the ISO has attempted to reflect the most current,accurate information available in preparing this notice, system conditionsmay change suddenly with little or no notice. ');

    INSERT INTO bodies VALUES (4,'Market Message: At 2141 PST, WACM requested coordinated operation ofcontrollable devices for USF mitigation on Path 30. Path schedules are 188MW, actual flows are 570 MW. Path transfer capability is 581 MW.');

    INSERT INTO bodies VALUES (5,'Beginning HE20, the 10 minute and expost pricing on the OASIS web site arecorrect. Sent by the Real time Market Operations. The system conditions described in this communication are dynamic andsubject to change. While the ISO has attempted to reflect the most current,accurate information available in preparing this notice, system conditionsmay change suddenly with little or no notice. ');

    INSERT INTO bodies VALUES (6,'Starting HE19 the ISO is posting incorrect 10 minute pricing and hourlyexpost pricing due to software problems. As soon as the problem has beenrepaired the correct pricing will be posted. Sent by the Real time desk. The system conditions described in this communication are dynamic andsubject to change. While the ISO has attempted to reflect the most current,accurate information available in preparing this notice, system conditionsmay change suddenly with little or no notice. ');

    INSERT INTO bodies VALUES (7,'To Market Participants and Scheduling Coordinators: The ISO received a data request yesterday from the Office of RatepayerAdvocates in CPUC docket 01-03-036. The request requires a response no laterthan November 16, 2001. An electronic version of the request is attached.The ISO intends to respond in a timely manner. Please notify Jeanne M. Sol?no later than November 10 if you consider any of the information requestedto be confidential under the ISO Tariff and the basis for you view. JeanneSol? can be reached at (916) 608-7144, email jsole@caiso.com<mailto:jsole@caiso.com>. <<Data Request to CAISO.doc>> Client Relations CommunicationsCRCommunications@caiso.com');

    INSERT INTO bodies VALUES (8,'The price is still 91.87.Keoni AlmeidaCalifornia Independent System Operatorphone: 916/608-7053pager: 916/814-7352alpha page: 9169812000.1151268@pagenet.nete-mail: <mailto:kalmeida@caiso.com>> -----Original Message-----> From:CRCommunications> Sent:Friday, June 22, 2001 11:34 AM> To:ISO Market Participants> Subject:CAISO Notice: Update to June 20 Market Notice>> <<MARKET NOTICE 010622_.doc>>>> Market Participants:> Please read the attached explanation of Footnote 14 in the California ISO> June 20, 2001, Market Notice.>> CR Communications> Client Relations Communications - MARKET NOTICE 010622_.doc ');

    INSERT INTO bodies VALUES (9,' So . . . you were looking for a one night stand afterall . . . ?? DC');

    INSERT INTO bodies VALUES (10,'Hey there Bill!I thought I\'d drop a quick line just to do a quick introduction of myself.Your buddy, Brendan, has offered your tour guide ability to me when I visitPortland in July. I do have a college friend out there (who I hope tolocate) so you might not have to be stuck with me. :)Anyhow I\'m Nikki. I\'d go more into depth but everytime I write it out itsounds more like a personal add then an introduction!Brendan tells me you are about to close on a condo. Congratulations! Ican\'t wait to have a place of my own! Apartment life is horrible. Myneighbor and I knock on the bathroom wall at each other in the mornings justto say \"hi\". I\'m really serious. I actually have really great neighbors inmy building and I have to admit, apartment life is a good way to get to knowpeople.I don\'t know what Brendan has told you about me but here\'s the story. I wasborn and raised in New Mexico and hate it here! I need trees and green!Someone once compared landing in New Mexico to landing in a litter box.Where there\'s not dirt there\'s piles of crap. Now everytime a plane fliesme back home I look around and agree.So, to say the least I\'m a little stir crazy for a new environment. Thefirst part of August I will be moving to Newport Beach, California. Or oneof four other beach communities in that area. I want to experience thatwhole lifestyle for at least a year. Then I\'m moving to Oregon. I\'ve nevervisited or anything I\'m just drawn there. To be perfectly honest it\'sbecause of the two movies \"Goonies\" and \"Kindergarten Cop\". I loved thosemovies as a kid and have wanted to move to Astoria ever since. Now\'s mychance. I just finished college and have a period of resting time before Igo after my master\'s in Public Health Administration or my MBA. We\'ll seewhich fits me best later. I don\'t feel that at 24 I can really decide the\"best\" avenue for the next 50 years of my career so I\'m waiting for the lifeexperience to kick in and lead the way later.As of yet I have no set plans as to dates, iteneraries, etc and so forth.I\'m a very spontaneous type of person I don\'t really \"do\" planning andscheduling. I just kind of show up and try to make my own fun. It\'s betterthat way. Life can be too organized!Well now that I\'ve given you a \"short\" introduction...haha. I\'d love tohear back from you about any conflicting dates that you might have in July.Don\'t worry if they change I won\'t hold ya to them. The only time that I amforecasting for myself is the 12th through the 16th (of July) IF I don\'t goto Calgary, Canada. Currently I\'m planning as many trips as I possibly can.My youth is fading fast and I need to enjoy as much of it as I possiblewhile I don\'t have a husband and kids to take care of. :)Hope to hear from you soon!Nikki_______________________________________________________Send a cool gift with your E-Cardhttp://www.bluemountain.com/giftcenter/');

    INSERT INTO bodies VALUES (11,'Group,EES and I have not been receiving emails detailing our marketing of power for them. Please send this information to them and cc me on it. If your email is not working...find someone to make it work. We have plenty of IT help in the office, and should call on it.Thanks, Bill');

    errors

    Msg 102, Level 15, State 1, Line 19

    Incorrect syntax near 'd'.

    Msg 343, Level 15, State 1, Line 19

    Unknown object type 'a' used in a CREATE, DROP, or ALTER statement.

    Msg 319, Level 15, State 1, Line 19

    Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

    Msg 4145, Level 15, State 1, Line 19

    An expression of non-boolean type specified in a context where a condition is expected, near 'change'.

    Msg 319, Level 15, State 1, Line 19

    Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

    Msg 4145, Level 15, State 1, Line 20

    An expression of non-boolean type specified in a context where a condition is expected, near 'email'.

    Msg 105, Level 15, State 1, Line 20

    Unclosed quotation mark after the character string ');

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Looks like the embedded single quotes is what is killing you. Check the Quoted_Identifiers option and use double quotes around the string.

  • There are embedded single quotes and embedded double quotes. Switching the QUOTED_IDENTIFIER setting will fix the single quote issue, but the embedded double quotes will then be just as much of a problem.

    The single and double quotes are currently "escaped" by a backslash. This is not a valid syntax to escape quotes in SQL. The double quotes should not have a backslash and the single quotes should be doubled, without a backslash.

    CREATE TABLE #t (a VARCHAR(100))

    -- default setting

    SET QUOTED_IDENTIFIER ON

    INSERT #t VALUES

    ('Here''s an escaped single quote.'),

    ('Double quotes "like this" cannot be escaped, but don''t require it.'),

    ('Single quotes can\''t be escaped with a backslash, neither can \"doubles\".');

    SELECT *

    FROM #t;

  • Stephanie Giovannini (10/3/2016)


    There are embedded single quotes and embedded double quotes. Switching the QUOTED_IDENTIFIER setting will fix the single quote issue, but the embedded double quotes will then be just as much of a problem.

    The single and double quotes are currently "escaped" by a backslash. This is not a valid syntax to escape quotes in SQL. The double quotes should not have a backslash and the single quotes should be doubled, without a backslash.

    CREATE TABLE #t (a VARCHAR(100))

    -- default setting

    SET QUOTED_IDENTIFIER ON

    INSERT #t VALUES

    ('Here''s an escaped single quote.'),

    ('Double quotes "like this" cannot be escaped, but don''t require it.'),

    ('Single quotes can\''t be escaped with a backslash, neither can \"doubles\".');

    SELECT *

    FROM #t;

    I didn't catch the double quotes in the strings. The embedded single quotes were obvious. Thanks for catching that.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply