Social Hacks

  • Comments posted to this topic are about the item Social Hacks

  • I think this article is in danger of confusing data with information.

    The data on the social networking sites is, so far as we have the ability to judge, accurate in so far as it remains exactly as entered via the approved methods. Agreed, the information the data portrays may be well off the mark, but that's a different thing.

    Which is where experience comes in. If you give a wodge of cash to a teenager and tell them to go and buy a second-hand car, they're likely to spend far too much and/or buy a nail. That's because they've yet to recognise when people are telling the truth or not. Some people get better at this, many do not, but there's no fundamental difference between human motives online and human motives face to face. If you take what someone says on their Facebook site at face value (pardon the pun), you're a fool. However, you're also a fool if you completely ignore the car salesman just because he might be trying to con you.

    Make a judgement on the data in its proper context, cross-checking for comparison where possible, and then you'll have valid information. And here I do agree with the article in that many feeds based on this kind of data are, as Steve rightly points out, not validated in any way, simply being taken as read. That's sloppy data management, IMHO.

    Semper in excretia, suus solum profundum variat

  • Hello Steve,

    You make a good point...free 'social' data requires a huge amount of filtering and cleaning.

    Two answers:

    * we have software tools to qualify and clean data(raw sales leads for example)

    * correspondents can be limited to those already accepted in your group. Behind your firewall on your corporate intranet or whatever boundary you define your group by. Naturally, shared information from group members will qualify for greater credibility.

    Hope that helps!

    Jay Santry

    Knowledge Manager/DBA

  • I agree more with Steve. It's way too easy to post information / create a site posing as someone else. And even more difficult for the real person to get it removed. My daughter has had this happen - probably by a 'friend' that thought it would be funny.

    Greg E

  • I hope I'm not being boring, but almost every time I log in here I wind up thanking Steve for pointing us to yet another interesting article. If you didn't click on the link to the "Dark Reading" article, go back and do it.

    I've always been a little suspicious of the new wave of social sites like Facebook and 'Linked In', but now I've got one more reason to exercise caution and to recommend that caution to friends. Not only might respective employers be looking at what we post, but also the cyber theives. The tireless devotion that these guys have to ripping us off is truly scary. They seem to find new ways every day.

    The recommendations in the DR article look to be really helpful. Thanks, Steve.

    ___________________________________________________
    “Politicians are like diapers. They both need changing regularly and for the same reason.”

  • Hmmmm... I'm not sure I care, if anyone's trying to use my own personal data then it's not with my explicit permission and it isn't why I put it up there. I'm up there for my benefit, if I choose to I can put accurate information up there but if it is of no consequence then I won't - for example on Facebook I list my hobbies as Barn Dancing, Owls and cheese, I quite like cheese and owls are pretty cool, but they're not hobbies. No cheese or owl targeted ad campaign is going to make any margins out of me... Similarly although I'm listed as a Pastafarian I don't actually believe in the FSM (but I guess that's the point).

    If you want to use social networking to collect commercial quality data then it can only be useful in a controlled environment. Full stop.

    So I'm stumped Steve, I don't see any data quality issues to concern me here. I see corporations with an interest in data thievery, and you appear (on the face of it) to be on their side.....

  • Steve,

    Thanks for the link to a very interesting article. I have long been suspicious of social networking sites. But the internet, in general, has the potential to have a LOT of inaccurate "information" posted on it. Anyone can pose as an expert on any subject. When my children had reports to write, I continued to encourage them to use printed material for their references, rather than trusting material found by searching on the web. That said, I have been very impressed with sites like Wikipedia - given the huge opportunity for mis-information, the articles I have read have been very good. I realize they monitor what information has been added, but verifying the articles must be time-consuming.

    Thank goodness for sites like SQLServerCentral, where the discussion on articles quickly verifies the quality!

  • Not sure I'd call it data thievery, Richard. I don't think one can reasonably object to companies collecting data freely put into the public domain any more than someone choosing to wander into a nudist camp can object with others seeing the location of that tattoo they got when they were drunk a few Christmases ago.

    Of course, if the "others" choose to act upon that new tattoo knowledge without verifying it won't wash off in the next bath, that's their lookout.

    And I hope everyone's noticed how, with an analogy like that, I still managed to stay nominally out of the gutter. I admit it was a great effort. 😉

    Semper in excretia, suus solum profundum variat

  • Thanks Steve,

    The information here is that those who hate and are filed with vengeance are bad. When the take that hate to the internet there are few restraints to keep the malice inline.

    You can not trust one source. Old timers use to say out of the mouths of two or three witnesses will a thing be confirmed. Taking this to heart you can not take what you read on the net to heart without other confirmation.

    Miles...

    Not all gray hairs are Dinosaurs!

  • Yeah, I'm filed with vengeance and I'm super bad...

    And the Venga Boys, they must get filed with vengeance all the time...

  • I strongly dissagree here -maybe because I am taking this quote slightly out of context, but I still like a healthy debate:

    Those of us that work with data tend to take the integrity of it very seriously. We work hard to ensure that we clean data properly, store and manipulate it in ways the keep it's meaning valid, and take pride in delivering results to queries that are useful to our clients.

    I work with data all day long, but I don't manipulate it. I don't try to find a meaning to it. I do, however, ensure that the data complies with the minimum, original system or application specifications. I will also, to an extent, triple check a front end UI passing me data just to make sure there is nothing offensive or even harmful -talk about hacks & sanitization.

    On the other hand, if this would've been written as:

    Those of us that work with informationtend to take the integrity of it very seriously. We work hard to ensure that we clean data properly, store and manipulate it in ways the keep it's meaning valid, and take pride in delivering results to queries that are useful to our clients.

    In terms of grabage prevention, it'll be as good as the 'bouncers' that pick and choose which 'data' makes it in, and which one needs to tuck-in it's shirt before getting admitted. In a wide system where you allow any kind of free-form data entry, all you can do, from a data perspective is ensure that baseline specifications are met, and otherwise reject it. It is, of a completely different nature to make sense out of data and produce information off of it.

    And then again, social hacks or social engineering exist since the dawn of times as it's pointed out in the article. Doesn't matter whether its a car salesman or an online store trying to sucker you into buying the 'latest' detox patches out in the market. The differences rely in a social hack making use of 'well-intentioned' fellas, whereas social engineering is conceived from the get go as a means to take advantage of a situation or someone.

    He's out to get you:

    car salesman = social engineer

    And this is an open system, prone to hacks:

    facebook = risk for social hacks

    Again, utlimately, as a data trustee, you should have no bearing on how the data makes it into information. We are, however, responsible to ensure whatever makes it in, contains as little 'garbage' as possible. But we can't clean it up completely, as we run the risk of skewing it's original content and rendering any derived information useless.

    Sorry for the babble...I'll go back to my corner now.

  • Seeing as you wanted a debate I'll disagree with you.

    When building a data architecture you have to consider the front end, the bouncers, as you call them.

    You need a procedure for getting it in and (preferably, although not often enough in my experience) a procedure for rolling it back. If you can't give the users the ability to clean up their own mess then you have to do it for them.

    Agreed there is a (pedantic) distinction between data and information but rubbish information is based on rubbish data, to my mind the only difference is one of display.

    If you think the data is your concern but the information isn't then I don't think you're considering the full scope of why you're doing what you're doing. You don't work for EDS do you?

  • I don't agree with companies scraping this information, though I'm not sure they're wrong, either in the moral or legal sense. You put it out there and allowing someone to see it means giving permission for them to use it. Facebook means you allow your high school friends to potentially contact you (or you them), and I'm not sure that's different than a company using it. Perhaps Jostens trying to see if you want a duplicate or resized class ring from that year.

    As I said, I don't like it, but I'm not sure of what control we have, or even should have, over information.

    That being said, I know someone will start mining/using this. It's a matter of time. So when they do, I'm concerned they might start trusting this data as valid. In some cases that might be annoying (perhaps you'll get cheese offers, Richard) and in some it might be troublesome.

    What's very disconcerting is someone might post mostly true things about you and then a few false ones. And you might not be aware of it until something happens.

  • Steve Jones - Editor (10/6/2008)


    I don't agree with companies scraping this information, though I'm not sure they're wrong, either in the moral or legal sense. You put it out there and allowing someone to see it means giving permission for them to use it. ... So when they do, I'm concerned they might start trusting this data as valid. In some cases that might be annoying (perhaps you'll get cheese offers, Richard) and in some it might be troublesome.

    What's very disconcerting is someone might post mostly true things about you and then a few false ones. And you might not be aware of it until something happens.

    This already happens. In fact, bad data entry has been going on since before the internet (yes, there was such a time). I've had several credit cards issued to me over my career with my name misspelled, and that has resulted in my credit report showing those misspellings as other names I've used. This led at least one lender to ask me why I was using different names.

    Also, I have used Amazon.com to look for Christmas presents for friends, which has led them to create "suggestions for you" that I am not personnally interested in.

    And [don't ask me how] I have received catalogs at my home address for a co-worker. [No, he never lived there and I never ordered anything for him.] I never could figure that one out.

  • Richard Gardner (10/6/2008)


    Seeing as you wanted a debate I'll disagree with you.

    When building a data architecture you have to consider the front end, the bouncers, as you call them.

    You need a procedure for getting it in and (preferably, although not often enough in my experience) a procedure for rolling it back. If you can't give the users the ability to clean up their own mess then you have to do it for them.

    Agreed there is a (pedantic) distinction between data and information but rubbish information is based on rubbish data, to my mind the only difference is one of display.

    If you think the data is your concern but the information isn't then I don't think you're considering the full scope of why you're doing what you're doing. You don't work for EDS do you?

    Well then, when is too much scraping too much? You can certainly "roll back" and get back to a prior state, but there is a limit on how many iterations of your original data set you are going to be able to rescue. I look at this from a purely DBA related scope. To an extent, I have to dettach myself from the application, there are 'experts' that dedicate countless years in perfecting their methodology to craft entire applications and platforms that manipulate data and contsrue information from it (sometimes I think they actually deconstruct it...)

    If by EDS you refer to employee dental services the answer is no! 😀

    Now seriously, the 'pedantic' distinction between data and information is very important. Think of it in terms of evolution if you wish: your data will evolve surely into other entities, far more sophisticated and refined as they travel through the hands of countless 'data analysts'. But when will your 'information' have evolved so much that its lost all relation to its ancestor (data)? Is your informaton still valid if you can't determine where it's come from?

    Someone in the company decided it was worth the investment in infrastructure and resources to capture data. Once you've captured it, you can do whatever you please with it: massage it, cleanse it, create statistics, derive marketing trends, etc. What do you get? The 4-1-1 (information). Information which is relevant within the context of an original set of requirements. The minute they throw a new requirment at you, you are SOL. You've 'cleansed your' data so much, its no longer useful -may not even be reliable anymore.

    But guess what? When I wear my DBA hat....that's all I do. I don't attempt to analyse sales trends, nor demographic data. I will, however, try to spot anomalies that may indicate a flaw in the UI or feed. And I will surely have a word or two to say about the mechanisms used to capture data. If what you give me is garbage, well then, I will spit back out 'parfumed' garbage.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply