Data with Provenance

  • Comments posted to this topic are about the item Data with Provenance

    Best wishes,
    Phil Factor

  • I asked a question - when JSON in SQL Server could be constrained by collection of JSON schemas in the same fashion how Microsoft implemented a typed XML data in SQL Server. No answer so far 🙁

  • This is a signifigant issue where I work. Thanks for raising your concerns in an intelligent way.

    412-977-3526 call/text

  • Well, it's stupid to publish a dataset without knowing the meaning of what all columns or attributes are included. I'd expect better forethought from a healthcare organization knowing that the their data is potentially very patient-centric. When in doubt, leave it out.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks a lot for some very interesting replies. The whole business of data transfer makes me pretty damned scared. The quicker and tighter we can nail down these issues the better, and I find it disappointing that Microsoft are one of the few database vendors who aren't moving behind the JSON Schema standard

    Best wishes,
    Phil Factor

  • We can also question whether schema-less XML and JSON documents (or BLOBs, or images, or free text) belong in SQL Server in the first place. The user community has already asked for MongoDB style schema validation in Azure CosmosDB, and it seems the product team has indicated it's on the roadmap for 2019.
    https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/33464752-support-for-document-validation-using-json-schema

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Monday, December 3, 2018 9:05 AM

    Well, it's stupid to publish a dataset without knowing the meaning of what all columns or attributes are included. I'd expect better forethought from a healthcare organization knowing that the their data is potentially very patient-centric. When in doubt, leave it out.

    This is my thoughts exactly. I would say it's silly to think that anyone using a document database is all of a sudden unaware of what's going into the system or that document databases and the likes are naturally flawed or less secure because of this. Like it's some sort of problem with the technology more than the humans managing it. I mean, I get it. You are not defining a schema. But if you are giving people the power to do bad things, they will do it regardless of the system. I feel there is some extreme bias going on in this article and it sucks it exists here.

  • Phil Factor - Monday, December 3, 2018 9:20 AM

    Thanks a lot for some very interesting replies. The whole business of data transfer makes me pretty damned scared. The quicker and tighter we can nail down these issues the better, and I find it disappointing that Microsoft are one of the few database vendors who aren't moving behind the JSON Schema standard

    How is this standard being promoted? Is there a blog post about the path to adoption and the details somewhere?

    412-977-3526 call/text

  • The article should have been done with the phrase "Unfortunately, nobody had spotted an insignificant-looking XML column in an innocent-looking table," and concluded by recommending the age old advice that's independent of technology, when it comes to permissions, make sure that the DEFAULT is to DENY access. This way if you overlooked an insignificant looking column OF ANY KIND, it would have remained secure because that would have been the default.
  • @robert See json-schema.org. The LEARN section is a good place to start. Also, and rather simpler at Newtonsoft . See their introduction and  Validating JSON with JSON Schema

    Best wishes,
    Phil Factor

  • Phil Factor - Thursday, December 6, 2018 3:32 AM

    Thank you!

    I posted the links to my wiki: https://sqlserver.miraheze.org/wiki/Json_schema

    412-977-3526 call/text

  • This type of data breach is common; where protected data elements are accidentally released in a dataset intended for 3rd party or public subscribers.

    A few years ago, the Georgia state government released a voter registration dataset that inadvertently contained SSN, DOB, and other fields. An IT contractor was initially blamed. Here is an article where the contractor speaks out and describes from his perspective what happened.

    https://politics.myajc.com/news/state--regional-govt--politics/exclusive-fired-kemp-worker-says-scapegoat-data-breach/m1yBjy5dQVqNs4hAiQay1J/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Thursday, December 6, 2018 7:51 AM

    This type of data breach is common; where protected data elements are accidentally released in a dataset intended for 3rd party or public subscribers.

    A few years ago, the Georgia state government released a voter registration dataset that inadvertently contained SSN, DOB, and other fields. An IT contractor was initially blamed. Here is an article where the contractor speaks out and describes from his perspective what happened.

    https://politics.myajc.com/news/state--regional-govt--politics/exclusive-fired-kemp-worker-says-scapegoat-data-breach/m1yBjy5dQVqNs4hAiQay1J/

    Thanks for the link.

    412-977-3526 call/text

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply