Custom Schemas

  • Since the database itself is a "schema", I'm using the term "subschema" to refer to the database subdivisions including "dbo".

  • In early versions of sql server one could think of the database as a container for tables and other objects. I'm guessing that changed with sql 2005 or 2008 when all objects had to be in a schema. Microsoft changed the meaning of schema from "ER Design" to "folder" and everything had to be in a folder with nothing at the "root" level. Microsoft created a "dbo" schema for all the dbo user objects as sort of a backward compatible design. Of course, the schema word and concept came from the ANSI SQL Standard.

  • I've used schematic a fair bit to seperate logical setss of tables, but th best use I found in etl is the ability to have landing, staging and erro tabl with the same name in different schema's which makes debugging a 1-1 table check.

    They are super simple to setup and use and are , in my opinion, under used

    _____________________________________________________________________________MCITP: Business Intelligence Developer (2005)

  • I use schemas frequently.

    For example, in a current ETL database, I have core functionality for all files that will be imported into the database, and then there are peripheral objects to support different data-consumers and types of files. (All data is file-based in this system. That's what the database is for.)

    Core objects are in a Files schema. This is locked down pretty heavily, since any change made there, even in a dev environment, will affect everything in the datastreams. Changes to objects in the Files schema are, by policy, subject to a level of QA, review, regression testing, etc., that is extremely detailed. Code in there has to be approved unanimously by a fairly large committee of stakeholders.

    Then there are a number of datastream-specific schemas that feed off of the Files schema. Changes in those affect only that specific datastream. Isolating these objects, even having the same objects in multiple schemas, means that there isn't cross-contamination. Projects to enhance, improve, refactor, etc., in those spaces is subject to a smaller level of scrutiny and a smaller number of stakeholders.

    That isolation has kept a fair number of minor issues from spreading business-wide, instead being isolated. This makes troubleshooting easier, too. If you know where a problem is in terms of what services/applications are showing issues, then you know what datastream it belongs to, and you can more easily narrow down the issue for resolution.

    It also makes it really easy to find objects for a specific project. They will be in the schema for that particular datastream.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have always been a strong advocate of schema use throughout our production system.

    We have a large database structure under an enterprise workforce management system and traditionally the database had naming conventions for tables, which were hap-hazard at best.

    The first schema changes we made were to move all our auditing tables (a_) to an auditing schema, from there the introduction of code first development with .NET entity framework led to using schemes for our domain modelling.

    We also enforce custs schemes for any client customisation, to clearly differentiate from the product.

  • I frequently use schemas to logically group objects by functional area, department, etc. Sometimes, that is extended to implement schema-level security.

    I did have one application that was quite different. I was building a new version of an existing database because the needs had expanded far beyond the original design. We were receiving millions of records of text data daily from multiple sources. Each source needed to be kept separate for security and calculation reasons, but the same set of rules and calculations applied to every source. I gave each source it's own schema, created partition schemes for each, and put the partitioned tables (and other source-specific objects) in the source schemas. Any custom procedures were added to the source schemas with a common name across the schemas. Common objects like reference data and procedures were stored outside the source schemas. When the main process started, it could call custom procedures in the source schemas if they existed. This gave us the flexibility and control for each source while maintaining common base code and rules.

  • We have a bunch of old views that are used for cross-database queries and they were originally written using hard-coded database pointers. Of course after moving/renaming the databases a few times, and occasionally forgetting to update the database pointers in the views (and dealing with the problems this caused), we fixed the problem using synonyms.

    We create these synonyms in the schema "syn" with the original table name.

    Of course, we still need to rebuild the synonyms every time we move/rename the databases, but that's a single call to a stored procedure instead of having to update every view (and hoping that there aren't any other hard-coded references that were missed).

    As a bonus, you get to hear variations on this conversation:

    Developer #1: Why won't this query work?

    Developer #2: That table is in the other database, you have to sin it first!

  • We use schemas for various reasons. One is to separate a small set of tables used to track specific changes to the production database by a programmer. If production data needs to be changed, a ticket is written up for the task and then a specially written program is used to execute the actual SQL used to make the updates. The SQL is written into one of these special tables so that we can track who changed the data when, for which ticket and the SQL used.

  • We haven't used schemas much, starting to use it more. Until recently I was under the wrong impression that if someone dropped a table without specifying the schema that it would drop both tables. Say you had db_work.stage.tbl_name and dw_work.dbo.tbl_name that a 'DROP TABLE tbl_name' would drop both. I tried it recently and of course it didn't work. So I'm not sure where I got that idea. Would some other drop statement delete both tables? What about older versions of SQL?

    I'll just chalk it up to one of those things I thought I knew but didn't.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • I've never experienced dropping two objects when trying to drop one.

    My guess is that using something like the sp_MSforeachtable stored procedure would drop all matching objects if you don't specify a schema (owner in older versions). It depends on current and default schemas and security rights at the time of execution.

  • We currently use schemas to separate the different areas of our enterprise data warehouse (landing, staging and warehouse layers). We also have customer specific schemas for the landing and staging areas which then come together in the warehouse schema, along with a SSAS schema which exposes the data to a tabular model using views.

    • LandingXXX
    • LandingYYY
    • StagingXXX
    • StagingYYY
    • DWH

    When we bring on a new customer we will create new landing and staging schemas and create the objects using template script. Any customer specific changes are then made to the schema for that customer only.

Viewing 11 posts - 31 through 40 (of 40 total)

You must be logged in to reply to this topic. Login to reply