Avoiding redundant data storage

  • Hello, I already searched the web on this but did not find a satisfactionary answer. Because I am not really experienced in this topic I would like to start a little discussion on database design.

    Starting point is the following scenario:

    There are two applications. Each of the application uses its own database.

    Application A is an internal (Windows) application which is used to handle the whole business of a company. E.g. entering orders, administrating customer addresses, creating invoices etc.

    Application B is an external (web) customer information system. This system mainly shows data which was created in application A. Additionally the customers can change some of the data which then will be used again by application A.

    At the moment the two systems have completly seperated databases. But most of the data in both systems is equal. There are several transfers (SSIS Jobs) which synchronise the databases in several cycles.

    This solution was choosen because of the following advantages:

    The applications are independent. If one of the databases is down e.g. for maintanance the other database / application can still run.

    Security reasons. If someone can get access to the database of the application B which is available over the internet he has only access to the transfered data. This is only a small part of the data which is stored in the database of application A.

    The licencing seems to be quite easy. For all users which use the internal application A we need CALs and for the external application we buy a server licence.

    The problems with this solution are:

    Many SSIS packages to maintain.

    In some cases the transfer is relatively complex because data can be changed in both systems.

    Therefore there is a big chance that there are inconsistencies in the redundant data.

    Some of the transfers take a lot of time and resources and can therefore only be executed once or twice a day. In most cases this time lag is not very satisfactionary.

    In my opinion the drawbacks are very big. But I am not really sure if I am right. Maybe this is a common approach to seperate external access from a companies internal data.

    I would feel better with a solution which avoids the redundant storage of data which allows the both applications to use the same data. What is your opinion on that? There are several technologies which sound interesting for this scenario e.g. Replication, linked servers, master data services.... But at the moment I am not sure in which direction to run. I would be very happy if someone could give me a hint which technology I should have a closer look at.

    Thank in advance.

  • You stated all of the negatives about this design... and they get worse as the size and transaction rate increase. The design is not scalable.

    The probability of survival is inversely proportional to the angle of arrival.

  • Thanks for the answer. But what would be a reasonable design.

    One possiblility would be to fully integrate the two databases into one big database. Therefore the problem of the redundant data would be solved. But then the two applications would be tightly coupled. This means if we would have to shut down the internal database for maintanace we will also have to shut down the external application for the customers.

    Another idea is to use a third database which contains all the data which is used by both applications. Then the two applications won't be coupled too tight. At the moment this seems to me the solution which I prefer. But are linked servers the best way to implement this solution?

  • You would need to clarify what "shutting down the database for maintenance" actually means. What sort of maintenance requires you to totally shut down the database?

    Perodic index maintenance would not require this nor would archiving of various tables. Yes, you would want to schedule this at hours of minimal activity and use batches where required to maximize concurrency, but beyond that what other maintenance issues do you have?

    It sounds like an architecture (design) issue to me and you should consider hiring an experience database/software architect to help you evaluate your requirements and recommend solutions.

    The probability of survival is inversely proportional to the angle of arrival.

  • At the moment our administration team sometimes requires to restart the server e.g. for installing updates. But you are right, this happens not very often. But one argument for the current design is that the two systems work independently. If e.g. the server for the internal application would be down because of an hardware defect this is not critical as long as it is not for several days. We would be able to still do our internal business. But the external application is something that we sold to our customers and therefore is critical. It is not acceptable to have down times for more than several minutes.

    Hiring an sql server expert is a very good sugestion. This would also be my favorite solution but my boss won't spent the money for such an expert. Especially because my boss did not give me the task to redesign the databases. I just try to find a concept which would work better than the current one in the matter of easier maintanance and fewer redundant data. I think if I will be able to create a good concept I can make sure that he will give me the time to implement it.

    So I am looking for ideas how to change the current design.

  • TDM13 (7/14/2011)


    At the moment our administration team sometimes requires to restart the server e.g. for installing updates. But you are right, this happens not very often. But one argument for the current design is that the two systems work independently. If e.g. the server for the internal application would be down because of an hardware defect this is not critical as long as it is not for several days. We would be able to still do our internal business. But the external application is something that we sold to our customers and therefore is critical. It is not acceptable to have down times for more than several minutes.

    If I have a 24/7 up-time requirement I'd implement clustering. If there is just a requirement for high availability but can tolerate being down for up to a few minutes I'd use a hot standby scenario. I'm sort of surprised you don't have such a system in place now since it is very cost effective.

    Anyway, I see no reason the two databases could not be merged into one database to simplify the synchronization logic as well as eliminate the duplicated data (and risk of data inconsistencies) that are inherent in the system that you are describing.

    The probability of survival is inversely proportional to the angle of arrival.

  • This was already a big help for me. Thank you. I think in this case I will try to create a concept which integrates the two databases.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply