Conceptual Question Replicating Data from standalone SQL Server to AWS Cloud

  • This at the moment is a conceptual question as we are just looking at options.

    So the problem we want to solve is replicating data from our SQL server in our own data centre to a RDS SQL Instance within Amazon Web Services.

    We have less than 10 tables of data but they are relatively large combined size of approx 10GB data. The tables reside across 3 databases, each database is TB's in size and the rest of the data not relevant to our requirements. For that reason full backup \ restore and log shipping are out as we dont want to transfer large amounts of un-required data each day to the cloud. Never mind how we technically go about moviong the files.

    So basically we would want to start with a baseline set of data, upload it into our database on the cloud SQL instance and then keep it in sync between the cloud and our local db server.

    A few options we have discussed but I have yet to fully investigate is:

    1. Transactional replication, gives us close to real time updates, but not sure whether this is even possible within AWS. I.e. can it be a subscriber?

    2. CDC, we can connect between the Cloud and our SQL Servers with the use of ODBC drivers, so moving data this way would be an option although requires a bit more coding effort.

    3. Usual ETL type process using SSIS or similar.

    Does anyone have any first hand experience of this sort of design issue that could share their knowledge, doesn't have to be a indepth detailed description but pointers on what tech solution to consider.

    MCITP SQL 2005, MCSA SQL 2012

  • Appears that SQL server replication is not supported on AWS either and restore from file is also not available:

    http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/SQLServer.Procedural.Importing.html

    Options are looking very limited.

    MCITP SQL 2005, MCSA SQL 2012

  • I had the same issue, transferring our database on an AWS instance to a local server. The database was significantly larger, 120GB.

    We implemented transactional replication.

    The basic steps I followed was outlined in any number of articles found on how to perform replication without a domain.

    The was four years ago, not sure what may have changed with AWS since then, but it is possible and it did work flawlessly.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Thank you Michael, it appears they must have removed that functionality as per there documentation it states:

    "Amazon RDS for SQL Server does not currently support importing data into the msdb database. SQL Server features that use msdb, such as Database Mail and Replication, are not currently supported in Amazon RDS."

    MCITP SQL 2005, MCSA SQL 2012

  • If you install your own SQL instance then you can use transactional replication.

    As part of our migration from co-lo datacentres to AWS, we had SQL P2P replication running between the data centres and it worked fine. We had some initial issues due to a known low-bandwidth outbound pipe from one of the co-lo DCs and had to upgrade this, but after that it was fine.

    Using SQL P2P and AWS Route 53 allowed us to do a phased migration to AWS. Route 53 allows you to redirect inbound traffic to another location, so we could start by sending 10% of traffic to AWS and 90% to co-lo, and change the ratio when needed until we were happy that AWS could take 100% of our traffic.

    We currently run across 2 Availability Zones, and continue to use SQL P2P as part of our HA and DR process.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • I assume you mean creating a SQL server instance using EC2 within AWS opposed to creating a SQL server instance via RDS?

    If so that does sound more plausible as I was reading up on it this very morning.

    At the moment I am currently working on this as one option.

    It's all a bit information overload at the moment as on too of this particular conundrum, I am trying to learn pipeline, HDFS, S3, dynamoDB and redshift. I like a challenge but this is proving to be quite a workload.

    MCITP SQL 2005, MCSA SQL 2012

  • There are LOTS of components in AWS. If you try to learn everything before you use anything, it is a very steep learning curve. My advice is to look at what you currently do and find the AWS components that allow you to continue with this. When you have more AWS experience, look at what else may be useful.

    We decided that our initial migration in 2012 would be fairly simple, a lift and shift of what we currently had and minimising other change. We did have to find a new load balancing solution to replace co-lo appliances (Route53 and ELBs), new mail solution (PowerMTA), new backup solution (Zmanda). We also exploited AWS best practice by using a Bastion server to gain access, and use security groups to provide additional firewall capability within our environment.

    Now that we have over a year of AWS experience, we are starting to break out of this shell. We standardised on Windows 2008R2 and SQL2012 for the move, but are now part-way through migrating to W2012R2 and SQL2014. However, we plan to use Redshift as our main future BI database mainly due to cost compared to SQL 2014.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • EdVassie (2/10/2014)


    There are LOTS of components in AWS. If you try to learn everything before you use anything, it is a very steep learning curve. My advice is to look at what you currently do and find the AWS components that allow you to continue with this. When you have more AWS experience, look at what else may be useful.

    We decided that our initial migration in 2012 would be fairly simple, a lift and shift of what we currently had and minimising other change. We did have to find a new load balancing solution to replace co-lo appliances (Route53 and ELBs), new mail solution (PowerMTA), new backup solution (Zmanda). We also exploited AWS best practice by using a Bastion server to gain access, and use security groups to provide additional firewall capability within our environment.

    Now that we have over a year of AWS experience, we are starting to break out of this shell. We standardised on Windows 2008R2 and SQL2012 for the move, but are now part-way through migrating to W2012R2 and SQL2014. However, we plan to use Redshift as our main future BI database mainly due to cost compared to SQL 2014.

    I dont want to sound negative, but unfortunatly I have no choice on learning the new tech. Its happening and we will be supporting it very very soon. Whether I agree or not its outwith my control. All other other tech I mentioned is for the processing of web logs. I can't really say more than that

    Anyway back on topic, I can at least confirm that its definetly not possible to use replication on RDS SQL Server with either push or pull replication. Tried that this morning. Currently setting up a SQL Server in EC2 and have made a request for a new security group for the testing so we can sort out any connectivity issues with IP's \ ports etc.

    At the moment we are only investigating the possibility of replicating a few lookup tables from our local SQL Server to the AWS SQL Server, this will be lookup data and its not required to sync back the other way. We may never even implement it as we are primarily using MySQL in RDS for a relational DB.

    MCITP SQL 2005, MCSA SQL 2012

  • Success, I have finally managed to get replication working. As mentioned above, running a EC2 AMI with SQL Server installed was the way to go. Its just a POC so needs further work on the security side but essentially I have a DEV DB in our domain with a transactional publication pushing data to the SQL server hosted in the AMI. It takes a little effort with setting up an alias via SQL Server configuration manager, and a SQL login with the correct permissions on the subscriber but it works 🙂

    Now to hand it off to the dev team to see if it gives them what they want and back to the reseach of the other tech.

    Thank you all for your help.

    MCITP SQL 2005, MCSA SQL 2012

  • Hey RTaylor2208, its good to read this post as I have the exact same requirement and am using an EC2 SQL Server database. I'm trying to setup Transactional Replication from the EC2 SQL instance to a local instance (on-premise). I'm hoping the same process you discovered will apply for me, but I am interested regardless.

    I'm particularly interested to know the specific settings you found to work with regard 'setting up an alias via SQL Server configuration manager, and a SQL login with the correct permissions on the subscriber'. Is this something you would be willing to share?

    I've tried to setup an alias but keep getting the same error:

    Cannot connect to [the name of the EC2 SQL server instance].

    ------------------------------

    ADDITIONAL INFORMATION:

    SQL Server replication requires the actual server name to make a connection to the server. Specify the actual server name, '[The server name registered in the instance]'. (Replication.Utilities)

    Any guidance you can provide would be much appreciated. 🙂

  • In the end we ditched the idea as we now use MySQL RDS instances in AWS rather than SQL server. We have a custom solution I designed in house to sync the data from on premise to the RDS instances using python.

    Anyway to setup an alias this is what I did:

    Within SQL Server Configuration manager create the alias to the EC2 instance:

    Alias Name: I called my AWSSQLServer

    Server: The public DNS name of the EC2 Instance something like dbasqlserver.cdwfpn0euha3.us-west-2.rds.amazonaws.com

    Port: The port number of the SQL server i.e. 3306

    Protocol: TCP/IP

    Now that I have been working with AWS for a while a few things you need to be aware of:

    The inbound \ outbound rules for your EC2 instance within the EC2 security group will need to allow traffic from your hosts connecting to the EC2 instance over the SQL server port.

    Even with this enabled in the EC2 Security Group you might still be blocked by either the windows firewall or any ACL rules you have set up for the AWS subnets.

    Best way to start is to check connectivity between systems using telnet and the IP addresses over port 3306 (or the SQL port your using). Get connectivity working first then go from there with the alias stuff.

    MCITP SQL 2005, MCSA SQL 2012

  • Thankyou very much for your prompt reply. Its been extremely helpful and encouraging in enabling me to make progress. I've made some wins today as a result. 🙂

    As you mention the security between the EC2 SQL Server instance and the other remote SQL Server is a bit of a headache, but I've got a few ideas to work around it, which I hope will work.

  • Check out the Cloudbasic RDS SQL AlwaysOn tool from the AWS Marketplace (https://aws.amazon.com/marketplace/pp/B00OU0PE5M/ref=_ptnr_blg_’blg5’ ). It is designed to do exactly what you need - continuously replicate data from your on-premise SQL Server to AWS RDS DB Instance within Amazon Web Services. You can also continuously geo-replicate data from one AWS region to another region.

    I've used this tool for continuous replication of on-premise databases to RDS (aggregating databases form various offices of my client into AWS), and then used Jaspersoft for reporting (https://aws.amazon.com/marketplace/pp/B00B527JQ0). I've also used it for AWS cross-region replication for DR and reporting. And in one occasion I had to deliver a solution to continuously stream data from an RDS SQL Standard DB instance in AWS Virginia, to Frankfurt and Tokyo based RDS SQL Web edition instances for data locality.

  • zheliazkoff (3/10/2016)


    Check out the Cloudbasic RDS SQL AlwaysOn tool from the AWS Marketplace (https://aws.amazon.com/marketplace/pp/B00OU0PE5M/ref=_ptnr_blg_’blg5’ ). It is designed to do exactly what you need - continuously replicate data from your on-premise SQL Server to AWS RDS DB Instance within Amazon Web Services. You can also continuously geo-replicate data from one AWS region to another region.

    I've used this tool for continuous replication of on-premise databases to RDS (aggregating databases form various offices of my client into AWS), and then used Jaspersoft for reporting (https://aws.amazon.com/marketplace/pp/B00B527JQ0). I've also used it for AWS cross-region replication for DR and reporting. And in one occasion I had to deliver a solution to continuously stream data from an RDS SQL Standard DB instance in AWS Virginia, to Frankfurt and Tokyo based RDS SQL Web edition instances for data locality.

    Ouch £600 per month per EC2 instance you spin up with this software, the solution we developed in house costs nothing. Downside being I can't share the exact code due to IP rights.

    MCITP SQL 2005, MCSA SQL 2012

  • Support and ongoing upgrades are included, and it works with RDS SQL Server Web Edition. Supporting your custom solution won't cost less. I am an independent consultant. It works for me. I need to do the job fast. Going the WSFC and native SQL AlwaysOn route does not work for me on AWS.

    Also, can your custom solution do this: http://cloudbasic.net/aws/rds/alwayson/benchmark/

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply