Companies use different flavors of databases today. Companies receive data from business applications and data can first reside in an Online Transaction Processing (OLTP) database. Data is then moved to analytical systems where data is stored in an Online analytical processing (OLAP) database. OLTP and OLAP are two different types of databases and Database Administrators (DBAs) maintain and support these databases. They make sure databases are healthy and available based on the Service Level Agreements (SLAs).
The increased exponential amount of data being received today is resulting in newer challenges that companies are facing today. The cost of managing data, database, and storage is increasing, and companies would like to use their most important assets, their people, judiciously.
The challenges those enterprises are facing are multi-faceted, and we will look at four main aspects:
- Managing Data/ DB proliferation: Managing data not only needs infrastructure to store data but resources, which can be difficult to acquire in the first place. How can companies offload some of the load from DBAs due to this ever-increasing data growth? Companies have a decision to make, either hire more resources to manage data or outsource data and database management to reduce risk.
- Repetitive Database Tasks: DBAs are involved in mundane activities needed to maintain databases, eating up important. How can companies utilize DBA’s time more efficiently?
- Security, Audit, and Compliance: Companies need to make sure data and databases are secured and in compliance with required regulations. How can an enterprise not only secure, but also audit, their databases so that they are in compliance.
- Project Management, and Delivery: DBAs need to adapt to the changing needs of application development methodologies and their delivery schedule. How can we make DBA activities more agile in order to quickly to adapt to the changes in the business?
Today, we have a solution for each of these challenges. The sections below elaborate on the solution to each of the above challenges and the tools that can be helpful.
Managing Data/ DB proliferation with Database Activities in the Cloud
Although traditional DBA activities remain the same for on-premises databases, the database can be managed more efficiently and cost-effectively today with the advent of cloud computing. Cloud computing can be the solution to the data and database proliferation. Cloud computing vendors provide infrastructure to store a large amount of data in the cloud. Moreover, companies can also save costs by putting data in the cloud. There are different types of cloud offerings available to businesses to store applications and databases in either a public or a private cloud. Moreover, cloud infrastructure is more secure today than ever and fulfills most compliance requirements, making it more attractive.
There are a number of cloud providers who also provide database services, like AWS, Azure, Google, or Oracle. These offerings are popular due to their high availability, performance, and cost. As per the Gartner’s magic quadrant for cloud database management systems, AWS, MS Azure, Google, and Oracle all fall in the Leader’s quadrant (source: Gartner).
There are three major types of cloud offerings – Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). This article aims not to discuss different types of cloud offerings but how DBAs need to adjust and conduct their database activities based on the type of cloud they are working with. The following table provides a glimpse of all the database activities and what action needs to be taken by DBAs on each type of infrastructure.
|Architecture & Design||√||√|
|Performance & Tuning||√||√|
|Backup & Recovery||√||√|
|Security & Audit||√||√|
|Manage DB VM Server||√|
|DB Service Configuration||√||√||√|
(Figure 1: Comparing DB activities in cloud)
The company’s DBAs will have more activities to perform if the database is on-premises or on an IaaS cloud solution where the VM, resources, and storage responsibility is with the IaaS provider. DBAs will have fewer activities to perform with PaaS, as the only activity is the configuration and integration of database service providers, but they do need to manage SLAs with PaaS/ DBaaS providers. If an application is a SaaS application, the database will also be maintained by the SaaS vendor.
No matter which cloud flavor the company is on, data ownership will still reside with the company, and that’s why companies need to be careful while engaging with any cloud provider. Some of the items, but not limited to, that we need to be careful about are:
- Reporting requirements
- Cost of database management
- Support provided by the cloud vendor
- Manage egress & ingress rules for the data management
- Performance & Recovery service level agreements (SLAs)
- Soc2 Type2 reports provided by cloud database vendor detailing controls on data and database.
Using Machine Learning (ML) and Automation for Repetitive Database Tasks
A lot of activities database administrators perform are repetitive in nature. These activities can be easily scheduled and triggered based on an event. Artificial Intelligence/ Machine Learning (AI/ML) helps identify patterns and can be used to automate those mundane database activities. AI/ML allows the system to learn from past experience and perform actions as humans. Machines learn the pattern, and as an event starts happening more and more, it becomes easy for the system to trigger actions. AI/ML helps data management, the ETL process, and Master data Management (MDM).
Its imperative for businesses to use automation these days so that machines can execute traditional functions conducted by the human workforce. SQL users can create, train, and deploy machine learning models. Automation can help DBAs eliminate day-to-day mundane DBA activities so that DBAs can focus on more high-value activities. Automated jobs can help in the capacity planning, migration of data, updating patches, and other maintenance jobs like shutdown/ startup, tablespace, and object management, to name a few.
Amazon RDS came up with the first automated database as a service in 2009. Database providers like Oracle & Microsoft SQL Server are moving towards providing self-administered databases or autonomous databases. A self-administered or autonomous database uses machine learning and automatically administers tasks like database tuning, security, backup, updated, and other management tasks without any human intervention. Database tools can also learn how the database is doing and can understand and suggest the corrective actions, as in the case of Automatic Tuning in SQL Server or Oracle.
Data Security, Audit, and Compliance
Data is critical for the organization and needs to be stored and protected such that it is only accessible by the authorized person. Data should be protected from an unauthorized person who can use it for the wrong reasons. In addition, internal or external users can hack data, so it is important to harden the security around data and log the activities of privileged accounts. Understanding the types of data is very important so that the data can be classified appropriately and proper controls can be put in place.
Today, there are many compliance laws around data so that citizens' data is secured and doesn’t get misused. In the case it is found that data is being misused, people have the right to sue the company storing their data. Some of the data compliance laws are GDPR, CCPA, CPRA, HIPAA, FISMA, SOX, and PCI DSS. Therefore, a business needs to abide by these rules and put controls in place to avoid getting into a liability situation. Thus, data compliance becomes the a core plan for any organization, and data protection steps revolve around it.
Data security starts with knowing one's data, what data is coming from what all sources. Once data is gathered then it needs to be classified according to its use and risk level. After classification, proper controls need to be put in place on data based on its criticality. Once all data is secured, it is important to have audits in place to see if the database remains in compliance on a periodical basis. If an audit finds a gap, then corrective actions need to take place so that those security gaps can be removed at the earliest. This cycle may sure data is secured throughout its lifecycle and remain complaint.
The data security lifecycle consists of:
Know your data: First, any organization should know what type of data is being received by the enterprise. We should also know about such data's entry, resting, and exit points. Access control should be managed, and only authenticated users should read or write to the database. DBAs need to make sure all data entry points are taken care of and analyze what type of data is entering.
Classify data: Data compliance plan will help identify and classify types of data. Restricted data like PCI or HIPAA needs to be handled safely. The risk of data is directly dependent on the classification of data. DBAs need to make sure data is being correctly classified as per the company and regulatory compliance.
Apply Control: Controls on data need to be in place. If possible, a centralized system should manage data access controls and authorization. Privileged users and system access should be logged and controlled. Depending on the data, we may need to encrypt data at rest or in transit. DBAs need to make sure role-based access be granted to users based on the principle of least privilege. DBAs need to make sure data is being correctly classified as per the company and regulatory compliance.
Audit & Reporting: Database and data need to be audited. ISACA has provided guidance on conducting an Audit/ Assurance program for SQL Server or Oracle databases. It is easy to follow and complete. The author has undertaken ISACA-based Audit assurance on Oracle databases. DBAs can execute Self-Assurance audits and Internal Control questionnaires to make sure the database is in compliance and doesn't have any gaps.
Corrective Actions: Enterprise needs to take corrective actions based on the audit/ assessment findings. If there is any gap found in the Self-Assurance audits, DBAs should put all the controls that are required so that database is in compliance.
(Figure 2: Data Security & Compliance lifecycle)
Project Management for Agile Database Administration
Readers can gain a lot of information on the Agile methodology by going over www.scaledagileframework.com. Enterprises today want to be agile to change with the changing market conditions. Application development is moving from building a monolithic application to microservices architecture. For the same reason, project management has embraced agile methodology where a monolithic project is delivered in smaller shippable products for greater value.
With changing needs, database development also needs to align with Application development for agile development and release management. DBAs and Data teams need to plan, collaborate, and deliver per sprint cycles and sprint methodology.
DBAs need to:
- Align with the assigned Agile Release Train (ART) or Solution Train. ART is a team having all stakeholders who helps in incrementally developing and delivering solutions.
- Be engaged in PI (Program Increment) planning (usually conducted quarterly). PI Planning is a cadence based planning sessions to align all ART teams are working towards a common goal.
- Coordinate with other Agile teams to know planned release items and important dates. Agile Team is a cross-functional group of individuals who define, build, test, and deliver shippable solution. DBAs needs to make sure they are also engaged with other agile teams
- Be involved in daily scrums to track and deliver shippable database activities.
- Participate in system demo which will help business stakeholders.
- Evaluate, inspect, and adapt after the database activities have been deployed and pivot as per the requirements.
Blockchain and Metaverse are a couple of fields generating a lot of traction and data these days. Companies are already getting into blockchain, and some are planning to get into metaverse space too. This will generate a lot of data that will be needed to be managed throughout its lifecycle. A blockchain is a database which is distributed and shared among network of computer nodes; stores information electronically in digital format; digitally signed and irreversible. This is the basis for all cryptocurrencies. It is like a legal contract stored in the distributed systems.
Metaverse is a network of the virtual world focused on social connections. 3D and Augmented Reality are the main components to bring metaverse experience to the customer. A lot of data and content will be captured in Metaverse. There is a also convergence between blockchain and metaverse where users can buy a digital asset using fungible tokens. Fungible tokens store value (like Bitcoin), and non-fungible tokens store date of art or academic work.
Blockchain & Metaverse Data may be residing in RDMS or non-SQL databases. Since a lot of data is being generated, Data custodians need to make sure all the security and non-functional requirements are well taken care of, apart from functional requirements. Regulations and compliance are of paramount importance for Cryptocurrency and Metaverse, especially when people are using it either as store of value or as a digital real estate.
Enterprise Database Administrator's responsibilities can change based on whether a database is configured on-premises or any flavor of cloud. An enterprise needs to be extra diligent with the cloud databases before engaging with and sharing data & databases with the vendor. Before engaging with cloud database services, reporting, security, auditing, and SLAs must be agreed upon. Machine Learning & Automation can help remove day-to-day mundane DBA activities so that DBAs can focus on other high-value activities. Database providers are also moving towards AI/ML-enabled self-administered databases. Data security is critical, mainly when restricted data like PCI or HIPAA is stored. Enterprise not only needs to secure data and database but also perform self-assessment/ audit of their database to make sure it is in compliance with the rules and regulations. DBAs also must adopt Agile methodologies to align with the application development to deliver value quickly.
With Cloud, Automation, Compliance, and Agile methodologies, DBAs roles and responsibilities have changed a lot today. Now DBAs need to provide value quickly and ensure automation can be used for the DB activities that machines can learn and perform without human intelligence or intervention. This helps an organization to use their resource judiciously and pivot when required. In addition, the advent of new technologies like blockchain, NFTs, and Metaverse data/ content will only grow exponentially going forward. So, we need to make sure there are regulations on managing this data and having a compliance plan around it. So, buckle up; we are going to have a databulous ride.