One of the core components of the Hadoop framework and responsible for the storage aspect is HDFS. Unlike the usual storage available on our computers, HDFS is a Distributed File System and parts of a single large file can be stored on different nodes across the cluster. Here are some of the key concepts related to HDFS.
Ask database administrators how they implement disaster recovery in their big data environments and you'll get two typical responses: DR plans are not necessary and backups take up a lot of space. Despite this reasoning, a disaster recovery plan for your big data implementation may be essential for your company's future.
Big data is now a standard part of information technology architecture for most large organizations. As a database administrator, with the holiday season upon us, I have the following items and notions on my holiday wish list. Here's hoping that I am gifted one or more of these; each one gives me something that I want or need.
Big Data implementations bring their own problems and issues, and will require database administrators and support staff to redesign the data warehouse architecture. Here's how.
Distributed applications are just that: distributed across one or more hardware platforms across the enterprise. The database administrator (DBA) has the unenviable task of monitoring these environments and configuring and tuning the database server to meet multiple needs. As multiple distributed applications now require access to a very large data store, what tuning options are available to help?
Big data is the latest craze. Hardware and software vendors have overwhelmed IT departments with high-speed analytical software, proprietary high-performance hardware, and columnar-based data stores promising quick access and lightning-fast answers to ad hoc analytical queries. Forgotten in this blast of technology are the database administrators' most important responsibilities: backup and recovery.