As big data application success stories (and failures) have appeared in the news and technical publications, several myths have emerged about big data. This article explores a few of the more significant myths, and how they may negatively affect your own big data implementation.
Most large organizations have implemented one or more big data applications. As more data accumulates internal users and analysts execute more reports and forecasts, which leads to additional queries and analysis, and more reporting. The cycle continues: data growth leads to better analysis, which generates more reporting. Eventually the big data application swells with so much data and querying that performance suffers.
Big data applications are now fairly commonplace in large organizations. It is, however, difficult to simply ‘drop’ these applications into an existing IT infrastructure and expect to run smoothly. In addition to energy and cooling requirements for new hardware to support the new big data application, other IT areas need to prepare.
Big Data implementations are more than just lots of data. Of equal importance is the analytics software used to query the data. Analyzing business data using advanced analytics is common, especially in companies that already have an enterprise data warehouse. It is therefore only natural that your big data application must be integrated with the existing warehouse.
As the enterprise embraces big data, management assumes that staff sizes will decrease. What should be done with those unneeded technologists? One answer: convert them into technology consultants that collaborate and coordinate with the lines of business. In other words, give them customer-facing roles.
Technical support teams usually support familiar hardware and software configurations. Specialization in particular combinations of operating systems and database management software is common, and this allows some team members to gain in-depth experience that is extremely valuable in an enterprise IT setting. How has big data changed this paradigm?
Many big data application implementations seem to begin with an existing data warehouse, one or more new high-volume data streams, and some specialized hardware and software. The data storage issue is often accommodated by installing a proprietary hardware appliance that can store huge amounts of data while providing extremely fast data access. In these cases, do we really need to worry about database design?