By: William Alexander in Trends Tutorials on 2014-01-19
Multitude of user generated content in social websites and other sources such as Internet of Things give rise to accumulation and storage of massive amounts of data termed rightly as Big Data. Big Data related technologies are still a work in progress and continues to mature as the big players like IBM, Microsoft, Google are working on various tools and technologies to handle Big Data.
Besides these big players, there are Open Source of alternatives such as The Apache Hadoop software which is actually a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and thus should be automatically handled in software by the framework. Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers.
Even Yahoo and Facebook have implemented Hadoop to manage their PetaBytes of storage which keeps increasing each day. This is proof enough that Hadoop is a proven technology that has been tested.
No doubt that Hadoop implementations of this scale requires skilled programmers and engineers. However there are innovative services available now from companies such as xplenty who offer Hadoop as a Service. By using Hadoop as a Service, there is no waiting for a system to be built. Resources can be diverted to other areas directly related to the business plan instead of building and maintaining a data center. Your money is saved and the results are immediate.
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
- Your name, rating, website address, town, country, state and comment will be publicly displayed if entered.
- Aside from the data entered into these form fields, other stored data about your comment will include:
- Your IP address (not displayed)
- The time/date of your submission (displayed)
- Your email address will not be shared. It is collected for only two reasons:
- Administrative purposes, should a need to contact you arise.
- To inform you of new comments, should you subscribe to receive notifications.
- A cookie may be set on your computer. This is used to remember your inputs. It will expire by itself.
This policy is subject to change at any time and without notice.
These terms and conditions contain rules about posting comments. By submitting a comment, you are declaring that you agree with these rules:
- Although the administrator will attempt to moderate comments, it is impossible for every comment to have been moderated at any given time.
- You acknowledge that all comments express the views and opinions of the original author and not those of the administrator.
- You agree not to post any material which is knowingly false, obscene, hateful, threatening, harassing or invasive of a person's privacy.
- The administrator has the right to edit, move or remove any comment for any reason and without notice.
Failure to comply with these rules may result in being banned from submitting further comments.
These terms and conditions are subject to change at any time and without notice.
Most Viewed Articles (in Trends )
Latest Articles (in Trends)
- Data Science
- React Native
- Cloud Computing
- Java Beans
- Mac OS X
- Office 365
- Tech Reviews