In today’s world volume of data has been so large. This problem can be tackled through Hadoop-based Data Lake. A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. There are some strengths, weakness, opportunity and treats.
Strengths:
A Hadoop-based data Lake is a low cost operation, as it is open source software and can be processed on low cost system. Hadoop has a capability of storing and processing all types of data, whether structured or unstructured, which incurs only a small proportion of cost of our currently traditional systems.
Weaknesses:
There can be a lot of confusion due to large volume, variety and velocity of big data. Although Hadoop-based Data Lake is open source and helped to meet the increasing demands of vendor’s products, it had not received complete success yet. There is a lot of security problem with this system. Open source community and vendors tried to eradicate such problem by supporting security and privacy requirement of the organization.