HDFS (US) - Hadoop Distributed File System (HDFS) is a file system that spans all the nodes in a Hadoop cluster for data storage. It links together the file systems on many local nodes to make them into one big file system.
Hadoop is supplemented by an ecosystem of Apache projects, such as Pig (US), Hive (US) and Zookeeper (US), that extend the value of Hadoop and improves its usability.
So what’s the big deal?
Hadoop enables a computing solution that is:
Scalable – New nodes can be added as needed, and added without needing to change data formats, how data is loaded, how jobs are written, or the applications on top.
Cost effective – Hadoop brings massively parallel computing to commodity servers. The result is a sizeable decrease in the cost per terabyte of storage, which in turn makes it affordable to model all your data.
Flexible – Hadoop is schema-less, and can absorb any type of data, structured or not, from any number of sources. Data from multiple sources can be joined and aggregated in arbitrary ways enabling deeper analyses than any one system can provide.
Fault tolerant – When you lose a node, the system redirects work to another location of the data and continues processing without missing a fright beat.
Oracle Big Data
Big Data Architecture:Deliver your new competitive strategy on a unified architecture that provides end-to-end data liquidity, tailored scalability, and security and privacy from the inside out. Innovate on a foundation of big data management, integration, analytics, and applications.
Discover Fast and Predict Fast :Analyze a variety of data by exploring and experimenting for more precise personalized interactions and quicker, more accurate anticipation of changing behavior and situations. Speed discoveries and next best action into operational activities.
Access Hadoop and NoSQL with SQL:Break down data silos and gain the insights you need with the skills you already have by using a single SQL query to seamlessly access data stored in Hadoop, relational databases, and NoSQL stores and scale on any platform
Extend Governance and Security:Reduce risk by extending information protection policies to Hadoop and NoSQL—including seamless authentication, authorization and auditing controls similar to security for your relational database