Hadoop and around

Posty

For a start

grudnia 31, 2018

Initial setup For an initial setup I 've got a ready-to-play virtual machine from Oracle https://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite-2104726.html . Of course it is beneficial to start with own hadoop installation, but to look around this is much faster start. Hadoop versions What I hit first was Hadoop versions. There are 3. HDFS All of them built on HDFS, which can be think of as a virtual file system, where the user may find files and directories, and ACLs (not fully supported) and other goodies known from other file systems. Data schema/constraints/etc is enforced here not on write (as e.g. in RDBMS-es), but on read, so all those limitations are up to the user code, which by default works with raw input. By and large it consists of NameNode and DataNode , where the NameNode is a manager of Hadoop metadata, which cover the data distribution among DataNodes, while DataNodes keep the data itself in the form of standardized blocks - this...

Czytaj więcej

Szukaj na tym blogu

Hadoop and around

Posty

First Hadoop job

For a start