we have set up the hadoop cluster with 2 machines, we are trying to implement cluster in our real time projects , we need information in a multiple node cluster about uploading the data , suppose if i have 9 data nodes , which slave node we need to upload the data.can i can give choice to upload the data into 2 slave nodes , if i am uploading the data into hdfs is it replicated into another slave nodes?. As we observed curretnly hdfs using /tmp location incase if the /tmp is full which location HDFS will use.
Purpose of adding the more number of cluster is to enlarge the data storage.. Are you looking for secure the cluster, grant the privileges to some of the user shold upload the data in to the HDFS ?? right If means you can implement the KERBEROS principle or authorize the user to upload the data!
Data replication: Yes once the data will be uploaded to the HDFS it will replicate the data in to the nodes, Once the decommission of data node occurs it ill take care the data itll moved form the decommissioned node into the other node.