I'm considering data repication between clusters for 2 use cases :
For first one, I'd tend to think Falcon is the right option. But for second one, I want to replicate data as sson as it is available (means end of put for HDFS, and end of table creation for Hive). What would be your view on this ?
Just discovered ReAir https://github.com/airbnb/reair
Seems a good tools to look at. :)