Search code examples
hdfslantransferhadoop-streamingflume

How to transfer data from one system to another system's HDFS (connected through LAN) using Flume?


I have a computer in LAN Connection . I need to transfer data from the system to another system's HDFS location using flume.

I have tried using ip address of the sink system, but it didn't work. Please help..

Regards,

Athiram


Solution

  • This can be achieved by using avro mechanism.

    The flume has to be installed in both the machines. A config file with the following codes has to be made to be run in the source system , where the logs are generated.

    a1.sources = tail-file
    a1.channels = c1
    a1.sinks=avro-sink
    
    a1.sources.tail-file.channels = c1
    a1.sinks.avro-sink.channel = c1
    a1.channels.c1.type = memory
    a1.channels.c1.capacity = 1000
    
    a1.sources.tail-file.type = spooldir
    a1.sources.tail-file.spoolDir =<location of spool directory>
    a1.sources.tail-file.channels = c1
    a1.sinks.avro-sink.type = avro
    a1.sinks.avro-sink.hostname = <IP Address of destination system where the data has to    be written>   
    a1.sinks.avro-sink.port = 11111
    

    A config file with the following codes has to be made to be run in the destination system , where the logs are generated.

    a2.sources = avro-collection-source
    a2.sinks = hdfs-sink
    a2.channels = mem-channel
    
    a2.sources.avro-collection-source.channels = mem-channel
    a2.sinks.hdfs-sink.channel = mem-channel
    a2.channels.mem-channel.type = memory
    a2.channels.mem-channel.capacity = 1000
    
    a2.sources.avro-collection-source.type = avro
    a2.sources.avro-collection-source.bind = localhost
    a2.sources.avro-collection-source.port = 44444
    
    a2.sinks.hdfs-sink.type = hdfs
    a2.sinks.hdfs-sink.hdfs.writeFormat = Text
    a2.sinks.hdfs-sink.hdfs.filePrefix =  testing
    a2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:54310/user/hduser/
    

    Now, the data from the log file in the source system will be written to hdfs system in the destination system.

    Regards,

    Athiram