Search code examples
jsonflumeflume-ng

Failed loading positionFile: while using TAILDIR Source in flume i am getting error


I working on Flume to append the data from a local directory to HDFS using Flume Source TAILDIR.

My use case is to do Delta Load If the new line comes in the source file in local dir so that will append in hdfs.

This is my Flume Conf file :

#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1

agent.sources.r1.type=TAILDIR
agent.sources.r1.positionFile = /home/flume/Documents/taildir_position.json
agent.sources.r1.filegroups=f1
agent.sources.r1.filegroups.f1=/home/flume/Documents/spooldir/
agent.sources.r1.batchSize = 20
agent.sources.r1.writePosInterval=2000
agent.sources.r1.maxBackoffSleep=5000
agent.sources.r1.fileHeader = true

agent.sources.r1.channels=k1
agent.channels.k1.type=memory
agent.channels.k1.capacity=10000
agent.channels.k1.transactionCapacity=1000   

agent.sinks.c1.type=hdfs
agent.sinks.c1.channel=k1
agent.sinks.c1.hdfs.path=hdfs://localhost:8020/flume_sink
agent.sinks.c1.hdfs.batchSize = 1000
agent.sinks.c1.hdfs.rollSize = 268435456
agent.sinks.c1.hdfs.writeFormat=Text

while running flume command : flume-ng agent -n agent -c conf -f /home/swechchha/Documents/flumereal.conf

I am getting errorLoading Json File Unable to Load Json file

I am getting error to load JSON file.


Solution

  • The Flume.conf mentioned in Question Statement is having a problem.

    TAILDIR SOURCE: Watch the specified files, and tail them in nearly real-time once detected new lines appended to each files. If the new lines are being written, this source will retry reading them in wait for the completion of the write.

    While writing filegroups property directory may contain multiple files in this case it should be mentioned like directory path/ .filestype.

    agent.sources.r1.filegroups.f1=/home/flume/Documents/spooldir/.*txt.*
    

    Then run flume.conf and check the result it will work fine.