Search code examples
apachehadoopibm-mqflumeflume-ng

Fetching JMS Header in Apache Flume


I am trying to consume JMS messages (IBM Websphere MQ) using Apache Flume and storing the data to HDFS. While reading the message, i am only able to see the body of the message and not the header content of the message.

Is it possible to read the jms message with the header property using Apache Flume?

My configuration:

# Source definition
u.sources.s1.type=jms
u.sources.s1.initialContextFactory=ABC
u.sources.s1.connectionFactory=<my connection factory>
u.sources.s1.providerURL=ABC
u.sources.s1.destinationName=r1
u.sources.s1.destinationType=QUEUE
# Channel definition
u.channels.c1.type=file
u.channels.c1.capacity=10000000
u.channels.c1.checkpointDir=/checkpointdir
u.channels.c1.transactionCapacity=10000
u.channels.c1.dataDirs=/datadir
# Sink definition
u.sinks.r1.type=hdfs
u.sinks.r1.channel=c1
u.sinks.r1.hdfs.path=/message/%Y%m%d
u.sinks.r1.hdfs.filePrefix=event_
u.sinks.r1.hdfs.fileSuffix=.xml
u.sinks.r1.hdfs.fileType = DataStream
u.sinks.r1.hdfs.writeFormat=Text
u.sinks.r1.hdfs.useLocalTimeStamp=TRUE

Solution

  • There are quite a few types of JMS messages as in "Table 30–2 JMS Message Types" here.

    The Flume DefaultJMSMessageConverter uses TextMessage as in here and is given below for your reference:

    ...
    else if(message instanceof TextMessage) {
          TextMessage textMessage = (TextMessage)message;
          event.setBody(textMessage.getText().getBytes(charset));
        } 
    ...
    

    TextMessage offers only body of the message.

    IMHO, you have two options:

    1. If at all possible, send the message-header, header-value pair in the body itself and use the "DefaultJMSMessageConverter" as is.
    2. Build your own "flume-jms-source.jar" by writing a custom JMSMessageConverter and type-cast the "message" to javax.jms.Message, get the JMS headers, set them in SimpleEvent.

    Hope this gives some direction.