Search code examples
javahadoophbase

Handling Images,Video and audio types using hbase


Anybody have any idea about,How to handle unstructured data like Audio,Video and Images using Hbase.I tried for this alot but i didn't get any idea.please any help is appreciated.


Solution

    • Option 1: convert image to byte array and you can prepare put request and insert to table. Similarly audio and video files also can be achieved.

    See https://docs.oracle.com/javase/7/docs/api/javax/imageio/package-summary.html


    import javax.imageio.ImageIO;
    
    /*       * Convert an image to a byte array
             */
        private byte[] convertImageToByteArray (String ImageName)throws IOException {
    
            byte[] imageInByte;
            BufferedImage originalImage = ImageIO.read(new File(ImageName));
    
            // convert BufferedImage to byte array
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            ImageIO.write(originalImage, "jpg", baos);
            imageInByte = baos.toByteArray();
            baos.close();
    
            return imageInByte;
        }
    
    • Option 2 : You can do that in below way using Apache commons lang API. probably this is best option than above which will be applicable to all objects including image/audio/video etc.. This can be used NOT ONLY for hbase you can save it in hdfs as well

    See my answer for more details.

    For ex : byte[] mediaInBytes = org.apache.commons.lang.SerializationUtils.serialize(Serializable obj)

    for deserializing, you can do this static Object deserialize(byte[] objectData)

    see the doc in above link..

    Example usage of the SerializationUtils

    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    
    import org.apache.commons.lang.SerializationUtils;
    
    public class SerializationUtilsTest {
      public static void main(String[] args) {
        try {
          // File to serialize object to it can be your image or any media file
          String fileName = "testSerialization.ser";
    
          // New file output stream for the file
          FileOutputStream fos = new FileOutputStream(fileName);
    
          // Serialize String
          SerializationUtils.serialize("SERIALIZE THIS", fos);
          fos.close();
    
          // Open FileInputStream to the file
          FileInputStream fis = new FileInputStream(fileName);
    
          // Deserialize and cast into String
          String ser = (String) SerializationUtils.deserialize(fis);
          System.out.println(ser);
          fis.close();
        } catch (Exception e) {
          e.printStackTrace();
        }
      }
    }
    

    Note :jar of apache commons lang always available in hadoop cluster.(not external dependency)