I'm trying to write a simple RTF document pretty much from scratch in Java, and I'm trying to embed JPEGs in the document. Here's an example of a JPEG (a 2x2-pixel JPEG consisting of three white pixels and a black pixel in the upper left, if you're curious) embedded in an RTF document (generated by WordPad, which converted the JPEG to WMF):
I've been reading the RTF specification, and it looks like you can specify that the image is a JPEG, but since WordPad always converts images to WMF, I can't see an example of an embedded JPEG. So I may also end up needing to transcode from JPEG to WMF or something....
But basically, I'm looking for how to generate the binary or hexadecimal (Spec, p.148: "These pictures can be in hexadecimal (the default) or binary format.") form of a JPEG given a file URL.
EDIT: I have the stream stuff working all right, I think, but still don't understand exactly how to encode it, because whatever I'm doing, it's not RTF-readable. E.g., the above picture instead comes out as:
This PHP library would do the trick, so I'm trying to port the relevant portion to Java. Here is is:
$imageData = file_get_contents($this->_file);
$size = filesize($this->_file);
$hexString = '';
for ($i = 0; $i < $size; $i++) {
$hex = dechex(ord($imageData{$i}));
if (strlen($hex) == 1) {
$hex = '0' . $hex;
$hexString .= $hex;
return $hexString;
But I don't know what the Java analogue to dechex(ord($imageData{$i}))
is. :( I got only as far as the Integer.toHexString()
function, which takes care of the dechex
Thanks all. :)
Given a file URL for any file you can get the corresponding bytes by doing (exception handling omitted for brevity)...
int BUF_SIZE = 512;
URL fileURL = new URL("http://www.somewhere.com/someurl.jpg");
InputStream inputStream = fileURL.openStream();
byte [] smallBuffer = new byte[BUF_SIZE];
ByteArrayOutputStream largeBuffer = new ByteArrayOutputStream();
int numRead = BUF_SIZE;
while(numRead == BUF_SIZE) {
numRead = inputStream.read(smallBuffer,0,BUF_SIZE);
if(numRead > 0) {
byte [] bytes = largeBuffer.toByteArray();
I'm looking at your PHP snippet now and realizing that RTF is a bizarre specification! It looks like each byte of the image is encoded as 2 hex digits (which doubles the size of the image for no apparent reason). The the entire thing is stored in raw ASCII encoding. So, you'll want to do...
StringBuilder hexStringBuilder = new StringBuilder(bytes.length * 2);
for(byte imageByte : bytes) {
String hexByteString = Integer.toHexString(0x000000FF & (int)imageByte);
if(hexByteString .size() == 1) {
hexByteString = "0" + hexByteString ;
String hexString = hexStringBuilder.toString();
byte [] hexBytes = hexString.getBytes("UTF-8"); //Could also use US-ASCII
EDIT: Updated code sample to pad 0's on the hex bytes
EDIT: negative bytes were getting logically right shifted when converted to ints >_<