I'm working on an API call with Flickr, that returns results per photo like:
<photo id="7503362468" owner="59044395@N02" secret="66b94027db" server="8423" farm="9" title="Potluck" ispublic="1" isfriend="0" isfamily="0" />
Now, according to Flickr's URL/API documentation, their URLs are structured like this, with the mstzb's being one-letter indicators of the size of the photo:
http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}_[mstzb].jpg
So, my question has to do with a mass search and replace that can take each line, prepend the http://farm
and then basically just "fill in the blanks" for the rest. The goal would be to use the API to fetch a restful XML that I can then throw the replacer at and have a list of URLs get generated. I have a brief familiarity with sed - admittedly no wizard at it - but I'm just unsure of how to do a search and replace per line, that prepends, then replaces in the proper order. Of course, the farm-id is the first to go into the URL, and is the fifth field in the XML - what I mean is the search and replace pattern follows the same locations for each line. Admittedly, again, I'm just getting started with regex-type stuff and any help would be appreciated. I also see that this sort of question has been asked before, but they seemed to be focused on how to create URL syntax rather than a sed-style replace. Like I said, my sed knowledge is more based around simple s/unnecessary/necessary
- I am just unsure of how to pick out certain quoted fields and move them into a preformed line.
edit: A little more info - I'm using Flickr's API Explorer to generate these XML files, and typically work with bash for editing. I think what I am after here is more along the lines of a bash script or possibly even a piece of (hopefully) executable programming language. I will hasten to add that although I do have a 'little' familiarity working with languages like python, I have zero to no experience with writing code aside from bash scripts. You can check out the API Explorer here: http://www.flickr.com/services/api/explore/?method=flickr.photos.search
Thanks y'all!
Three solutions using awk:
Solution 1. Assumes that every xml record looks like the sample given, with all the fields in exactly the sample's sequence:
The double quote is set as the field delimiter, then the desired content is accessed as positional variables within the input line.
A file could have many input records and all will be converted in one execution.
#!/usr/bin/awk -f
#<photo id="7503362468" owner="59044395@N02" secret="66b94027db" server="8423" farm="9" title="Potluck" ispublic="1" isfriend="0" isfamily="0" />
#1 2 3 4 5 6 7 8 9 `10 11 12 13 14 15 16 17 18 19
#http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}_[mstzb].jpg
#usage ./xml2url.awk <file_of_xml_text
BEGIN {FS="\""}
{print "http://farm"$10".staticflickr.com/"$8"/"$2"_"$6"_[mstzb].jpg"}
Solution 2. This solution assumes you can edit the xml, replacing
<photo
with
usage echo x|./xml2urlv2.awk
and replacing
/>
with nothing.
Then
#!/usr/bin/awk -f
# usage echo x|./xml2urlv2.awk id="7503362468" owner="59044395@N02" secret="66b94027db" server="8423" farm="9" title="Potluck" ispublic="1" isfriend="0" isfamily="0"
#<photo id="7503362468" owner="59044395@N02" secret="66b94027db" server="8423" farm="9" title="Potluck" ispublic="1" isfriend="0" isfamily="0" />
#http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}_[mstzb].jpg
#
{print "http://farm"farm".staticflickr.com/"server"/"id"_"secret"_[mstzb].jpg"}
does the trick.
Solution 3. This solution eliminates the need to echo anything into the script, but requires more editing. You have to put -v before each field that you care about.
#!/usr/bin/awk -f
#<photo id="7503362468" owner="59044395@N02" secret="66b94027db" server="8423" farm="9" title="Potluck" ispublic="1" isfriend="0" isfamily="0" />
#http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}_[mstzb].jpg
#usage: ./xml2urlv.awk -v id="7503362468" -v owner="59044395@N02" -v secret="66b94027db" -v server="8423" -v farm="9" -v title="Potluck" -v ispublic="1" -v isfriend="0" -v isfamily="0"
BEGIN{print "http://farm"farm".staticflickr.com/"server"/"id"_"secret"_[mstzb].jpg"}
### end of script
if you are new to awk, remember that the entire print statement must go on one line. Also, the { must go on the line with the word BEGIN.