I came across this JS regex that retrieve ID from the Youtube URLs listed below.
/(youtu(?:\.be|be\.com)\/(?:.*v(?:\/|=)|(?:.*\/)?)([\w'-]+))/i
Youtube URLS tested on:
http://www.youtube.com/user/Scobleizer#p/u/1/1p3vcRhsYGo
http://www.youtube.com/watch?v=cKZDdG9FTKY&feature=channel
http://www.youtube.com/watch?v=yZ-K7nCVnBI&playnext_from=TL&videos=osPknwzXEas&feature=sub
http://www.youtube.com/ytscreeningroom?v=NRHVzbJVx8I
http://www.youtube.com/user/SilkRoadTheatre#p/a/u/2/6dwqZw0j_jY
http://www.youtube.com/watch?v=6dwqZw0j_jY&feature=youtu.be
http://www.youtube.com/user/Scobleizer#p/u/1/1p3vcRhsYGo?rel=0
http://www.youtube.com/watch?v=cKZDdG9FTKY&feature=channel
http://www.youtube.com/watch?v=yZ-K7nCVnBI&playnext_from=TL&videos=osPknwzXEas&feature=sub
http://www.youtube.com/ytscreeningroom?v=NRHVzbJVx8I
http://www.youtube.com/embed/nas1rJpm7wY?rel=0
http://www.youtube.com/watch?v=peFZbP64dsU
How do I modify the regex to work in Java? Also, can it be altered to pick IDs from gdata URLs too? e.g https://gdata.youtube.com/feeds/api/users/Test/?alt=json&v=2
Update: This is the function where I intend to use the Regex.
public static String getIDFromYoutubeURL(String ytURL ) {
if(ytURL.startsWith("https://gdata")) { // This is my obviously silly hack,
ytURL = ytURL.replace("v=\\d", ""); // I belive Regext should handle this.
}
String pattern = "(?i)(https://gdata\\.)?(youtu(?:\\.be|be\\.com)/(?:.*v(?:/|=)|(?:.*/)?)([\\w'-]+))";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(ytURL);
if(matcher.find()){
return matcher.group(3);
}
return null;
}
Currently, it works fine for the URLs listed above and for https://gdata.youtube.com/feeds/api/users/Test/?id=c
. However, It doesn't not work well if the Gdata URL have the version parameter. e.g v=2, (https://gdata.youtube.com/feeds/api/users/Test/?id=c&v=2
). In this case, it returns 2 as the ID. How can it be improved to return Test and not 2 as the ID in the Gdata URL?
Thanks.
I fixed it!
Use replaceAll instead:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test2 {
public Test2() {
// TODO Auto-generated constructor stub
}
public static void main(String[] args) {
String toTest = getIDFromYoutubeURL(
"https://gdata.youtube.com/feeds/api/users/Test/?id=c&v=2");
System.out.println(toTest);
}
public static String getIDFromYoutubeURL(String ytURL ) {
if(ytURL.startsWith("https://gdata")) { // This is my obviously silly hack,
ytURL = ytURL.replaceAll("v=\\d", ""); // I belive Regext should handle this.
}
String pattern = "(?i)(https://gdata\\.)?(youtu(?:\\.be|be\\.com)/(?:.*v(?:/|=)|(?:.*/)?)([\\w'-]+))";
Pattern compiledPattern = Pattern.compile(pattern);
Matcher matcher = compiledPattern.matcher(ytURL);
if(matcher.find()){
return matcher.group(3);
}
return null;
}
}