Note: all the quotation marks in this question are actually part of the code.
I'm learning regex, and i'm trying to scrape a site with music on it. I put the source of the site into a text file called 'ytcmusic.txt'. Here's a sample of the html:
<li><a href="angelpool%20-%20know.mp3"> angelpool - know.mp3</a></li>
<li><a href="angelpool%20-%20sellout.mp3"> angelpool - sellout.mp3</a></li>
<li><a href="angelpool%20-%20time.mp3"> angelpool - time.mp3</a></li>
<li><a href="bella%20-%20gibsons.mp3"> bella - gibsons.mp3</a></li>
i'll use the first line as an example, i'm trying to scrape only the "angelpool%20-%20know.mp3" and to do that here's the regex i used: ".*.mp3" ------ when I put it into C#, I have to surround it in quotation marks, which ruins the quotation marks in the regex. heres the code (it doesn't compile, if you remove one set of quotation marks around the regex, it does but obviously doesnt return the correct part of the source):
var sr = new StreamReader("ytcmusic.txt");
string str = sr.ReadToEnd();
var match = Regex.Match(str, @".*.mp3");
thanks in advance!
This will do
"[^"]*"
Note that I'm keeping you to your sample input and assuming the titles are the only thing quoted. If that's not the case you have to put more context into the regex.
If you want to capture without the quotes you can introduce parenthesis like so
"([^"]*)"
In C# this becomes
StringCollection resultList = new StringCollection();
Regex regexObj = new Regex("\"([^\"]*)\"");
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Groups[1].Value);
matchResult = matchResult.NextMatch();
}