Search code examples
phpreplacesubstringpreg-match

how to do echo from a string, only from values that are between a specific stretch[href tag] of the string?


[PHP]I have a variable for storing strings (a BIIGGG page source code as string), I want to echo only interesting strings (that I need to extract to use in a project, dozens of them), and they are inside the quotation marks of the tag <a href="HERE"></a>

but I just want to capture the values that start with the letter: N (news)

[<a href="/news7044449/exclusive_news_sunday_"]
<a href="/n[ews7044449/exclusive_news_sunday_]"
that is, I think you will have to work with match using: [a href="/n]

how to do that to define that the echo will delete all the texts of the variable, showing only:

note that there are other hrefs tags with values that start with other letters, such as the letter 'P' : href="/profiles... (This does not interest me.)

$string = '</div><span class="news-hd-mark">HD</span></div><p><a href="/news7044449/exclusive_news_sunday_" title="exclusive_news_sunday_">exclusive_news_sunday_</a></p><p class="metadata"><span class="bg"><a href="/profiles/czechav">Czech AV</a><span class="mobile-hide"> - 5.4M Views</span>
- <span class="duration">7 min</span></span></p></div><script>xv.thumbs.preparenews(7044449);</script>                                      
<div id="news_31720715" class="thumb-block "><div class="thumb-inside"><div class="thumb"><a href="/news31720715/my_sister_running_every_single_morning"><img src="https://static-hw.xnewss.com/img/lightbox/lightbox-blank.gif"';

I imagine something like this:

$removes_everything_except_values_from_the_href_tag_starting_with_the_letter_n = ('/something regex expresion I think /' or preg_match, substring?);
    echo $string = str_replace($removes_everything_except_values_from_the_href_tag_starting_with_the_letter_n,'',$string);

expected output: /news7044449/exclusive_news_sunday_

NOTE: it is not essential to be through a variable, it can be from a .txt file the place where the extracts will be extracted, and not necessarily a variable.

thanks.


Solution

  • I believe this will help her.

    <?php
    
    $source = file_get_contents("code.html");
    
    preg_match_all("/<a href=\"(\/n(?:.+?))\"[^>]*>/", $source, $results);
    
    var_export( end($results) );
    

    Step by Step Regex:

    Regex Demo

    Regex Debugger