Search code examples
phpxmlrssxml-parsingsimplepie

Is it possible to pull information out a link from an article form for example the verge


I am trying to make something where you put in a link from an article and it puts out the article so the title, description , ... Is that possible, i think there is a way because some websites do it. And i think it's only compatible with websites that an rss reader can read. Can somebody help me or give me a little help on how to do it with simple pie for example.


Solution

  • This is the reason why RSS exists - A way to read information in a standardized way.
    In the pre html5-era every web-developer is more or less inventing his own format, because xhtml, html4 didn't have the right semantic elements for this.

    A few examples:

    <div class="article">
       <h1>Article</h1>
       <p>Content Content Content Content Content</p>
    </div>
    
    <div class="article">
       <span class="arcticleHeader">Article</span>
       <div class="left">Content Content Content Content Content</div>
    </div>
    

    with html5 it is different. there are much more semantic elements like <article>, <header>, <footer>.

    but you can still not know, how the content is actually structured.

    you should use, a well defined format for this like RSS, ATOM, RDF.
    Wikipedia describes this problem very well

    A standardized XML file format allows the information to be published once and viewed by many different programs.