All the pages are connected via some href
elements. The very first page is named mainpage.html
. Now I want to remove the <image>
tags from all the webpages
and show elements within <div id = "pB">
.
Instead of removing image tags manually from one page to another, I'd like a generic method for this purpose. Any suggestions or queries from me you can ask me, thanks in advance.
the structure of tree is
<html> -> <body> -> <div id= pB>
As the structure and aim of your project are not totally clear to me, i will try to give you some hints for the various aspects i can identify. I am assuming a solution in PHP.
find all pages from within your mainpage.html: Regexp for extracting all links and anchor texts from HTML
or even more elegant
Regexp for extracting all links and anchor texts from HTML
alternatively, you mentioned a "local web directory" so you could also get all files via
https://www.php.net/manual/en/function.glob.php
Assuming you have all the filenames of files you want to parse in an $array, you could iterate over that array, open each file and use the mentioned modification from here
http://www.php.net/manual/en/function.strip-tags.php#86964
Either you then save your modified pages or you display them in your div.
Hope this helps a bit.