Does JSSoup support extracting text similar to Beautiful Soup soup.findAll(text=True)
?
The documentation does not provide any information about this use case, but seems to me that there should be a way.
To clarify what I want is to grab all visible text from the page.
In beautiful soup
you can extract text in different ways with find_all(text=True)
but also with .get_text()
or .text
.
JSSoup
works similar to beautiful soup
- To extract all visible text just call .get_text()
, .text
or string
on your soup
.
var soup = new JSSoup('<html><head><body>text<p>ptext</p></body></head></html>');
soup.get_text('|')
// 'text|ptext'
soup.get_text('|').split('|')
// ['text','ptext']
from bs4 import BeautifulSoup
html = '''<html><head><body>text<p>ptext</p></body></head></html>'''
soup = BeautifulSoup(html, "html.parser")
print(soup.get_text('|').split('|'))
['text','ptext']