console.log(document.getElementsByTagName('html')['0'].textContent);
console.log(document.getElementsByTagName('html')['0'].innerText);
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<p>innnerHtml of paragraph</p>
</body>
</html>
The textContent property is printing all the text content inside the html element excluding the tags. It also prints all the white spaces and new lines. So to get the text without white spaces and new lines, I used the innerText property but it didn't print the text inside the title element and just printed the text inside the p element. Why didn't the innerText property work as I expected?
Your below code working as it's intended behavior. I think you get confused about them. Have a look here at MDN
Couple of them :
While textContent
gets the content of all elements, including <script>
and <style>
elements, innerText
does not, only showing human-readable elements.
innerText
is aware of styling and won’t return the text of hidden elements, whereas textContent
does.
To remove white-space and new-line you can use regex replace.
// remove new-line and white space with replace
console.log(document.getElementsByTagName('html')['0'].textContent.replace(/[\n\r]+|[\s]{2,}/g, ' '));
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<p>innnerHtml of paragraph</p>
</body>
</html>