Search code examples
javajsoup

Problem extracting data from span class using Jsoup


Hello i need a solution to my below code. I tried to extract text from span class but it seems all the text extracted in same time is it possible so that i can extract the text one by one.

JAVA CODE

public class Exractor {
    public static void main(String[] args) throws IOException {
        Document d = Jsoup.connect("https://www.brainyquote.com/topics").get();
        Elements e = d.select("div.col-md-4");
        for(Element el : e){
            Elements name = el.getElementsByTag("a");
            String text = name.text();
            System.out.println(text);
        }
    }
}

HTML OUTPUT

<div class="col-sm-6 col-md-4"> 
 <div class="bq_fl content indexContent topicContent"> 
  <div class="row"> 
   <div class="col-sm-6 col-xs-6"> 
    <div class="bqLn"> 
     <div class="bqLn"> 
      <a href="/topics/age" class="topicIndexChicklet" onclick="topicCl('/topics/age',1,'Index')">
	  <span class="topicContentName">Age</span> <span class="topicIndexArrow">
		<i class="fa fa-chevron-right" aria-hidden="true"></i>
	  </span> 
       <div style="clear:both"></div></a> 
     </div> 
    </div> 
    <div class="bqLn"> 
     <div class="bqLn"> 
      <a href="/topics/alone" class="topicIndexChicklet" onclick="topicCl('/topics/alone',2,'Index')">
	  <span class="topicContentName">Alone</span> <span class="topicIndexArrow">
	  <i class="fa fa-chevron-right" aria-hidden="true"></i>
	  </span> 
       <div style="clear:both"></div></a> 
     </div> 
    </div> 
    </div> 
   </div> 
  </div> 
 </div> 
</div>

JAVA OUTPUT

Age Alone Amazing Anger Anniversary Architecture Art Attitude Beauty Best Birthday Brainy Business Car Chance Change Christmas Communication Computers Cool Courage Dad Dating Death Design Diet Dreams Easter Education Environmental Equality Experience Failure Faith Family Famous Father's Day Fear Finance Fitness Food Forgiveness Freedom Friendship Funny Future Gardening God Good Government Graduation Great Happiness Health History Home Hope Humor Imagination Independence Inspirational Intelligence Jealousy Knowledge Leadership Learning Legal Life Love Marriage Medical Memorial Day Men Mom Money Morning Mother's Day Motivational Movies Moving On Music Nature New Year's Parenting Patience Patriotism Peace Pet Poetry Politics Positive Power Relationship Religion Respect Romantic Sad Saint Patrick's Day Science Smile Society Space Sports Strength Success Sympathy Teacher Technology Teen Thankful Thanksgiving Time Travel Trust Truth Valentine's Day Veterans Day War Wedding Wisdom Women Work

OUTPUT Expected

  • Age
  • Alone
  • Amazing
  • Anger

I am doing something wrong but i cant figure it out please help me


Solution

  • public class Exractor {
        public static void main(String[] args) throws IOException {
            Document d = Jsoup.connect("https://www.brainyquote.com/topics").get();
            Elements e = d.select("div.col-md-4");
            for(Element el : e){
                Elements names = el.getElementsByTag("a"); //getElementsByTag returns elements
            for(Element name: names) {
                String text = name.text();
                System.out.println(text);
                }
            }
        }
    }
    

    In your code, el.getElementsByTag("a"); Returned Elements. getting text() from elements level Get the combined text of this element and all its children.