Search code examples
javahtmljsouphrefextract

How to get one "a href" out of many in one html class with jSoup


I have to extract all text elements in HTML into Java Strings. But in seperate Strings.

I have the following code:

<div class="sb-spieldaten">
    <p class="sb-datum hide-for-small">
        <a href="/jumplist/spieltag/wettbewerb/C1/saison_id/2014/spieltag/2">2. Spieltag</a>
        &nbsp;&nbsp;|&nbsp;&nbsp;
        <a href="/aktuell/waspassiertheute/aktuell/new/datum/2014-07-26">Sa., 26.07.2014</a>
        &nbsp;&nbsp;|&nbsp;&nbsp;17:45 Uhr
    </p>
    <p class="sb-datum show-for-small">
        <a href="/jumplist/spieltag/wettbewerb/C1/saison_id/2014/spieltag/2">2. Spieltag</a>
        <br />
        <a href="/aktuell/waspassiertheute/aktuell/new/datum/2014-07-26">26.07.2014</a>
        <br>
        17:45 Uhr
    </p>
    <div class="ergebnis-wrap">
        <div class="sb-ergebnis">
            <div class="sb-endstand">2:3
                <div class="sb-halbzeit">(<span>2:</span>2)
                </div>
            </div>
        </div>
    </div>
    <p class="sb-zusatzinfos">
        <span class="hide-for-small">
            <a href="/stadion/stadion/verein/504/saison_id/2014">Letzigrund</a>
            &nbsp;&nbsp;|&nbsp;&nbsp;
            <strong>4.200 Zuschauer</strong>
            <br />
        </span>
        <strong>Schiedsrichter:</strong>
        <br class="show-for-small" />
        <a title="Fedayi San" href="/fedayi-san/profil/schiedsrichter/4791">Fedayi San</a>
    </p>
</div>

I use:

Elements myText = doc.getElementsByClass("sb-spieldaten");
String myString = myText.select(a.sb-datum.hide-for-small").text();

But with this I extract all Strings in the class "hide-for-small". So the answer I get is : 2. Spieltag | Sa., 26.07.2014 | 17:45 Uhr 2. Spieltag 26.07.2014 17:45 Uhr Letzigrund | 4200 Zuschauer Schiedsrichter: Fedayi San

How do I get only one of this Strings? I can't find it with .getElementsByClass("...") understandably. Is there a way to extract a specific "a href" element? Or do I have to use the .split() method?


Solution

  • Code Snippet for example

    Document abc = Jsoup.connect("http://www.abc.in/").timeout(0).get();
    Elements ee = abc.select("a[href*=xyz]");// all hrefs containing xyz substring 
    String xyz = ee.first().attr("abs:href");