Well, I have been working in a app to display news headings and contents from the site http://www.myagdikali.com
I am able to extract the data from 'myagdikali.com/category/news/national-news/' but there are only 10 posts in this page and there are links to other pages as 1,2,3... like myagdikali.com/category/news/national-news/page/2.
All I need to know is, how do I extract news from every possible pages under /national_news ? Is it even possible using Jsoup ?
Till now my code to extract data from a single page is:
public View onCreateView(LayoutInflater inflater, ViewGroup container,
Bundle savedInstanceState) {
View rootView = inflater.inflate(R.layout.fragment_all, container, false);
int i = getArguments().getInt(NEWS);
String topics = getResources().getStringArray(R.array.topics)[i];
switch (i) {
case 0:
url = "http://myagdikali.com/category/news/national-news";
new NewsExtractor().execute();
break;
.....
[EDIT]
private class NewsExtractor extends AsyncTask<Void, Void, Void> {
String title;
@Override
protected Void doInBackground(Void... params) {
while (status == OK) {
currentURL = url + String.valueOf(page);
try {
response = Jsoup.connect(currentURL).execute();
status = response.statusCode();
if (status == OK) {
Document doc = response.parse();
Elements urlLists = doc.select("a[rel=bookmark]");
for (org.jsoup.nodes.Element urlList : urlLists) {
String src = urlList.text();
myLinks.add(src);
}
title = doc.title();
}
} catch (IOException e) {
e.printStackTrace();
}
page++;
}
return null;
}
EDIT: While trying to extract data from single page without loop, I can extract the data. But after using while loop, I get the error stating No adapter attached.
Actually I am loading the extracted data in the RecyclerView and onPostExecute is like this:
@Override
protected void onPostExecute(Void aVoid) {
layoutManager = new LinearLayoutManager(getActivity());
recyclerView.setLayoutManager(layoutManager);
myRecyclerViewAdapter = new MyRecyclerViewAdapter(getActivity(),myLinks);
recyclerView.setAdapter(myRecyclerViewAdapter);
}
Since you know the URL
of the pages you need - http://myagdikali.com/category/news/national-news/page/X (where X is the page number between 2 and 446), you can loop through the URL
s. You'll also need to use the Jsoup's response
, to make sure that the page exists (the number 446 can be changed - I believe that it increases).
The code should be something like this:
final String URL = "http://myagdikali.com/category/news/national-news/page/";
final int OK = 200;
String currentURL;
int page = 2;
int status = OK;
Connection.Response response = null;
Document doc = null;
while (status == OK) {
currentURL = URL + String.valueOf(page); //add the page number to the url
response = Jsoup.connect(currentURL)
.userAgent("Mozilla/5.0")
.execute(); //you may add here userAgent/timeout etc.
status = response.statusCode();
if (status == OK) {
doc = response.parse();
//extract the info. you need
}
page++;
}
This is of course not fully working code - you'll have to add try-catch
sentences, but the compiler will help you.
Hope this helps you.
EDIT:
1. I've editted the code - I've had to send a userAgent
string in order to get response from the server.
2. The code runs on my machine, it prints lots of ????
, because I don't have the proper fonts installed.
3. The error you're getting is from the Android
part - something to do with your view
s. You haven't posted that piece of code...
4. Try to add the userAgent
, it might solve it.
5. Please add the error and the code you're running to the original question by editting it, it's much more readable.