I want to collect the detailed recommendation description paragraphs that a person received on his/her LinkedIn profile, such as this link:
https://www.linkedin.com/in/teddunning/details/recommendations/
(This link can be viewed after logging in any LinkedIn account)
Here is my best try:
for index, row in df2.iterrows():
linkedin = row['LinkedIn Website']
current_url = f'{linkedin}/details/recommendations/'
driver.get(current_url)
time.sleep(random.uniform(2,3))
descriptions=driver.find_elements_by_xpath("//*[@class='display-flex align-items-center t-14 t-normal t-black']")
s=0
for description in descriptions:
s+=1
print(description.text)
df2.loc[index, f'RecDescription_{str(s)}'] = description.text
The urls I scraped in df2 are all similar to the example link above. The code find nothing in the "descriptions" variable.
My question is: What element I should use to find the detailed recommendation content under "received tab"? Thank you very much!
Well you would first get the direct parent of the paragraphs. You can do that with XPath, class or id whatever fits best. After that you can do Your_Parent.find_elements(by=By.XPATH, value='./child::*')
you can then loop over the result of that to get all paragraphs.
This selects all the paragraphs i have not yet looked into seperating them by post but here is what i got so far:
parents_of_paragraphs = driver.find_elements(By.CSS_SELECTOR, "div.display-flex.align-items-center.t-14.t-normal.t-black")
text_total = ""
for element in parents_of_paragraphs:
paragraph = element.find_element(by=By.XPATH, value='./child::*')
text_total += f"{paragraph.text}\n"
print(text_total)