it will be great help if i will be able to know how to resolve the issues to get the proper output in json file
r = requests.get('https://www.iomfsa.im/enforcement/disqualified-directors/')
soup = BeautifulSoup(r.content, 'html.parser')
paragraphs=[]
length=soup.findAll("strong")
for leng in length:
paragraphs.append(leng.next_sibling)
paragraph = [i for i in paragraphs if i is not None]
print(paragraph)
list=['name','address','DOB','POD','DOD','Particulars of Disqualification Order or Undertaking']
You can do something like below:
from bs4 import BeautifulSoup
import requests
import json
import re
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
headers= {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}
url = 'https://www.iomfsa.im/enforcement/disqualified-directors/'
df_list = []
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'xml')
# print(soup)
dismissed_dirs = soup.select('section.accordion-item')
for d in dismissed_dirs:
# print(d)
name = d.find('strong', text=re.compile('^Name:')).next_sibling
address = d.find('strong', text=re.compile('^Address')).next_sibling
dob = d.find('strong', text=re.compile('^Date of Birth:')).next_sibling
pod = d.find('strong', text=re.compile('^Period of Disqualification:')).next_sibling
dod = d.find('strong', text=re.compile('^Dates of Disqualification:')).parent.text
particulars = d.find('strong', text=re.compile('^Particulars')).find_next('a').text
df_list.append((name, address, dob, pod, dod, particulars))
df = pd.DataFrame(df_list, columns = ['name', 'address', 'dob', 'pod', 'dod', 'particulars'])
print(df)
print('--------------')
print(df.to_dict(orient='records'))
This will return both a dataframe (which makes more sense, visually), as well as a dictionary, as you requested:
name | address | dob | pod | dod | particulars | |
---|---|---|---|---|---|---|
0 | John Trevor Roche Baines | c/o Isle of Man Prison, Jurby, Isle of Man | 19 Dec 1939 | 15 Years 0 Months 0 Days | Dates of Disqualification: From 15 Jul 2010 To 15 Jul 2025 | Section 2 Company Officers (Disqualification) Act 2009 |
1 | Ralph Stephen Brunswick | Valdfrieden, Ballaugh Glen, Ballaugh, Isle of Man IM7 5JB | 13 Years 6 Months 0 Days | Dates of Disqualification: From 4 Mar 2009 To 4 Sep 2022 | Section 2 Company Officers (Disqualification) Act 2009 | |
2 | Fenella Jane Carter | 30 North Quay, Douglas, Isle of Man IM1 4LD | 6 August 1984 | 6 Years 0 Months 0 Days | Dates of Disqualification: From 5 July 2018 to 5 July 2024 | Section 2 Company Officers (Disqualification) Act 2009 |
3 | Richard Alan Costain | St Patrick's Close, Jurby, Isle of Man | 13 Sep 1951 | 15 Years 0 Months 0 Days | Dates of Disqualification: From 16 Feb 2017 To 16 Feb 2032 | Section 2 Company Officers (Disqualification) Act 2009 |
4 | Paul Deighton | Isle of Man Prison, St Patrick’s Close, Coast Road IM7 3JP | 23 December 1966 | 12 Years 0 Months 0 Days | Dates of Disqualification: From 8 August 2020 to 7 August 2032 | Section 4 Company Officers (Disqualification) Act 2009 |
5 | Jamie Alexander Irving | Meadowcourt, The Links, Douglas Road, Peel, Isle of Man, IM5 1LN | Not known | 7 Years 0 Months 0 Days | Dates of Disqualification: From 26 Feb 2018 To 26 Feb 2025 | Section 4 Company Officers (Disqualification) Act 2009 |
6 | Jonathan Frank Edward Irving | Meadowcourt, The Links, Douglas Road, Peel, Isle of Man, IM5 1LN | Not known | 8 Years 0 Months 0 Days | Dates of Disqualification: From 26 Feb 2018 To 26 Feb 2026 | Section 4 Company Officers (Disqualification) Act 2009 |
7 | Duncan Frank Ellis Jones | Harpers Glen, Hillberry Green, Douglas, Isle of Man IM2 6DE | 10 Aug 1959 | 13 Years 0 Months 0 Days | Dates of Disqualification: From 25 Apr 2011 To 25 Apr 2024 | Section 2 Company Officers (Disqualification) Act 2009 |
8 | Lynn Keig | Croit-e-Quill, Lonan, Isle of Man, IM4 7JG | 28 June 1956 | Years 0 Months 0 Days | Dates of Disqualification: From 29 Jun 2017 To 28 Jun 2023 | Section 2 Company Officers (Disqualification) Act 2009 |
9 | Richard Ian Kissack | 6 Falcon Cliff Court, Douglas, Isle of Man, IM2 4AQ (currently Mr Kissack is residing at HM IOM Prison, Jurby, Isle of Man) | 30 October 1968 | 5 Years 11 Months 13 Days | Dates of Disqualification: From 31 December 2021 to 13 December 2027 | Section 2 Company Officers (Disqualification) Act 2009 |
10 | Alan Louis | Not known | 23 November 1965 | 12 Years 0 Months 0 Days | Dates of Disqualification: From 29 April 2019 to 28 April 2031 | Section 2 Company Officers (Disqualification) Act 2009 |
11 | Phillip Sean McCarthy | Cedar Lodge, Main Road, Crosby IM4 4BH | 5 October 1979 | 8 Years 0 Months 0 Days | Dates of Disqualification: From 28 November 2019 to 27 November 2027 | Section 2 Company Officers (Disqualification) Act 2009 |
12 | John McCauley | Not known | 30 March 1955 | 5 Years 0 Months 0 Days | Dates of Disqualification: From 29 April 2019 to 28 April 2024 | Section 2 Company Officers (Disqualification) Act 2009 |
13 | Dirk Frederik Mudge | 92, Daan Bekker Street, Windhoek, Namibia | 23 December 1976 | 8 Years 0 Months 0 Days | Dates of Disqualification: From 17 November 2018 to 16 November 2026 | Section 2 Company Officers (Disqualification) Act 2009 |
14 | Lukas Nakos | Not known | 6 March 1976 | 6 Years 0 Months 0 Days | Dates of Disqualification: From 29 April 2019 to 28 April 2025 | Section 2 Company Officers (Disqualification) Act 2009 |
15 | Andrew Mark Rouse | 13 Reayrt Ny Chrink, Crosby, Isle of Man IM4 2EA | 24 Jan 1977 | 5 Years 0 Months 0 Days | Dates of Disqualification: From 28 Feb 2018 To 28 Feb 2023 | Section 2 Company Officers (Disqualification) Act 2009 |
--------------
[{'name': 'John Trevor Roche Baines', 'address': 'c/o Isle of Man Prison, Jurby, Isle of Man', 'dob': '19 Dec 1939', 'pod': '15 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From 15 Jul 2010 To 15 Jul 2025', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': 'Ralph Stephen Brunswick', 'address': 'Valdfrieden, Ballaugh Glen, Ballaugh, Isle of Man IM7 5JB', 'dob': None, 'pod': '13 Years 6 Months 0 Days', 'dod': 'Dates of Disqualification: From 4 Mar 2009 To 4 Sep 2022', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Fenella Jane Carter', 'address': '30 North Quay, Douglas, Isle of Man IM1 4LD', 'dob': '6 August 1984', 'pod': '6 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa05 July 2018 to 5 July 2024', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': 'Richard Alan Costain', 'address': "St Patrick's Close, Jurby, Isle of Man", 'dob': '13 Sep 1951', 'pod': '15 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From 16 Feb 2017 To 16 Feb 2032', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Paul Deighton', 'address': 'Isle of Man Prison, St Patrick’s Close, Coast Road IM7 3JP', 'dob': '23 December 1966', 'pod': '12 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa08 August 2020 to 7 August 2032', 'particulars': 'Section\xa04 Company Officers (Disqualification) Act 2009'}, {'name': 'Jamie Alexander Irving', 'address': 'Meadowcourt, The Links, Douglas Road, Peel, Isle of Man, IM5 1LN', 'dob': 'Not known', 'pod': '7 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa026 Feb 2018 To\xa026 Feb 2025', 'particulars': 'Section\xa04 Company Officers (Disqualification) Act 2009'}, {'name': 'Jonathan Frank Edward Irving', 'address': 'Meadowcourt, The Links, Douglas Road, Peel, Isle of Man, IM5 1LN', 'dob': 'Not known', 'pod': '8 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa026 Feb 2018 To\xa026 Feb 2026', 'particulars': 'Section\xa04 Company Officers (Disqualification) Act 2009'}, {'name': 'Duncan Frank Ellis Jones', 'address': 'Harpers Glen, Hillberry Green, Douglas, Isle of Man IM2 6DE', 'dob': '10 Aug 1959', 'pod': '13 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From 25 Apr 2011 To 25 Apr 2024', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': 'Lynn Keig', 'address': 'Croit-e-Quill, Lonan, Isle of Man, IM4 7JG', 'dob': '28 June 1956', 'pod': ' Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa029 Jun 2017 To 28 Jun 2023', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Richard Ian Kissack', 'address': '6 Falcon Cliff Court, Douglas, Isle of Man, IM2 4AQ (currently Mr Kissack is residing at HM IOM Prison, Jurby, Isle of Man)', 'dob': <strong>30 October 1968</strong>, 'pod': '5 Years 11 Months 13 Days', 'dod': 'Dates of Disqualification: From\xa031 December 2021 to 13 December 2027', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Alan Louis', 'address': 'Not known', 'dob': '23 November 1965', 'pod': '12 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa029 April 2019 to 28 April 2031', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Phillip Sean McCarthy', 'address': 'Cedar Lodge, Main Road, Crosby IM4 4BH', 'dob': '5 October 1979', 'pod': '8 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa028 November 2019 to 27 November 2027', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' John McCauley', 'address': 'Not known', 'dob': '30 March 1955', 'pod': '5 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa029 April 2019 to 28 April 2024', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Dirk Frederik Mudge', 'address': '92, Daan Bekker Street, Windhoek, Namibia', 'dob': '23 December 1976', 'pod': '8 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa017 November 2018 to\xa016 November 2026', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': ' Lukas Nakos', 'address': 'Not known', 'dob': '6 March 1976', 'pod': '6 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa029 April 2019 to 28 April 2025', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}, {'name': 'Andrew Mark Rouse', 'address': '13 Reayrt Ny Chrink, Crosby, Isle of Man IM4 2EA', 'dob': '24 Jan 1977', 'pod': '5 Years 0 Months 0 Days', 'dod': 'Dates of Disqualification: From\xa028 Feb 2018 To\xa028 Feb 2023', 'particulars': 'Section 2 Company Officers (Disqualification) Act 2009'}]