python python-3.x loops beautifulsoup urllib3

For Loop doesn't spit out needed results

I got this piece of code to spit out the unique "area number" in the URL. However, the loop doesn't work. It spits out the same number, please see below:

import urllib3
from bs4 import BeautifulSoup

http = urllib3.PoolManager()

url = open('MS Type 1 URL.txt',encoding='utf-8-sig')

links = []
for link in url:
    y = link.strip()
    links.append(y)

url.close()

print('Amount of Links: ', len(links))

for x in links:
    j = (x.find("=") + 1)
    g = (x.find('&housing'))
    print(link[j:g])

Results are:

http://millersamuel.com/aggy-data/home/query_report?area=38&housing_type=3&measure=4&query_type=quarterly&region=1&year_end=2020&year_start=1980 23

http://millersamuel.com/aggy-data/home/query_report?area=23&housing_type=1&measure=4&query_type=annual&region=1&year_end=2020&year_start=1980 23

As you can see it spits out the area number '23' which is only in one of this URL but not the '38' of the other URL.

Solution

There's a typo in your code. You iterate over links list and bind its elements to x variable, but print a slice of link variable, so you get the same string printed on each loop iteration. So you can change print(link[j:g]) to print(x[j:g]), but it's better to call your variables with more descriptive names, so here's the fixed version of your loop:

for link in links:
    j = link.find('=') + 1
    g = link.find('&housing')
    print(link[j:g])

And I also want to show you a proper way to extract area value from URLs:

from urllib.parse import urlparse, parse_qs
url = 'http://millersamuel.com/aggy-data/home/query_report?area=38&housing_type=3&measure=4&query_type=quarterly&region=1&year_end=2020&year_start=1980'
area = parse_qs(urlparse(url).query)['area'][0]

So instead of using str.find method, you can write this:

for url in urls:
    parsed_qs = parse_qs(urlparse(url).query)
    if 'area' in parsed_qs:
        area = parsed_qs['area'][0]
        print(area)

Used functions: