Question:
I'm working on a Django project that uses Scrapy to scrape member profiles from a website. The scraped data is processed by a method called match_maker. However, I'm encountering an issue where match_maker only returns 4 members, despite having 150 members in the database (excluding 3 staff members).
Details:
Database: Contains 153 members; 3 are staff members, leaving 150 regular members. Profile Types: Each member has a profile_type of either 'Man', 'Woman', 'Trans', or 'Couple'.
Issue:
In the match_maker method, there's a loop that processes rooms and assigns them to members. A set named used_rooms is used to track assigned rooms to ensure each room is only assigned once. The relevant code snippet is:
if room["username"] in used_rooms:
continue
When this condition is active, only 4 members are returned. If I comment out this check, the method processes all 150 members, but the number of available rooms exceeds one million, which is incorrect.
Objective:
I need each room to be assigned to only one member, ensuring no more than one member owns a particular room. I'm looking for guidance on how to resolve this issue so that match_maker correctly processes all 150 members without assigning multiple members to the same room.
What I've Tried:
Ensured Uniqueness: Verified that room["username"] is unique for each room. Debugged used_rooms: Printed the contents of used_rooms before and after the check to ensure it's being populated correctly. Checked Room Data Structure: Confirmed that room["username"] is unique across all rooms. Despite these efforts, the issue persists. Any insights or suggestions would be greatly appreciated.
Code snippet:
def match_maker(self, members, room_data: list):
matched_items = []
used_rooms = set() # Keep track of assigned rooms
room_data_copy = deepcopy(room_data)
# Group rooms by profile_type for quick lookup
rooms_by_type = defaultdict(list)
for room in room_data_copy:
if len(room["body"]) >= 5: # Only store rooms with a valid description
rooms_by_type[room["profile_type"]].append(room)
users = set()
for member in members:
if member.is_staff:
continue
# Get matching rooms for this profile_type
available_rooms = rooms_by_type.get(member.profile_type, [])
for room in available_rooms:
if room["username"] in used_rooms:
continue
if room["profile_type"] != member.profile_type:
continue
users.add(member.username)
matched_items.append((member, room))
used_rooms.add(room["username"])
print(f"The users are: {users}")
print(f"The number of users are: {len(users)}")
random.shuffle(matched_items)
return matched_items
Solution by @Serhii Fomenko:
def match_maker(self, members: QuerySet[Member], room_data: list):
matched_items = []
room_data_copy = deepcopy(room_data)
# Group rooms by profile_type for quick lookup
rooms_by_type: dict = defaultdict(list)
for room in room_data_copy:
if len(room["body"]) >= 5: # Only store rooms with a valid description
rooms_by_type[room["profile_type"]].append(room)
members_by_type = defaultdict(list)
for member in members:
if not member.is_staff:
members_by_type[member.profile_type].append(member)
processed_usernames = set()
for profile_type, members_list in members_by_type.items():
members_it = cycle(members_list)
for available_room in rooms_by_type[profile_type]:
member = next(members_it)
matched_items.append((member, available_room))
processed_usernames.add(member.username)
random.shuffle(matched_items)
return matched_items
From the code you provided, in this nested loop, when the if room[“username”] in used_rooms:
condition is active, the first user with a certain profile type will occupy all available rooms of a certain type, because you don't use the break
operator after a room has been assigned to a user. This loop should look something like this:
for room in available_rooms:
if room["username"] in used_rooms:
continue
if room["profile_type"] != member.profile_type:
continue
users.add(member.username)
matched_items.append((member, room))
used_rooms.add(room["username"])
break
This should now work if everything you wrote is true, namely room[“username”]
is unique and if it returns True
in all cases and prints this text to the console 'There are enough rooms'
:
import itertools
import operator
it = itertools.groupby(members, key=operator.attrgetter('profile_type'))
for profile_type, [*members] in it:
if len(rooms_by_type[profile_type]) >= len(members):
print('There are enough rooms')
else:
print("There aren't enough rooms to assign to all users")
UPDATED
After chatting and providing test data, the author of the question, the function match_maker
was reworked.
def match_maker(members, room_data: list):
import random
from collections import defaultdict
from copy import deepcopy
from itertools import cycle
matched_items = []
room_data_copy = deepcopy(room_data)
rooms_by_type = defaultdict(list)
for room in room_data_copy:
if len(room["body"]) >= 5:
rooms_by_type[room["profile_type"]].append(room)
members_by_type = defaultdict(list)
for member in members:
if not member.is_staff:
members_by_type[member.profile_type].append(member)
processed_usernames = set()
for profile_type, members_list in members_by_type.items():
members_it = cycle(members_list)
for available_room in rooms_by_type[profile_type]:
member = next(members_it)
matched_items.append((member, available_room))
processed_usernames.add(member.username)
random.shuffle(matched_items)
print(processed_usernames)
# {'par', 'solbad56', 'the2005', 'undergivenbs', 'x_Alexander__x', 'zobberzobber', 'Älskar52'}
return matched_items
Below is the data provided from the chat and how the verification was done, this returns the correct results as per the author's requirements. Data was provided for 10 members, 3 of which have the is_staff=True
flag so they are skipped, for the remaining 7, rooms were allocated by profile_type
. All members who are not employees have been processed.
from collections import Counter
from dataclasses import dataclass
@dataclass
class Member:
username: str
is_staff: bool
profile_type: str
init_members = [Member(username='par', is_staff=False, profile_type='Man'), Member(username='Älskar52', is_staff=False, profile_type='Man'), Member(username='zobberzobber', is_staff=False, profile_type='Man'), Member(username='x_Alexander__x', is_staff=False, profile_type='Man'), Member(username='undergivenbs', is_staff=False, profile_type='Man'), Member(username='the2005', is_staff=False, profile_type='Man'), Member(username='testuser3', is_staff=True, profile_type='Förening'), Member(username='testuser2', is_staff=True, profile_type='Förening'), Member(username='testuser1', is_staff=True, profile_type='Trans'), Member(username='solbad56', is_staff=False, profile_type='Man')]
rooms_list = [{'body': 'Anyone but never especially lose nice. Think owner some whole more. Stop character expert space movement show once close.', 'profile_type': 'Man', 'username': 'igilbert'}, {'body': 'Best report play nation yourself. Live show history fund prepare agent.', 'profile_type': 'Man', 'username': 'ujones'}, {'body': 'Everybody machine imagine middle always. Support party Congress use heart guy class.\nCut test economy act talk. Film become social in his response. Together cost onto still appear let son.', 'profile_type': 'Man', 'username': 'jenningspatrick'}, {'body': 'Side suddenly many present bit these.\nAgain right return job so. Natural why exist south decide system adult.', 'profile_type': 'Man', 'username': 'johnsonjorge'}, {'body': 'Receive trouble leader every then. Wide detail fall. Other ahead street all detail.', 'profile_type': 'Man', 'username': 'chaneyamanda'}, {'body': 'Religious such sport tend admit current rest. Identify think physical boy world. Box expert cultural prove toward last.', 'profile_type': 'Kvinna', 'username': 'rortega'}, {'body': 'Contain each energy. Argue toward rise consumer to point provide. Say girl involve identify task control.\nInside nothing television crime. Quality institution mission system Congress bit.', 'profile_type': 'Man', 'username': 'qholmes'}, {'body': 'Born behavior meeting often others enough history. Keep know from rule it ground. Bed prepare game never teacher meeting.\nSome away usually also. Three push few program whose.', 'profile_type': 'Heteropar', 'username': 'gunderwood'}, {'body': 'Likely medical get even increase fish. Though those time really candidate. Agency skill according account ask trouble writer close.', 'profile_type': 'Heteropar', 'username': 'hallemily'}, {'body': 'Any loss green wrong glass cell success. Claim level listen in easy.\nTake rise central claim often official itself. Age century true director. Issue race western school.', 'profile_type': 'Man', 'username': 'thatfield'}, {'body': 'Act their but system bag stop. Single open nature old dream station over president.\nWithin such participant show. Your public their during face. Available middle bar church deal time reach.', 'profile_type': 'Man', 'username': 'monicachambers'}, {'body': 'Easy itself mission language plant. Walk state job thank.\nMachine reason oil easy. Three personal much body however newspaper once.', 'profile_type': 'Heteropar', 'username': 'renee12'}, {'body': 'Just there should hope course. Gas security bill apply why off. Effort strong media six.\nTown she important popular. Bag deep vote.', 'profile_type': 'Man', 'username': 'cooleymichelle'}, {'body': 'Total note perhaps assume include kind. Bring area data ready. Walk want positive be. Relationship ready forget seem adult air.\nPicture player activity rule. Born stuff wall its.', 'profile_type': 'Man', 'username': 'ekelly'}, {'body': 'Choose page carry simple coach final. Goal do necessary defense drive. Wish culture figure happen later how.', 'profile_type': 'Man', 'username': 'wilsonwilliam'}, {'body': 'Attention alone class task. Agree American choose. Fact may civil ago water. Score build door job defense free staff.\nAmerican ago miss there subject. Ok never find heavy natural color eye.', 'profile_type': 'Man', 'username': 'imartinez'}, {'body': 'Range week agreement about attention Mr. Rule family development.\nGrow company short fund these. Goal executive theory various success north. Realize exactly together cup.', 'profile_type': 'Man', 'username': 'lrivas'}, {'body': 'Study record coach town. Stop character military half score.\nPass fish election writer evidence sit easy. Fast size experience.\nBehind word age professional. Yeah by tree century.', 'profile_type': 'Man', 'username': 'gperry'}, {'body': 'Air what easy once.\nStock cause make western will. Significant west again option expect short across family. Big base follow big these effect through.', 'profile_type': 'Man', 'username': 'wendy37'}, {'body': 'Situation establish because type sense. Sense wind perform.\nIn prove remain commercial explain. Able culture great international consider although piece.', 'profile_type': 'Man', 'username': 'stonepeggy'}, {'body': 'Traditional color whom make position animal quite. Break major sure remember wide choice process. Star serve street crime city five. Military describe everybody including word pressure.', 'profile_type': 'Man', 'username': 'bbrown'}, {'body': 'Scene federal per shake go all. Dinner year shake memory. Suggest happy address discussion affect late.\nReduce do economy job room allow. Benefit son cut my themselves ten. Whether world bar reality.', 'profile_type': 'Man', 'username': 'yford'}, {'body': 'Talk purpose possible yard statement.\nCase doctor by. Remember his I throw I seek.\nSpring service radio phone. Way same far. Cause total age court.', 'profile_type': 'Man', 'username': 'emccarthy'}, {'body': 'Door reality turn key. Woman responsibility significant name friend.', 'profile_type': 'Heteropar', 'username': 'alexanderalvarez'}]
result = match_maker(members=init_members, room_data=rooms_list)
print(result)
# [(Member(username='undergivenbs', is_staff=False, profile_type='Man'), {'body': 'Receive trouble leader every then. Wide detail fall. Other ahead street all detail.', 'profile_type': 'Man', 'username': 'chaneyamanda'}), (Member(username='Älskar52', is_staff=False, profile_type='Man'), {'body': 'Situation establish because type sense. Sense wind perform.\nIn prove remain commercial explain. Able culture great international consider although piece.', 'profile_type': 'Man', 'username': 'stonepeggy'}), (Member(username='zobberzobber', is_staff=False, profile_type='Man'), {'body': 'Traditional color whom make position animal quite. Break major sure remember wide choice process. Star serve street crime city five. Military describe everybody including word pressure.', 'profile_type': 'Man', 'username': 'bbrown'}), (Member(username='solbad56', is_staff=False, profile_type='Man'), {'body': 'Any loss green wrong glass cell success. Claim level listen in easy.\nTake rise central claim often official itself. Age century true director. Issue race western school.', 'profile_type': 'Man', 'username': 'thatfield'}), (Member(username='par', is_staff=False, profile_type='Man'), {'body': 'Anyone but never especially lose nice. Think owner some whole more. Stop character expert space movement show once close.', 'profile_type': 'Man', 'username': 'igilbert'}), (Member(username='x_Alexander__x', is_staff=False, profile_type='Man'), {'body': 'Scene federal per shake go all. Dinner year shake memory. Suggest happy address discussion affect late.\nReduce do economy job room allow. Benefit son cut my themselves ten. Whether world bar reality.', 'profile_type': 'Man', 'username': 'yford'}), (Member(username='undergivenbs', is_staff=False, profile_type='Man'), {'body': 'Talk purpose possible yard statement.\nCase doctor by. Remember his I throw I seek.\nSpring service radio phone. Way same far. Cause total age court.', 'profile_type': 'Man', 'username': 'emccarthy'}), (Member(username='par', is_staff=False, profile_type='Man'), {'body': 'Act their but system bag stop. Single open nature old dream station over president.\nWithin such participant show. Your public their during face. Available middle bar church deal time reach.', 'profile_type': 'Man', 'username': 'monicachambers'}), (Member(username='undergivenbs', is_staff=False, profile_type='Man'), {'body': 'Attention alone class task. Agree American choose. Fact may civil ago water. Score build door job defense free staff.\nAmerican ago miss there subject. Ok never find heavy natural color eye.', 'profile_type': 'Man', 'username': 'imartinez'}), (Member(username='x_Alexander__x', is_staff=False, profile_type='Man'), {'body': 'Side suddenly many present bit these.\nAgain right return job so. Natural why exist south decide system adult.', 'profile_type': 'Man', 'username': 'johnsonjorge'}), (Member(username='Älskar52', is_staff=False, profile_type='Man'), {'body': 'Just there should hope course. Gas security bill apply why off. Effort strong media six.\nTown she important popular. Bag deep vote.', 'profile_type': 'Man', 'username': 'cooleymichelle'}), (Member(username='x_Alexander__x', is_staff=False, profile_type='Man'), {'body': 'Choose page carry simple coach final. Goal do necessary defense drive. Wish culture figure happen later how.', 'profile_type': 'Man', 'username': 'wilsonwilliam'}), (Member(username='Älskar52', is_staff=False, profile_type='Man'), {'body': 'Best report play nation yourself. Live show history fund prepare agent.', 'profile_type': 'Man', 'username': 'ujones'}), (Member(username='zobberzobber', is_staff=False, profile_type='Man'), {'body': 'Total note perhaps assume include kind. Bring area data ready. Walk want positive be. Relationship ready forget seem adult air.\nPicture player activity rule. Born stuff wall its.', 'profile_type': 'Man', 'username': 'ekelly'}), (Member(username='the2005', is_staff=False, profile_type='Man'), {'body': 'Range week agreement about attention Mr. Rule family development.\nGrow company short fund these. Goal executive theory various success north. Realize exactly together cup.', 'profile_type': 'Man', 'username': 'lrivas'}), (Member(username='the2005', is_staff=False, profile_type='Man'), {'body': 'Contain each energy. Argue toward rise consumer to point provide. Say girl involve identify task control.\nInside nothing television crime. Quality institution mission system Congress bit.', 'profile_type': 'Man', 'username': 'qholmes'}), (Member(username='solbad56', is_staff=False, profile_type='Man'), {'body': 'Study record coach town. Stop character military half score.\nPass fish election writer evidence sit easy. Fast size experience.\nBehind word age professional. Yeah by tree century.', 'profile_type': 'Man', 'username': 'gperry'}), (Member(username='par', is_staff=False, profile_type='Man'), {'body': 'Air what easy once.\nStock cause make western will. Significant west again option expect short across family. Big base follow big these effect through.', 'profile_type': 'Man', 'username': 'wendy37'}), (Member(username='zobberzobber', is_staff=False, profile_type='Man'), {'body': 'Everybody machine imagine middle always. Support party Congress use heart guy class.\nCut test economy act talk. Film become social in his response. Together cost onto still appear let son.', 'profile_type': 'Man', 'username': 'jenningspatrick'})]
counter = Counter()
for m, r in result:
counter[m.username] += 1
print(counter)
# Counter({'zobberzobber': 3, 'Älskar52': 3, 'par': 3, 'x_Alexander__x': 3, 'undergivenbs': 3, 'solbad56': 2, 'the2005': 2})
member_profile_types = {v.profile_type for v in init_members if not v.is_staff}
total_matched_rooms_count = sum(v['profile_type'] in member_profile_types for v in rooms_list)
print(total_matched_rooms_count) # 19
print(sum(counter.values()) == total_matched_rooms_count) # True