I connected the tables 'projects' and 'project_freelancer' through a many-to-many relationship, because one project can have multiple freelancer and one freelancer can do multiple projects. In 'projects' I have only one primary key 'url'. In 'projects_freelancer' I have a composite primary key out of 'url' and 'freel_url'
Therefore I definded my association table like this:
association_table = Table('association', Base.metadata,
Column('projects_url', ForeignKey('projects.url', ondelete="CASCADE"), primary_key=True),
Column('project_freelancer_url'),
Column('project_freelancer_freel_url'),
ForeignKeyConstraint(
('project_freelancer_url', 'project_freelancer_freel_url'),
('project_freelancer.url', 'project_freelancer.freel_url')),
)
And here are my class-definitions (short version):
class Project(Base):
__tablename__ = "projects"
url = Column(String, primary_key=True)
datetime = Column(DateTime)
children2 = relationship("ProjectFree", secondary=association_table, back_populates='parents2', cascade="all, delete") # Many-to-many relationship
class ProjectFree(Base):
__tablename__ = "project_freelancer"
url = Column(String, primary_key=True)
freel_url = Column(String, primary_key=True)
update = Column(DateTime)
parents2 = relationship("Project", secondary=association_table, back_populates="children2", cascade="all, delete")
When filling my db I get the following error message:
ERROR [21-11-18 09:58:47] scrapy.core.scraper _itemproc_finished :249 Error processing {'added': datetime.datetime(2021, 11, 18, 9, 58, 47, 654089),
'freel_bidtext': 'Hi,\n'
'\n'
'I hope all is well with you.\n'
'\n'
'I have read the job description and finds it a perfect '
'match for me as I have extensive hands-on experience in '
'React JS/Vue JS development.\n'
'\n'
'I have developed a lot of websites and sto',
'freel_country': 'Pakistan',
'freel_deadline': 'in 7 days',
'freel_earnings': '0.0',
'freel_name': 'abdulhaseebbasit',
'freel_price': '$140 USD',
'freel_ranking': 3,
'freel_rating': '0.0',
'freel_reviews': '0',
'freel_url': 'https://www.freelancer.com/u/abdulhaseebbasit',
'update': datetime.datetime(1900, 1, 1, 0, 0),
'url': 'https://www.freelancer.com/projects/javascript/looking-for-react-developer-32135012/?ngsw-bypass=&w=f'}
Traceback (most recent call last):
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1802, in _execute_context
self.dialect.do_execute(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\default.py", line 719, in do_execute
cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: association.projects_url
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\twisted\internet\defer.py", line 858, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\scrapy\utils\defer.py", line 150, in f
return deferred_from_coro(coro_f(*coro_args, **coro_kwargs))
File "C:\Users\myuser\PycharmProjects\crawlerDiss\diss\diss\pipelines.py", line 42, in process_item
return self.handle_ProjectFreelancer(item, spider)
File "C:\Users\myuser\PycharmProjects\crawlerDiss\diss\diss\pipelines.py", line 160, in handle_ProjectFreelancer
session.commit()
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 1428, in commit
self._transaction.commit(_to_root=self.future)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 829, in commit
self._prepare_impl()
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 808, in _prepare_impl
self.session.flush()
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 3345, in flush
self._flush(objects)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 3485, in _flush
transaction.rollback(_capture_exception=True)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\util\langhelpers.py", line 70, in __exit__
compat.raise_(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\util\compat.py", line 207, in raise_
raise exception
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\session.py", line 3445, in _flush
flush_context.execute()
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\unitofwork.py", line 456, in execute
rec.execute(self)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\unitofwork.py", line 579, in execute
self.dependency_processor.process_saves(uow, states)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\dependency.py", line 1182, in process_saves
self._run_crud(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\orm\dependency.py", line 1245, in _run_crud
connection.execute(statement, secondary_insert)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1289, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\sql\elements.py", line 325, in _execute_on_connection
return connection._execute_clauseelement(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1481, in _execute_clauseelement
ret = self._execute_context(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1845, in _execute_context
self._handle_dbapi_exception(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2026, in _handle_dbapi_exception
util.raise_(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\util\compat.py", line 207, in raise_
raise exception
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\base.py", line 1802, in _execute_context
self.dialect.do_execute(
File "c:\users\myuser\pycharmprojects\crawlerdiss\venv\lib\site-packages\sqlalchemy\engine\default.py", line 719, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: association.projects_url
[SQL: INSERT INTO association (projects_url, project_freelancer_url, project_freelancer_freel_url) VALUES (?, ?, ?)]
[parameters: ('https://www.freelancer.com/projects/javascript/looking-for-react-developer-32135012/?ngsw-bypass=&w=f', 'https://www.freelancer.com/projects/javascript/looking-for-react-developer-32135012/?ngsw-bypass=&w=f', 'https://www.freelancer.com/u/abdulhaseebbasit')]
(Background on this error at: https://sqlalche.me/e/14/gkpj)
In the db-browser I can see, that only unique combinations are filled into the association table. So for a project already in the association table, with a different freelancer the error occurs. Do you have an idea what I have to change? Should I change the unique behavior anywhere?
Finally I get the solution and it was quite easy (as is so often the case). In the association table, I had to add , primary_key = True
for each Foreign Key. Here is the final association_table
:
association_table = Table('association', Base.metadata,
Column('projects_url', ForeignKey('projects.url', ondelete="CASCADE"), primary_key=True),
Column('project_freelancer_url', primary_key = True),
Column('project_freelancer_freel_url', primary_key = True),
ForeignKeyConstraint(
('project_freelancer_url', 'project_freelancer_freel_url'),
('project_freelancer.url', 'project_freelancer.freel_url')),
)
The idea come to my mind by reading the documentations tip here.