I am moving data from Spreadsheets to MySQL. So we know that in Spreadsheets usually there is no ID, instead, just text.
City;Country;...
New York;USA;...
Berlim;Germany;...
Munich,Germany,...
With that in mind, let's consider two tables:
Country : [ID, name]
City : [ID , country (FK) , name]
I dont want to create several countries with the same name -- but I want to use the existing one. Perfect, so, let's add a FUNCTION in the INSERT state that searches, insert (if needed) and return the Country ID.
So I created a Function to FIRST assess whether the Country exists if not then create a country
getCountry (parameter IN strCountry varchar(100))
BEGIN
SELECT ID INTO @id from `country` WHERE country.country = strCountry ;
IF (@id is NULL OR @id= 0) THEN
INSERT INTO `country` (country) VALUES (strCountry);
if (ROW_COUNT()>0) THEN
SET @id = LAST_INSERT_ID();
else
SET @id = NULL;
END IF;
END IF ;
RETURN @id;
END
And then I have DOZENS OF THOUSANDS of INSERTS such as
INSERT INTO city (name, country) VALUES ('name of the city', getCountry('new or existing one'));
The Function works well when executed alone, such as
SELECT getCountry('Aruba');
However, when I execute that in that VERY LONG SQL (22K+ rows) then it does not work.... it uses basically the latest ID that was created BEFORE starting the execution. Maybe I should "wait" the function execute and return a proper result? But How?
What am I doing wrong?
Instead of function why not use a Stored Procedure, then the procedure will process the checking and insertion.
https://www.mysqltutorial.org/getting-started-with-mysql-stored-procedures.aspx
DELIMITER $$
CREATE PROCEDURE `sp_city_add`(in p_city varchar(100), in p_country varchar(100))
BEGIN
DECLARE country_id INT;
IF (SELECT COUNT(1) FROM country WHERE country.country = p_country) = 0 THEN
INSERT INTO country (country) VALUE (p_country);
SET country_id = LAST_INSERT_ID();
ELSE
SELECT ID INTO country_id FROM country WHERE country.country = p_country;
END IF;
INSERT INTO city (name, country) VALUES (p_city, country_id);
END$$
DELIMITER ;
And if you want to execute a procedure
CALL sp_city_add('Bogota', 'Colombia');
CALL sp_city_add('Phnom Penh', 'Cambodia');
CALL sp_city_add('Yaounde', 'Cameroon');
CALL sp_city_add('Ottawa', 'Canada');
CALL sp_city_add('Santiago', 'Chile');
CALL sp_city_add('Beijing', 'China');
CALL sp_city_add('Bogotá', 'Colombia');
CALL sp_city_add('Moroni', 'Comoros');
You can also add a condition to check if the city and country exists to prevent duplicate entry.