I know there's a high chance of my hitting the XY problem here, so the first chunk of this is about the more general situation.
I have a set of data points containing abstract geographic-feature information, but no actual locations (absolute or relative). For the sake of example, let's call it the following list of cities that describes the local terrain, but with no coordinates or relative positioning:
From this, I want to write a program that can provide relative positions for these locations. For the above list, it might derive that A and D are near each other, E might be near them but is less likely, and B and C might be near each other. You can think of it as a graph with all the edges rubbed out, and the program tries to write them back in based on the properties of the nodes. Given that, it could then come up with some arbitrary coordinates for each city, from which a map might be drawn.
I don't need a unique solution – my end-goal here is plausible maps of unmapped-but-described fictional places.
This seems to me to be the sort of problem that's a good fit for e.g. Prolog or similar logic-engines, since it's basically just constraint resolution. However, I can't quite work it out myself. The issues I'm hitting at the moment relate to the way that, for example, two cities could have similar local features without being near the same instance of that larger feature. It's the difference between "This city is near some unspecified mountain" and "This city is near Mt. Foobar." The latter provides a strong constraint (two cities both near Mt. Foobar are near each other), but the latter only provides a guideline (two cities both near mountains are more likely to be near each other than one city near a mountain and another city not near a mountain).
How does one define (and provide solutions based on) probabilities, rather than absolutes, in Prolog or other logic/rules engines?
Here's an approach using MiniZinc (a very nice constraint modeling language).
The assumptions of the model is that there is a map of fixed places, e.g. where the mountains, hills, rivers, etc are. (I'm not sure if that is in your assumptions. For a different model, see below.)
The objective is then to place a number of cities (in this model cities A..H) using these constraints - near(P1,P2): P1 and P2 must be within a distance (defined by "near_distance") - on(P1,P2): the city P1 must be on or very near a fixed place - not_near(P1,P2): the places P1 and P2 must not be near
I changed one of the original constraint and added some more cities and constraints.
This model is also here: http://hakank.org/minizinc/place_cities2.mzn . The solution is after the model
include "globals.mzn";
% The places
enum places = {island,hill,coast,river,mountain,plains,
city_a,city_b,city_c,city_d,city_e,city_f,city_g,city_h
};
int: empty = 0;
set of int: fixed_places = {island,hill,coast,river,mountain,plains};
set of int: to_place = places diff fixed_places; % {city_a,city_b,city_c,city_d,city_e,city_f,city_g,city_h};
int: num_places = length(places);
int: max_x;
int: max_y;
int: near_distance;
array[1..max_x, 1..max_y] of int: data;
array[0..num_places] of string: places_s = array1d(0..num_places,
["-","i","h","c","r","m","p",
"A","B","C","D","E","F","G","H",
]);
% decision variables
% position of a city
array[to_place] of var 1..max_x: x;
array[to_place] of var 1..max_y: y;
% the grid (0 is an empty spot)
array[1..max_x, 1..max_y] of var 0..num_places: grid;
% on: must be really near.
% Assumption: p2 is a fixed_place
predicate on(var 1..num_places: p1, var 1..num_places: p2) =
exists(I in 1..max_x, J in 1..max_y) (
data[I,J] = p2 /\
pow(abs(x[p1]-I),2) + pow(abs(y[p1]-J),2) <= 1
)
;
% define the concept of near: atmost d distance apart
predicate near(var 1..num_places: p1, var 1..num_places: p2) =
exists(I in 1..max_x, J in 1..max_y) (
grid[I,J] = p2 /\
pow(abs(x[p1]-I),2) + pow(abs(y[p1]-J),2) <= near_distance
)
;
% not near: > d distance apart
predicate not_near(var int: p1, var int: p2) =
exists(I in 1..max_x, J in 1..max_y) (
grid[I,J] = p2 /\
pow(abs(x[p1]-I),2) + pow(abs(y[p1]-J),2) > near_distance
)
;
solve satisfy;
% solve :: int_search(x ++ y ++ array1d(grid), input_order, indomain_split, complete) satisfy;
% general constraints
constraint
% Here we ensure that:
% - a fixed place can only be positioned by the fixed place or a city
% - if an empty spot (in data[I,J]) then it can only be positioned by a city
forall(I in 1..max_x, J in 1..max_y) (
if data[I,J] != empty then
(grid[I,J] in {data[I,J]} union to_place)
/\ grid[I,J] != empty
else
grid[I,J] in to_place union {empty}
endif
)
;
% city constraints
constraint
% City A is on an island and on a hill.
on(city_a,island) /\
on(city_a, hill) /\
% City B is on the coast and near a river.
on(city_b,coast) /\
near(city_b,river) /\
% City C is on a mountain and near a river
on(city_c,mountain) /\
near(city_c,river) /\
% City D is on an island and on a hill.
on(city_d,island) /\
on(city_d,hill) /\
%%%City E is on an island and on plains.
% % on(city_e,island) /\
% Changed it to:
% City E is near the mountains and on plains
near(city_e, mountain) /\
on(city_e,plains)
% ADDED:
% City F is on mountains and near a river
/\
on(city_f, mountain) /\
near(city_f,river)
/\
near(city_g, mountain) /\
near(city_g, hill)
/\
on(city_h,plains) /\
% near(city_h,hill) % /\
% not_near(city_h,city_c) /\
not_near(city_h,city_f)
;
constraint
% connect the x[p] and y[p] arrays with grid[I,J]
forall(p in to_place) (
exists(I in 1..max_x, J in 1..max_y) (
x[p] = I /\ y[p] = J /\ grid[I,J] = p
)
)
% unique place in grid
% all cities have unique positions
/\ all_different([(x[p]*num_places-1)+ y[p] | p in to_place])
/\ % each city has just one place in the grid
forall(p in to_place) (
sum([grid[I,J] = p | I in 1..max_x, J in 1..max_y]) <= 1
)
;
output [
"x: \(x)\ny: \(y)\n"
]
++
[
join("", [places_s[fix(grid[I,J])] | J in 1..max_y]) ++ "\n"
| I in 1..max_x % , J in 1..max_y
]
;
%
% data
%
max_x = 15;
max_y = 15;
near_distance = 4;
data = array2d(1..max_x,1..max_y,
[
empty,empty,empty,empty,empty,empty,river,empty,empty,coast,empty,island,hill,hill,empty,
empty,empty,empty,empty,empty,empty,river,empty,empty,coast,empty,empty,island,island,empty,
empty,empty,empty,empty,empty,empty,river,empty,empty,empty,coast,coast,coast,coast,coast,
empty,empty,empty,empty,empty,empty,river,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,river,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,mountain,mountain,mountain,mountain,empty,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,mountain,mountain,mountain,mountain,mountain,empty,empty,empty,hill,hill,hill,empty,empty,
empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,hill,hill,hill,empty,empty,
empty,empty,empty,empty,plains,plains,plains,plains,empty,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,plains,plains,plains,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,mountain,mountain,mountain,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,mountain,mountain,mountain,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,
empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,empty,
]);
The data is based on this (fictional) map with the abbreviations.
i: island
h: hill
c: coast
r: river
m: mountain
p: plains
......r..c.ihh.
......r..c..ii.
......r...ccccc
......r........
......r........
..mmmm.........
..mmmmm...hhh..
..........hhh..
....pppp.......
....ppp........
...............
......mmm......
Here is one solution where the capital letters are the cities to be placed:
x: [1, 1, 5, 1, 9, 5, 7, 10]
y: [13, 9, 6, 12, 4, 5, 9, 4]
------r-Bc-DAh-
------r--c--ii-
------r---ccccc
------r--------
----FCr--------
--mmmm---------
--mmmmm-G-hhh--
----------hhh--
---Epppp-------
---Hppp--------
---------------
------mmm------
------mmm------
---------------
---------------
It is solved in some second with the Gecode FlatZinc solver (most of the time is converting to the general FlatZinc format).
One advantage of a Constraint Programming solver is that it's very easy to generate many solutions , e.g. compared to most MIP solvers and most/many SAT solvers.
Two further comments: - I first interpreted the question that there is now known position at all. The model is here: http://hakank.org/minizinc/place_cities.mzn Note that it assumes that there is only one mountain, one river, etc.