I'm doing a program that can cluster numeric data using Kohonen Self-Organizing Maps and I'm trying to make it as generic as possible. So, how do I know the appropriate initial size of the neighborhood in proportion to the number of items (number of output nodes) in the dataset?
Suggestions would also be greatly appreciated. :)
You can refine the neighborhood function as you wish.
If you define a circle around each best matching unit (BMU) you can choose this radius to be proportional to the size of your grid in the low-dimensional space (likely 2d). Then you can make the radius remain constant or shrink according to a certain criterion until your last iteration.
Another option is to make the neighborhood fixed, as in a certain amount of surrounding nodes to the BMU.