Abstract:
It is shown in this report that unsupervised Self-Organizing Maps (SOMs), as well as supervised learning by Learning Vector Quantization (LVQ) can be defined for string variables, too. Their computing becomes possible when the SOM and the LVQ algorithms are expressed as batch versions, and when the average over a list of symbol strings is computed as the minimum sum of generalized distance functions of a string from all the other strings. Special considerations are necessary in order to initialize the SOM properly. If a special distance measure called the feature distance is used, the winner string can be located almost immediately by the method called the Redundant Hash Addressing, whereby the number of computations in winner search is almost independent of the SOM size.