Creating Self-Organizing Maps

From the visualization controls in the upper right of the Visual Analytics canvas, click New Visualization

.

Select Self-organizing Map.

Enter a Title and Description.

The title appears on the visualization when it is added to the canvas.

From the Show field, select a maximum of 10 input parameters to display in the self-organizing map.

Select input parameters to exclude from the calculation of the self-organizing map.

Optional: Clear the Use default settings option to configure advanced settings.

Set the Grid size.

The algorithm that creates a self-organizing map starts by creating a grid of hexagonal bins arranged in a square. The dimensions of the square are $n \times n$ , where $n = \sqrt{N u m b e r o f d e s i g n p o int s}$ . The maximum value of $n$ is 25. (If the data set contains more than 625 data points, Results Analytics positions the data in a 25 × 25 grid.)

Select the SOM type.

The Preserve topology of sample space technique tries to minimize the topology error.

The Maximize resolution of fit technique tries to determine the quality of the fit.

For more information, see Choosing the self-organizing map technique.

Enter the Maximum alternatives sampled.

Enter the Maximum iterations.

The maximum iterations are the maximum number of times Results Analytics runs the outer loop of the self-organizing map algorithm with a new set of random weight vectors. For more information, see The Self-Organizing Maps Algorithm.

Enter the Initial learning rate value.

The learning rate controls the amount of deformation experienced by each node in the mesh. For the self-organizing map to converge, Results Analytics decreases the learning rate exponentially with time.

The default is 0.8, which means that a node is moved by a maximum of 80% toward the training data point during the first iteration. Higher values for the learning rate result in arriving at the correct self-organizing map quickly, but they can sometimes produce enormous deformations (invalid self-organizing maps).

Enter the Initial neighborhood radius value.

The neighborhood radius identifies the nodes in the neighborhood that are affected by a training data point. Similar to the learning rate, this radius is also reduced exponentially with time.

The default is 0.8, which means that nodes that lie within 80% of the span of the self-organizing map are influenced by the training data point in the first iteration, but that range of influence decreases very quickly in subsequent iterations. Low values for neighborhood radius can produce faster convergence, which in some cases can be premature (invalid self-organizing maps).

Enter a Random seed value.

When a SOM is initialized with a random map, the seed value is used to generate the random data for the initial random map. The SOM can appear different for the same data set with different random initializations (controlled by the seed value), but the trends between parameters in the data set are preserved.

Select Initialize using Principal Component Analysis.

To create self-organizing maps, Results Analytics creates an initial map that is trained with the data set. The initial map can be generated with random data or with the linear correlations among the parameters in the training data set obtained through Principal Component Analysis (PCA). When the linear correlations are used to generate the map, a substantial aspect of the data trend is already captured before beginning the training iterations. The nonlinear trends are captured during the training iterations.

Using PCA, a good self-organizing map can be obtained with fewer iterations. However, for certain data sets the linear trends can be misleading and dominate the self-organizing map. In such cases, random initialization is preferred along with increasing maximum number of iterations to ensure convergence.

Do one of the following:

Click Create and Close to create the new visualization and to return to the canvas.
Click Create to create the new visualization and to keep the new visualization window open.