-
From the visualization controls in the upper right of the Visual Analytics canvas,
click New Visualization
.
-
Select Self-organizing Map.
-
Enter a Title and Description.
The title appears on the visualization when it is added to the canvas.
-
From the Show field, select a maximum of 10 input parameters to
display in the self-organizing map.
-
Select input parameters to exclude from the calculation of the self-organizing
map.
- Optional:
Clear the Use default settings option to configure advanced
settings.
-
Set the Grid size.
The algorithm that creates a self-organizing map starts by creating a grid of
hexagonal bins arranged in a square. The dimensions of the square are
, where
. The maximum value of
is 25. (If the data set contains more than 625 data points, Results Analytics positions the data in a 25 × 25 grid.)
-
Select the SOM type.
The Preserve topology of sample space technique tries to
minimize the topology error.
The Maximize resolution of fit
technique tries to determine the quality of the fit.
For more information, see Choosing the self-organizing map technique.
-
Enter the Maximum alternatives sampled.
-
Enter the Maximum iterations.
The maximum iterations are the maximum number of times Results Analytics runs the outer loop of the self-organizing map algorithm with a new set of random
weight vectors. For more information, see The Self-Organizing Maps Algorithm.
-
Enter the Initial learning rate value.
The learning rate controls the amount of deformation experienced by each node in the
mesh. For the self-organizing map to converge, Results Analytics decreases the learning rate exponentially with time.
The default is 0.8, which means that a node is moved by a maximum of 80% toward the
training data point during the first iteration. Higher values for the learning rate
result in arriving at the correct self-organizing map quickly, but they can sometimes
produce enormous deformations (invalid self-organizing maps).
-
Enter the Initial neighborhood radius value.
The neighborhood radius identifies the nodes in the neighborhood that are affected by a
training data point. Similar to the learning rate, this radius is also reduced
exponentially with time.
The default is 0.8, which means that nodes that lie within 80% of the span of the
self-organizing map are influenced by the training data point in the first iteration,
but that range of influence decreases very quickly in subsequent iterations. Low values
for neighborhood radius can produce faster convergence, which in some cases can be
premature (invalid self-organizing maps).
-
Enter a Random seed value.
When a SOM is initialized with a random map, the seed value is used to generate the
random data for the initial random map. The SOM can appear different for the same data
set with different random initializations (controlled by the seed value), but the trends
between parameters in the data set are preserved.
-
Select Initialize using Principal Component Analysis.
To create self-organizing maps, Results Analytics creates an initial map that is trained with the data set. The initial map can be
generated with random data or with the linear correlations among the parameters in the
training data set obtained through Principal Component Analysis (PCA). When the linear
correlations are used to generate the map, a substantial aspect of the data trend is
already captured before beginning the training iterations. The nonlinear trends are
captured during the training iterations.
Using PCA, a good self-organizing map can be obtained with fewer
iterations. However, for certain data sets the linear trends can be misleading and
dominate the self-organizing map. In such cases, random initialization is preferred
along with increasing maximum number of iterations to ensure convergence.
-
Do one of the following:
- Click Create and Close to create the new visualization and
to return to the canvas.
- Click Create to create the new visualization and to keep
the new visualization window open.