| Contents |
k-Nearest Neighbors Prediction
|
Using k-Nearest Neighbor Prediction in XLMiner™: In XLMiner™, select Prediction --> k-Nearest Neighbors. On clicking on the k-Nearest Neighbor option the following dialog appears:
Data range: You can select the data range manually. Variables in input data: Select the input variables and the output variables from the list. Click Next, and the following dialog box comes up: Normalize input Data: Normalizing the data is important to ensure that the distance measure accords equal weight to each variable -- without normalization, the variable with the largest scale will dominate the measure. Number of Nearest Neighbors (k): There is no simple rule for selecting k. If k is too small, the prediction of a case (row) will be quite variable -- dependent on the classification of the single case to which it is closest. Typically, k is chosen to be in the units or tens. Scoring option : Select one of these. If you select Score on specified value of k as above, XLMiner™ uses the specified value of k for scoring. If Score on best k between 1 and specified value is selected, then XLMiner™ builds models parallelly on all values of k upto the maximum specified value and scoring is done on the best of these models. Score training data: Select this option to show an assessment of the performance in predicting the training data. The report is displayed according to your specifications - Detailed, Summary and Lift charts. Score validation data: Select this option to show an assessment of the performance in predicting the validation data. The report is displayed according to your specifications - Detailed, Summary and Lift charts. Score Test Data: The options in this group let you apply the model for scoring to the test partition (if one had been created earlier). The option "Score Test Data" is available only if the dataset contains test partition. Select it to apply the model to test data. Score new Data: The options in this group let you apply the model for scoring to an altogether new data. Specify where the new data is located. See the Example of Discriminant Analysis for detailed instructions on this. Score New data in database : See the Example of Discriminant Analysis for detailed instructions on this. See also: |