|
Examples:
Data Size: Different
versions of XLMiner™ have varying limits on size of data. The size of
data depicted in the example below may not be supported by your version. Refer
to
Data Handling
Specifications
for details.
Let us apply this utility on Irisfacto.xls, a small dataset to
understand the features of Create Dummies and Create Category Scores. This
dataset is derived from Iris.xls.
 Species_Name
happens to be a string variable.
-
Select XLMiner --> Data Utilities --> Transform Categorical Data -->
Create Dummies.

Select
Species_name and click OK. See the output. 
Interpretation
: As seen above, the
variable, Species_name, is expressed as two dummy variables,
Species_name_Verginica and Species_name_Versicolor. They act as
switches. Species_name_Verginica takes a value of 1 only when the value of
Species_name="Verginica" in the dataset. Otherwise,
Species_name_Verginica = 0. Same is true for the other dummy variable ie.
Species_Name_Versicolor. The
variable Species_Name assumes one more value in the dataset = "Setosa".
You will wonder why the dummy variable Species_Name_Setosa is missing. See
the values of the two dummy variables for Row Id = 3. Both of them are zero
when the value of Species_Name in the the dataset is "Setosa" for
the 3rd record. This means when both the dummy variables
show the value of 0, the value is known to be "Setosa"
automatically. This is the reason for not including the column for dummy
variable Species_Name_Setosa. In this
way, XLMiner™ converts a string variable into categorical variables and
the dataset is now numeric.
Select
XLMiner --> Data Utilities --> Transform Categorical Data -->
Create Category Scores

Select
Species_name and retain the default option of Assign numbers
1,2,3....

Interpretation:
The
output shows that the XLMiner™ sorts the values of this variable
alphabetically and assigns numbers 1,2,3... to them. (Starting from 1 because we selected assign numbers
1,2,3...) A variable, Species_name_ord is created to store these assigned
numbers. If we had selected Assign numbers 0,1,2... then Species_name_ord
would
have values from 0,1,2.... Thus the variable Species_name is
categorized.
|