|
Examples:
Data Size: Different
versions of XLMiner™ have varying limits on size of data. The size of
data depicted in the example below may not be supported by your version. Refer
to Data Handling
Specifications
for details. Open
the dataset Sampling.xls. It contains a variable ID for the record
identification and seven variables, v1, v2, v7, v8, v9, v10, v11.
Let us work out different sampling options on
this dataset and see the outputs.
For all the sampling examples, first activate
XLMiner -->Data Utilities --> Sample from worksheet. We get the dialog for sampling after that. Make the changes as mentioned in each example and see output.
- We will use the default options
first.

Select all the variables. Let us see the
output with these default settings.

- Let us try Sampling with replacement.
-
Select all the variables.
-
Check the box "Sample with
Replacement."
-
Choose the desired sample size
to be 300. Since we are choosing sampling with replacement, XLMiner™ can
very well generate a sample with a larger no. of records than the dataset.
-
Click OK.
See the output.

The output indicates "True" for
Sampling with replacement. The desired sample size is greater than
#records in input data.
The sampling with replacement is apparent
in the second and third entries which are the same record, record#3.
Try stratified random sampling as follows.
-
Select all the variables.
-
Let us select the Desired sample size to be 100 and the Set seed
45689.
-
Select Stratified random sampling. -
Click on the pull down menu
for the stratum variable. Choose v8. (Remember, the variable you choose
should not have more than 30 distinct values.) The #strata is
automatically displayed once you select v8. -
Select
"Proportionate to stratum size." -
Click "OK".

-
In order to maintain the proportions of
the strata intact XLMiner™ has increased the sample size a little on
the higher side. This is apparent in the entry for #records
actually sampled.
-
Under stratum wise details
XLMiner™ has
listed all the stratum values v8 assumes with #records in input data for
each stratum.
-
On this basis XLMiner™ calculated the percentage
representation of that value in the dataset and maintains it in the
sample.
-
You can see the #records in sampled data
having the same proportion as in the dataset.
-
XLMiner™
has attached Row Id to each
record before sampling. The output is displayed after sorting the sample
on these Row Ids.
- Select the options for stratified random
sampling with equal records from each stratum.

-
Select all the variables.
-
Select Stratified random sampling. Choose
v8 as the Stratum variable.
The #strata is displayed automatically.
-
Select "Equal from each stratum, please specify #records".
-
Enter the #records. Remember, this number should not be greater than
the smallest stratum size. In this case the smallest stratum size is 8.
(If you already noticed, the smallest stratum size appears automatically
in a box as you select the stratum variable). Enter 7 , which is less than
the limit of 8.
-
Click OK.
In the output,

- The #records in sampled data are 56, ie 7
records per stratum for 8 strata.
- If we want a sample with equal no. of records
of each stratum but of bigger size we can use the same options above for
sampling with replacement.

-
Check "Sample with
replacement".
-
Specify #records as 20. Though the smallest
stratum size is 8 in this dataset, we can get more records
as we are sampling with replacement.
-
Choose all the other options just as in ex.
4 above
The output is as follows

- The output sample has 20 records per
stratum. So the #records in sampling is 160.
- Try the following combination of options-

-
Select Stratified random sampling. Choose
v8 as the Stratum variable .
The #strata is displayed automatically.
-
Select "Equal from each stratum, #records = smallest stratum size".
-
The smallest stratum size is displayed automatically as you select the
stratum variable.
-
Click OK.
The output is quite self explanatory.
- #records in sampled data = 8
per stratum,
the smallest stratum size.
See
also:
|