
Data Range : Either type the address
directly in this box, or using the reference button,
mark the required data range from the worksheet. If the cell pointer (active
cell) is already somewhere in the data range, XLMiner™ automatically picks up the
contiguous data range surrounding the active cell. When the data range is
selected XLMiner™ displays the number of records in the selected range.
First row contains headers :
When this box is checked, XLMiner™ picks up the headers from first row of the
selected data range. When the box is unchecked, XLMiner™ follows the default
naming convention, i.e., the variable in the first column of the selected range
will be called "Var1", the second column "Var2," etc.
Variables : This list box contains the names of the variables in the selected data range. If
the first row of the range contains the variable names, then these names appear
in this list box. If the first row of the dataset does not contain the headers,
then XLMiner™ puts the variable names using its default naming convention. In
this case the first column is named Var1, the second column is named Var2
and so on. To select a variable for sampling, click on it, then
click on the ">" button. Use the ctrl key to select
multiple variables.
Sampling Options :
Sample With replacement : If checked
the data is sampled with replacement. The default is sampling without
replacement.
Set Seed : The user can enter the
sorting seed. The default seed is 12345.
Desired sample size : The user can
enter here the sample size as desired. (Note that the actual sample size may
vary a little, depending on the further instructions from the user. We will
know why as we go ahead.)
Simple random sampling : The data is
sorted using the simple random sampling techniques taking into account the
other parameters entered earlier.
Stratified random sampling : If this
sampling method is to be used XLMiner™ provides the following options.

Stratum Variable : From the list
of variables, the user can decide which to use for stratified random
sampling. Note that XLMiner™ allows only those variables which have less
than 30 distinct values. As the user selects the variable name XLMiner™ displays the no. of strata that variable contains in a box in front and
smallest stratum size in a box in front of the option "Equal from
each stratum, #records = smallest stratum size".
Pick records from strata :
Proportionate to stratum size :
XLMiner™
detects the proportion of each stratum in the dataset and maintains the
same in sampling. Due to this XLMiner™ sometimes has to change the
desired sample size (on the higher side) in order to maintain the proportionate stratum size.
XLMiner™ then displays the actual sample size in the output along with
the desired sample size as shown below.

Equal from each stratum : On
specifying the no. of records, XLMiner™ generates sample which has
that same number of records from each stratum. In this case the number
chosen automatically decides the desired sample size, so the option to
enter the desired sample size is not available.
Equal from each stratum, #records =
smallest stratum size : XLMiner™ detects the smallest stratum size
and generates a sample wherein every stratum has a representation of that
size. Again here option for the desired sample size is not
required.
XLMiner™
performs the stratified
random sampling with or without replacement. If you choose sample
without replacement, you can not have the desired sample size
greater than the number of records in the dataset, for obvious reasons.
While entering the #records in the option "equal from each
stratum" do not enter a number which will make the sample size greater than
the size of the dataset.
If you choose to sample with replacement,
then the no. of records in the dataset does not pose any limit on the no.
of records in the sample. XLMiner™ can generate a sample having upto
2000 records.
Before sampling, XLMiner™ generates a
Row ID for each record (irrespective of weather the dataset already has
one). XLMiner™ sorts the sampled output according to Row ID
before displaying it.