Contents

Data Handling Specifications

 

 XLMiner™ has different maximum limits on the size of data, depending on its version#, build# and edition. See the following table. To know the version and build# of XLMiner™ that you are using, see the "About XLMiner" box in the XLMiner™ menu.

 Note: Academic Research edition capabilities are similar to those of Professional edition; exceptions noted.

 

Professional edition
Academic Research edition

 

Education edition

 

Demo edition

 

Partitioning

 

 

 

# Rows

Original data : 500,000 (65,000 in Excel 2003)

Output : 500,000 (65,000 in Excel 2003), subject to training partition being 10000 

Original data : 500,000 (65,000 in Excel 2003)     

Output : 500,000 (65,000 in Excel 2003), subject to training partition being 10000 

Original data : 600       

Output : 600, subject to training partition being not more than 200. 

# Columns

Original data : No limit

Output : 200

Original data : No limit

Output : 200

Original data : No limit

Output : 200

Sample from worksheet

 

 

 

# Rows

Original data: Max. Excel limit.

Sample output: 500,000 (65,000 in Excel 2003)

Original data: Max. Excel limit.

Sample output: 500,000 (65,000 in Excel 2003)

Original data: Max. Excel limit.

Sample output: 200

# Columns

Original data : No limit

Output : 200

Original data : No Limit

Output : 200

Original data : No limit

Output : 200

# categories for Stratum variable (in Stratified Sampling)

30 (Stratum values are not case sensitive)

30 (Stratum values are not case sensitive)

30 (Stratum values are not case sensitive)

  Sample from database

 

 

 

# Fields 

In the table: No limit

Sample output: 200

Not applicable, as the feature is not supported.

In the table: No limit

Sample output: 200

# Records 

In the table : 10,000,000

Sample Output : 500,000 (65,000 in Excel 2003)

Not applicable, as the feature is not supported.

In the table : 200

Sample Output : 200

# categories for Stratum field (in Stratified Sampling)

30 (Stratum values are not case sensitive)

Not applicable, as the feature is not supported.

30 (Stratum values are not case sensitive)

Handle Missing values

 

 

 

# Rows

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

# Columns

200

200

200

#Missing values that can be  treated at a time

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

 Bin Continuous Data

 

 

 

Sum of #Columns present in the data range and #columns selected for binning

200

200

200

# Rows

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

# Columns in the output

200 (Inclusive of all columns in the data range and binned columns)

200 (Inclusive of all columns in the data range and  binned columns)

200 (Inclusive of all columns in the data range and  binned columns)

Transform Categorical Data

 

 

 

#Rows

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

# Columns

200 (Inclusive of all columns in the data range and the ones added in the output.)

200 (Inclusive of all columns in the data range and the ones added in the output.)

200 (Inclusive of all columns in the data range and the ones added in the output.)

#distinct classes

30

30

30

#output variables

30

30

30

Time Series

     

#Rows

10000

10000

200

Classification and Prediction

 

 

 

# Rows

10000 for Training 

500,000 (65,000 in Excel 2003) for Training + Validation + Test 

(if partitioning is used) 

500,000 (65,000 in Excel 2003) in new data used as Scoring target

10000 for Training 

500,000 (65,000 in Excel 2003) for Training + Validation + Test   

(if partitioning is used) 

500,000 (65,000 in Excel 2003) in new data used as Scoring target

200 for each partition (Training, Validation, Test) if partitioning is used. 

200 if partitioning is not used. 

200 in new data used as Scoring target

# Columns (input variables)

100 (The data set can contain up to 200 columns, out of which, up to 30 can be selected for the model as input variables)

100 (The data set can contain up to 200 columns, out of which, up to 30 can be selected for the model as input variables)

30 (The data set can contain up to 200 columns, out of which, up to 30 can be selected for the model as input variables)

# Distinct classes for a categorical variable

30 (Class values are not case sensitive)

30 (Class values are not case sensitive)

30 (Class values are not case sensitive)

# Distinct values for any input variable for Naive Bayes classification

1000 (Values are not case sensitive)

1000 (Values are not case sensitive)

30 (Values are not case sensitive)

# Nearest neighbors for k-Nearest Neighbors

 20 (or # Training rows whichever is smaller)

20 (or # Training rows whichever is smaller)

20 (or # Training rows whichever is smaller)

# Splits for Regression Tree

5000 (or [# Training rows -1] whichever is smaller)

5000 (or [# Training rows -1] whichever is smaller) 

5000 (or [# Training rows -1] whichever is smaller) 

# Levels in Tree drawing for Regression and Classification trees

7 (Actual tree may contain more levels)

7 (Actual tree may contain more levels)

7 (Actual tree may contain more levels)

# Epochs for Neural Networks

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

# Iterations for Logistic Regression

100

100

100

Affinity - Association Rules

 

 

 

# Transactions

500,000 (65,000 in Excel 2003)

500,000 (65,000 in Excel 2003)

200

# Distinct items in data set 

5000

5000

1000

# Items in a transaction

30

30

30

# Rules

500,000 (65,000 in Excel 2003) (Additional rules may exist, they are not displayed)

500,000 (65,000 in Excel 2003) (Additional rules may exist, they are not displayed)

500,000 (65,000 in Excel 2003) (Additional rules may exist, they are notdisplayed)

Data Exploration & Reduction

 

 

 

# Rows

20000 

Exception: when using Hierarchical Clustering, the number of rows is limited to 4000.

20000 

Exception: when using Hierarchical Clustering, the number of rows is limited to 4000 

200

# Columns (variables)

200

200

30. (The data set can contain up to 200 columns, out of which, up to 30 can be selected as variables for the model .)

# Clusters displayed in a Dendrogram

30 (The solution may involve a higher number of clusters, but the Dendrogram shows a maximum of 30 top-level clusters)

30 (The solution may involve a higher number of clusters, but the Dendrogram shows a maximum of 30 top-level clusters)

30 (The solution may involve a higher number of clusters, but the Dendrogram shows a maximum of 30 top-level clusters)

Size of Distance Matrix (if specified) for Hierarchical Clustering

200 x 200

200 x 200

30 x 30

# Clusters for k-Means clustering

20 (or # Training rows whichever is smaller)

20 (or # Training rows whichever is smaller)

20 (or # Training rows whichever is smaller)

# Iterations for k-Means clustering

 50

50

50

Charts

 

 

 

# Rows 

10000

10000

200

# Columns

Original Data : 200

For charts drawing : 5 (For Box & Matrix plots)

Original Data : 200

For charts drawing : 5 (For Box & Matrix plots)

Original Data : 200

For charts drawing : 5 (For Box & Matrix plots)

# Distinct values X-variable can take

5 (for Box plot)

5 (for Box plot)

5 (for Box plot)

General

 

 

 

# Worksheets in workbook (Excel File)

245, before running any XLMiner™ procedure

(Count includes any hidden/very hidden sheets also which may be present in the workbook)

245, before running any XLMiner™ procedure

(Count includes any hidden/very hidden sheets also which may be present in the workbook)

245, before running any XLMiner™ procedure  

(Count includes any hidden/very hidden sheets also which may be present in the workbook)

Model Storage and Scoring

Included (via XLMcalc, included with Professional ed.)

via XLMcalc (purchased separately)

via XLMcalc (purchased separately)