The View Models view lets you compare the accuracy of models generated from your feature selections. Depending on your target, Distil builds models of the following types:

Because acceleration is a continuous feature, Distil generates regression models for this example. Once you have inspected them, you can either:

  • Return to the previous step to refine your model definition, or
  • Export the one you think is most accurate

Overview

Regression models allow you to configure the acceptable error (distance from the actual value) in predictions. Distil then lists the correctly and incorrectly predicted values based on your selection.

Regression results
Click to enlarge

If your target feature was a categorical feature instead of a continuous one, Distil would generate classification models instead of regression models.

Classification model results are similar to regression models, but they do not require you to specify an error threshold. They only report whether the predicted values matched the actual values.

Classification results
Click to enlarge

Review model results

In the View Models view, you can inspect the models generated by Distil to:

  • Determine which one is most accurate
  • Understand how specific features and values influence the results.

The View Models view is made up of the following components:

Model Results pane

The Model Results pane on the right lists each model that Distil generated for your problem. Distil ranks the models in descending order of estimated accuracy.

To inspect the predictions generated by the model:
  1. Scroll through the models at the bottom of the pane.
  2. Compare the distribution of predictions in each model against the distribution of the actual values from the dataset. Which model looks most similar to the actual target feature?
    Compare model results to actual results
  3. Select a range in the Predicted distribution chart to split the Prediction tables based on the corresponding values.
    Highlight distribution of predicted values to view context of correct/incorrect predictions

    The top table lists records that match your selection, while the bottom table lists all other records.

  4. Click a different model to load its results in the Prediction Tables.
To adjust the acceptable error for the models:

Each regression model also includes the prediction error, which is calculated as the distance of the predicted value from the actual value.

  1. Review the distribution of error in the predicted results. Is the error relatively consistent across all the predictions?
    Compare distribution of error across models

    Error charts are centered on 0. Positive error values indicate that the prediction was higher than the actual value, while negative error values indicate the opposite.

  2. Drag the left or right error sliders to adjust the acceptable error in the predictions. By default, Distil sets the acceptable error to the 25th percentile. How does changing this value affect the number of correct/incorrect predictions?
    Adjust acceptable error in models
To view a different model:
  1. Compare the ranges of predicted values and error in the model summaries.
  2. Click the model you want to review to refresh the Prediction Tables.

Prediction tables

The Prediction tables list records that the model predicted. When you first access the View Models page, a single Prediction table lists all the records. As you interact with feature summaries or model predictions to filter the view, the Prediction table is split into:

  • Matching samples that contain the selected values.
  • Other samples that do not.

You can compare the results in the two tables to understand how certain values affect the correctness of the predictions. In regression models, like the one you generated for acceleration, correctness is determined by the configurable error threshold.

To split the predictions tables:
  • Select a category or a range of values in the target feature, feature summaries, or model predictions. What values do correct and incorrect predictions tend to have?
    View values for specific records in context of the whole dataset

Feature summaries

The Feature summaries show the range of values in the features used to model the predictions. The distribution of values can help you understand how they influence correct/predictions. You can filter the Prediction tables on specific values to understand how results would change if you omitted them from the model.

To understand how the models would change if you omitted records with specific values:
  • For continuous features, drag the sliders on either end of the timeline to focus records that contain values beyond the new placement. Does this improve the predictions?
    Drag range sliders to omit outliers
  • For categorical features, click to select individual values. Does this improve the predictions?

Refine a model

To refine the models that Distil produced, click Select Target or Create Models to go back to a previous step.

Return to Select to refine the models

When refining the model, you can change the target feature, add or remove features used to model the predictions, or adjust the values included for specific features.

Finish a problem

When you are satisfied with the results of a model, click Export Model to save it and move on to the next problem.

Export models