Finally, you must choose features that you think will inform the predictions.
Select the features to model
To choose the features that you think will predict the target feature:
-
Click Add under the Horsepower and Cylinders features.
In this example, Distil adds the selected features to the list of Features to Model. You can filter or change the Type of any of the features to model.
Click Remove under the feature name in the Features to Model pane or click Remove All above the list of features. -
Note the distribution of categories for the Cylinders feature. Four, six and eight cylinder engines appear often, while three and five cylinder engines appear much less frequently.
- Click the three-cylinder category to view samples that match the value in the Samples to Model From table.
- To exclude these samples from the model, click Exclude.
- Repeat steps 3–4 for the five-cylinder category as well.
-
Review the updated Samples to Model From table. Click the column headers to sort by feature and check for extreme values that may indicate problems with the data. Note that several rows have no value for Horsepower.
-
To remove these records, click each row and then click Exclude.
-
Click Excluded Samples to review the records you removed. There is a collection of three and five cylinder vehicles, as well as some four cylinder vehicles with no horsepower.
Your model definition is now complete, and you can use your selections to create models.