This section describes main characteristics of the friedman data set and its attributes:
General information
Friedman Benchmark Function data set |
Type | Regression | Origin | Laboratory |
Features | 5 | (Real / Integer / Nominal) | (5 / 0 / 0) |
Instances | 1200 | Missing values? | No |
Attribute description
Attribute | Domain |
Input1 | [0.0, 1.0] |
Input2 | [0.0, 1.0] |
Input3 | [0.0, 1.0] |
Input4 | [0.0, 1.0] |
Input5 | [0.0, 1.0] |
Output | [0.664014955, 28.5903858] |
Additional information
This is a synthetic benchmark dataset proposed by Friedman in 1991. The cases are generated using the following method: Generate the values of 5 attributes, X1, ..., X5 independently each of which uniformly distributed over [0.0, 1.0]. Obtain the value of the target variable Y using the equation: y=10(sin(PI)x1x2)+20(x3-0.5)2+10x4+5x5+e where e is a Gaussian random noise N(0,1).
In this section you can download some files related to the friedman data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
- This is not a native data set from the KEEL project. It has been obtained from the LIACC repository. The original page where the data set can be found is: http://www.liaad.up.pt/~ltorgo/Regression/DataSets.html.
|