Hi Dylan,
The KernelScale is a scaling parameter which is used to scale the data before evaluation of the appropriate Gram matrix.
- In case one specifies the parameter to be 'auto', the software selects an appropriate scale factor using a heuristic procedure. This heuristic procedure uses subsampling, so estimates can vary from one call to another. Therefore, to reproduce results, set a random number seed using rng before training.
- If one specifies KernelScale and a custom kernel function, for example, 'KernelFunction', 'kernel', then the software throws an error. Then scaling must be applied within kernel.
The KernelFunction is the function used to compute the elements of Gram Matrix G after scaling has been applied using the KernelScale parameter.
The linear kernel function can be defined as:
Scaling the predictor variables by a factor of, say, s gives us the Gram matrix
This is not same as a quadratic kernel function with scaling factor 1 with the following form:
You can refer to this link for further information. Hope this helps!