-
Notifications
You must be signed in to change notification settings - Fork 2k
Add an Example for CrossValidating NaiveBayes #807
Comments
Hi @ConductedClever, While the example is not added to the official documentation, you can find an example on how to use NaiveBayes with Cross-Validation below: // Ensure we have reproducible results
Accord.Math.Random.Generator.Seed = 0;
// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
//
int[][] inputs =
{
// input output
new int[] { 0, 1, 1, 0 }, // 0
new int[] { 0, 1, 0, 0 }, // 0
new int[] { 0, 0, 1, 0 }, // 0
new int[] { 0, 1, 1, 0 }, // 0
new int[] { 0, 1, 0, 0 }, // 0
new int[] { 1, 0, 0, 0 }, // 1
new int[] { 1, 0, 0, 0 }, // 1
new int[] { 1, 0, 0, 1 }, // 1
new int[] { 0, 0, 0, 1 }, // 1
new int[] { 0, 0, 0, 1 }, // 1
new int[] { 1, 1, 1, 1 }, // 2
new int[] { 1, 0, 1, 1 }, // 2
new int[] { 1, 1, 0, 1 }, // 2
new int[] { 0, 1, 1, 1 }, // 2
new int[] { 1, 1, 1, 1 }, // 2
};
int[] outputs = // those are the class labels
{
0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2, 2,
};
// Let's say we want to measure the cross-validation
// performance of Naive Bayes on the above data set:
var cv = CrossValidation.Create(
k: 10, // We will be using 10-fold cross validation
learner: (p) => new NaiveBayesLearning(),
// Now we have to specify how the tree performance should be measured:
loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual),
// This function can be used to perform any special
// operations before the actual learning is done, but
// here we will just leave it as simple as it can be:
fit: (teacher, x, y, w) => teacher.Learn(x, y, w),
// Finally, we have to pass the input and output data
// that will be used in cross-validation.
x: inputs, y: outputs
);
// After the cross-validation object has been created,
// we can call its .Learn method with the input and
// output data that will be partitioned into the folds:
var result = cv.Learn(inputs, outputs);
// We can grab some information about the problem:
int numberOfSamples = result.NumberOfSamples; // should be 15
int numberOfInputs = result.NumberOfInputs; // should be 4
int numberOfOutputs = result.NumberOfOutputs; // should be 3
double trainingError = result.Training.Mean; // should be 0
double validationError = result.Validation.Mean; // should be 0.05 Hope it helps, |
Thanks a lot for the responsiveness. I will try this code and reflect the result here or close the issue. Thanks again @cesarsouza. |
Hi @cesarsouza, I have one more question about cross validation. Is there any way to get confusion matrix from CrossValidationResult or CrossValidationStatistics? For example to get the mean accuracy after cross-validating. |
Yes, just instead of creating a ZeroOneLoss, you can create a GeneralConfusionMatrix and return its accuracy, for example: // Let's say we want to measure the cross-validation
// performance of Naive Bayes on the above data set:
var cv = CrossValidation.Create(
k: 10, // We will be using 10-fold cross validation
// First we define the learning algorithm:
learner: (p) => new NaiveBayesLearning(),
// Now we have to specify how the n.b. performance should be measured:
loss: (actual, expected, p) =>
{
var cm = new GeneralConfusionMatrix(expected, actual);
p.Tag = cm; // (if you want to save it for some purpose)
return cm.Accuracy;
},
// This function can be used to perform any special
// operations before the actual learning is done, but
// here we will just leave it as simple as it can be:
fit: (teacher, x, y, w) => teacher.Learn(x, y, w),
// Finally, we have to pass the input and output data
// that will be used in cross-validation.
x: inputs, y: outputs
); Hope it helps, |
Ops, sorry, I misunderstood your question. Do you mean presenting a combined, aggregated confusion matrix representing all cross-validation folds? Or simply returning the confusion matrix of the best model found so far? |
Hi @cesarsouza. I mean the first one, "a combined, aggregated confusion matrix representing all cross-validation folds". Because I think this could be a good metric to decide if the learning would work or not. Although the worst case also helps (I think). |
In this case, you can use the example I have posted above to save a GeneralConfusionMatrix in each of the .Tag properties of the cross validation results. Then, you can combine all the confusion matrices from the validation folds using the GeneralConfusionMatrix.Combine method. I might be able to post a better example soon. |
I will be adding a new extension method to help with this task. In the next framework release, you should be able to call it using: // Let's say you have estimated a cross-validation result using:
var result = cv.Learn(inputs, outputs);
// You should be able to obtain a confusion matrix for the cross-validation using:
GeneralConfusionMatrix gcm = result.ToConfusionMatrix(inputs, outputs); Regards, |
…matrices from cross-validation results. - Updates GH-807: Add an Example for CrossValidating NaiveBayes
Hi @cesarsouza, Thanks. now your given solution is complete. And about your extension method, I think it will be useful to have both Thanks. |
Hi, is it possible to get classified class results for the validation data? |
What would you like to submit? (put an 'x' inside the bracket that applies)
Issue description
Hi. Please add an example of how to cross validate Naive Bayes learning algorithm. There are some for SVM, Hidden Markov Model, and Decision Tree here. Thanks.
The text was updated successfully, but these errors were encountered: