kfold validation 交叉验证

January 15, 2024

© 2024 borui. All rights reserved. This content may be freely reproduced, displayed, modified, or distributed with proper attribution to borui and a link to the article: borui(2024-01-15 22:43:26 +0000). kfold validation 交叉验证. https://borui/blog/2024-01-15-zh-kfold-validation.
@misc{
  borui2024,
  author = {borui},
  title = {kfold validation 交叉验证},
  year = {2024},
  publisher = {borui's blog},
  journal = {borui's blog},
  url={https://borui/blog/2024-01-15-zh-kfold-validation}
}

下图是一个五个fold的的验证过程: kfold_cross_validation_diagram0alt 浅蓝色是validation dataset,灰色是training dataset。 但是实际上kfold是做k次训练用不同段的数据作为验证集然后取验证的平均值作为训练的结果accuracy。 验证集也参与了训练,验证完全部fold的结果以后用全部数据(验证集加上训练集)做训练作为输出的模型。 这样就有了模型的准确率和模型本身。

⚠️ Warning: test dataset不包含在这些数据中

下面也是一张不错的图片 kfold_cross_validation_diagram1alt

Addtional reading

  1. 一枚宅小宋. (04 June, 2019.). 深度概念·K-Fold 交叉验证 (Cross-Validation)的理解与应用. 知乎专栏. [Blog post]. Retrieved from https://zhuanlan.zhihu.com/p/67986077
  2. 刘思聪. (23 Oct, 2018.). Kaggle求生:亚马逊热带雨林篇. 知乎专栏. [Blog post]. Retrieved from https://zhuanlan.zhihu.com/p/28084438