自动机器学习

此词条暂由彩云小译翻译，未经人工整理和审校，带来阅读不便，请见谅。

Automated machine learning (AutoML) is the process of automating the process of applying machine learning to real-world problems. AutoML covers the complete pipeline from the raw dataset to the deployable machine learning model. AutoML was proposed as an artificial intelligence-based solution to the ever-growing challenge of applying machine learning.^[1]^[2] The high degree of automation in AutoML allows non-experts to make use of machine learning models and techniques without requiring to become an expert in this field first.

Automated machine learning (AutoML) is the process of automating the process of applying machine learning to real-world problems. AutoML covers the complete pipeline from the raw dataset to the deployable machine learning model. AutoML was proposed as an artificial intelligence-based solution to the ever-growing challenge of applying machine learning. The high degree of automation in AutoML allows non-experts to make use of machine learning models and techniques without requiring to become an expert in this field first.

自动机器学习(Automated machine learning，AutoML)是将机器学习应用于实际问题的过程自动化的过程。Automl 涵盖了从原始数据集到可部署的机器学习模型的整个管道。Automl 作为一种基于人工智能的解决方案，被提出来应用机器学习的日益增长的挑战。自动化的高度自动化允许非专家使用机器学习模型和技术，而不需要首先成为这个领域的专家。

Automating the process of applying machine learning end-to-end additionally offers the advantages of producing simpler solutions, faster creation of those solutions, and models that often outperform hand-designed models.

自动化端到端应用机器学习的过程还提供了生成更简单的解决方案、更快地创建这些解决方案以及通常优于手工设计模型的模型的优势。

Comparison to the standard machine learning approach

In a typical machine learning application, practitioners have a dataset consisting of input data points to train on. The raw data itself may not be in a form such that all algorithms may be applicable to it out of the box. An expert may have to apply appropriate data pre-processing, feature engineering, feature extraction, and feature selection methods that make the dataset amenable for machine learning. Following those preprocessing steps, practitioners must then perform algorithm selection and hyperparameter optimization to maximize the predictive performance of their machine learning model. Clearly all of those steps induce their own challenges, accumulating to a significant hurdle to get started with machine learning.

在一个典型的机器学习应用程序中，实践者有一个由输入数据点组成的数据集来进行训练。原始数据本身的形式可能不适用于所有算法。专家可能需要应用适当的数据预处理、特征工程、特征提取和特征选择方法，使数据集适合机器学习。按照这些预处理步骤，从业人员必须执行算法选择和超参数优化，以最大限度地提高他们的机器学习模型的预测性能。显然，所有这些步骤都带来了自己的挑战，累积到了开始机器学习的一个重大障碍。

A downside are the additional parameters of AutoML tools, which may need some expertise to be set themselves. Although those hyperparameters exist, AutoML simplifies the application of machine learning for non-experts dramatically.

不足之处是 AutoML 工具的附加参数，这些参数可能需要一些专业知识来设置自己。虽然这些超参数是存在的，但是 AutoML 极大地简化了非专家机器学习的应用。

Targets of automation

Automated machine learning can target various stages of the machine learning process.^[2] Essentially the targets can be grouped into the fields data preparation, feature engineering, model selection, selection of evaluation metrics, and hyperparameter optimization.

Automated machine learning can target various stages of the machine learning process. Essentially the targets can be grouped into the fields data preparation, feature engineering, model selection, selection of evaluation metrics, and hyperparameter optimization.

自动机器学习可以针对机器学习过程的不同阶段。从本质上讲，目标可以分为数据准备、特征工程、模型选择、评价指标的选择和超参数优化。

Automated data preparation and ingestion (from raw data and miscellaneous formats)

- Automated column type detection; e.g., boolean, discrete numerical, continuous numerical, or text

- Automated column intent detection; e.g., target/label, stratification field, numerical feature, categorical text feature, or free text feature

- Automated task detection; e.g., binary classification, regression, clustering, or ranking

Automated feature engineering

- Feature selection

- Feature extraction

- Meta learning and transfer learning

- Detection and handling of skewed data and/or missing values

Automated model selection

Hyperparameter optimization of the learning algorithm and featurization

Automated pipeline selection under time, memory, and complexity constraints

Automated selection of evaluation metrics / validation procedures

Automated problem checking

- Leakage detection

- Misconfiguration detection

Automated analysis of results obtained

User interfaces and visualizations for automated machine learning

References

↑ Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013). Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. KDD '13 Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 847–855.
↑ ^2.0 ^2.1 Hutter F, Caruana R, Bardenet R, Bilenko M, Guyon I, Kegl B, and Larochelle H. "AutoML 2014 @ ICML". AutoML 2014 Workshop @ ICML. Retrieved 2018-03-28.

Category:Machine learning

分类: 机器学习

Category:Artificial intelligence

类别: 人工智能

This page was moved from wikipedia:en:Automated machine learning. Its edit history can be viewed at 自动机器学习/edithistory

[autoweka1-1] Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013). Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. KDD '13 Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 847–855.

[AutoML2014ICML-2] 2.0 ^2.1 Hutter F, Caruana R, Bardenet R, Bilenko M, Guyon I, Kegl B, and Larochelle H. "AutoML 2014 @ ICML". AutoML 2014 Workshop @ ICML. Retrieved 2018-03-28.

[1]

[2]

自动机器学习

目录

Comparison to the standard machine learning approach

Targets of automation

See also

References

导航菜单

搜索