第1行: |
第1行: |
− | 此词条暂由彩云小译翻译,翻译字数共388,未经人工整理和审校,带来阅读不便,请见谅。
| + | 此词条暂由彩云小译翻译,翻译字数共388,人工整理:林家驹。 |
| | | |
− | {{Multiple issues|
| |
− | {{citation style|date=December 2011}}
| |
− | {{technical|date=October 2012}}
| |
− | {{abbreviations|date=October 2012}}
| |
− | }}
| |
| '''Automatic content extraction''' ('''ACE''') is a research program for developing advanced [[information extraction]] [[technologies]] convened by the [[National Institute of Standards and Technology|NIST]] from 1999 to 2008, succeeding [[Message Understanding Conference|MUC]] and preceding [https://www.nist.gov/tac/ Text Analysis Conference]. | | '''Automatic content extraction''' ('''ACE''') is a research program for developing advanced [[information extraction]] [[technologies]] convened by the [[National Institute of Standards and Technology|NIST]] from 1999 to 2008, succeeding [[Message Understanding Conference|MUC]] and preceding [https://www.nist.gov/tac/ Text Analysis Conference]. |
− |
| |
− |
| |
− | Automatic content extraction (ACE) is a research program for developing advanced information extraction technologies convened by the NIST from 1999 to 2008, succeeding MUC and preceding Text Analysis Conference.
| |
| | | |
| 自动内容提取(ACE)是由 NIST 在1999年至2008年间召开的一个研究项目,目的是开发先进的信息抽取/文本分析技术,后续还有 MUC 和之前的文本分析会议。 | | 自动内容提取(ACE)是由 NIST 在1999年至2008年间召开的一个研究项目,目的是开发先进的信息抽取/文本分析技术,后续还有 MUC 和之前的文本分析会议。 |
| | | |
| ==Goals and efforts== | | ==Goals and efforts== |
− | In general objective, the ACE program is motivated by and addresses the same issues as the MUC program that preceded it. The ACE program, however, defines the research objectives in terms of the target objects (i.e., the entities, the relations, and the events) rather than in terms of the words in the text. For example, the so-called "named entity" task, as defined in MUC, is to identify those words (on the page) that are names of entities. In ACE, on the other hand, the corresponding task is to identify the entity so named. This is a different task, one that is more abstract and that involves inference more explicitly in producing an answer. In a real sense, the task is to detect things that "aren't there".
| |
− |
| |
| In general objective, the ACE program is motivated by and addresses the same issues as the MUC program that preceded it. The ACE program, however, defines the research objectives in terms of the target objects (i.e., the entities, the relations, and the events) rather than in terms of the words in the text. For example, the so-called "named entity" task, as defined in MUC, is to identify those words (on the page) that are names of entities. In ACE, on the other hand, the corresponding task is to identify the entity so named. This is a different task, one that is more abstract and that involves inference more explicitly in producing an answer. In a real sense, the task is to detect things that "aren't there". | | In general objective, the ACE program is motivated by and addresses the same issues as the MUC program that preceded it. The ACE program, however, defines the research objectives in terms of the target objects (i.e., the entities, the relations, and the events) rather than in terms of the words in the text. For example, the so-called "named entity" task, as defined in MUC, is to identify those words (on the page) that are names of entities. In ACE, on the other hand, the corresponding task is to identify the entity so named. This is a different task, one that is more abstract and that involves inference more explicitly in producing an answer. In a real sense, the task is to detect things that "aren't there". |
| | | |
第21行: |
第11行: |
| | | |
| While the ACE program is directed toward extraction of information from [[Sound|audio]] and [[image]] sources in addition to pure text, the research effort is restricted to information extraction from text. The actual [[transduction (machine learning)|transduction]] of audio and image data into text is not part of the ACE research effort, although the processing of [[Speech recognition | ASR]] and [[Optical character recognition | OCR]] output from such transducers is. | | While the ACE program is directed toward extraction of information from [[Sound|audio]] and [[image]] sources in addition to pure text, the research effort is restricted to information extraction from text. The actual [[transduction (machine learning)|transduction]] of audio and image data into text is not part of the ACE research effort, although the processing of [[Speech recognition | ASR]] and [[Optical character recognition | OCR]] output from such transducers is. |
− |
| |
− | While the ACE program is directed toward extraction of information from audio and image sources in addition to pure text, the research effort is restricted to information extraction from text. The actual transduction of audio and image data into text is not part of the ACE research effort, although the processing of ASR and OCR output from such transducers is.
| |
| | | |
| 虽然 ACE 项目的目标是从音频和图像资源中提取除纯文本以外的信息,但研究工作仅限于从文本中提取信息抽取。实际将音频和图像数据转换成文本并不是 ACE 研究工作的一部分,不过这些转换器输出的 ASR 和 OCR 的输出是 ACE 研究工作的一部分。 | | 虽然 ACE 项目的目标是从音频和图像资源中提取除纯文本以外的信息,但研究工作仅限于从文本中提取信息抽取。实际将音频和图像数据转换成文本并不是 ACE 研究工作的一部分,不过这些转换器输出的 ASR 和 OCR 的输出是 ACE 研究工作的一部分。 |
第29行: |
第17行: |
| * defining the research tasks in detail, | | * defining the research tasks in detail, |
| * collecting and annotating data needed for training, development, and evaluation, | | * collecting and annotating data needed for training, development, and evaluation, |
− | * supporting the research with evaluation tools and [[research workshop]]s. | + | * supporting the research with evaluation tools and [[research workshop]]s |
− | | |
− | The effort involves:
| |
− | * defining the research tasks in detail,
| |
− | * collecting and annotating data needed for training, development, and evaluation,
| |
− | * supporting the research with evaluation tools and research workshops.
| |
| | | |
| 这项工作包括: | | 这项工作包括: |
第45行: |
第28行: |
| # '''entities''' mentioned in the text, such as: persons, organizations, locations, facilities, weapons, vehicles, and geo-political entities. | | # '''entities''' mentioned in the text, such as: persons, organizations, locations, facilities, weapons, vehicles, and geo-political entities. |
| # '''relations''' between entities, such as: person A is the manager of company B. Relation types include: role, part, located, near, and social. | | # '''relations''' between entities, such as: person A is the manager of company B. Relation types include: role, part, located, near, and social. |
− | # '''events''' mentioned in the text, such as: interaction, movement, transfer, creation and destruction. | + | # '''events''' mentioned in the text, such as: interaction, movement, transfer, creation and destruction |
− | | |
− | Given a text in natural language, the ACE challenge is to detect:
| |
− | # entities mentioned in the text, such as: persons, organizations, locations, facilities, weapons, vehicles, and geo-political entities.
| |
− | # relations between entities, such as: person A is the manager of company B. Relation types include: role, part, located, near, and social.
| |
− | # events mentioned in the text, such as: interaction, movement, transfer, creation and destruction.
| |
| | | |
| 给定一个自然语言的文本,ACE 的挑战是检测: 文本中提到的 | | 给定一个自然语言的文本,ACE 的挑战是检测: 文本中提到的 |
第61行: |
第39行: |
| | | |
| The program relates to [[English language|English]], [[Arabic language|Arabic]] and [[Chinese language|Chinese]] texts. | | The program relates to [[English language|English]], [[Arabic language|Arabic]] and [[Chinese language|Chinese]] texts. |
− |
| |
− | The program relates to English, Arabic and Chinese texts.
| |
| | | |
| 该计划涉及英语,阿拉伯语和中文文本。 | | 该计划涉及英语,阿拉伯语和中文文本。 |
| | | |
| The ACE corpus is one of the standard benchmarks for testing new information extraction [[algorithm]]s. | | The ACE corpus is one of the standard benchmarks for testing new information extraction [[algorithm]]s. |
− |
| |
− | The ACE corpus is one of the standard benchmarks for testing new information extraction algorithms.
| |
| | | |
| ACE 语料库是测试新的信息抽取算法的标准基准之一。 | | ACE 语料库是测试新的信息抽取算法的标准基准之一。 |
第74行: |
第48行: |
| ==References== | | ==References== |
| * George Doddington@NIS T, Alexis Mitchell@LD C, Mark Przybocki@NIS T, Lance Ramshaw@BB N, Stephanie Strassel@LD C, Ralph Weischedel@BB N. [https://web.archive.org/web/20150126022215/http://www.citeulike.org/user/erelsegal-halevi/article/10003935 The automatic content extraction (ACE) program–tasks, data, and evaluation.] 2004 | | * George Doddington@NIS T, Alexis Mitchell@LD C, Mark Przybocki@NIS T, Lance Ramshaw@BB N, Stephanie Strassel@LD C, Ralph Weischedel@BB N. [https://web.archive.org/web/20150126022215/http://www.citeulike.org/user/erelsegal-halevi/article/10003935 The automatic content extraction (ACE) program–tasks, data, and evaluation.] 2004 |
− |
| |
− | * George Doddington@NIS T, Alexis Mitchell@LD C, Mark Przybocki@NIS T, Lance Ramshaw@BB N, Stephanie Strassel@LD C, Ralph Weischedel@BB N. The automatic content extraction (ACE) program–tasks, data, and evaluation. 2004
| |
− |
| |
− |
| |
− | * George Doddington@NIS t,Alexis Mitchell@LD c,Mark Przybocki@NIS t,Lance Ramshaw@BB n,Stephanie Strassel@LD c,Ralph Weischedel@BB n.自动内容提取(ACE)程序——任务、数据和评估。2004
| |
| | | |
| ==External links== | | ==External links== |