更改

删除20字节 、 2021年8月28日 (六) 19:32
第61行: 第61行:  
* 半结构化信息抽取,它是试图恢复某种信息结构的信息抽取方法的统称,这种信息结构在发布过程中已经丢失,例如:
 
* 半结构化信息抽取,它是试图恢复某种信息结构的信息抽取方法的统称,这种信息结构在发布过程中已经丢失,例如:
 
** 表提取: 从文档中查找和提取表<ref name=":9">{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = A framework for information extraction from tables in biomedical literature | journal = International Journal on Document Analysis and Recognition (IJDAR) | volume = 22 | issue = 1 | pages = 55–78 | date = February 2019 | doi = 10.1007/s10032-019-00317-0 | arxiv = 1902.10031 | bibcode = 2019arXiv190210031M }}</ref><ref name=":10">{{cite thesis |type=PhD |last=Milosevic |first=Nikola |date=2018 |title=A multi-layered approach to information extraction from tables in biomedical documents |publisher=University of Manchester | url=https://www.research.manchester.ac.uk/portal/files/70405100/FULL_TEXT.PDF}}</ref>。
 
** 表提取: 从文档中查找和提取表<ref name=":9">{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = A framework for information extraction from tables in biomedical literature | journal = International Journal on Document Analysis and Recognition (IJDAR) | volume = 22 | issue = 1 | pages = 55–78 | date = February 2019 | doi = 10.1007/s10032-019-00317-0 | arxiv = 1902.10031 | bibcode = 2019arXiv190210031M }}</ref><ref name=":10">{{cite thesis |type=PhD |last=Milosevic |first=Nikola |date=2018 |title=A multi-layered approach to information extraction from tables in biomedical documents |publisher=University of Manchester | url=https://www.research.manchester.ac.uk/portal/files/70405100/FULL_TEXT.PDF}}</ref>。
** 表信息抽取: 以结构化方式从表中提取信息。这比表格提取更复杂,因为表格提取只是第一步,而理解单元格、行、列的角色、表格内信息的链接和理解表格中的信息是表格/信息抽取所必需的额外任务。<ref>{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = A framework for information extraction from tables in biomedical literature | journal = International Journal on Document Analysis and Recognition (IJDAR) | volume = 22 | issue = 1 | pages = 55–78 | date = February 2019 | doi = 10.1007/s10032-019-00317-0 | arxiv = 1902.10031 | bibcode = 2019arXiv190210031M | s2cid = 62880746 }}</ref><ref>{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = Disentangling the structure of tables in scientific literature | journal = 21st International Conference on Applications of Natural Language to Information Systems | series = Lecture Notes in Computer Science | volume = 21  | date = June 2016 | pages = 162–174 | doi = 10.1007/978-3-319-41754-7_14  | url = https://www.research.manchester.ac.uk/portal/en/publications/disentangling-the-structure-of-tables-in-scientific-literature(473111c2-52e9-493a-be8c-1a78c5b7ce36).html }}</ref><ref>{{cite thesis |type=PhD |last=Milosevic |first=Nikola |date=2018 |title=A multi-layered approach to information extraction from tables in biomedical documents |publisher=University of Manchester | url=https://www.research.manchester.ac.uk/portal/files/70405100/FULL_TEXT.PDF}}</ref>
+
** 表信息抽取: 以结构化方式从表中提取信息。这比表格提取更复杂,因为表格提取只是第一步,而理解单元格、行、列的角色、表格内信息的链接和理解表格中的信息是表格/信息抽取所必需的额外任务。<ref>{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = A framework for information extraction from tables in biomedical literature | journal = International Journal on Document Analysis and Recognition (IJDAR) | volume = 22 | issue = 1 | pages = 55–78 | date = February 2019 | doi = 10.1007/s10032-019-00317-0 | arxiv = 1902.10031 | bibcode = 2019arXiv190210031M}}</ref><ref>{{cite journal | vauthors = Milosevic N, Gregson C, Hernandez R, Nenadic G | title = Disentangling the structure of tables in scientific literature | journal = 21st International Conference on Applications of Natural Language to Information Systems | series = Lecture Notes in Computer Science | volume = 21  | date = June 2016 | pages = 162–174 | doi = 10.1007/978-3-319-41754-7_14  | url = https://www.research.manchester.ac.uk/portal/en/publications/disentangling-the-structure-of-tables-in-scientific-literature(473111c2-52e9-493a-be8c-1a78c5b7ce36).html }}</ref><ref>{{cite thesis |type=PhD |last=Milosevic |first=Nikola |date=2018 |title=A multi-layered approach to information extraction from tables in biomedical documents |publisher=University of Manchester | url=https://www.research.manchester.ac.uk/portal/files/70405100/FULL_TEXT.PDF}}</ref>
 
** 注释提取: 从文章的实际内容中提取注释,以恢复每个句子的作者之间的联系
 
** 注释提取: 从文章的实际内容中提取注释,以恢复每个句子的作者之间的联系
 
* 语言和词汇分析
 
* 语言和词汇分析
7,129

个编辑