更改

分析句法的结构 (查看源代码)

2020年10月14日 (三) 21:48的版本

添加1,746字节、 2020年10月14日 (三) 21:48

创建页面，内容为“==句子的构成== 对于使用比较短句子的文本，使用n-gram token(n个词构成的部分)的词频等统计信息就可以描述文本的特征。但对…”

==句子的构成==

对于使用比较短句子的文本，使用n-gram token(n个词构成的部分)的词频等统计信息就可以描述文本的特征。但对于较长句子，我们需要分析句法的结构和词的词性来进行更精确的分析。例如，下面是一张词性对照表：

Symbol Meaning Example
S sentence the man walked
NP noun phrase a dog
VP verb phrase saw a park
PP prepositional phrase with a telescope
Det determiner the
N noun dog
V verb walked
P preposition in

==语法分析及其可视化==

为了把句子拆开，展示句子的结构，我们可以使用nltk包如下
<syntaxhighlight lang="python">
import nltk

grammar1 = nltk.parse_cfg("""
S -> NP VP
VP -> V NP | V NP PP
PP -> P NP
V -> "saw" | "ate" | "walked"
NP -> "John" | "Mary" | "Bob" | Det N | Det N PP
Det -> "a" | "an" | "the" | "my"
N -> "man" | "dog" | "cat" | "telescope" | "park"
P -> "in" | "on" | "by" | "with"
""")

sent = "the man saw Bob with the telescope".split()

rd_parser = nltk.RecursiveDescentParser(grammar1)

for tree in rd_parser.nbest_parse(sent):
print tree

tree1 = nltk.Tree('S', [nltk.Tree('NP', [nltk.Tree('Det', ['the']),
nltk.Tree('N', ['man'])]), nltk.Tree('VP', [nltk.Tree('V', ['saw']),
nltk.Tree('NP', ['Bob']), nltk.Tree('PP', [nltk.Tree('P', ['with']),
nltk.Tree('NP', [nltk.Tree('Det', ['the']), nltk.Tree('N', ['telescope'])])])])])

tree1.draw()
</syntaxhighlight>

最后可以得到这张图

[[File:grammar_tree_1.png|500px]]
[[Category:旧词条迁移]]

Thingamabob

143

个编辑

更改

分析句法的结构 (查看源代码)

2020年10月14日 (三) 21:48的版本

导航菜单

搜索