第191行: |
第191行: |
| |- | | |- |
| | align="center" | 4 | | | align="center" | 4 |
− | | '''Find Extremum''' | + | | '''寻找极值Find Extremum''' |
− | | Find data cases possessing an extreme value of an attribute over its range within the data set.
| + | | 查找数据集中在某属性的某范围内具有极端取值的数据案例。 |
− | | What are the top/bottom N data cases with respect to attribute A?
| |
− | | ''- What is the car with the highest MPG?''
| |
− | ''- What director/film has won the most awards?''
| |
− | | |
− | ''- What Marvel Studios film has the most recent release date?''
| |
− | |-
| |
− | | align="center" | 5
| |
− | | '''Sort'''
| |
− | | Given a set of data cases, rank them according to some ordinal metric.
| |
− | | What is the sorted order of a set S of data cases according to their value of attribute A?
| |
− | | ''- Order the cars by weight.''
| |
− | ''- Rank the cereals by calories.''
| |
− | |-
| |
− | | align="center" | 6
| |
− | | '''Determine Range'''
| |
− | | Given a set of data cases and an attribute of interest, find the span of values within the set.
| |
− | | What is the range of values of attribute A in a set S of data cases?
| |
− | | ''- What is the range of film lengths?''
| |
− | ''- What is the range of car horsepowers?''
| |
− | | |
− | ''- What actresses are in the data set?''
| |
− | |-
| |
− | | align="center" | 7
| |
− | | '''Characterize Distribution'''
| |
− | | Given a set of data cases and a quantitative attribute of interest, characterize the distribution of that attribute's values over the set.
| |
− | | What is the distribution of values of attribute A in a set S of data cases?
| |
− | | ''- What is the distribution of carbohydrates in cereals?''
| |
− | ''- What is the age distribution of shoppers?''
| |
− | |-
| |
− | | align="center" | 8
| |
− | | '''Find Anomalies'''
| |
− | | Identify any anomalies within a given set of data cases with respect to a given relationship or expectation, e.g. statistical outliers.
| |
− | | Which data cases in a set S of data cases have unexpected/exceptional values?
| |
− | | ''- Are there exceptions to the relationship between horsepower and acceleration?''
| |
− | ''- Are there any outliers in protein?''
| |
− | |-
| |
− | | align="center" | 9
| |
− | | '''Cluster'''
| |
− | | Given a set of data cases, find clusters of similar attribute values.
| |
− | | Which data cases in a set S of data cases are similar in value for attributes {X, Y, Z, ...}?
| |
− | | ''- Are there groups of cereals w/ similar fat/calories/sugar?''
| |
− | ''- Is there a cluster of typical film lengths?''
| |
− | |-
| |
− | | align="center" | 10
| |
− | | '''Correlate'''
| |
− | | Given a set of data cases and two attributes, determine useful relationships between the values of those attributes.
| |
− | | What is the correlation between attributes X and Y over a given set S of data cases?
| |
− | | ''- Is there a correlation between carbohydrates and fat?''
| |
− | ''- Is there a correlation between country of origin and MPG?''
| |
− | | |
− | ''- Do different genders have a preferred payment method?''
| |
− | | |
− | ''- Is there a trend of increasing film length over the years?''
| |
− | |-
| |
− | | align="center" | 11
| |
− | | ''' [[Contextualization (computer science)|Contextualization]]<ref name="ConTaaS"/>'''
| |
− | | Given a set of data cases, find contextual relevancy of the data to the users.
| |
− | | Which data cases in a set S of data cases are relevant to the current users' context?
| |
− | | ''- Are there groups of restaurants that have foods based on my current caloric intake?''
| |
− | |-
| |
− | |}
| |
− | | |
− | | |
− | | |
− | | |
− | | |
− | | |
− | | |
− | | |
− | ''- Which funds underperformed the SP-500?''
| |
− | | |
− | - Which funds underperformed the SP-500?
| |
− | | |
− | “-哪些基金的表现不如 SP-500?”
| |
− | | |
− | |-
| |
− | | |
− | |-
| |
− | | |
− | |-
| |
− | | |
− | | align="center" | 3
| |
− | | |
− | | align="center" | 3
| |
− | | |
− | | align="center" | 3
| |
− | | |
− | | '''Compute Derived Value'''
| |
− | | |
− | | Compute Derived Value
| |
− | | |
− | | '''<font color='#ff8000'>计算派生值Compute Derived Value</font>'''
| |
− | | |
− | | Given a set of data cases, compute an aggregate numeric representation of those data cases.
| |
− | | |
− | | Given a set of data cases, compute an aggregate numeric representation of those data cases.
| |
− | | |
− | | 给定一组数据案例,计算这些数据案例以聚合形式表示的数值。
| |
− | | |
− | | What is the value of aggregation function F over a given set S of data cases?
| |
− | | |
− | | What is the value of aggregation function F over a given set S of data cases?
| |
− | | |
− | | '''<font color='#ff8000'>聚合函数aggregation function </font>'''F 在给定数据集 S 上的值是多少?
| |
− | | |
− | | ''- What is the average calorie content of Post cereals?''
| |
− | | |
− | | - What is the average calorie content of Post cereals?
| |
− | | |
− | | “-'''<font color='#ff8000'>波斯特谷物Post cereals</font>'''的平均热量是多少?”
| |
− | | |
− | ''- What is the gross income of all stores combined?''
| |
− | | |
− | - What is the gross income of all stores combined?
| |
− | | |
− | “-所有商店的总收入是多少?”
| |
− | | |
− | | |
− | | |
− | ''- How many manufacturers of cars are there?''
| |
− | | |
− | - How many manufacturers of cars are there?
| |
− | | |
− | “-有多少汽车制造商?”
| |
− | | |
− | |-
| |
− | | |
− | |-
| |
− | | |
− | |-
| |
− | | |
− | | align="center" | 4
| |
− | | |
− | | align="center" | 4
| |
− | | |
− | | align="center" | 4
| |
− | | |
− | | '''Find Extremum'''
| |
− | | |
− | | Find Extremum
| |
− | | |
− | | '''<font color='#ff8000'>寻找极值Find Extremum</font>'''
| |
− | | |
− | | Find data cases possessing an extreme value of an attribute over its range within the data set.
| |
− | | |
− | | Find data cases possessing an extreme value of an attribute over its range within the data set.
| |
− | | |
− | | 查找数据集中在某属性的某范围内具有极端取值的数据案例。 | |
− | | |
− | | What are the top/bottom N data cases with respect to attribute A?
| |
− | | |
− | | What are the top/bottom N data cases with respect to attribute A?
| |
− | | |
| | 属性 A 的顶部或底部的 N 个数据案例是什么? | | | 属性 A 的顶部或底部的 N 个数据案例是什么? |
− | | + | | “- 有最高 MPG 的汽车是什么?” |
− | | |
− | | ''- What is the car with the highest MPG?''
| |
− | | |
− | | - What is the car with the highest MPG?
| |
− | | |
− | |“- 有最高 MPG 的汽车是什么?” | |
− | | |
− | ''- What director/film has won the most awards?''
| |
− | | |
− | - What director/film has won the most awards?
| |
− | | |
| “- 哪个导演/哪部电影获奖最多?” | | “- 哪个导演/哪部电影获奖最多?” |
− |
| |
− |
| |
− |
| |
− | ''- What Marvel Studios film has the most recent release date?''
| |
− |
| |
− | - What Marvel Studios film has the most recent release date?
| |
− |
| |
| “- 哪部漫威电影公司的电影具有最近上映的日期?” | | “- 哪部漫威电影公司的电影具有最近上映的日期?” |
− |
| |
− | |-
| |
− |
| |
− | |-
| |
− |
| |
| |- | | |- |
− |
| |
− | | align="center" | 5
| |
− |
| |
− | | align="center" | 5
| |
− |
| |
| | align="center" | 5 | | | align="center" | 5 |
− | | + | | '''排序 Sort''' |
− | | '''Sort''' | |
− | | |
− | | Sort
| |
− | | |
− | | '''<font color='#ff8000'>排序Sort</font>'''
| |
− | | |
− | | Given a set of data cases, rank them according to some ordinal metric.
| |
− | | |
− | | Given a set of data cases, rank them according to some ordinal metric.
| |
− | | |
| | 给定一组数据案例,根据某种顺序度量对它们进行排序。 | | | 给定一组数据案例,根据某种顺序度量对它们进行排序。 |
− |
| |
− | | What is the sorted order of a set S of data cases according to their value of attribute A?
| |
− |
| |
− | | What is the sorted order of a set S of data cases according to their value of attribute A?
| |
− |
| |
| | 根据属性 A 的值,一组数据案例 S 怎样排序? | | | 根据属性 A 的值,一组数据案例 S 怎样排序? |
− |
| |
− | | ''- Order the cars by weight.''
| |
− |
| |
− | | - Order the cars by weight.
| |
− |
| |
| | “-按重量给汽车排序。” | | | “-按重量给汽车排序。” |
− |
| |
− | ''- Rank the cereals by calories.''
| |
− |
| |
− | - Rank the cereals by calories.
| |
− |
| |
| “- 按卡路里排列谷物。” | | “- 按卡路里排列谷物。” |
− |
| |
| |- | | |- |
− |
| |
− | |-
| |
− |
| |
− | |-
| |
− |
| |
− | | align="center" | 6
| |
− |
| |
− | | align="center" | 6
| |
− |
| |
| | align="center" | 6 | | | align="center" | 6 |
− | | + | | '''确定范围 Determine Range''' |
− | | '''Determine Range''' | |
− | | |
− | | Determine Range
| |
− | | |
− | | '''<font color='#ff8000'>确定范围Determine Range</font>'''
| |
− | | |
− | | Given a set of data cases and an attribute of interest, find the span of values within the set.
| |
− | | |
− | | Given a set of data cases and an attribute of interest, find the span of values within the set.
| |
− | | |
| | 给定一组数据案例和一个感兴趣的属性,查找该组中的值的范围。 | | | 给定一组数据案例和一个感兴趣的属性,查找该组中的值的范围。 |
− |
| |
− | | What is the range of values of attribute A in a set S of data cases?
| |
− |
| |
− | | What is the range of values of attribute A in a set S of data cases?
| |
− |
| |
| | 在一组数据案例 S 中,属性 A 的值范围是多少? | | | 在一组数据案例 S 中,属性 A 的值范围是多少? |
− |
| |
− | | ''- What is the range of film lengths?''
| |
− |
| |
− | | - What is the range of film lengths?
| |
− |
| |
| | “-胶卷的长度范围是多少?” | | | “-胶卷的长度范围是多少?” |
− |
| |
− | ''- What is the range of car horsepowers?''
| |
− |
| |
− | - What is the range of car horsepowers?
| |
− |
| |
| “-汽车的马力范围是多少?” | | “-汽车的马力范围是多少?” |
− |
| |
− |
| |
− |
| |
− | ''- What actresses are in the data set?''
| |
− |
| |
− | - What actresses are in the data set?
| |
− |
| |
| “-数据库里有哪些女演员?” | | “-数据库里有哪些女演员?” |
− |
| |
− | |-
| |
− |
| |
− | |-
| |
− |
| |
| |- | | |- |
− |
| |
− | | align="center" | 7
| |
− |
| |
− | | align="center" | 7
| |
− |
| |
| | align="center" | 7 | | | align="center" | 7 |
− | | + | | '''特征分布 Characterize Distribution''' |
− | | '''Characterize Distribution''' | |
− | | |
− | | Characterize Distribution
| |
− | | |
− | | '''<font color='#ff8000'>特征分布Characterize Distribution</font>'''
| |
− | | |
− | | Given a set of data cases and a quantitative attribute of interest, characterize the distribution of that attribute's values over the set.
| |
− | | |
− | | Given a set of data cases and a quantitative attribute of interest, characterize the distribution of that attribute's values over the set.
| |
− | | |
| | 给定一组数据案例和一个感兴趣的定量属性,刻画该属性值在该集上的分布情况。 | | | 给定一组数据案例和一个感兴趣的定量属性,刻画该属性值在该集上的分布情况。 |
− |
| |
− | | What is the distribution of values of attribute A in a set S of data cases?
| |
− |
| |
− | | What is the distribution of values of attribute A in a set S of data cases?
| |
− |
| |
| | 属性 A 的值在一组数据案例 S 中的分布情况如何? | | | 属性 A 的值在一组数据案例 S 中的分布情况如何? |
− |
| |
− | | ''- What is the distribution of carbohydrates in cereals?''
| |
− |
| |
− | | - What is the distribution of carbohydrates in cereals?
| |
− |
| |
| |“-谷物中碳水化合物的分布情况如何?” | | |“-谷物中碳水化合物的分布情况如何?” |
− |
| |
− | ''- What is the age distribution of shoppers?''
| |
− |
| |
− | - What is the age distribution of shoppers?
| |
− |
| |
| “-购物者的年龄分布情况如何?” | | “-购物者的年龄分布情况如何?” |
| | | |
| |- | | |- |
− |
| |
− | |-
| |
− |
| |
− | |-
| |
− |
| |
− | | align="center" | 8
| |
− |
| |
− | | align="center" | 8
| |
− |
| |
| | align="center" | 8 | | | align="center" | 8 |
− | | + | | '''寻找异常值 Find Anomalies''' |
− | | '''Find Anomalies''' | |
− | | |
− | | Find Anomalies
| |
− | | |
− | | '''<font color='#ff8000'>寻找异常值Find Anomalies</font>'''
| |
− | | |
− | | Identify any anomalies within a given set of data cases with respect to a given relationship or expectation, e.g. statistical outliers.
| |
− | | |
− | | Identify any anomalies within a given set of data cases with respect to a given relationship or expectation, e.g. statistical outliers.
| |
− | | |
| | 识别给定数据集中与给定关系或期望有关的任何异常值,例如统计异常值。 | | | 识别给定数据集中与给定关系或期望有关的任何异常值,例如统计异常值。 |
− |
| |
− | | Which data cases in a set S of data cases have unexpected/exceptional values?
| |
− |
| |
− | | Which data cases in a set S of data cases have unexpected/exceptional values?
| |
− |
| |
| | 在一组数据案例中,哪些数据案例具有意外的或异常的取值? | | | 在一组数据案例中,哪些数据案例具有意外的或异常的取值? |
− |
| |
− | | ''- Are there exceptions to the relationship between horsepower and acceleration?''
| |
− |
| |
− | | - Are there exceptions to the relationship between horsepower and acceleration?
| |
− |
| |
| |“-马力和加速度之间的关系有例外吗?” | | |“-马力和加速度之间的关系有例外吗?” |
− |
| |
− | ''- Are there any outliers in protein?''
| |
− |
| |
− | - Are there any outliers in protein?
| |
− |
| |
| “-蛋白质是否有异常值?” | | “-蛋白质是否有异常值?” |
− |
| |
| |- | | |- |
− | | + | | align="center" | 9 |
| + | | '''集群 Cluster''' |
| + | |给定一组数据案例,找出相似属性值的集群。 |
| + | |一组数据案例中的哪些数据案例在属性{ X, Y, Z, ... }的值上相似? |
| + | |“-有没有含脂肪 / 卡路里 / 糖类似的谷类食物?” |
| + | “-是否有一组典型的电影长度?” |
| + | |- |
| + | | align="center" | 10 |
| + | | '''相关 Correlate''' |
| + | | 给定一组数据案例和两个属性,确定这些属性值之间的有用关系。 |
| + | | 在给定的数据案例集 S 中,属性 X 和 Y 之间的相关性是什么? |
| + | |“-碳水化合物和脂肪之间有关系吗?” |
| + | “-起源国和 MPG 之间有联系吗?” |
| + | “- 不同性别是否倾向不同的付款方式?” |
| + | “- 电影长度是否有逐年增加的趋势?” |
| |- | | |- |
− | | + | | align="center" | 11 |
| + | | ''' 语境化 Contextualization<ref name="ConTaaS"/>''' |
| + | | 给定一组数据案例,找出数据与用户语境化的相关性。 |
| + | | 一组数据案例中的哪些数据案例与当前用户的语境相关? |
| + | |“-是否有几组餐馆根据我目前摄入的热量来提供食物?” |
| |- | | |- |
| + | |} |
| | | |
− | | align="center" | 9
| |
− |
| |
− | | align="center" | 9
| |
− |
| |
− | | align="center" | 9
| |
| | | |
− | | '''Cluster'''
| |
| | | |
− | | Cluster
| |
| | | |
− | | '''<font color='#ff8000'>Cluster 集群<font>'''
| |
| | | |
− | | Given a set of data cases, find clusters of similar attribute values.
| |
| | | |
− | | Given a set of data cases, find clusters of similar attribute values.
| |
| | | |
− | | 给定一组数据案例,找出相似属性值的集群。
| |
| | | |
− | | Which data cases in a set S of data cases are similar in value for attributes {X, Y, Z, ...}? | + | | '''Cluster''' |
| | | |
− | | Which data cases in a set S of data cases are similar in value for attributes {X, Y, Z, ...}? | + | | Cluster |
| | | |
− | | 一组数据案例中的哪些数据案例在属性{ X, Y, Z, ... }的值上相似? | + | | '''<font color='#ff8000'>Cluster 集群<font>''' |
| | | |
− | | ''- Are there groups of cereals w/ similar fat/calories/sugar?''
| |
| | | |
− | | - Are there groups of cereals w/ similar fat/calories/sugar?
| |
| | | |
| + | |给定一组数据案例,找出相似属性值的集群。 |
| + | |一组数据案例中的哪些数据案例在属性{ X, Y, Z, ... }的值上相似? |
| |“-有没有含脂肪 / 卡路里 / 糖类似的谷类食物?” | | |“-有没有含脂肪 / 卡路里 / 糖类似的谷类食物?” |
− |
| |
− | ''- Is there a cluster of typical film lengths?''
| |
− |
| |
− | - Is there a cluster of typical film lengths?
| |
− |
| |
| “-是否有一组典型的电影长度?” | | “-是否有一组典型的电影长度?” |
| | | |
第605行: |
第290行: |
| | '''<font color='#ff8000'>相关Correlate</font>''' | | | '''<font color='#ff8000'>相关Correlate</font>''' |
| | | |
− | | Given a set of data cases and two attributes, determine useful relationships between the values of those attributes.
| |
| | | |
− | | Given a set of data cases and two attributes, determine useful relationships between the values of those attributes.
| |
| | | |
| | 给定一组数据案例和两个属性,确定这些属性值之间的有用关系。 | | | 给定一组数据案例和两个属性,确定这些属性值之间的有用关系。 |
− |
| |
− | | What is the correlation between attributes X and Y over a given set S of data cases?
| |
− |
| |
− | | What is the correlation between attributes X and Y over a given set S of data cases?
| |
− |
| |
| | 在给定的数据案例集 S 中,属性 X 和 Y 之间的相关性是什么? | | | 在给定的数据案例集 S 中,属性 X 和 Y 之间的相关性是什么? |
− |
| |
− | | ''- Is there a correlation between carbohydrates and fat?''
| |
− |
| |
− | | - Is there a correlation between carbohydrates and fat?
| |
− |
| |
| |“-碳水化合物和脂肪之间有关系吗?” | | |“-碳水化合物和脂肪之间有关系吗?” |
− |
| |
− | ''- Is there a correlation between country of origin and MPG?''
| |
− |
| |
− | - Is there a correlation between country of origin and MPG?
| |
− |
| |
| “-起源国和 MPG 之间有联系吗?” | | “-起源国和 MPG 之间有联系吗?” |
| | | |
− |
| |
− |
| |
− | ''- Do different genders have a preferred payment method?''
| |
− |
| |
− | - Do different genders have a preferred payment method?
| |
− |
| |
− | “- 不同性别是否倾向不同的付款方式?”
| |
| | | |
| | | |
| | | |
− | ''- Is there a trend of increasing film length over the years?''
| |
| | | |
− | - Is there a trend of increasing film length over the years?
| |
− |
| |
− | “- 电影长度是否有逐年增加的趋势?”
| |
| | | |
| |- | | |- |
第663行: |
第320行: |
| | '''<font color='#ff8000'>语境化Contextualization</font>''' | | | '''<font color='#ff8000'>语境化Contextualization</font>''' |
| | | |
− | | Given a set of data cases, find contextual relevancy of the data to the users.
| |
| | | |
− | | Given a set of data cases, find contextual relevancy of the data to the users.
| |
| | | |
| | 给定一组数据案例,找出数据与用户语境化的相关性。 | | | 给定一组数据案例,找出数据与用户语境化的相关性。 |
− |
| |
− | | Which data cases in a set S of data cases are relevant to the current users' context?
| |
− |
| |
− | | Which data cases in a set S of data cases are relevant to the current users' context?
| |
− |
| |
| | 一组数据案例中的哪些数据案例与当前用户的语境相关? | | | 一组数据案例中的哪些数据案例与当前用户的语境相关? |
− |
| |
− | | ''- Are there groups of restaurants that have foods based on my current caloric intake?''
| |
− |
| |
− | | - Are there groups of restaurants that have foods based on my current caloric intake?
| |
− |
| |
| |“-是否有几组餐馆根据我目前摄入的热量来提供食物?” | | |“-是否有几组餐馆根据我目前摄入的热量来提供食物?” |
| | | |