第313行: |
第313行: |
| 数据挖掘在任何有数字数据可用的地方都可以被使用。数据挖掘的著名例子可以在商业、医学、科学和监控领域找到。 | | 数据挖掘在任何有数字数据可用的地方都可以被使用。数据挖掘的著名例子可以在商业、医学、科学和监控领域找到。 |
| | | |
− | ==Privacy concerns and ethics== | + | ==隐私问题和道德规范 Privacy concerns and ethics== |
| | | |
| While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to peoples' behavior (ethical and otherwise).<ref>{{cite journal |author=Seltzer, William |title=The Promise and Pitfalls of Data Mining: Ethical Issues |url=https://ww2.amstat.org/committees/ethics/linksdir/Jsm2005Seltzer.pdf|publisher = American Statistical Association|journal = ASA Section on Government Statistics|date = 2005 }}</ref> | | While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to peoples' behavior (ethical and otherwise).<ref>{{cite journal |author=Seltzer, William |title=The Promise and Pitfalls of Data Mining: Ethical Issues |url=https://ww2.amstat.org/committees/ethics/linksdir/Jsm2005Seltzer.pdf|publisher = American Statistical Association|journal = ASA Section on Government Statistics|date = 2005 }}</ref> |
第319行: |
第319行: |
| While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to peoples' behavior (ethical and otherwise). | | While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to peoples' behavior (ethical and otherwise). |
| | | |
− | 虽然“数据挖掘”这个术语本身可能没有伦理含义,但它通常与人们行为(伦理和其他)相关的信息挖掘有关。
| + | 虽然“数据挖掘”这个术语本身可能没有伦理含义,但它通常与人们行为(伦理和其他)相关的信息挖掘有关。 |
| | | |
| | | |
第327行: |
第327行: |
| The ways in which data mining can be used can in some cases and contexts raise questions regarding privacy, legality, and ethics. In particular, data mining government or commercial data sets for national security or law enforcement purposes, such as in the Total Information Awareness Program or in ADVISE, has raised privacy concerns. | | The ways in which data mining can be used can in some cases and contexts raise questions regarding privacy, legality, and ethics. In particular, data mining government or commercial data sets for national security or law enforcement purposes, such as in the Total Information Awareness Program or in ADVISE, has raised privacy concerns. |
| | | |
− | 在某些情况下,数据挖掘的使用方式会引起关于隐私、合法性和道德的问题。特别是,出于国家安全或执法目的的数据挖掘政府或商业数据集,如在全面信息意识项目或在 ADVISE 中,引起了隐私问题。
| + | 在某些情况下,数据挖掘的使用方式可能会引发隐私、合法性和伦理问题。特别是,为国家安全或执法目的而进行的政府或商业数据集的数据挖掘,如在全面信息意识项目或在 ADVISE 中,引起了隐私问题。 |
− | | |
| | | |
| | | |
第335行: |
第334行: |
| Data mining requires data preparation which uncovers information or patterns which compromise confidentiality and privacy obligations. A common way for this to occur is through data aggregation. Data aggregation involves combining data together (possibly from various sources) in a way that facilitates analysis (but that also might make identification of private, individual-level data deducible or otherwise apparent). This is not data mining per se, but a result of the preparation of data before – and for the purposes of – the analysis. The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly compiled data set, to be able to identify specific individuals, especially when the data were originally anonymous. | | Data mining requires data preparation which uncovers information or patterns which compromise confidentiality and privacy obligations. A common way for this to occur is through data aggregation. Data aggregation involves combining data together (possibly from various sources) in a way that facilitates analysis (but that also might make identification of private, individual-level data deducible or otherwise apparent). This is not data mining per se, but a result of the preparation of data before – and for the purposes of – the analysis. The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly compiled data set, to be able to identify specific individuals, especially when the data were originally anonymous. |
| | | |
− | 数据挖掘需要进行数据准备,以发现损害机密性和隐私义务的信息或模式。发生这种情况的一种常见方式是通过数据聚合。数据聚合涉及以一种有利于分析的方式将数据组合在一起(可能来自不同的来源)(但这也可能使私有的、个人级别的数据的识别可以推断或以其他方式显而易见)。这本身并不是数据挖掘,而是在分析之前准备数据的结果,也是为了分析的目的。对个人隐私的威胁发挥作用时,数据,一旦编译,使数据矿工,或任何人谁有权访问新编译的数据集,能够识别具体的个人,特别是当数据最初是匿名的。
| + | 数据挖掘需要进行数据准备,以发现损害机密性和隐私义务的信息或模式。实现这一点的一种常见方式是通过'''<font color="#ff8000">数据聚合 Data Aggregation</font>'''。数据聚合包括以一种便于分析的方式将数据(可能来自不同的来源)组合在一起(但这也可能使私人、个人级别的数据的识别变得可推断或以其他方式显而易见)。但这并不是数据挖掘本身,而是在分析之前以及为分析目的准备数据的结果。当数据被编译后,数据挖掘者或任何有权访问新编译的数据集的人能够识别特定的个人,特别是当数据最初是匿名的时,对个人隐私的威胁就开始发挥作用了。 |
− | | |
| | | |
| | | |
第346行: |
第344行: |
| | | |
| * The purpose of the data collection and any (known) data mining projects; | | * The purpose of the data collection and any (known) data mining projects; |
| + | |
| + | 数据收集和任何(已知的)数据挖掘项目的目的; |
| | | |
| * How the data will be used; | | * How the data will be used; |
| + | |
| + | 数据使用的方法; |
| | | |
| * Who will be able to mine the data and use the data and their derivatives; | | * Who will be able to mine the data and use the data and their derivatives; |
| + | |
| + | 谁将能够挖掘数据并使用这些数据及其衍生工具; |
| | | |
| * The status of security surrounding access to the data; | | * The status of security surrounding access to the data; |
| + | |
| + | 数据访问的安全状态; |
| | | |
| * How collected data can be updated. | | * How collected data can be updated. |
| | | |
− | | + | 如何更新收集的数据。 |
| | | |
| Data may also be modified so as to ''become'' anonymous, so that individuals may not readily be identified.<ref name="NASCIO" /> However, even ""anonymized" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL.<ref>[http://www.securityfocus.com/brief/277 ''AOL search data identified individuals''], SecurityFocus, August 2006</ref> | | Data may also be modified so as to ''become'' anonymous, so that individuals may not readily be identified.<ref name="NASCIO" /> However, even ""anonymized" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL.<ref>[http://www.securityfocus.com/brief/277 ''AOL search data identified individuals''], SecurityFocus, August 2006</ref> |
| | | |
− | Data may also be modified so as to become anonymous, so that individuals may not readily be identified. | + | Data may also be modified so as to become anonymous, so that individuals may not readily be identified.However, even ""anonymized" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL. |
| | | |
− | 数据也可能被修改成匿名的,这样个人就不容易被识别。
| + | 数据也可以被修改成匿名的,这样个人就不容易被修改了确定。但是,甚至“匿名化”的数据集也可能包含足够的信息用来识别个人,就像记者能够根据一组无意中搜索历史找到几个个人一样美国在线发布。 |
| | | |
| | | |
| + | The inadvertent revelation of [[personally identifiable information]] leading to the provider violates Fair Information Practices. This indiscretion can cause financial, emotional, or bodily harm to the indicated individual. In one instance of [[privacy violation]], the patrons of Walgreens filed a lawsuit against the company in 2011 for selling prescription information to data mining companies who in turn provided the data to pharmaceutical companies.<ref>{{Cite journal|title = Big data׳s impact on privacy, security and consumer welfare|journal = Telecommunications Policy|pages = 1134–1145|volume = 38|issue = 11|doi = 10.1016/j.telpol.2014.10.002|first = Nir|last = Kshetri|year = 2014|url = http://libres.uncg.edu/ir/uncg/f/N_Kshetri_Big_2014.pdf}}</ref> |
| | | |
− | The inadvertent revelation of [[personally identifiable information]] leading to the provider violates Fair Information Practices. This indiscretion can cause financial, | + | The inadvertent revelation of personally identifiable information leading to the provider violates Fair Information Practices. This indiscretion can cause financial, emotional, or bodily harm to the indicated individual. In one instance of privacy violation, the patrons of Walgreens filed a lawsuit against the company in 2011 for selling prescription information to data mining companies who in turn provided the data to pharmaceutical companies. |
| | | |
− | The inadvertent revelation of personally identifiable information leading to the provider violates Fair Information Practices. This indiscretion can cause financial,
| |
| | | |
− | 无意中泄露的个人身份信息信息导致供应商违反了公平信息惯例。这种轻率的行为会导致经济上的,
| + | 无意中泄露个人身份信息导致提供者违反了公平信息惯例。这种轻率的行为会对指定的个人造成经济、情感或身体伤害。在一起侵犯隐私的案例中,沃尔格林 Walgreens的赞助人在2011年对该公司提起诉讼,指控该公司向数据挖掘公司出售处方信息,而数据挖掘公司又将这些数据提供给制药公司。 |
| | | |
− | emotional, or bodily harm to the indicated individual. In one instance of [[privacy violation]], the patrons of Walgreens filed a lawsuit against the company in 2011 for selling
| |
| | | |
− | emotional, or bodily harm to the indicated individual. In one instance of privacy violation, the patrons of Walgreens filed a lawsuit against the company in 2011 for selling
| |
| | | |
− | 对指定个人的情感或身体伤害。在一起侵犯隐私的案例中,沃尔格林的赞助人在2011年对沃尔格林公司提起诉讼,指控其销售
| + | ===欧洲的情况 Situation in Europe=== |
− | | |
− | prescription information to data mining companies who in turn provided the data
| |
− | | |
− | prescription information to data mining companies who in turn provided the data
| |
− | | |
− | 处方信息提供给数据挖掘公司,而这些公司反过来又提供了数据
| |
− | | |
− | to pharmaceutical companies.<ref>{{Cite journal|title = Big data׳s impact on privacy, security and consumer welfare|journal = Telecommunications Policy|pages = 1134–1145|volume = 38|issue = 11|doi = 10.1016/j.telpol.2014.10.002|first = Nir|last = Kshetri|year = 2014|url = http://libres.uncg.edu/ir/uncg/f/N_Kshetri_Big_2014.pdf}}</ref>
| |
− | | |
− | to pharmaceutical companies.
| |
− | | |
− | 给制药公司。
| |
− | | |
− | | |
− | | |
− | ===Situation in Europe=== | |
| | | |
| | | |
第399行: |
第387行: |
| Europe has rather strong privacy laws, and efforts are underway to further strengthen the rights of the consumers. However, the U.S.-E.U. Safe Harbor Principles, developed between 1998 and 2000, currently effectively expose European users to privacy exploitation by U.S. companies. As a consequence of Edward Snowden's global surveillance disclosure, there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the National Security Agency, and attempts to reach an agreement with the United States have failed. | | Europe has rather strong privacy laws, and efforts are underway to further strengthen the rights of the consumers. However, the U.S.-E.U. Safe Harbor Principles, developed between 1998 and 2000, currently effectively expose European users to privacy exploitation by U.S. companies. As a consequence of Edward Snowden's global surveillance disclosure, there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the National Security Agency, and attempts to reach an agreement with the United States have failed. |
| | | |
− | 欧洲有相当严格的隐私法,并且正在努力进一步加强消费者的权利。然而,美国和欧盟。1998年至2000年间开发的安全港原则,目前有效地将欧洲用户暴露在美国公司的隐私剥削之下。由于爱德华 · 斯诺登(Edward Snowden)披露了全球监控信息,撤销这项协议的讨论有所增加,特别是数据将完全暴露给美国国家安全局(National Security Agency) ,与美国达成协议的尝试也失败了。
| + | 欧洲有相当强的隐私法,正在努力进一步加强消费者的权利。然而,1998年至2000年期间制定的《美国-欧盟安全港原则》(U.S.-E.U.Safe Harbor Principles)目前有效地使欧洲用户受到美国公司的隐私剥削。由于爱德华·斯诺登 Edward Snowden披露了全球监控信息后,关于撤销这一协议的讨论越来越多,尤其是数据将完全暴露给国家安全局,与美国达成协议的尝试也失败了。 |
− | | |
| | | |
− | | + | ===美国的情况 Situation in the United States=== |
− | ===Situation in the United States=== | |
| | | |
| | | |
第411行: |
第397行: |
| In the United States, privacy concerns have been addressed by the US Congress via the passage of regulatory controls such as the Health Insurance Portability and Accountability Act (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and future uses. According to an article in Biotech Business Week, "'[i]n practice, HIPAA may not offer any greater protection than the longstanding regulations in the research arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approach a level of incomprehensibility to average individuals." This underscores the necessity for data anonymity in data aggregation and mining practices. | | In the United States, privacy concerns have been addressed by the US Congress via the passage of regulatory controls such as the Health Insurance Portability and Accountability Act (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and future uses. According to an article in Biotech Business Week, "'[i]n practice, HIPAA may not offer any greater protection than the longstanding regulations in the research arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approach a level of incomprehensibility to average individuals." This underscores the necessity for data anonymity in data aggregation and mining practices. |
| | | |
− | 在美国,隐私问题已经通过美国国会通过的监管控制措施得到解决,比如美国健康保险便利和责任法案保护局(HIPAA)。该法要求个人就其提供的信息及其目前和未来的预期用途作出”知情同意”。根据《生物技术商业周刊》的一篇文章,“在实践中,HIPAA 可能不会提供任何比研究领域长期存在的规定更好的保护,” AAHC 说。更重要的是,该规则的目标是通过知情同意的保护是接近一般个人不可理解的水平。”这强调了数据聚合和挖掘实践中数据匿名的必要性。
| + | 在美国,美国国会通过了《健康保险便携性和责任法案》(HIPAA)等监管措施,解决了隐私问题。HIPAA要求个人就其提供的信息及其当前和未来的预期用途给予“知情同意”。根据《生物技术商业周刊》的一篇文章,“在实践中,HIPAA可能不会比研究领域长期存在的法规提供更大的保护。”。更重要的是,该规则通过知情同意进行保护的目标是接近普通个人的不可理解程度。”这突出了数据聚合和挖掘实践中数据匿名的必要性。 |
− | | |
| | | |
| | | |
第419行: |
第404行: |
| U.S. information privacy legislation such as HIPAA and the Family Educational Rights and Privacy Act (FERPA) applies only to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.S. is not controlled by any legislation. | | U.S. information privacy legislation such as HIPAA and the Family Educational Rights and Privacy Act (FERPA) applies only to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.S. is not controlled by any legislation. |
| | | |
− | 美国信息隐私立法,如 HIPAA 和《家庭教育权利和隐私法》(FERPA)仅适用于这些法律所涉及的具体领域。美国大多数企业对数据挖掘的使用并不受任何法律的控制。 | + | 美国信息隐私立法,如 HIPAA 和《家庭教育权利和隐私法》(FERPA)仅适用于每一个此类法律所涉及的特定领域。美国大多数企业对数据挖掘的使用并不受任何法律的控制。 |
− | | |
− | | |
| | | |
| == 数据挖掘与著作权法 Copyright law== | | == 数据挖掘与著作权法 Copyright law== |