更改

添加47,873字节 、 2021年7月20日 (二) 17:35
第1行: 第1行: −
此词条暂由彩云小译翻译,翻译字数共1358,未经人工整理和审校,带来阅读不便,请见谅。
+
此词条暂由彩云小译翻译,翻译字数共4096,未经人工整理和审校,带来阅读不便,请见谅。
   −
{{more citations needed|date=June 2013}}
+
'''Differential privacy''' is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset. The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single individual, and therefore provides privacy. Another way to describe differential privacy is as a constraint on the algorithms used to publish aggregate information about a [[statistical database]] which limits the disclosure of private information of records whose information is in the database. For example, differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring [[confidentiality]] of survey responses, and [[#Adoption of differential privacy in real-world applications|by companies]] to collect information about user behavior while controlling what is visible even to internal analysts.
   −
'''Urban computing''' is an [[Interdisciplinarity|interdisciplinary field]] which pertains to the study and application of [[Computing|computing technology]] in urban areas. This involves the application of [[wireless network]]s, [[sensor]]s, computational power, and data to improve the quality of densely populated areas:
+
Differential privacy is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset. The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single individual, and therefore provides privacy. Another way to describe differential privacy is as a constraint on the algorithms used to publish aggregate information about a statistical database which limits the disclosure of private information of records whose information is in the database. For example, differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring confidentiality of survey responses, and by companies to collect information about user behavior while controlling what is visible even to internal analysts.
   −
Urban computing is an interdisciplinary field which pertains to the study and application of computing technology in urban areas. This involves the application of wireless networks, sensors, computational power, and data to improve the quality of densely populated areas:
+
是一个用于公开分享数据集信息的系统,它描述了数据集中的群体模式,同时保留了数据集中个人的信息。差分隐私的想法是,如果在数据库中任意进行单一替换的效果足够小,那么查询结果就不能用于推断任何单个个体的太多信息,因此提供了隐私。另一种描述差分隐私数据库的方法是限制用于发布统计数据库的聚合信息的算法,这种算法限制了数据库中信息的记录的私人信息的披露。例如,一些政府机构使用不同的私有算法公布人口统计信息或其他统计数据,同时确保调查答复的保密性,而公司则使用私有算法收集用户行为的信息,同时控制哪怕是内部分析师也能看到的信息。
   −
城市计算是一个跨学科的领域,属于计算技术的研究和应用在城市地区。这涉及到无线网络、传感器、计算能力和数据的应用,以改善人口稠密地区的质量:
+
Roughly, an algorithm is differentially private if an observer seeing its output cannot tell if a particular individual's information was used in the computation.
 +
Differential privacy is often discussed in the context of identifying individuals whose information may be in a database. Although it does not directly refer to identification and [[Data re-identification|reidentification]] attacks, differentially private algorithms probably resist such attacks.<ref name="DMNS06" />
    +
Roughly, an algorithm is differentially private if an observer seeing its output cannot tell if a particular individual's information was used in the computation.
 +
Differential privacy is often discussed in the context of identifying individuals whose information may be in a database. Although it does not directly refer to identification and reidentification attacks, differentially private algorithms probably resist such attacks.
    +
粗略地说,如果一个观察者看到一个算法的输出不能判断一个特定个体的信息是否被用于计算,那么这个算法就是有差异的私有的。差分隐私通常是在识别数据库中的个人信息时讨论的。虽然它不直接涉及身份识别和重新身份识别攻击,但差别私有算法可能抵抗这种攻击。
   −
The term "urban computing" was first introduced by [[Eric Paulos]] at the 2004 UbiComp conference<ref>{{cite speech |first1=Eric |last1=Paulos |first2=Ken |last2=Anderson |first3=Anthony |last3=Townsend |url=http://www.ubicomp.org/ubicomp2004/prg.php?show=workshop#w9 |title=UbiComp in the Urban Frontier |type=workshop  |event=Sixth International Conference on Ubiquitous Computing |location=Nottingham, England |date=September 7, 2004}}</ref> and in his paper The Familiar Stranger<ref name="Paulos04">{{cite conference | last=Paulos | first=Eric | last2=Goodman | first2=Elizabeth | title=The familiar stranger: anxiety, comfort, and play in public places | publisher=ACM Press | location=New York, New York, USA | year=2004 | isbn=1-58113-702-8 | doi=10.1145/985692.985721 | pages=223–230 |conference=Proceedings of the SIGCHI Conference on Human Factors in Computing Systems}}</ref> co-authored with [[Elizabeth Goodman]].  Although closely tied to the field of [[urban informatics]], Marcus Foth differentiates the two in his preface to ''Handbook of Research on Urban Informatics'' by saying that urban computing, urban technology, and urban infrastructure focus more on technological dimensions whereas [[urban informatics]] focuses on the social and human implications of technology in cities.<ref name="foth2009">{{cite book|last1=Foth|first1=Marcus|title=Handbook of Research on Urban Informatics: The Practice and Promise of the Real-Time City|date=2009|publisher=Information Science Reference|location=Hershey, PA|isbn=978-1-60566-152-0|url=http://eprints.qut.edu.au/13308/ |oclc=227572898}}</ref>
+
Differential privacy was developed by [[Cryptography|cryptographers]] and thus is often associated with cryptography, and draws much of its language from cryptography.
   −
The term "urban computing" was first introduced by Eric Paulos at the 2004 UbiComp conference and in his paper The Familiar Stranger co-authored with Elizabeth Goodman.  Although closely tied to the field of urban informatics, Marcus Foth differentiates the two in his preface to Handbook of Research on Urban Informatics by saying that urban computing, urban technology, and urban infrastructure focus more on technological dimensions whereas urban informatics focuses on the social and human implications of technology in cities.
+
Differential privacy was developed by cryptographers and thus is often associated with cryptography, and draws much of its language from cryptography.
   −
“城市计算”一词最早是由埃里克 · 保罗斯在2004年 UbiComp 会议上和他与伊丽莎白 · 古德曼合著的论文《熟悉的陌生人》中提出的。尽管与城市信息学领域密切相关,Marcus Foth 在他的《城市信息学研究手册》的序言中区分了这两者,他说城市计算、城市技术和城市基础设施更多地关注技术层面,而城市信息学则关注城市技术对社会和人类的影响。
+
差分隐私是由密码学家开发的,因此经常与密码学联系在一起,并从密码学中汲取了大量的语言。
    +
==History==
 +
Official statistics organizations are charged with collecting information from individuals or establishments, and publishing aggregate data to serve the public interest.  For example, the [[1790 United States Census]] collected information about individuals living in the United States and published tabulations based on sex, age, race, and condition of servitude. Statistical organizations have long collected information under a promise of [[confidentiality]] that the information provided will be used for statistical purposes, but that the publications will not produce information that can be traced back to a specific individual or establishment. To accomplish this goal, statistical organizations have long suppressed information in their publications. For example, in a table presenting the sales of each business in a town grouped by business category, a cell that has information from only one company might be suppressed, in order to maintain the confidentiality of that company's specific sales.
    +
Official statistics organizations are charged with collecting information from individuals or establishments, and publishing aggregate data to serve the public interest.  For example, the 1790 United States Census collected information about individuals living in the United States and published tabulations based on sex, age, race, and condition of servitude. Statistical organizations have long collected information under a promise of confidentiality that the information provided will be used for statistical purposes, but that the publications will not produce information that can be traced back to a specific individual or establishment. To accomplish this goal, statistical organizations have long suppressed information in their publications. For example, in a table presenting the sales of each business in a town grouped by business category, a cell that has information from only one company might be suppressed, in order to maintain the confidentiality of that company's specific sales.
   −
Within the domain of [[computer science]], urban computing draws from the domains of wireless and sensor networks, [[information science]], and [[Human–computer interaction|human-computer interaction]].  Urban computing uses many of the [[paradigm]]s introduced by [[ubiquitous computing]] in that collections of devices are used to gather data about the urban environment to help improve the [[quality of life]] for people affected by cities.  What further differentiates urban computing from traditional remote sensing networks is the variety of devices, inputs, and human interaction involved.  In traditional sensor networks, devices are often purposefully built and specifically deployed for monitoring certain [[phenomenon]] such as temperature, noise, and light.<ref name="Akyildiz02">{{cite journal | last1 = Akyildiz | first1 = I.F. | last2 = Su | first2 = W. | last3 = Sankarasubramaniam | first3 = Y. | last4 = Cayirci | first4 = E. | year = 2002 | title = Wireless sensor networks: a survey | journal = Computer Networks | volume = 38 | issue = 4| pages = 393–422 [395] | doi = 10.1016/S1389-1286(01)00302-4 | citeseerx = 10.1.1.320.5948 }}</ref> As an interdisciplinary field, urban computing also has practitioners and applications in fields including [[civil engineering]], [[anthropology]], [[public history]], [[health care]], [[urban planning]], and energy, among others.<ref name="Kukka14">{{cite conference | last=Kukka | first=Hannu | last2=Ylipulli | first2=Johanna | last3=Luusua | first3=Anna | last4=Dey | first4=Anind K. | title=Urban computing in theory and practice | publisher=ACM Press |conference=Proceedings of the 8th Nordic Conference on Human-Computer Interaction: Fun, Fast, Foundational (NordiCHI '14) | location=New York, New York, USA | year=2014 | isbn=978-1-4503-2542-4 | doi=10.1145/2639189.2639250 | pages=658–667}}</ref>
+
= = 历史 = = 官方统计组织负责从个人或机构收集信息,并公布汇总数据,以服务于公众利益。例如,1790年美国人口普查收集了生活在美国的个人信息,并公布了基于性别、年龄、种族和奴役状况的表格。统计组织长期以来一直在保密的前提下收集信息,即所提供的信息将用于统计目的,但出版物不会提供可追溯到具体个人或机构的信息。为了实现这一目标,统计机构长期以来一直在其出版物中压制信息。例如,在按业务类别列出某个城镇中每家企业的销售情况的表格中,可能会取消只有一家公司信息的单元格,以便对该公司的具体销售情况保密。
   −
Within the domain of computer science, urban computing draws from the domains of wireless and sensor networks, information science, and human-computer interaction.  Urban computing uses many of the paradigms introduced by ubiquitous computing in that collections of devices are used to gather data about the urban environment to help improve the quality of life for people affected by cities.  What further differentiates urban computing from traditional remote sensing networks is the variety of devices, inputs, and human interaction involved. In traditional sensor networks, devices are often purposefully built and specifically deployed for monitoring certain phenomenon such as temperature, noise, and light. As an interdisciplinary field, urban computing also has practitioners and applications in fields including civil engineering, anthropology, public history, health care, urban planning, and energy, among others.
+
The adoption of electronic information processing systems by statistical agencies in the 1950s and 1960s dramatically increased the number of tables that a statistical organization could produce and, in so doing, significantly increased the potential for an improper disclosure of confidential information. For example, if a business that had its sales numbers suppressed also had those numbers appear in the total sales of a region, then it might be possible to determine the suppressed value by subtracting the other sales from that total. But there might also be combinations of additions and subtractions that might cause the private information to be revealed. The number of combinations that needed to be checked increases exponentially with the number of publications, and it is potentially unbounded if data users are able to make queries of the statistical database using an interactive query system.
   −
在计算机科学领域,城市计算从无线网络和传感器网络、信息科学和人机交互计算领域获得灵感。城市计算使用了许多由普适计算引入的范式,即设备集合用于收集城市环境的数据,以帮助改善受城市影响的人们的生活质量。城市计算与传统遥感网络的进一步区别在于涉及的设备、输入和人机交互的多样性。在传统的传感器网络中,设备通常是有目的地构建和专门用于监测某些现象,如温度、噪音和光线。作为一个跨学科的领域,城市计算在土木工程、人类学、公共历史、卫生保健、城市规划和能源等领域也有实践和应用。
+
The adoption of electronic information processing systems by statistical agencies in the 1950s and 1960s dramatically increased the number of tables that a statistical organization could produce and, in so doing, significantly increased the potential for an improper disclosure of confidential information. For example, if a business that had its sales numbers suppressed also had those numbers appear in the total sales of a region, then it might be possible to determine the suppressed value by subtracting the other sales from that total. But there might also be combinations of additions and subtractions that might cause the private information to be revealed. The number of combinations that needed to be checked increases exponentially with the number of publications, and it is potentially unbounded if data users are able to make  queries of the statistical database using an interactive query system.
    +
1950年代和1960年代,统计机构采用了电子信息处理系统,大大增加了统计机构可以编制的表格数量,从而大大增加了不当披露机密信息的可能性。例如,如果一个企业的销售数字被抑制,这些数字也出现在一个地区的总销售额中,那么可以通过从总销售额中减去其他销售额来确定被抑制的价值。但也可能有增减的组合,这可能导致私人信息被披露。随着出版物数量的增加,需要检查的组合数量呈指数增长,如果数据用户能够使用交互式查询系统对统计数据库进行查询,则可能无限制。
    +
In 1977, Tore Dalenius formalized the mathematics of cell suppression.<ref>{{cite journal|author=Dalenius|first=Tore|year=1977|title=Towards a methodology for statistical disclosure control|url=https://archives.vrdc.cornell.edu/info7470/2011/Readings/dalenius-1977.pdf|journal=Statistik Tidskrift|volume=15}}</ref>
   −
== Applications and examples ==
+
In 1977, Tore Dalenius formalized the mathematics of cell suppression.
   −
{{Quotation|Urban computing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by a diversity of sources in urban spaces, such as sensors, devices, vehicles, buildings, and human, to tackle the major issues that cities face. Urban computing connects unobtrusive and ubiquitous sensing technologies, advanced data management and analytics models, and novel visualization methods, to create win-win-win solutions that improve urban environment, human life quality, and city operation systems.|Yu Zheng|Urban Computing with Big Data<ref name="Yu14">{{cite journal | last=Zheng | first=Yu | last2=Capra | first2=Licia | last3=Wolfson | first3=Ouri | last4=Yang | first4=Hai | title=Urban Computing | journal=ACM Transactions on Intelligent Systems and Technology | publisher=Association for Computing Machinery (ACM) | volume=5 | issue=3 | date=2014-09-18 | issn=2157-6904 | doi=10.1145/2629592 | pages=1–55}}</ref>}}
+
1977年,托雷 · 达伦纽斯正式确立了细胞抑制的数学模型。
    +
In 1979, [[Dorothy Denning]], [[Peter J. Denning]] and Mayer D. Schwartz formalized the concept of a Tracker, an adversary that could learn the confidential contents of a statistical database by creating a series of targeted queries and remembering the results.<ref>{{cite journal|author=Dorothy E. Denning|author2=Peter J. Denning|author3=Mayer D. Schwartz|title=The Tracker: A Threat to Statistical Database Security|date=March 1978|url=http://www.dbis.informatik.hu-berlin.de/fileadmin/lectures/SS2011/VL_Privacy/Tracker1.pdf|volume=4|number=1|pages=76–96}}</ref> This and future research showed that privacy properties in a database could only be preserved by considering each new query in light of (possibly all) previous queries. This line of work is sometimes called ''query privacy,'' with the final result being that tracking the impact of a query on the privacy of individuals in the database was NP-hard.
    +
In 1979, Dorothy Denning, Peter J. Denning and Mayer D. Schwartz formalized the concept of a Tracker, an adversary that could learn the confidential contents of a statistical database by creating a series of targeted queries and remembering the results. This and future research showed that privacy properties in a database could only be preserved by considering each new query in light of (possibly all) previous queries. This line of work is sometimes called query privacy, with the final result being that tracking the impact of a query on the privacy of individuals in the database was NP-hard.
   −
=== Cultural archiving===
+
1979年,多萝西 · 丹宁(Dorothy Denning)、彼得 · j · 丹宁(Peter j. Denning)和迈耶 · d · 施瓦茨(Mayer d. Schwartz)正式提出了“跟踪器”(Tracker)的概念。这个对手可以通过创建一系列有针对性的查询并记住查询结果,来了解统计数据库的机密内容。这项研究和未来的研究表明,数据库中的隐私属性只能通过根据(可能是所有的)以前的查询来考虑每个新的查询来保护。这种工作有时被称为查询隐私,最终的结果是跟踪查询对数据库中个人隐私的影响是 np 难的。
   −
Cities are more than a collection of places and people - places are continually reinvented and re-imagined by the people occupying them.  As such, the prevalence of computing in urban spaces leads people to supplement their physical reality with what is virtually available.<ref name="Kukka14-2">{{cite journal | last1 = Kukka | first1 = Hannu | last2 = Luusua | first2 = Anna | last3 = Ylipulli | first3 = Johanna | last4 = Suopajärvi | first4 = Tiina | last5 = Kostakos | first5 = Vassilis | last6 = Ojala | first6 = Timo | year = 2014 | title = From cyberpunk to calm urban computing: Exploring the role of technology in the future cityscape | journal = Technological Forecasting and Social Change | volume = 84 | pages = 29–42 | doi = 10.1016/j.techfore.2013.07.015 }}</ref>  Toward this end, researchers engaged in ethnography, collective memory, and public history have leveraged urban computing strategies to introduce platforms that enable people to share their interpretation of the urban environment.  Examples of such projects include CLIO—an urban computing system that came out of the Collective City Memory of Oulu study—which "allows people to share personal memories, context annotate them and relate them with city landmarks, thus creating the collective city memory."<ref name="Christopoulou12">{{cite conference | last=Christopoulou | first=Eleni | last2=Ringas | first2=Dimitrios | last3=Stefanidakis | first3=Michail | title=Experiences from the Urban Computing Impact on Urban Culture |conference=16th Panhellenic Conference on Informatics (PCI)| publisher=IEEE | year=2012 | isbn=978-1-4673-2720-6 | doi=10.1109/pci.2012.53 | at=pp.56,61}}</ref> and the Cleveland Historical project which aims to create a shared history of the city by allowing people to contribute stories through their own digital devices.<ref name="clevelandhistorical">{{cite web |url=http://clevelandhistorical.org/about |title=About Cleveland Historical |publisher=Cleveland Historical |access-date=22 April 2015}}</ref>
+
In 2003, [[Kobbi Nissim]] and [[Irit Dinur]] demonstrated that it is impossible to publish arbitrary queries on a private statistical database without revealing some amount of private information, and that the entire information content of the database can be revealed by publishing the results of a surprisingly small number of random queries—far fewer than was implied by previous work.<ref>Irit Dinur and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '03). ACM, New York, NY, USA, 202–210. {{doi|10.1145/773153.773173}}</ref> The general phenomenon is known as the [[Reconstruction attack|Fundamental Law of Information Recovery]], and its key insight, namely that in the most general case, privacy cannot be protected without injecting some amount of noise, led to development of differential privacy.
   −
Cities are more than a collection of places and people - places are continually reinvented and re-imagined by the people occupying them. As such, the prevalence of computing in urban spaces leads people to supplement their physical reality with what is virtually available. Toward this end, researchers engaged in ethnography, collective memory, and public history have leveraged urban computing strategies to introduce platforms that enable people to share their interpretation of the urban environmentExamples of such projects include CLIO—an urban computing system that came out of the Collective City Memory of Oulu study—which "allows people to share personal memories, context annotate them and relate them with city landmarks, thus creating the collective city memory." and the Cleveland Historical project which aims to create a shared history of the city by allowing people to contribute stories through their own digital devices.
+
In 2003, Kobbi Nissim and Irit Dinur demonstrated that it is impossible to publish arbitrary queries on a private statistical database without revealing some amount of private information, and that the entire information content of the database can be revealed by publishing the results of a surprisingly small number of random queries—far fewer than was implied by previous work.Irit Dinur and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '03). ACM, New York, NY, USA, 202–210The general phenomenon is known as the Fundamental Law of Information Recovery, and its key insight, namely that in the most general case, privacy cannot be protected without injecting some amount of noise, led to development of differential privacy.
   −
城市不仅仅是一个地方的集合,人们不断地对这些地方进行改造和重新想象。因此,计算机在城市空间的普及导致人们用虚拟可用的东西来补充他们的物理现实。为此,从事人种学、集体记忆和公共历史研究的研究人员利用城市计算策略引入平台,使人们能够分享他们对城市环境的解释。这类项目的例子包括 clio ——一个来自奥卢集体城市记忆研究的城市计算系统——该系统“让人们分享个人记忆,对其进行背景注释,并将其与城市地标联系起来,从而创造集体城市记忆。”以及克利夫兰历史项目,该项目旨在通过允许人们通过自己的数字设备发表故事,创造一个共享的城市历史。
+
2003年,Kobbi Nissim 和 Irit Dinur 证明,在私人统计数据库上发布任意查询而不披露一些私人信息是不可能的,而且通过发布少得惊人的随机查询的结果就可以显示数据库的全部信息内容ー远远少于以前的工作所暗示的内容。伊里特 · 迪努尔和小布 · 尼辛。2003.在保护隐私的同时披露信息。在第二十二届 ACM SIGMOD-SIGACT-SIGART 数据库系统原理研讨会会议录(PODS’03)。ACM,纽约,纽约,美国,202-210。这种普遍现象被称为信息恢复的基本法则,其核心观点是,在最普遍的情况下,如果不注入一些噪音,隐私就无法得到保护,这导致了差分隐私的发展。
    +
In 2006, [[Cynthia Dwork]], [[Frank McSherry]], [[Kobbi Nissim]] and [[Adam D. Smith]] published an article formalizing the amount of noise that needed to be added and proposing a generalized mechanism for doing so.<ref name="DMNS06" /> Their work was a co-recipient of the 2016 TCC Test-of-Time Award<ref>{{cite web |title=TCC Test-of-Time Award |url=https://www.iacr.org/workshops/tcc/awards.html}}</ref> and the 2017 [[Gödel Prize]].<ref>{{cite web |title=2017 Gödel Prize |url=https://www.eatcs.org/index.php/component/content/article/1-news/2450-2017-godel-prize}}</ref>
    +
In 2006, Cynthia Dwork, Frank McSherry, Kobbi Nissim and Adam D. Smith published an article formalizing the amount of noise that needed to be added and proposing a generalized mechanism for doing so. Their work was a co-recipient of the 2016 TCC Test-of-Time Award and the 2017 Gödel Prize.
   −
===Energy consumption===
+
2006年,辛西娅 · 德沃克、弗兰克 · 麦克谢里、科比 · 尼西姆和亚当 · d · 史密斯发表了一篇文章,正式确定了需要增加的噪音量,并提出了一种通用的增加噪音的机制。他们的工作是2016年移动通信协会时间测试奖和2017年哥德尔奖的共同获得者。
   −
Energy consumption and pollution throughout the world is heavily impacted by urban transportation.<ref name="Greenhouse">{{cite web | title=Greenhouse Gas Emissions: Transportation Sector Emissions - Climate Change - US EPA | website=epa.gov | date=2012-03-16 | url=http://www.epa.gov/climatechange/ghgemissions/sources/transportation.html | archive-url=https://web.archive.org/web/20140704052734/http://www.epa.gov/climatechange/ghgemissions/sources/transportation.html | archive-date=2014-07-04 | url-status=unfit}}</ref> In an effort to better utilize and update current infrastructures, researchers have used urban computing to better understand gas emissions by conducting field studies using GPS data from a sample of vehicles, refueling data from gas stations, and self-reporting online participants.<ref name="Zhang">{{cite conference | last=Zhang | first=Fuzheng | last2=Wilkie | first2=David | last3=Zheng | first3=Yu | last4=Xie | first4=Xing | title=Sensing the pulse of urban refueling behavior |conference=UbiComp '13: Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing| publisher=ACM Press | location=New York, New York, USA | year=2013 | isbn=978-1-4503-1770-2 | doi=10.1145/2493432.2493448 | pages=13–22}}</ref> From this, knowledge of the density and speed of traffic traversing a city's road network can be used to suggest cost-efficient driving routes, and identify road segments where gas has been significantly wasted.<ref name="Inferring_Yu">{{cite conference | last=Shang | first=Jingbo | last2=Zheng | first2=Yu | last3=Tong | first3=Wenzhu | last4=Chang | first4=Eric | last5=Yu | first5=Yong | title=Inferring gas consumption and pollution emission of vehicles throughout a city |conference=KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining | publisher=ACM Press | location=New York, New York, USA | year=2014 | isbn=978-1-4503-2956-9 | doi=10.1145/2623330.2623653 | pages=1027–1036}}</ref> Information and predictions of pollution density gathered in this way could also be used to generate localized air quality alerts.<ref name="Inferring_Yu" /> Additionally, these data could produce estimates of gas stations’ wait times to suggest more efficient stops, as well as give a geographic view of the efficiency of gas station placement.<ref name="Zhang" />
+
Since then, subsequent research has shown that there are many ways to produce very accurate statistics from the database while still ensuring high levels of privacy.<ref>{{Cite journal|last=Hilton|first=Michael|s2cid=16861132|title=Differential Privacy: A Historical Survey}}</ref><ref>{{Cite book|title=Theory and Applications of Models of Computation|volume=4978|last=Dwork|first=Cynthia|date=2008-04-25|publisher=Springer Berlin Heidelberg|isbn=9783540792277|editor-last=Agrawal|editor-first=Manindra|series=Lecture Notes in Computer Science|pages=1–19|language=en|chapter=Differential Privacy: A Survey of Results|doi=10.1007/978-3-540-79228-4_1|editor-last2=Du|editor-first2=Dingzhu|editor-last3=Duan|editor-first3=Zhenhua|editor-last4=Li|editor-first4=Angsheng|chapter-url=https://www.microsoft.com/en-us/research/publication/differential-privacy-a-survey-of-results/}}</ref>
   −
Energy consumption and pollution throughout the world is heavily impacted by urban transportation. In an effort to better utilize and update current infrastructures, researchers have used urban computing to better understand gas emissions by conducting field studies using GPS data from a sample of vehicles, refueling data from gas stations, and self-reporting online participants. From this, knowledge of the density and speed of traffic traversing a city's road network can be used to suggest cost-efficient driving routes, and identify road segments where gas has been significantly wasted. Information and predictions of pollution density gathered in this way could also be used to generate localized air quality alerts. This discovery spurred a collaboration between the CDC and Google to create a map of predicted flu outbreaks based on this data.
+
Since then, subsequent research has shown that there are many ways to produce very accurate statistics from the database while still ensuring high levels of privacy.
   −
全世界的能源消耗和污染都受到城市交通的严重影响。为了更好地利用和更新现有的基础设施,研究人员使用城市计算机进行实地研究,利用车辆样本的全球定位系统数据、加油站的加油数据以及在线参与者的自我报告,从而更好地了解气体排放。由此,可以利用关于穿越城市道路网的交通密度和速度的知识,提出具有成本效益的驾驶路线,并确定哪些道路段的汽油被大量浪费。以这种方式收集的污染密度信息和预测也可用于产生局部空气质量警报。这一发现促使 CDC 和谷歌合作,根据这些数据创建了一个预测流感爆发的地图。
+
从那时起,随后的研究表明,有许多方法可以在保证高度隐私的同时,从数据库中生成非常准确的统计数据。
    +
==ε-differential privacy==
 +
The 2006 Dwork, McSherry, Nissim and Smith article introduced the concept of ε-differential privacy, a mathematical definition for the privacy loss associated with any data release drawn from a statistical database. (Here, the term ''statistical database'' means a set of data that are collected under the pledge of confidentiality for the purpose of producing statistics that, by their production, do not compromise the privacy of those individuals who provided the data.)
    +
The 2006 Dwork, McSherry, Nissim and Smith article introduced the concept of ε-differential privacy, a mathematical definition for the privacy loss associated with any data release drawn from a statistical database. (Here, the term statistical database means a set of data that are collected under the pledge of confidentiality for the purpose of producing statistics that, by their production, do not compromise the privacy of those individuals who provided the data.)
   −
===Health===
+
= = = ε- 差别隐私 = = 2006年 Dwork,McSherry,Nissim 和 Smith 的文章介绍了 ε- 差别隐私的概念,这是从统计数据库中抽取的任何数据泄露引起的隐私损失的数学定义。(在这里,统计数据库一词是指根据保密承诺收集的一组数据,目的是编制统计数据,而编制这些数据不会损害提供数据的个人的隐私。)
   −
Urban computing can also be used to track and predict pollution in certain areas.  Research involving the use of artificial neural networks (ANN)  and conditional random fields (CRF) has shown that air pollution for a large area can be predicted based on the data from a small number of air pollution monitoring stations. These findings can be used to track air pollution and to prevent the adverse health effects in cities already struggling with high pollution. On days when air pollution is especially high, for example, there could be a system in place to alert residents to particularly dangerous areas.
+
The intuition for the 2006 definition of ε-differential privacy is that a person's privacy cannot be compromised by a statistical release if their data are not in the database. Therefore, with differential privacy, the goal is to give each individual roughly the same privacy that would result from having their data removed. That is, the statistical functions run on the database should not overly depend on the data of any one individual.
   −
城市计算机也可以用来跟踪和预测某些地区的污染。人工神经网络(ANN)和条件随机场(CRF)的研究表明,利用少量空气污染监测站的数据,可以预测大面积的空气污染。这些发现可以用来跟踪空气污染,并防止对已经在与高污染作斗争的城市的不利健康影响。例如,在空气污染特别严重的日子里,可以建立一个系统,提醒居民注意特别危险的地区。
+
The intuition for the 2006 definition of ε-differential privacy is that a person's privacy cannot be compromised by a statistical release if their data are not in the database. Therefore, with differential privacy, the goal is to give each individual roughly the same privacy that would result from having their data removed. That is, the statistical functions run on the database should not overly depend on the data of any one individual.
   −
Smart phones, tablets, smart watches, and other mobile computing devices can provide information beyond simple communication and entertainment.  In regards to public and personal health, organizations like the [[Center for Disease Control and Prevention ]](CDC)  and [[World Health Organization]] (WHO) have taken to Twitter and other social media platforms, to provide rapid dissemination of disease outbreaks, medical discoveries, and other news.  Beyond simply tracking the spread of disease, urban computing can even help predict it.  A study by Jeremy Ginsberg et al. discovered that flu-related search queries serve as a reliable indicator of a future outbreak, thus allowing for the tracking of flu outbreaks based on the geographic location of such flu-related searches.<ref name="Ginsberg">{{cite journal | last1 = Ginsberg | first1 = J | display-authors = etal  | year = 2009 | title = Detecting influenza epidemics using search engine query data | journal = Nature | volume = 457 | issue = 7232| pages = 1012–1014 | doi = 10.1038/nature07634 | pmid=19020500| bibcode = 2009Natur.457.1012G }}</ref> This discovery spurred a collaboration between the CDC and Google to create a map of predicted flu outbreaks based on this data.<ref name="FluTrends">{{cite web |url=http://www.google.org/flutrends/about/how.html |title=Google Flu Trends |access-date=21 April 2015}}</ref>
+
2006年对 ε- 差别隐私的定义的直觉是,如果一个人的数据不在数据库中,那么他的隐私就不会因为统计公布而受到损害。因此,差分隐私的目标是给每个人大致相同的隐私,这将导致他们的数据删除。也就是说,在数据库上运行的统计功能不应过分依赖于任何个人的数据。
    +
Of course, how much any individual contributes to the result of a database query depends in part on how many people's data are involved in the query. If the database contains data from a single person, that person's data contributes 100%. If the database contains data from a hundred people, each person's data contributes just 1%. The key insight of differential privacy is that as the query is made on the data of fewer and fewer people, more noise needs to be added to the query result to produce the same amount of privacy. Hence the name of the 2006 paper, "Calibrating noise to sensitivity in private data analysis."
    +
Of course, how much any individual contributes to the result of a database query depends in part on how many people's data are involved in the query. If the database contains data from a single person, that person's data contributes 100%. If the database contains data from a hundred people, each person's data contributes just 1%. The key insight of differential privacy is that as the query is made on the data of fewer and fewer people, more noise needs to be added to the query result to produce the same amount of privacy. Hence the name of the 2006 paper, "Calibrating noise to sensitivity in private data analysis."
   −
Urban computing can also be used to track and predict pollution in certain areas.  Research involving the use of [[artificial neural networks]] (ANN)  and [[conditional random fields]] (CRF) has shown that air pollution for a large area can be predicted based on the data from a small number of air pollution monitoring stations.<ref name ="Zheng01">{{cite conference | last=Zheng | first=Yu | last2=Liu | first2=Furui | last3=Hsieh | first3=Hsun-Ping | title=U-Air: when urban air quality inference meets big data |conference=KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining| publisher=ACM Press | location=New York, New York, USA | year=2013 | isbn=978-1-4503-2174-7 | doi=10.1145/2487575.2488188 | pages=1436–1444}}</ref><ref name ="Zheng02">{{cite journal |first=Yu |last=Zheng |first2=Xuxu |last2=Chen |first3=Qiwei |last3=Jin |first4=Yubiao |last4=Chen |first5=Xiangyun |last5=Qu |first6=Xin |last6=Liu |first7=Eric |last7=Chang |first8=Wei-Ying |last8=Ma |first9=Yong |last9=Rui |first10=Weiwei |last10=Sun |title=A Cloud-Based Knowledge Discovery System for Monitoring Fine-Grained Air Quality |journal=MSR-Tr-2014-40 |year=2014 |url=http://pdfs.semanticscholar.org/54a6/98ad7cf82c98cb3bcb6ce38b88c2e2754974.pdf |archive-url=https://web.archive.org/web/20190224072142/http://pdfs.semanticscholar.org/54a6/98ad7cf82c98cb3bcb6ce38b88c2e2754974.pdf |url-status=dead |archive-date=2019-02-24 }}</ref>  These findings can be used to track air pollution and to prevent the adverse health effects in cities already struggling with high pollution. On days when air pollution is especially high, for example, there could be a system in place to alert residents to particularly dangerous areas.
+
当然,任何个体对数据库查询结果的贡献程度部分取决于查询中涉及的人员数据的数量。如果数据库包含来自一个人的数据,那么该人的数据贡献率为100% 。如果数据库包含来自100人的数据,每个人的数据贡献率仅为1% 。差分隐私的主要观点是,由于查询是针对越来越少的人的数据进行的,所以需要在查询结果中添加更多的噪音来产生同样的隐私。因此,2006年的论文得名为《在私人数据分析中将噪声校准到灵敏度》
   −
Mobile computing platforms can be used to facilitate social interaction.  In the context of urban computing, the ability to place proximity beacons in the environment, the density of population, and infrastructure available enables digitally facilitated interaction. Paulos and Goodman's paper The Familiar Stranger introduces several categories of interaction ranging from family to strangers and interactions ranging from personal to in passing. Examples of geographically aware applications include Yik Yak, an application that facilitates anonymous social interaction based on proximity of other users, Ingress which uses an augmented reality game to encourage users to interact with the area around them as well as each other, and Foursquare, which provides recommendations about services to users based on a specified location.
+
The 2006 paper presents both a mathematical definition of differential privacy and a mechanism based on the addition of Laplace noise (i.e. noise coming from the [[Laplace distribution]]) that satisfies the definition.
   −
移动计算平台可以用来促进社交互动。在城市计算的背景下,在环境中放置近距离信标的能力、人口密度和可用的基础设施使得数字化便利的互动成为可能。波洛斯和古德曼的论文《熟悉的陌生人》介绍了几类互动,从家庭到陌生人,从个人互动到过客互动。地理感知应用程序的例子包括 Yik Yak,一个基于接近其他用户促进匿名社交互动的应用程序,Ingress 使用扩增实境游戏来鼓励用户与他们周围的区域以及彼此之间进行互动,还有 Foursquare,它根据特定的位置向用户提供服务建议。
+
The 2006 paper presents both a mathematical definition of differential privacy and a mechanism based on the addition of Laplace noise (i.e. noise coming from the Laplace distribution) that satisfies the definition.
    +
2006年的论文给出了差分隐私的数学定义,以及基于拉普拉斯噪音(即。拉普拉斯分布发出的噪音)。
    +
===Definition of ε-differential privacy===
 +
Let ε be a positive [[real number]] and <math>\mathcal{A}</math> be a [[randomized algorithm]] that takes a dataset as input (representing the actions of the trusted party holding the data).
 +
Let <math>\textrm{im}\ \mathcal{A}</math> denote the [[image (mathematics)|image]] of <math>\mathcal{A}</math>. The algorithm <math>\mathcal{A}</math> is said to provide <math>\epsilon</math>-differential privacy if, for all datasets <math>D_1</math> and <math>D_2</math> that differ on a single element (i.e., the data of one person), and all subsets <math>S</math> of <math>\textrm{im}\ \mathcal{A}</math>:
 +
<center>
 +
<math>\Pr[\mathcal{A}(D_1) \in S] \leq \exp\left(\epsilon\right) \cdot \Pr[\mathcal{A}(D_2) \in S],</math>
 +
</center>
 +
where the probability is taken over the [[randomness]] used by the algorithm.<ref name="DPBook"/>
   −
=== Social Interaction===
+
Let ε be a positive real number and \mathcal{A} be a randomized algorithm that takes a dataset as input (representing the actions of the trusted party holding the data).
 +
Let \textrm{im}\ \mathcal{A} denote the image of \mathcal{A}. The algorithm \mathcal{A} is said to provide \epsilon-differential privacy if, for all datasets D_1 and D_2 that differ on a single element (i.e., the data of one person), and all subsets S of \textrm{im}\ \mathcal{A}:
   −
Mobile computing platforms can be used to facilitate social interaction.  In the context of urban computing, the ability to place proximity beacons in the environment, the density of population, and infrastructure available enables digitally facilitated interaction.  Paulos and Goodman's paper The Familiar Stranger introduces several categories of interaction ranging from family to strangers and interactions ranging from personal to in passing.<ref name="Paulos04"/>  Social interactions can be facilitated by purpose-built devices, proximity aware applications, and “participatory” applications.  These applications can use a variety techniques for users to identify where they are ranging from “checking in” to proximity detection, to self-identification.<ref name="Jabeur13">{{cite journal | last=Jabeur | first=Nafaâ | last2=Zeadally | first2=Sherali | last3=Sayed | first3=Biju | title=Mobile social networking applications | journal=Communications of the ACM | publisher=Association for Computing Machinery (ACM) | volume=56 | issue=3 | date=2013-03-01 | issn=0001-0782 | doi=10.1145/2428556.2428573 | page=71}}</ref> Examples of geographically aware applications include [[Yik Yak]], an application that facilitates anonymous social interaction based on proximity of other users, [[Ingress (game)|Ingress]] which uses an [[augmented reality]] game to encourage users to interact with the area around them as well as each other, and [[Foursquare City Guide|Foursquare]], which provides recommendations about services to users based on a specified location.
+
\Pr[\mathcal{A}(D_1) \in S] \leq \exp\left(\epsilon\right) \cdot \Pr[\mathcal{A}(D_2) \in S],
   −
One of the major application areas of urban computing is to improve private and public transportation in a city. The primary sources of data are floating car data (data about where cars are at a given moment). This includes individual GPS’s, taxi GPS’s, WiFI signals, loop sensors, and (for some applications) user input.
+
where the probability is taken over the randomness used by the algorithm.
   −
城市计算的主要应用领域之一是改善城市的私人和公共交通。数据的主要来源是浮动汽车数据(关于汽车在给定时刻位置的数据)。这包括个人 GPS、出租车 GPS、 WiFI 信号、循环传感器,以及(对于某些应用)用户输入。
+
= = = = ε-differential privacy 的定义 = = = 设 ε 是一个正实数,mathcal { a }是一个以数据集作为输入的随机化算法(表示持有数据的受信任方的行为)。让 textrm { im }数学{ a }表示数学{ a }的映像。算法数学{ a }被称为提供 epsilon-differentiation 保密性,如果对于所有数据集 d1和 d2在单个元素上不同(即,一个人的数据) ,以及所有 textrm { im }数学{ a }的子集 s: Pr [ mathcal { a }(d _ 1)在 s ] leq exp left (epon right) cdot Pr [数学{ a }(d _ 2)在 s ]中,其中概率取代了算法所使用的随机性。
    +
Differential privacy offers strong and robust guarantees that facilitate modular design and analysis of differentially private mechanisms due to its [[#Composability|composability]], [[#Robustness to post-processing|robustness to post-processing]], and graceful degradation in the presence of [[#Group privacy|correlated data]].
    +
Differential privacy offers strong and robust guarantees that facilitate modular design and analysis of differentially private mechanisms due to its composability, robustness to post-processing, and graceful degradation in the presence of correlated data.
   −
Urban computing can help select better driving routes, which is important for applications like Waze, Google Maps, and trip planning. Wang et al. built a system to get real-time travel time estimates. They solve the problems: one, not all road segments will have data from GPS in the last 30 minutes or ever; two, some paths will be covered by several car records, and it’s necessary to combine those records to create the most accurate estimate of travel time; and three, a city can have tens of thousands of road segments and an infinite amount of paths to be queried, so providing an instantaneous real time estimate must be scalable. They used various techniques and tested it out on 32670 taxis over two months in Beijing, and accurately estimated travel time to within 25 seconds of error per kilometer.
+
由于其可组合性、对后处理的鲁棒性以及在相关数据存在时的优雅退化,差分隐私提供了强大而健壮的保证,可以促进模块化设计和差异专用机制的分析。
   −
城市计算可以帮助选择更好的行车路线,这对于像 Waze、谷歌地图和旅行计划这样的应用程序很重要。王等人。建立了一个系统来得到实时旅行时间的估计。它们解决了以下问题: 第一,并不是所有的道路段都能在最后30分钟或更长的时间内获得 GPS 的数据; 第二,一些道路将被多条车辆记录覆盖,有必要将这些记录结合起来,以创建最准确的旅行时间估计; 第三,一个城市可能有数以万计的道路段和无限数量的路径需要查询,因此提供一个即时的实时估计必须是可伸缩的。他们使用了各种技术,并在北京的32670辆出租车上进行了为期两个月的测试,准确地估计出行时间在每公里误差25秒内。
+
=== Composability ===
 +
(Self-)composability refers to the fact that the joint distribution of the outputs of (possibly adaptively chosen) differentially private mechanisms satisfies differential privacy.
   −
=== Transportation ===
+
(Self-)composability refers to the fact that the joint distribution of the outputs of (possibly adaptively chosen) differentially private mechanisms satisfies differential privacy.
   −
One of the major application areas of urban computing is to improve private and public transportation in a city. The primary sources of data are floating car data (data about where cars are at a given moment). This includes individual GPS’s, taxi GPS’s, WiFI signals, loop sensors, and (for some applications) user input.
+
= = = = 可组合性 = = = = (Self -)可组合性是指差异私有机制的输出的联合分布满足差分隐私。
   −
Uber is an on-demand taxi-like service where users can request rides with their smartphone. By using the data of the active riders and drivers, Uber can price discriminate based on the current rider/driver ratio. This lets them earn more money than they would without “surge pricing,” and helps get more drivers out on the street in unpopular working hours.
+
'''Sequential composition.''' If we query an ε-differential privacy mechanism <math>t</math> times, and the randomization of the mechanism is independent for each query, then the result would be <math>\epsilon t</math>-differentially private. In the more general case, if there are <math>n</math> independent mechanisms: <math>\mathcal{M}_1,\dots,\mathcal{M}_n</math>, whose privacy guarantees are <math>\epsilon_1,\dots,\epsilon_n</math> differential privacy, respectively, then any function <math>g</math> of them: <math>g(\mathcal{M}_1,\dots,\mathcal{M}_n)</math> is <math>\left(\sum\limits_{i=1}^{n} \epsilon_i\right)</math>-differentially private.<ref name="PINQ" />
   −
Uber 是一种类似出租车的按需服务,用户可以用智能手机申请搭车。通过使用活跃乘客和司机的数据,优步可以根据当前的乘客/司机比例进行价格歧视。这使得他们比没有“动态定价”的情况下挣得更多的钱,并且有助于让更多的司机在不受欢迎的工作时间上街。
+
Sequential composition. If we query an ε-differential privacy mechanism t times, and the randomization of the mechanism is independent for each query, then the result would be \epsilon t-differentially private. In the more general case, if there are n independent mechanisms: \mathcal{M}_1,\dots,\mathcal{M}_n, whose privacy guarantees are \epsilon_1,\dots,\epsilon_n differential privacy, respectively, then any function g of them: g(\mathcal{M}_1,\dots,\mathcal{M}_n) is \left(\sum\limits_{i=1}^{n} \epsilon_i\right)-differentially private.
   −
Urban computing can help select better driving routes, which is important for applications like Waze, Google Maps, and trip planning. Wang et al. built a system to get real-time travel time estimates. They solve the problems: one, not all road segments will have data from GPS in the last 30 minutes or ever; two, some paths will be covered by several car records, and it’s necessary to combine those records to create the most accurate estimate of travel time; and three, a city can have tens of thousands of road segments and an infinite amount of paths to be queried, so providing an instantaneous real time estimate must be scalable. They used various techniques and tested it out on 32670 taxis over two months in Beijing, and accurately estimated travel time to within 25 seconds of error per kilometer.<ref name="Yu14"/>
+
连续构图。如果我们对一个 ε- 差分隐私机制进行 t 次查询,并且该机制的随机化独立于每个查询,那么结果将是 epsilon t- 差异私有。在更一般的情况下,如果存在 n 个独立的机制: 数学{ m }1,点,数学{ m } n,它们的隐私保证分别是 epsilon 1,点,epsilon n 差分隐私,那么它们的任何函数 g: g (mathcal { m }1,点,cal { m } n)左(sum limits { i = 1} ^ { n } epsilon i right)-微分私有。
    +
'''Parallel composition.''' If the previous mechanisms are computed on ''disjoint'' subsets of the private database then the function <math>g</math> would be <math>(\max_i \epsilon_i)</math>-differentially private instead.<ref name="PINQ" />
    +
Parallel composition. If the previous mechanisms are computed on disjoint subsets of the private database then the function g would be (\max_i \epsilon_i)-differentially private instead.
   −
Urban computing can also improve public transportation cheaply. A University of Washington group developed OneBusAway, which uses public bus GPS data to provide real-time bus information to riders. Placing displays at bus stops to give information is expensive, but developing several interfaces (apps, website, phone response, SMS) to OneBusAway was comparatively cheap. Among surveyed OneBusAway users, 92% were more satisfied, 91% waited less, and 30% took more trips.
+
平行构图。如果前面的机制是在私有数据库的不相交子集上计算的,那么函数 g 将是(max _ i epsilon _ i)-微分私有。
   −
城市计算也可以提高公共交通的成本。华盛顿大学的一个团队开发了 OneBusAway,它使用公共汽车的 GPS 数据向乘客提供实时的公共汽车信息。在公交车站放置显示器来提供信息是昂贵的,但是开发几个界面(应用程序、网站、电话响应、短信)到 OneBusAway 相对便宜。在接受调查的 OneBusAway 用户中,92% 的用户更满意,91% 的用户等待时间更短,30% 的用户出行次数更多。
+
=== Robustness to post-processing ===
 +
For any deterministic or randomized function <math>F</math> defined over the image of the mechanism <math>\mathcal{A}</math>, if <math>\mathcal{A}</math> satisfies ε-differential privacy, so does <math>F(\mathcal{A})</math>.
   −
[[Bicycle counter]]s are an example of [[computing technology]] to count the number of [[Cycling|cyclists]] at a certain spot in order to help [[urban planning]] with reliable data.<ref>{{Cite web|url=http://www.cycling-embassy.dk/2012/06/06/cycle-cities-awarded-bicycle-counters/|title=Cycle cities awarded bicycle counters|last=Magni|first=Marie|date=2012-06-06|website=Cycling Embassy of Denmark|language=en-US|access-date=2020-04-25}}</ref><ref>{{Cite web|url=https://hamburg.adfc.de/verkehr/themen-a-z/gute-beispiele/fahrradbarometer/|title=Fahrradbarometer|website=hamburg.adfc.de|language=de|access-date=2020-04-25}}</ref>
+
For any deterministic or randomized function F defined over the image of the mechanism \mathcal{A}, if \mathcal{A} satisfies ε-differential privacy, so does F(\mathcal{A}).
    +
= = = 后处理的鲁棒性 = = = 对于任何定义在机制数学图像上的确定性随机函数 f,如果数学图像{ a }满足 ε- 微分隐私性,那么 f (数学图像{ a })也满足 ε 微分隐私性。
    +
Together, [[#Composability|composability]] and [[#Robustness to post-processing|robustness to post-processing]] permit modular construction and analysis of differentially private mechanisms and motivate the concept of the ''privacy loss budget''. If all elements that access sensitive data of a complex mechanisms are separately differentially private, so will be their combination, followed by arbitrary post-processing.
   −
Making decisions on transportation policy can also be aided with urban computing. London’s Cycle Hire system is a heavily used bicycle sharing system run by their transit authority. Originally, it required users to have a membership. They changed it to not require a membership after a while, and analyzed data of when and where bikes were rented and returned, to see what areas were active and what trends changed. They found that removing membership was a good decision that increased weekday commutes somewhat and heavily increased weekend usage. Based on the patterns and characteristics of a bicycle sharing system, the implications for data-driven decision supports have been studied for transforming urban transportation to be more sustainable.
+
Together, composability and robustness to post-processing permit modular construction and analysis of differentially private mechanisms and motivate the concept of the privacy loss budget. If all elements that access sensitive data of a complex mechanisms are separately differentially private, so will be their combination, followed by arbitrary post-processing.
   −
制定交通政策的决策也可以借助城市计算。伦敦的自行车租赁系统是一个广泛使用的自行车共享系统,由他们的交通管理部门运营。最初,它要求用户拥有会员资格。一段时间后,他们将其改为不需要会员资格,并分析了自行车出租和归还的时间和地点的数据,以了解哪些地区是活跃的,哪些趋势发生了变化。他们发现,取消会员资格是一个很好的决定,工作日通勤次数有所增加,周末的使用量也大幅增加。基于自行车共享系统的模式和特点,研究了数据驱动决策支持对城市交通系统可持续改造的意义。
+
总之,可组合性和对后期处理的健壮性允许模块化构建和分析不同的私有机制,并激励隐私损失预算的概念。如果访问复杂机制的敏感数据的所有元素都是单独的、不同的私有元素,那么它们的组合也是如此,然后是任意的后处理。
   −
Uber is an on-demand taxi-like service where users can request rides with their smartphone. By using the data of the active riders and drivers, Uber can price discriminate based on the current rider/driver ratio. This lets them earn more money than they would without “surge pricing,” and helps get more drivers out on the street in unpopular working hours.<ref>{{cite magazine| title=Pricing the surge | magazine=[[The Economist]] |department=Free exchange | date=2014-03-29 | url=https://www.economist.com/finance-and-economics/2014/03/29/pricing-the-surge |url-access=limited}}</ref>
+
=== Group privacy ===
 +
In general, ε-differential privacy is designed to protect the privacy between neighboring databases which differ only in one row. This means that no adversary with arbitrary auxiliary information can know if '''one''' particular participant submitted his information. However this is also extendable if we want to protect databases differing in <math>c</math> rows, which amounts to adversary with arbitrary auxiliary information can know if '''<math>c</math>''' particular participants submitted their information. This can be achieved because if <math>c</math> items change, the probability dilation is bounded by <math>\exp ( \epsilon c )</math> instead of <math>\exp ( \epsilon )</math>,'''<ref name="Dwork, ICALP 2006" />''' i.e., for D<sub>1</sub> and D<sub>2</sub> differing on <math>c</math> items:
    +
In general, ε-differential privacy is designed to protect the privacy between neighboring databases which differ only in one row. This means that no adversary with arbitrary auxiliary information can know if one particular participant submitted his information. However this is also extendable if we want to protect databases differing in c rows, which amounts to adversary with arbitrary auxiliary information can know if c particular participants submitted their information. This can be achieved because if c items change, the probability dilation is bounded by \exp ( \epsilon c ) instead of \exp ( \epsilon ), i.e., for D1 and D2 differing on c items:
    +
= = = 组隐私 = = = 一般来说,ε- 差别隐私被设计用来保护只在一行中不同的相邻数据库之间的隐私。这意味着任何拥有任意辅助信息的对手都不能知道一个特定的参与者是否提交了他的信息。然而,如果我们想要保护不同于 c 行的数据库,这也是可扩展的,这相当于拥有任意辅助信息的对手可以知道 c 特定的参与者是否提交了他们的信息。这是可以实现的,因为如果 c 项发生变化,概率膨胀的界限是 exp (epsilon c) ,而不是 exp (epsilon) ,也就是说,对于不同于 c 项的 D1和 D2来说:
   −
Urban computing can also improve public transportation cheaply. A University of Washington group developed OneBusAway, which uses public bus GPS data to provide real-time bus information to riders. Placing displays at bus stops to give information is expensive, but developing several interfaces (apps, website, phone response, SMS) to OneBusAway was comparatively cheap. Among surveyed OneBusAway users, 92% were more satisfied, 91% waited less, and 30% took more trips.<ref>{{cite conference | last=Ferris | first=Brian | last2=Watkins | first2=Kari | last3=Borning | first3=Alan | title=OneBusAway: results from providing real-time arrival information for public transit |conference=CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems | publisher=ACM Press | location=New York, New York, USA | year=2010 | isbn=978-1-60558-929-9 | doi=10.1145/1753326.1753597 | pages=1807–1816}}</ref>
+
:<math>\Pr[\mathcal{A}(D_{1})\in S]\leq
 +
\exp(\epsilon c)\cdot\Pr[\mathcal{A}(D_{2})\in S]\,\!</math>
   −
Urban computing has a lot of potential to improve urban quality of life by improving the environment people live in, such as by raising air quality and reducing noise pollution. Many chemicals that are undesirable or poisonous are polluting the air, such as PM 2.5, PM 10, and carbon monoxide. Many cities measure air quality by setting up a few measurement stations across the city, but these stations are too expensive to cover the entire city. Because air quality is complex, it’s difficult to infer the quality of air in between two measurement stations.
+
:\Pr[\mathcal{A}(D_{1})\in S]\leq
 +
\exp(\epsilon c)\cdot\Pr[\mathcal{A}(D_{2})\in S]\,\!
   −
通过改善人们生活的环境,例如提高空气质量和减少噪音污染,城市计算机在改善城市生活质量方面有很大的潜力。许多不良或有毒的化学物质正在污染空气,如 pm2.5、 pm10和一氧化碳。许多城市通过在全市建立一些测量站来测量空气质量,但是这些测量站太昂贵了,无法覆盖整个城市。由于空气质量复杂,很难推断两个测量站之间的空气质量。
+
: Pr [ mathcal { a }(d _ {1}) in s ] leq exp (epsilon c) cdot Pr [ mathcal { a }(d _ {2}) in s ] ,!
    +
Thus setting ε instead to <math>\epsilon/c</math> achieves the desired result (protection of <math>c</math> items). In other words, instead of having each item ε-differentially private protected, now every group of <math>c</math> items is ε-differentially private protected (and each item is <math>(\epsilon/c)</math>-differentially private protected).
    +
Thus setting ε instead to \epsilon/c achieves the desired result (protection of c items). In other words, instead of having each item ε-differentially private protected, now every group of c items is ε-differentially private protected (and each item is (\epsilon/c)-differentially private protected).
   −
Making decisions on transportation policy can also be aided with urban computing. London’s Cycle Hire system is a heavily used bicycle sharing system run by their transit authority. Originally, it required users to have a membership. They changed it to not require a membership after a while, and analyzed data of when and where bikes were rented and returned, to see what areas were active and what trends changed. They found that removing membership was a good decision that increased weekday commutes somewhat and heavily increased weekend usage.<ref>{{cite journal | last=Lathia | first=Neal | last2=Ahmed | first2=Saniul | last3=Capra | first3=Licia | title=Measuring the impact of opening the London shared bicycle scheme to casual users | journal=Transportation Research Part C: Emerging Technologies | publisher=Elsevier BV | volume=22 | year=2012 | issn=0968-090X | doi=10.1016/j.trc.2011.12.004 | pages=88–102}}</ref> Based on the patterns and characteristics of a bicycle sharing system, the implications for data-driven decision supports have been studied for transforming urban transportation to be more sustainable.<ref name="Xie2018Bike">{{cite journal |last1=Xie |first1=Xiao-Feng |last2=Wang |first2=Zunjing |title=Examining travel patterns and characteristics in a bikesharing network and implications for data-driven decision supports: Case study in the Washington DC area |journal=Journal of Transport Geography |volume=71 |date=2018 |pages=84–102 |doi=10.1016/j.jtrangeo.2018.07.010|arxiv=1901.02061 |bibcode=2019arXiv190102061X }}</ref>
+
因此,将 ε 设置为 epsilon/c 可以达到预期的结果(c 项的保护)。换句话说,取代了每个条目 ε- 差别私有保护,现在每组 c 条目都是 ε- 差别私有保护(每个条目 ε/c)-差别私有保护)。
   −
Various ways of adding more sensors to the cityscape have been researched, including Copenhagen wheels (sensors mounted on bike wheels and powered by the rider) and car-based sensors. While these work for carbon monoxide and carbon dioxide, aerosol measurement stations aren’t portable enough to move around.
+
== ε-differentially private mechanisms ==
 +
Since differential privacy is a probabilistic concept, any differentially private mechanism is necessarily randomized. Some of these, like the Laplace mechanism, described below, rely on adding controlled noise to the function that we want to compute. Others, like the [[Exponential mechanism (differential privacy)|exponential mechanism]]<ref>[http://research.microsoft.com/pubs/65075/mdviadp.pdf F.McSherry and K.Talwar. Mechasim Design via Differential Privacy. Proceedings of the 48th Annual Symposium of Foundations of Computer Science, 2007.]</ref> and posterior sampling<ref>[https://arxiv.org/abs/1306.1066 Christos Dimitrakakis, Blaine Nelson, Aikaterini Mitrokotsa, Benjamin Rubinstein. Robust and Private Bayesian Inference. Algorithmic Learning Theory 2014]</ref> sample from a problem-dependent family of distributions instead.
   −
为城市景观添加更多传感器的各种方式已经被研究,包括哥本哈根车轮(安装在自行车车轮上的传感器,由骑手提供动力)和汽车传感器。虽然这些工作的一氧化碳和二氧化碳,气溶胶测量站是不便携的足以移动。
+
Since differential privacy is a probabilistic concept, any differentially private mechanism is necessarily randomized. Some of these, like the Laplace mechanism, described below, rely on adding controlled noise to the function that we want to compute. Others, like the exponential mechanismF.McSherry and K.Talwar. Mechasim Design via Differential Privacy. Proceedings of the 48th Annual Symposium of Foundations of Computer Science, 2007. and posterior samplingChristos Dimitrakakis, Blaine Nelson, Aikaterini Mitrokotsa, Benjamin Rubinstein. Robust and Private Bayesian Inference. Algorithmic Learning Theory 2014 sample from a problem-dependent family of distributions instead.
    +
= = = ε- 差异私有机制 = = 既然差分隐私是一个概率概念,任何差异私有机制都必然是随机的。其中一些,如下面描述的拉普拉斯机制,依赖于在我们要计算的函数中添加受控噪声。其他的,比如指数机制,麦雪莉和塔瓦尔。设计图片来源: 差分隐私。2007年第48届计算机科学基础年会论文集。迪米特拉卡基斯(Dimitrakakis)、布莱恩 · 尼尔森(Blaine Nelson)、艾卡捷里尼 · 米特罗科萨(aikaterina Mitrokotsa)、本杰明 · 鲁宾斯坦(Benjamin Rubinstein)。强大的私人贝叶斯推断。算法学习理论2014年的样本取自一个依赖于问题的分布族。
    +
=== Sensitivity ===
 +
Let <math>d</math> be a positive integer, <math>\mathcal{D}</math> be a collection of datasets, and <math>f \colon \mathcal{D} \rightarrow \mathbb{R}^d</math> be a function. The ''sensitivity'' <ref name="DMNS06"/> of a function, denoted <math>\Delta f</math>, is defined by
 +
: <math>\Delta f=\max \lVert f(D_1)-f(D_2) \rVert_1,</math>
 +
where the maximum is over all pairs of datasets <math>D_1</math> and <math>D_2</math> in <math>\mathcal{D}</math> differing in at most one element and <math>\lVert \cdot \rVert_1</math> denotes the [[Taxicab geometry|<math>\ell_1</math> norm]].
   −
=== Environment ===
+
Let d be a positive integer, \mathcal{D} be a collection of datasets, and f \colon \mathcal{D} \rightarrow \mathbb{R}^d be a function. The sensitivity  of a function, denoted \Delta f, is defined by
 +
: \Delta f=\max \lVert f(D_1)-f(D_2) \rVert_1,
 +
where the maximum is over all pairs of datasets D_1 and D_2 in \mathcal{D} differing in at most one element and \lVert \cdot \rVert_1 denotes the \ell_1 norm.
   −
Another source of data is social media data. In particular, geo-referenced picture tags have been successfully used to infer [http://goodcitylife.org/smellymaps/project.php smellscape] [http://goodcitylife.org/smellymaps/index.php maps] 
+
= = = 灵敏度 = = 设 d 为正整数,数学{ d }为数据集合,f 冒号{ d }右行数学{ r } ^ d 为函数。函数的灵敏度定义为: △ f = max lVert f (d _ 1)-f (d _ 2) rVert _ 1,其中最大值是所有数据集对 d1和 d2在数学上的数值,最多只有一个元素不同,而 lVert 点 rVert _ 1表示 lVert _ 1范数。
   −
另一个数据来源是社交媒体数据。特别是,地理参考图片标签已经成功地用于推断[ http://goodcitylife.org/smellymaps/project.php 气味][ http://goodcitylife.org/smellymaps/index.php 地图]
+
In the example of the medical database below, if we consider <math>f</math> to be the function <math>Q_i</math>, then the sensitivity of the function is one, since changing any one of the entries in the database causes the output of the function to change by either zero or one.
   −
Urban computing has a lot of potential to improve urban quality of life by improving the environment people live in, such as by raising air quality and reducing noise pollution. Many chemicals that are undesirable or poisonous are polluting the air, such as PM 2.5, PM 10, and carbon monoxide. Many cities measure air quality by setting up a few measurement stations across the city, but these stations are too expensive to cover the entire city. Because air quality is complex, it’s difficult to infer the quality of air in between two measurement stations.
+
In the example of the medical database below, if we consider f to be the function Q_i, then the sensitivity of the function is one, since changing any one of the entries in the database causes the output of the function to change by either zero or one.
   −
(linked to air quality) and [http://goodcitylife.org/chattymaps/project.php soundscape] [http://goodcitylife.org/chattymaps/index.php maps]  (linked to sound quality) at city level.
+
在下面的医学数据库的例子中,如果我们认为 f 是函数 q _ i,那么函数的灵敏度是1,因为改变数据库中的任何一个条目都会导致函数的输出变为0或1。
   −
(与空气质素有关)及( http://goodcitylife.org/chattymaps/project.php 声景)( http://goodcitylife.org/chattymaps/index.php 地图)(与声音质素有关)。
+
There are techniques (which are described below) using which we can create a differentially private algorithm for functions with low sensitivity.
    +
There are techniques (which are described below) using which we can create a differentially private algorithm for functions with low sensitivity.
    +
有一些技术(下面将描述) ,我们可以使用这些技术为低灵敏度函数创建一个差分私有算法。
   −
Various ways of adding more sensors to the cityscape have been researched, including Copenhagen wheels (sensors mounted on bike wheels and powered by the rider) and car-based sensors. While these work for carbon monoxide and carbon dioxide, aerosol measurement stations aren’t portable enough to move around.<ref name="Yu14"/>
+
=== The Laplace mechanism ===
 +
{{See also|Additive noise mechanisms}}
 +
The Laplace mechanism adds Laplace noise (i.e. noise from the [[Laplace distribution]], which can be expressed by probability density function <math>\text{noise}(y)\propto \exp(-|y|/\lambda)\,\!</math>, which has mean zero and standard deviation <math>\sqrt{2} \lambda\,\!</math>). Now in our case we define the output function of <math>\mathcal{A}\,\!</math> as a real valued function (called as the transcript output by <math>\mathcal{A}\,\!</math>) as <math>\mathcal{T}_{\mathcal{A}}(x)=f(x)+Y\,\!</math> where <math>Y \sim \text{Lap}(\lambda)\,\!\,\!</math> and <math>f\,\!</math> is the original real valued query/function we planned to execute on the database. Now clearly <math>\mathcal{T}_{\mathcal{A}}(x)\,\!</math> can be considered to be a continuous random variable, where
 +
       +
The Laplace mechanism adds Laplace noise (i.e. noise from the Laplace distribution, which can be expressed by probability density function \text{noise}(y)\propto \exp(-|y|/\lambda)\,\!, which has mean zero and standard deviation \sqrt{2} \lambda\,\!). Now in our case we define the output function of \mathcal{A}\,\! as a real valued function (called as the transcript output by \mathcal{A}\,\!) as \mathcal{T}_{\mathcal{A}}(x)=f(x)+Y\,\! where Y \sim \text{Lap}(\lambda)\,\!\,\! and f\,\! is the original real valued query/function we planned to execute on the database. Now clearly \mathcal{T}_{\mathcal{A}}(x)\,\! can be considered to be a continuous random variable, where
 +
   −
There are also attempts to infer the unknown air quality all across the city from just the samples taken at stations, such as by estimating car emissions from floating car data. Zheng et al. built a model using machine learning and data mining called U-Air. It uses historical and real-time air data, meteorology, traffic flow, human mobility, road networks, and points of interest, which are fed to artificial neural networks and conditional random fields to be processed. Their model is a significant improvement over previous models of citywide air quality.<ref name ="Zheng01" />
+
= = = 拉普拉斯机制 = = = 拉普拉斯机制增加了拉普拉斯噪音(即:。拉普拉斯分布的噪声,可以用概率密度函数文本{ noise }(y) propto exp (- | y |/lambda)来表示,!这个函数的平均值是0和标准差的 sqrt {2} lambda,!).现在,在我们的例子中,我们定义了数学{ a }的输出函数,!作为一个实值函数(由 mathcal { a } ,!)数学{ t } _ {数学{ a }}(x) = f (x) + y,!其中 y sim 文本{ Lap }(lambda) ,! ,!还有 f!是我们计划在数据库上执行的原始实值查询/函数。现在清楚的数学{ t } _ {数学{ a }(x) ,!可以被认为是一个连续的随机变量
    +
:<math>\frac{\mathrm{pdf}(\mathcal{T}_{\mathcal{A},D_1}(x)=t)}{\mathrm{pdf}(\mathcal{T}_{\mathcal{A},D_2}(x)=t)}=\frac{\text{noise}(t-f(D_1))}{\text{noise}(t-f(D_2))}\,\!</math>
 +
    +
:\frac{\mathrm{pdf}(\mathcal{T}_{\mathcal{A},D_1}(x)=t)}{\mathrm{pdf}(\mathcal{T}_{\mathcal{A},D_2}(x)=t)}=\frac{\text{noise}(t-f(D_1))}{\text{noise}(t-f(D_2))}\,\!
 +
   −
Chet et al. developed a system to monitor air quality indoors, which were deployed internally by Microsoft in China. The system is based in the building’s HVAC (heating, ventilation, air conditioning) units. Since HVACs filter the air of PM 2.5, but don’t check if its necessary, the new system can save energy by preventing HVACs from running when unnecessary.<ref>{{cite conference | last=Chen | first=Xuxu | last2=Zheng | first2=Yu | last3=Chen | first3=Yubiao | last4=Jin | first4=Qiwei | last5=Sun | first5=Weiwei | last6=Chang | first6=Eric | last7=Ma | first7=Wei-Ying | title=Indoor air quality monitoring system for smart buildings |conference=UbiComp '14: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing | publisher=ACM Press | location=New York, New York, USA | year=2014 | isbn=978-1-4503-2968-2 | doi=10.1145/2632048.2632103 | pages=471–475}}</ref>
+
: frc { mathrm { pdf }(mathcal { t } _ { mathcal { a } ,d _ 1}(x) = t)}{ mathrm { pdf }(mathcal { t } _ { mathcal { a } ,d _ 2}(x) = t)} = frc { text { noise }(t-f (d _ 1))}{ text { noise }(t-f (d _ 2))} ,!
    +
which is at most <math>e^{\frac{|f(D_{1})-f(D_{2})|}{\lambda}}\leq e^{\frac{\Delta(f)}{\lambda}}\,\!</math>. We can consider <math>\frac{\Delta(f)}{\lambda}\,\!</math> to be the privacy factor <math>\epsilon\,\!</math>. Thus <math>\mathcal{T}\,\!</math> follows a differentially private mechanism (as can be seen from [[#&epsilon;-differential privacy|the definition above]]). If we try to use this concept in our diabetes example then it follows from the above derived fact that in order to have <math>\mathcal{A}\,\!</math> as the <math>\epsilon\,\!</math>-differential private algorithm we need to have <math>\lambda=1/\epsilon\,\!</math>. Though we have used Laplace noise here, other forms of noise, such as the Gaussian Noise, can be employed, but they may require a slight relaxation of the definition of differential privacy.<ref name="Dwork, ICALP 2006" />
    +
which is at most e^{\frac{|f(D_{1})-f(D_{2})|}{\lambda}}\leq e^{\frac{\Delta(f)}{\lambda}}\,\!. We can consider \frac{\Delta(f)}{\lambda}\,\! to be the privacy factor \epsilon\,\!. Thus \mathcal{T}\,\! follows a differentially private mechanism (as can be seen from the definition above). If we try to use this concept in our diabetes example then it follows from the above derived fact that in order to have \mathcal{A}\,\! as the \epsilon\,\!-differential private algorithm we need to have \lambda=1/\epsilon\,\!. Though we have used Laplace noise here, other forms of noise, such as the Gaussian Noise, can be employed, but they may require a slight relaxation of the definition of differential privacy.
   −
Another source of data is social media data. In particular, geo-referenced picture tags have been successfully used to infer [http://goodcitylife.org/smellymaps/project.php smellscape] [http://goodcitylife.org/smellymaps/index.php maps] <ref>{{cite journal|last3=Aiello | first3=Luca Maria | last2=Schifanella |first2=Rossano|last1=Quercia |first1=Daniele|last4=Kate |first4=McLean|title=Smelly maps: the digital life of urban smellscapes|journal=AAAI Icwsm|volume=3|issue=3|year= 2015|arxiv=1505.06851|bibcode=2015arXiv150506851Q}}</ref>
+
最多 e ^ { frac { | f (d _ {1})-f (d _ {2}) | }{ lambda } leq e ^ { frac { Delta (f)}{ lambda } ,。我们可以考虑 frac { Delta (f)}{ lambda } ,!成为隐私保护因素。因此数学{ t } ,!遵循不同的私有机制(从上面的定义可以看出)。如果我们在我们的糖尿病示例中尝试使用这个概念,那么从上面的派生事实可以推出,为了得到数学{ a } ,!作为 epsilon!-我们需要 lambda = 1/epsilon! 。虽然我们在这里使用了拉普拉斯噪音,但是也可以使用其他形式的噪音,比如高斯噪音,但是它们可能需要稍微放宽差分隐私的定义。
   −
<ref>{{cite journal|last3=Aiello | first3=Luca Maria | last2=Schifanella |first2=Rossano|last1=Quercia |first1=Daniele|title= The Emotional and Chromatic Layers of Urban Smells|journal=AAAI Icwsm|year= 2016|url=https://www.researchgate.net/publication/303448694| bibcode=2016arXiv160506721Q | arxiv=1605.06721 }}</ref> (linked to air quality) and [http://goodcitylife.org/chattymaps/project.php soundscape] [http://goodcitylife.org/chattymaps/index.php maps] <ref>{{cite journal|last1=Aiello | first1=Luca Maria | last2=Schifanella |first2=Rossano|last3=Quercia |first3=Daniele|last4=Aletta |first4=Francesco|title= Chatty maps: constructing sound maps of urban areas from social media data|journal=Royal Society Open Science|volume=3|issue=3| pages=150690 | bibcode=2016RSOS....350690A | year=2016 | arxiv=1603.07813 | doi=10.1098/rsos.150690 | pmid=27069661 | pmc=4821272 }}</ref> (linked to sound quality) at city level.
+
According to this definition, differential privacy is a condition on the release mechanism (i.e., the trusted party releasing information ''about'' the dataset) and not on the dataset itself. Intuitively, this means that for any two datasets that are similar, a given differentially private algorithm will behave approximately the same on both datasets. The definition gives a strong guarantee that presence or absence of an individual will not affect the final output of the algorithm significantly.
    +
According to this definition, differential privacy is a condition on the release mechanism (i.e., the trusted party releasing information about the dataset) and not on the dataset itself. Intuitively, this means that for any two datasets that are similar, a given differentially private algorithm will behave approximately the same on both datasets. The definition gives a strong guarantee that presence or absence of an individual will not affect the final output of the algorithm significantly.
    +
根据这个定义,差分隐私是发布机制的一个条件(例如,可信方发布数据集的信息) ,而不是数据集本身。直观上,这意味着对于任何两个相似的数据集,一个给定的差异私有算法在两个数据集上的行为大致相同。该定义强有力地保证了个体的存在或不存在不会对算法的最终输出产生重大影响。
   −
== See also ==
+
For example, assume we have a database of medical records <math>D_1</math> where each record is a pair ('''Name''', '''X'''), where <math>X</math> is a [[Boolean algebra|Boolean]] denoting whether a person has diabetes or not. For example:
   −
* [[Ubiquitous computing]]
+
For example, assume we have a database of medical records D_1 where each record is a pair (Name, X), where X is a Boolean denoting whether a person has diabetes or not. For example:
   −
Category:Human–computer interaction
+
例如,假设我们有一个医疗记录数据库 d _ 1,其中每个记录是一对(Name,x) ,其中 x 是一个布尔值,表示一个人是否患有糖尿病。例如:
   −
类别: 人机交互
+
{| class="wikitable" style="margin-left: auto; margin-right: auto; border: none;"
 +
|-
 +
! Name !! Has Diabetes (X)
 +
|-
 +
| Ross
 +
|| 1
 +
|-
 +
| Monica
 +
|| 1
 +
|-
 +
| Joey
 +
|| 0
 +
|-
 +
| Phoebe
 +
|| 0
 +
|-
 +
| Chandler
 +
|| 1
 +
|-
 +
| Rachel
 +
|| 0
 +
|}
   −
* [[Urban informatics]]
+
{| class="wikitable" style="margin-left: auto; margin-right: auto; border: none;"
 +
|-
 +
! Name !! Has Diabetes (X)
 +
|-
 +
| Ross
 +
|| 1
 +
|-
 +
| Monica
 +
|| 1
 +
|-
 +
| Joey
 +
|| 0
 +
|-
 +
| Phoebe
 +
|| 0
 +
|-
 +
| Chandler
 +
|| 1
 +
|-
 +
| Rachel
 +
|| 0
 +
|}
   −
Category:Urban design
+
{| class="wikitable" style="margin-left: auto; margin-right: auto; border: none;"
 +
|-
 +
!姓名! !患有糖尿病(x) |-| Ross | | 1 |-| Monica | | | 1 | |-| Joey | | 0 |-| Phoebe | 0 |-|-| Chandler | | 1 |-|-| Rachel | | | | |
   −
类别: 城市设计
+
Now suppose a malicious user (often termed an ''adversary'') wants to find whether Chandler has diabetes or not. Suppose he also knows in which row of the database Chandler resides. Now suppose the adversary is only allowed to use a particular form of query <math>Q_i</math> that returns the partial sum of the first <math>i</math> rows of column <math>X</math> in the database. In order to find Chandler's diabetes status the adversary executes <math>Q_5(D_1)</math> and <math>Q_4(D_1)</math>, then computes their difference. In this example, <math>Q_5(D_1) = 3</math> and <math>Q_4(D_1) = 2</math>, so their difference is 1. This indicates that the "Has Diabetes" field in Chandler's row must be 1. This example highlights how individual information can be compromised even without explicitly querying for the information of a specific individual.
   −
* [[Smart city]]
+
Now suppose a malicious user (often termed an adversary) wants to find whether Chandler has diabetes or not. Suppose he also knows in which row of the database Chandler resides. Now suppose the adversary is only allowed to use a particular form of query Q_i that returns the partial sum of the first i rows of column X in the database. In order to find Chandler's diabetes status the adversary executes Q_5(D_1) and Q_4(D_1), then computes their difference. In this example, Q_5(D_1) = 3 and Q_4(D_1) = 2, so their difference is 1. This indicates that the "Has Diabetes" field in Chandler's row must be 1. This example highlights how individual information can be compromised even without explicitly querying for the information of a specific individual.
   −
Category:Urban planning
+
现在假设一个恶意用户(通常称为对手)想要查看 Chandler 是否患有糖尿病。假设他也知道 Chandler 在数据库的哪一行中。现在假设对手只允许使用特定形式的查询 q _ i,该查询返回数据库中列 x 的第一个 i 行的部分和。为了查找钱德勒的糖尿病状态,对手执行 q5(d1)和 q4(d1) ,然后计算它们的差异。在这个例子中,q _ 5(d _ 1) = 3和 q _ 4(d _ 1) = 2,所以它们的差值是1。这表示 Chandler 行中的“ Has Diabetes”字段必须为1。这个示例突出说明了即使不显式查询特定个人的信息,个人信息也可能被泄露。
   −
类别: 城市规划
+
Continuing this example, if we construct <math>D_2</math> by replacing (Chandler, 1) with (Chandler, 0) then this malicious adversary will be able to distinguish <math>D_2</math> from <math>D_1</math> by computing <math>Q_5 - Q_4</math> for each dataset. If the adversary were required to receive the values <math>Q_i</math> via an <math>\epsilon</math>-differentially private algorithm, for a sufficiently small <math>\epsilon</math>, then he or she would be unable to distinguish between the two datasets.
   −
*[[Bicycle counter|Bicycle Counters]]
+
Continuing this example, if we construct D_2 by replacing (Chandler, 1) with (Chandler, 0) then this malicious adversary will be able to distinguish D_2 from D_1 by computing Q_5 - Q_4 for each dataset. If the adversary were required to receive the values Q_i via an \epsilon-differentially private algorithm, for a sufficiently small \epsilon, then he or she would be unable to distinguish between the two datasets.
   −
Category:Information society
+
继续这个例子,如果我们用(Chandler,1)替换(Chandler,0)来构造 d2,那么这个恶意的对手将能够通过计算每个数据集的 q _ 5-q _ 4来区分 d _ 2和 d _ 1。如果对手被要求通过一个 epsilon-differentially private 算法接收值 q _ i,对于一个足够小的 epsilon,那么他或她将无法区分两个数据集。
   −
类别: 信息社会
+
===Randomized response ===
 +
{{See also|Local differential privacy}}
   −
* ''[[Ingress (video game)|Ingress]]''<!--"Through this, Ingress may manage to establish an emotional attachment between players and urban places" ~http://www3.architektur.tu-darmstadt.de/urbanhealthgames/wp-content/uploads/2014/03/knoell_urban_exergames.pdf It's also a way of collecting, organizing data about a city as well as socializing cultural sites, and making people learn about their environment etc-->
+
A simple example, especially developed in the [[social science]]s,<ref>{{cite journal |last=Warner |first=S. L. |date=March 1965 |title=Randomised response: a survey technique for eliminating evasive answer bias |jstor=2283137 |journal=[[Journal of the American Statistical Association]] |publisher=[[Taylor & Francis]] |volume=60 |issue=309 |pages=63–69 |doi= 10.1080/01621459.1965.10480775|pmid=12261830 }}</ref> is to ask a person to answer the question "Do you own the ''attribute A''?", according to the following procedure:
   −
Category:Urban society
+
A simple example, especially developed in the social sciences, is to ask a person to answer the question "Do you own the attribute A?", according to the following procedure:
   −
类别: 城市社会
+
一个简单的例子,尤其是在社会科学领域,就是让一个人回答“你拥有属性 a 吗?”?”,根据下列程序:
 +
 
 +
# [[Coin flipping|Toss a coin]].
 +
# If heads, then toss the coin again (ignoring the outcome), and answer the question honestly.
 +
# If tails, then toss the coin again and answer "Yes" if heads, "No" if tails.
 +
 
 +
# Toss a coin.
 +
# If heads, then toss the coin again (ignoring the outcome), and answer the question honestly.
 +
# If tails, then toss the coin again and answer "Yes" if heads, "No" if tails.
 +
 
 +
# 抛硬币。# 如果是正面,再掷硬币(忽略结果) ,诚实地回答问题。# 如果是反面,再掷一次硬币,如果是正面,回答“是”; 如果是反面,回答“否”。
 +
 
 +
(The seemingly redundant extra toss in the first case is needed in situations where just the ''act'' of tossing a coin may be observed by others, even if the actual result stays hidden.) The confidentiality then arises from the [[Falsifiability|refutability]] of the individual responses.
 +
 
 +
(The seemingly redundant extra toss in the first case is needed in situations where just the act of tossing a coin may be observed by others, even if the actual result stays hidden.) The confidentiality then arises from the refutability of the individual responses.
 +
 
 +
(在第一种情况下,看似多余的额外投掷是必要的,因为在这种情况下,即使实际结果仍然隐藏着,仅仅是抛硬币的动作就可能被其他人看到。)这种保密性来自于个人反驳的可驳斥性。
 +
 
 +
But, overall, these data with many responses are significant, since positive responses are given to a quarter by people who do not have the ''attribute A'' and three-quarters by people who actually possess it.
 +
Thus, if ''p'' is the true proportion of people with ''A'', then we expect to obtain (1/4)(1-''p'') + (3/4)''p'' = (1/4) + ''p''/2  positive responses. Hence it is possible to estimate ''p''.
 +
 
 +
But, overall, these data with many responses are significant, since positive responses are given to a quarter by people who do not have the attribute A and three-quarters by people who actually possess it.
 +
Thus, if p is the true proportion of people with A, then we expect to obtain (1/4)(1-p) + (3/4)p = (1/4) + p/2  positive responses. Hence it is possible to estimate p.
 +
 
 +
但是,总的来说,这些有很多回答的数据是有意义的,因为有四分之一的人给出了肯定的回答,而有四分之三的人给出了真正拥有 a 属性的人的答案。因此,如果 p 是 a 型人群的真实比例,那么我们期望得到(1/4)(1-p) + (3/4) p = (1/4) + p/2的积极反应。因此,我们可以估计 p。
 +
 
 +
In particular, if the ''attribute A'' is synonymous with illegal behavior, then answering "Yes" is not incriminating, insofar as the person has a probability of a "Yes" response, whatever it may be.
 +
 
 +
In particular, if the attribute A is synonymous with illegal behavior, then answering "Yes" is not incriminating, insofar as the person has a probability of a "Yes" response, whatever it may be.
 +
 
 +
特别是,如果属性 a 是非法行为的同义词,那么回答“是”并不意味着定罪,只要这个人有可能作出“是”的回答,无论它可能是什么。
 +
 
 +
Although this example, inspired by [[randomized response]], might be applicable to [[Microdata (statistics)|microdata]] (i.e., releasing datasets with each individual response), by definition differential privacy excludes microdata releases and is only applicable to queries (i.e., aggregating individual responses into one result) as this would violate the requirements, more specifically the plausible deniability that a subject participated or not.<ref>Dwork, Cynthia. "A firm foundation for private data analysis." Communications of the ACM 54.1 (2011): 86–95, supra note 19, page 91.</ref><ref>Bambauer, Jane, Krishnamurty Muralidhar, and Rathindra Sarathy. "Fool's gold: an illustrated critique of differential privacy." Vand. J. Ent. & Tech. L. 16 (2013): 701.</ref>
 +
 
 +
Although this example, inspired by randomized response, might be applicable to microdata (i.e., releasing datasets with each individual response), by definition differential privacy excludes microdata releases and is only applicable to queries (i.e., aggregating individual responses into one result) as this would violate the requirements, more specifically the plausible deniability that a subject participated or not.Dwork, Cynthia. "A firm foundation for private data analysis." Communications of the ACM 54.1 (2011): 86–95, supra note 19, page 91.Bambauer, Jane, Krishnamurty Muralidhar, and Rathindra Sarathy. "Fool's gold: an illustrated critique of differential privacy." Vand. J. Ent. & Tech. L. 16 (2013): 701.
 +
 
 +
虽然这个例子受到了随机化回答的启发,可能适用于微数据(例如,发布每个响应的数据集) ,但根据定义,差分隐私排除了微数据发布,并且只适用于查询(例如,将单个响应聚合成一个结果) ,因为这将违反要求,更具体地说,是一个主题参与或不参与的似是而非的否认。辛西娅。“为私人数据分析奠定坚实的基础。”美国计算机学会通讯54.1(2011) : 86-95,上注19,第91页. Bambauer,Jane,Krishnamurty Muralidhar,and Rathindra Sarathy。“愚人的黄金: 对差分隐私的插图式批评。”Vand.J. Ent.北京科技发展有限公司。L. 16(2013) : 701.
 +
 
 +
=== Stable transformations ===
 +
A transformation <math>T</math> is <math>c</math>-stable if the [[Hamming distance]] between <math>T(A)</math> and <math>T(B)</math> is at most <math>c</math>-times the Hamming distance between <math>A</math> and <math>B</math> for any two databases <math>A,B</math>. Theorem 2 in <ref name="PINQ"/> asserts that if there is a mechanism <math>M</math> that is <math>\epsilon</math>-differentially private, then the composite mechanism <math>M\circ T</math> is <math>(\epsilon \times c)</math>-differentially private.
 +
 
 +
A transformation T is c-stable if the Hamming distance between T(A) and T(B) is at most c-times the Hamming distance between A and B for any two databases A,B. Theorem 2 in  asserts that if there is a mechanism M that is \epsilon-differentially private, then the composite mechanism M\circ T is (\epsilon \times c)-differentially private.
 +
 
 +
= = = = 稳定变换 = = = a 变换 t 是 c- 稳定的,如果 t (a)和 t (b)之间的汉明距离最多是任意两个数据库 a,b 和 a 之间的汉明距离 c-乘以 b。定理2断言,如果存在一个机制 m 是 epsilon-微分私有的,那么复合机制 m circ t 是(epsilon 乘以 c)-微分私有的。
 +
 
 +
This could be generalized to group privacy, as the group size could be thought of as the Hamming distance <math>h</math> between
 +
<math>A</math> and <math>B</math> (where <math>A</math> contains the group and <math>B</math> doesn't). In this case <math>M\circ T</math> is <math>(\epsilon \times c \times h)</math>-differentially private.
 +
 
 +
This could be generalized to group privacy, as the group size could be thought of as the Hamming distance h between
 +
A and B (where A contains the group and B doesn't). In this case M\circ T is (\epsilon \times c \times h)-differentially private.
 +
 
 +
这可以推广到群组隐私,因为群组大小可以被认为是 a 和 b 之间的汉明距离 h (其中 a 包含群组,而 b 没有)。在这种情况下,m circ t 是(epsilon 乘以 c 乘以 h)-微分私有的。
 +
 
 +
==Other notions of differential privacy==
 +
Since differential privacy is considered to be too strong or weak for some applications, many versions of it have been proposed.<ref name="DP19"/> The most widespread relaxation is (ε, δ)-differential privacy,<ref name="DKMMN06"/> which weakens the definition by allowing an additional small δ density of probability on which the upper bound ε does not hold.
 +
 
 +
Since differential privacy is considered to be too strong or weak for some applications, many versions of it have been proposed. The most widespread relaxation is (ε, δ)-differential privacy, which weakens the definition by allowing an additional small δ density of probability on which the upper bound ε does not hold.
 +
 
 +
= = 差分隐私的其他概念 = = 由于差分隐私被认为对于某些应用来说太强或太弱,因此提出了许多版本。最广泛的松弛是(ε,δ)-差分隐私,它通过允许增加一个上限 ε 不成立的概率密度 δ 来削弱定义。
 +
 
 +
== Adoption of differential privacy in real-world applications ==
 +
{{see also|Implementations of differentially private analyses}}
 +
Several uses of differential privacy in practice are known to date:
 +
* 2008: [[United States Census Bureau|U.S. Census Bureau]], for showing commuting patterns.<ref name="MachanavajjhalaKAGV08"/>
 +
* 2014: [[Google]]'s RAPPOR, for telemetry such as learning statistics about unwanted software hijacking users' settings. <ref name="RAPPOR"/><ref>{{Citation|title=google/rappor|date=2021-07-15|url=https://github.com/google/rappor|publisher=GitHub}}</ref>
 +
* 2015: Google, for sharing historical traffic statistics.<ref name="Eland"/>
 +
* 2016: [[Apple Inc.|Apple]] announced its intention to use differential privacy in [[iOS 10]] to improve its [[Intelligent personal assistant]] technology.<ref>{{cite web|title=Apple - Press Info - Apple Previews iOS 10, the Biggest iOS Release Ever|url=https://www.apple.com/pr/library/2016/06/13Apple-Previews-iOS-10-The-Biggest-iOS-Release-Ever.html|website=Apple|access-date=16 June 2016}}</ref>
 +
* 2017: Microsoft, for telemetry in Windows.<ref name="DpWinTelemetry"/>
 +
* 2019: Privitar Lens is an API using differential privacy.<ref>{{cite web|title=Privitar Lens|url=https://www.privitar.com/privitar-lens|access-date=20 February 2018}}</ref>
 +
* 2020: LinkedIn, for advertiser queries.<ref name="DpLinkedIn"/>
 +
 
 +
 
 +
Several uses of differential privacy in practice are known to date:
 +
* 2008: U.S. Census Bureau, for showing commuting patterns.
 +
* 2014: Google's RAPPOR, for telemetry such as learning statistics about unwanted software hijacking users' settings.
 +
* 2015: Google, for sharing historical traffic statistics.
 +
* 2016: Apple announced its intention to use differential privacy in iOS 10 to improve its Intelligent personal assistant technology.
 +
* 2017: Microsoft, for telemetry in Windows.
 +
* 2019: Privitar Lens is an API using differential privacy.
 +
* 2020: LinkedIn, for advertiser queries.
 +
 
 +
= = = 在实际应用中采用差分隐私 = = = 迄今为止已经知道了差分隐私的几种实际用途:
 +
* 2008: 美国人口普查局,用于显示通勤模式。
 +
* 2014年: 谷歌的 RAPPOR,用于遥测,例如了解不受欢迎的软件劫持用户设置的统计数据。2015: Google,分享历史流量统计数据。
 +
* 2016年: 苹果公司宣布打算在 iOS 10中使用差分隐私智能个人助理来改进其智能个人助理技术。
 +
* 2017: 微软,Windows 遥测系统。2019: priveritar Lens 是一个使用差分隐私的 API。2020: LinkedIn,for advertiser queries.
 +
 
 +
== Public purpose considerations ==
 +
There are several public purpose considerations regarding differential privacy that are important to consider, especially for policymakers and policy-focused audiences interested in the social opportunities and risks of the technology:<ref>{{Cite web|title=Technology Factsheet: Differential Privacy|url=https://www.belfercenter.org/publication/technology-factsheet-differential-privacy|access-date=2021-04-12|website=Belfer Center for Science and International Affairs|language=en}}</ref>
 +
 
 +
There are several public purpose considerations regarding differential privacy that are important to consider, especially for policymakers and policy-focused audiences interested in the social opportunities and risks of the technology:
 +
 
 +
= = 公共目的考虑 = = 关于差分隐私技术,有几个公共目的考虑因素需要考虑,特别是对于对技术的社会机会和风险感兴趣的决策者和政策重点受众来说:
 +
 
 +
* '''Data Utility & Accuracy.''' The main concern with differential privacy is the tradeoff between data utility and individual privacy. If the privacy loss parameter is set to favor utility, the privacy benefits are lowered (less “noise” is injected into the system); if the privacy loss parameter is set to favor heavy privacy, the accuracy and utility of the dataset are lowered (more “noise” is injected into the system). It is important for policymakers to consider the tradeoffs posed by differential privacy in order to help set appropriate best practices and standards around the use of this privacy preserving practice, especially considering the diversity in organizational use cases. It is worth noting, though, that decreased accuracy and utility is a common issue among all statistical disclosure limitation methods and is not unique to differential privacy. What is unique, however, is how policymakers, researchers, and implementers can consider mitigating against the risks presented through this tradeoff.
 +
 
 +
* Data Utility & Accuracy. The main concern with differential privacy is the tradeoff between data utility and individual privacy. If the privacy loss parameter is set to favor utility, the privacy benefits are lowered (less “noise” is injected into the system); if the privacy loss parameter is set to favor heavy privacy, the accuracy and utility of the dataset are lowered (more “noise” is injected into the system). It is important for policymakers to consider the tradeoffs posed by differential privacy in order to help set appropriate best practices and standards around the use of this privacy preserving practice, especially considering the diversity in organizational use cases. It is worth noting, though, that decreased accuracy and utility is a common issue among all statistical disclosure limitation methods and is not unique to differential privacy. What is unique, however, is how policymakers, researchers, and implementers can consider mitigating against the risks presented through this tradeoff.
 +
 
 +
 
 +
* 数据的实用性及准确性。差分隐私的主要关注点在于数据效用和个人隐私之间的权衡。如果将隐私损失参数设置为有利于实用性,则隐私好处降低(向系统中注入的“噪音”较少) ; 如果将隐私损失参数设置为有利于重隐私性,则数据集的准确性和实用性降低(向系统中注入更多的“噪音”)。对于决策者来说,重要的是要考虑到差分隐私的权衡,以帮助建立适当的最佳实践和标准来使用这种隐私保护实践,特别是考虑到组织用例的多样性。值得注意的是,在所有的统计披露限制方法中,降低准确性和效用是一个共同的问题,并不是差分隐私唯一的。然而,独特之处在于,决策者、研究人员和实施者可以考虑如何减轻这种权衡带来的风险。
 +
 
 +
* '''Data Privacy & Security.''' Differential privacy provides a quantified measure of privacy loss and an upper bound and allows curators to choose the explicit tradeoff between privacy and accuracy. It is robust to still unknown privacy attacks. However, it encourages greater data sharing, which if done poorly, increases privacy risk. Differential privacy implies that privacy is protected, but this depends very much on the privacy loss parameter chosen and may instead lead to a false sense of security. Finally, though it is robust against unforeseen future privacy attacks, a countermeasure may be devised that we cannot predict.
 +
 
 +
* Data Privacy & Security. Differential privacy provides a quantified measure of privacy loss and an upper bound and allows curators to choose the explicit tradeoff between privacy and accuracy. It is robust to still unknown privacy attacks. However, it encourages greater data sharing, which if done poorly, increases privacy risk. Differential privacy implies that privacy is protected, but this depends very much on the privacy loss parameter chosen and may instead lead to a false sense of security. Finally, though it is robust against unforeseen future privacy attacks, a countermeasure may be devised that we cannot predict.
 +
 
 +
 
 +
* 资料私隐及保安。差分隐私图书馆提供了一个量化的隐私损失度量和上限,并允许馆长在隐私和准确性之间做出明确的权衡。它对仍然未知的隐私攻击是健壮的。然而,它鼓励更大的数据共享,如果做得不好,会增加隐私风险。差分隐私意味着隐私是受到保护的,但这在很大程度上取决于选择的隐私损失参数,并可能会导致错误的安全感。最后,尽管它对未来不可预见的隐私攻击是健壮的,但可以设计出一种我们无法预测的对策。
 +
 
 +
==See also==
 +
*[[Quasi-identifier]]
 +
*[[Exponential mechanism (differential privacy)]] – a technique for designing differentially private algorithms
 +
*[[k-anonymity]]
 +
*[[Differentially private analysis of graphs]]
 +
*[[Protected health information]]
 +
 
 +
*Quasi-identifier
 +
*Exponential mechanism (differential privacy) – a technique for designing differentially private algorithms
 +
*k-anonymity
 +
*Differentially private analysis of graphs
 +
*Protected health information
 +
 
 +
= = = = =
 +
* 准标识符
 +
* 指数机制(差分隐私)-一种设计差异私有算法的技术
 +
* k-匿名
 +
* 差异私有图分析
 +
* 受保护的健康信息
 +
 
 +
==References==
 +
{{Reflist|refs=
 +
<ref name="DKMMN06">
 +
Dwork, Cynthia, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. "Our data, ourselves: Privacy via distributed noise generation." In Advances in Cryptology-EUROCRYPT 2006, pp. 486–503. Springer Berlin Heidelberg, 2006.</ref>
 +
 
 +
{{Reflist|refs=
 +
 
 +
Dwork, Cynthia, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. "Our data, ourselves: Privacy via distributed noise generation." In Advances in Cryptology-EUROCRYPT 2006, pp. 486–503. Springer Berlin Heidelberg, 2006.
 +
 
 +
==References==
 +
{{Reflist|refs=
 +
 
 +
Dwork, Cynthia, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor.“我们的数据,我们自己: 通过分布式噪音产生的隐私。”密码学的进展-eurocrypt 2006,第页。486–503.Springer Berlin Heidelberg,2006年。
 +
 
 +
<!-- unused refs  <ref name="CABP13">
 +
Chatzikokolakis, Konstantinos, Miguel E. Andrés, Nicolás Emilio Bordenabe, and Catuscia Palamidessi. "Broadening the scope of Differential Privacy using metrics." In Privacy Enhancing Technologies, pp. 82–102. Springer Berlin Heidelberg, 2013.</ref>
 +
 
 +
<ref name="HRW11">
 +
Hall, Rob, Alessandro Rinaldo, and Larry Wasserman. "Random differential privacy." arXiv preprint arXiv:1112.2680 (2011).</ref>-->
 +
 
 +
 
 +
Hall, Rob, Alessandro Rinaldo, and Larry Wasserman. "Random differential privacy." arXiv preprint arXiv:1112.2680 (2011).-->
 +
 
 +
霍尔、罗布、亚历山德罗 · 里纳尔多和拉里 · 沃瑟曼。随机差分隐私arXiv preprint arXiv:1112.2680 (2011).-->
 +
 
 +
<ref name="MachanavajjhalaKAGV08">
 +
Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. "Privacy: Theory meets Practice on the Map". In Proceedings of the 24th International Conference on Data Engineering, ICDE) 2008.</ref>
 +
 
 +
 
 +
Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. "Privacy: Theory meets Practice on the Map". In Proceedings of the 24th International Conference on Data Engineering, ICDE) 2008.
 +
 
 +
 
 +
Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber.“隐私权: 理论与实践相结合的地图”。2008年第24届国际数据工程会议论文集。
 +
 
 +
<ref name="RAPPOR">
 +
Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova. [https://dl.acm.org/doi/10.1145/2660267.2660348 "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response".] In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS), 2014. {{doi|10.1145/2660267.2660348}}</ref>
 +
 
 +
 
 +
Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova. "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS), 2014.
 +
 
 +
 
 +
Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova.“ RAPPOR: 随机可聚合隐私保护顺序响应”。2014年美国计算机与通信安全会议论文集。
 +
 
 +
<ref name="DMNS06">
 +
[https://link.springer.com/chapter/10.1007%2F11681878_14 Calibrating Noise to Sensitivity in Private Data Analysis] by Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith. In Theory of Cryptography Conference (TCC), Springer, 2006. {{doi|10.1007/11681878_14}}. The [https://journalprivacyconfidentiality.org/index.php/jpc/article/view/405 full version] appears in Journal of Privacy and Confidentiality, 7 (3), 17-51. {{doi|10.29012/jpc.v7i3.405}}</ref>
 +
 
 +
 
 +
Calibrating Noise to Sensitivity in Private Data Analysis by Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith. In Theory of Cryptography Conference (TCC), Springer, 2006. . The full version appears in Journal of Privacy and Confidentiality, 7 (3), 17-51.
 +
 
 +
在私人数据分析中校准噪声的灵敏度。《密码学理论》(TCC) ,Springer,2006。完整版本发表在《隐私与保密期刊》 ,7(3) ,17-51。
 +
 
 +
<ref name="PINQ">
 +
[http://research.microsoft.com/pubs/80218/sigmod115-mcsherry.pdf Privacy integrated queries: an extensible platform for privacy-preserving data analysis] by Frank D. McSherry. In Proceedings of the 35th SIGMOD International Conference on Management of Data (SIGMOD), 2009. {{doi|10.1145/1559845.1559850}}</ref>
 +
 
 +
 
 +
Privacy integrated queries: an extensible platform for privacy-preserving data analysis by Frank D. McSherry. In Proceedings of the 35th SIGMOD International Conference on Management of Data (SIGMOD), 2009.
 +
 
 +
隐私集成查询: 一个可扩展的隐私保护数据分析平台。在第35届 SIGMOD 国际数据管理会议论文集中,2009年。
 +
 
 +
<ref name="Dwork, ICALP 2006">
 +
[http://research.microsoft.com/pubs/64346/dwork.pdf Differential Privacy] by Cynthia Dwork, International Colloquium on Automata, Languages and Programming (ICALP) 2006, p.&nbsp;1–12. {{doi|10.1007/11787006 1}}</ref>
 +
 
 +
 
 +
Differential Privacy by Cynthia Dwork, International Colloquium on Automata, Languages and Programming (ICALP) 2006, p. 1–12.
 +
 
 +
差分隐私,2006年自动机,语言和编程国际座谈会,第1-12页。
 +
 
 +
<ref name="DPBook">
 +
[http://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf The Algorithmic Foundations of Differential Privacy] by Cynthia Dwork and Aaron Roth. Foundations and Trends in Theoretical Computer Science. Vol. 9, no. 3–4, pp.&nbsp;211‐407, Aug. 2014. {{doi|10.1561/0400000042}}</ref>
 +
 
 +
 
 +
The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth. Foundations and Trends in Theoretical Computer Science. Vol. 9, no. 3–4, pp. 211‐407, Aug. 2014.
 +
 
 +
差分隐私的算法基础》 by Cynthia Dwork 和 Aaron Roth。理论计算机科学的基础与发展趋势。第一卷。9号,不。3-4页。211‐407, Aug. 2014.
 +
 
 +
<ref name="Eland">[https://europe.googleblog.com/2015/11/tackling-urban-mobility-with-technology.html Tackling Urban Mobility with Technology] by Andrew Eland. Google Policy Europe Blog, Nov 18, 2015.</ref>
 +
 
 +
Tackling Urban Mobility with Technology by Andrew Eland. Google Policy Europe Blog, Nov 18, 2015.
 +
 
 +
利用技术解决城市流动性问题。谷歌政策欧洲博客,2015年11月18日。
 +
 
 +
<ref name="DpWinTelemetry">[https://www.microsoft.com/en-us/research/publication/collecting-telemetry-data-privately/ Collecting telemetry data privately] by Bolin Ding, Jana Kulkarni, Sergey Yekhanin. NIPS 2017.</ref>
 +
 
 +
Collecting telemetry data privately by Bolin Ding, Jana Kulkarni, Sergey Yekhanin. NIPS 2017.
 +
 
 +
私下收集遥测数据,由 Bolin Ding,Jana Kulkarni,Sergey Yekhanin。2017年。
 +
 
 +
<ref name="DpLinkedIn">[https://arxiv.org/abs/2002.05839 LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale] by Ryan Rogers, Subbu Subramaniam, Sean Peng, David Durfee, Seunghyun Lee, Santosh Kumar Kancha, Shraddha Sahay, Parvez Ahammad. arXiv:2002.05839.</ref>
 +
 
 +
LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale by Ryan Rogers, Subbu Subramaniam, Sean Peng, David Durfee, Seunghyun Lee, Santosh Kumar Kancha, Shraddha Sahay, Parvez Ahammad. arXiv:2002.05839.
 +
 
 +
的受众接触 API: 一个大规模的隐私保护数据分析系统作者: Ryan Rogers,Subbu Subramaniam,Sean Peng,David Durfee,Seunghyun Lee,Santosh Kumar Kancha,Shraddha Sahay,Parvez Ahammad。2002.05839.
 +
 
 +
<ref name="DP19">[https://arxiv.org/abs/1906.01337 SoK: Differential Privacies] by Damien Desfontaines, Balázs Pejó. 2019.</ref>
 +
}}
 +
 
 +
SoK: Differential Privacies by Damien Desfontaines, Balázs Pejó. 2019.
 +
}}
 +
 
 +
差异私生活作者: Damien Desfontaines,Balázs pej ó。2019.
 +
}}
 +
 
 +
==Further reading==
 +
*[https://desfontain.es/privacy/index.html A reading list on differential privacy]
 +
*[https://journalprivacyconfidentiality.org/index.php/jpc/article/view/404 Abowd, John. 2017. “How Will Statistical Agencies Operate When All Data Are Private?”. Journal of Privacy and Confidentiality 7 (3).] {{doi|10.29012/jpc.v7i3.404}} ([https://www2.census.gov/cac/sac/meetings/2017-09/role-statistical-agency.pdf slides])
 +
* [http://www.jetlaw.org/wp-content/uploads/2018/12/4_Wood_Final.pdf "Differential Privacy: A Primer for a Non-technical Audience"], Kobbi Nissim, Thomas Steinke, Alexandra Wood, [[Micah Altman]], Aaron Bembenek, Mark Bun, Marco Gaboardi, David R. O’Brien, and Salil Vadhan, Harvard Privacy Tools Project, February 14, 2018
 +
* Dinur, Irit and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS '03). ACM, New York, NY, USA, 202-210. {{doi|10.1145/773153.773173}}.
 +
* Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. in Halevi, S. & Rabin, T. (Eds.) Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4–7, 2006. Proceedings, Springer Berlin Heidelberg, 265-284, {{doi|10.1007/11681878 14}}.
 +
* Dwork, Cynthia. 2006. Differential Privacy, 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Springer Verlag, 4052, 1-12, {{ISBN|3-540-35907-9}}.
 +
* Dwork, Cynthia and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science. Vol. 9, Nos. 3–4. 211–407, {{doi|10.1561/0400000042}}.
 +
* Machanavajjhala, Ashwin, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. 2008. Privacy: Theory Meets Practice on the Map, International Conference on Data Engineering (ICDE) 2008: 277-286, {{doi|10.1109/ICDE.2008.4497436}}.
 +
* Dwork, Cynthia and Moni Naor. 2010. On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy, Journal of Privacy and Confidentiality: Vol. 2: Iss. 1, Article 8. Available at: http://repository.cmu.edu/jpc/vol2/iss1/8.
 +
* Kifer, Daniel and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11). ACM, New York, NY, USA, 193-204. {{doi|10.1145/1989323.1989345}}.
 +
* Erlingsson, Úlfar, Vasyl Pihur and Aleksandra Korolova. 2014. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS '14). ACM, New York, NY, USA, 1054-1067. {{doi|10.1145/2660267.2660348}}.
 +
* Abowd, John M. and Ian M. Schmutte. 2017 . Revisiting the economics of privacy: Population statistics and confidentiality protection as public goods. Labor Dynamics Institute, Cornell University, Labor Dynamics Institute, Cornell University, at https://digitalcommons.ilr.cornell.edu/ldi/37/
 +
* Abowd, John M. and Ian M. Schmutte. Forthcoming. An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices. American Economic Review,  {{arxiv|1808.06303}}
 +
* Apple, Inc. 2016. Apple previews iOS 10, the biggest iOS release ever. Press Release (June 13). https://www.apple.com/newsroom/2016/06/apple-previews-ios-10-biggest-ios-release-ever.html.
 +
* Ding, Bolin, Janardhan Kulkarni, and Sergey Yekhanin 2017. Collecting Telemetry Data Privately, NIPS 2017.
 +
* http://www.win-vector.com/blog/2015/10/a-simpler-explanation-of-differential-privacy/
 +
* Ryffel, Theo, Andrew Trask, et. al. [[arxiv:1811.04017|"A generic framework for privacy preserving deep learning"]]
 +
 
 +
*A reading list on differential privacy
 +
*Abowd, John. 2017. “How Will Statistical Agencies Operate When All Data Are Private?”. Journal of Privacy and Confidentiality 7 (3).  (slides)
 +
* "Differential Privacy: A Primer for a Non-technical Audience", Kobbi Nissim, Thomas Steinke, Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, David R. O’Brien, and Salil Vadhan, Harvard Privacy Tools Project, February 14, 2018
 +
* Dinur, Irit and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS '03). ACM, New York, NY, USA, 202-210. .
 +
* Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. in Halevi, S. & Rabin, T. (Eds.) Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4–7, 2006. Proceedings, Springer Berlin Heidelberg, 265-284, .
 +
* Dwork, Cynthia. 2006. Differential Privacy, 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Springer Verlag, 4052, 1-12, .
 +
* Dwork, Cynthia and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science. Vol. 9, Nos. 3–4. 211–407, .
 +
* Machanavajjhala, Ashwin, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. 2008. Privacy: Theory Meets Practice on the Map, International Conference on Data Engineering (ICDE) 2008: 277-286, .
 +
* Dwork, Cynthia and Moni Naor. 2010. On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy, Journal of Privacy and Confidentiality: Vol. 2: Iss. 1, Article 8. Available at: http://repository.cmu.edu/jpc/vol2/iss1/8.
 +
* Kifer, Daniel and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11). ACM, New York, NY, USA, 193-204. .
 +
* Erlingsson, Úlfar, Vasyl Pihur and Aleksandra Korolova. 2014. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS '14). ACM, New York, NY, USA, 1054-1067. .
 +
* Abowd, John M. and Ian M. Schmutte. 2017 . Revisiting the economics of privacy: Population statistics and confidentiality protection as public goods. Labor Dynamics Institute, Cornell University, Labor Dynamics Institute, Cornell University, at https://digitalcommons.ilr.cornell.edu/ldi/37/
 +
* Abowd, John M. and Ian M. Schmutte. Forthcoming. An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices. American Economic Review, 
 +
* Apple, Inc. 2016. Apple previews iOS 10, the biggest iOS release ever. Press Release (June 13). https://www.apple.com/newsroom/2016/06/apple-previews-ios-10-biggest-ios-release-ever.html.
 +
* Ding, Bolin, Janardhan Kulkarni, and Sergey Yekhanin 2017. Collecting Telemetry Data Privately, NIPS 2017.
 +
* http://www.win-vector.com/blog/2015/10/a-simpler-explanation-of-differential-privacy/
 +
* Ryffel, Theo, Andrew Trask, et. al. "A generic framework for privacy preserving deep learning"
 +
 
 +
= = = 进一步阅读 = = =
 +
* 差分隐私上的阅读清单
 +
* 。2017.“当所有数据都是私人数据时,统计机构将如何运作?”。隐私与保密期刊7(3)。(幻灯片)
 +
* “差分隐私: 非技术观众入门”,Kobbi Nissim,Thomas Steinke,Alexandra Wood,Micah Altman,Aaron Bembenek,Mark Bun,Marco gabordi,David r. o’brien,and Salil Vadhan,Harvard Privacy Tools Project,February 14,2018
 +
* Dinur,Irit and Kobbi Nissim。2003.在保护隐私的同时披露信息。在第二十二届 ACM SIGMOD-SIGACT-SIGART 数据库系统原理研讨会会议录(PODS’03)。ACM,纽约,纽约,美国,202-210. 。
 +
* Dwork、 Cynthia、 Frank McSherry、 Kobbi Nissim 和 Adam Smith。2006. in Halevi,s & Rabin,t.(Eds.)在密码学的私人数据分析理论中校准噪声的灵敏度: 第三次密码学理论会议,TCC 2006,纽约,纽约,美国,2006年3月4-7。美国国家科学院院刊,Springer Berlin Heidelberg,265-284,。
 +
* 辛西娅。2006.差分隐私,第33届国际自动机,语言和编程学术讨论会,第二部分(ICALP 2006) ,Springer Verlag,4052,1-12,。
 +
* Dwork,Cynthia and Aaron Roth.2014.差分隐私的算法基础。理论计算机科学的基础与发展趋势。第一卷。9,Nos.3–4.211–407, .
 +
* Machanavajjhala,Ashwin,Daniel Kifer,John m. Abowd,Johannes Gehrke,and Lars Vilhuber.2008.隐私权: 理论与实践的结合,国际数据工程会议2008:277-286,。
 +
* Dwork、 Cynthia 和 Moni Naor。2010.关于统计数据库中的披露防范的困难或者差分隐私的案例,隐私和保密期刊: 第一卷。2: Iss.1,第8条。网址:  http://repository.cmu.edu/jpc/vol2/iss1/8。
 +
* Kifer,Daniel and Ashwin Machanavajjhala.2011.数据隐私没有免费午餐。在2011年 ACM SIGMOD 国际数据管理会议记录(SIGMOD’11)。ACM,纽约,纽约,美国,193-204. 。
 +
* Erlingsson, Úlfar, Vasyl Pihur and Aleksandra Korolova.2014.RAPPOR: 随机可聚合隐私保护顺序响应。在2014年 ACM SIGSAC 计算机和通信安全会议(CCS’14)的会议记录中。ACM,纽约,纽约,美国,1054-1067。
 +
* 以上,约翰 · m · 施穆特和伊恩 · m · 施穆特。2017 .重温隐私经济学: 人口统计和保密性保护作为公共产品。劳动动力学研究所,康奈尔大学,劳动动力学研究所,康奈尔大学, https://digitalcommons.ilr.cornell.edu/ldi/37/。即将到来。作为社会选择的隐私权保护与统计准确性的经济学分析。美国经济评论》 ,
 +
* 苹果公司,2016。苹果预览 iOS 10,史上最大的 iOS 发布。新闻稿(六月十三日)。Https://www.apple.com/newsroom/2016/06/apple-previews-ios-10-biggest-ios-release-ever.html.
 +
* 丁、博林、贾纳丹•库尔卡尼及谢尔盖•叶卡宁二○一七。私下收集遥测数据 NIPS 2017。
 +
*  http://www.win-vector.com/blog/2015/10/a-simpler-explanation-of-differential-privacy/
 +
* Ryffel,Theo,Andrew Trask,et.艾尔。“一个保护隐私的通用深度学习框架”
 +
 
 +
==External links==
 +
* [https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/dwork.pdf Differential Privacy] by Cynthia Dwork, ICALP July 2006.
 +
* [http://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf The Algorithmic Foundations of Differential Privacy] by Cynthia Dwork and Aaron Roth, 2014.
 +
* [http://research.microsoft.com/apps/pubs/default.aspx?id=74339 Differential Privacy: A Survey of Results] by Cynthia Dwork, Microsoft Research, April 2008
 +
* [http://video.ias.edu/csdm/dynamicdata Privacy of Dynamic Data: Continual Observation and Pan Privacy] by Moni Naor, Institute for Advanced Study, November 2009
 +
* [http://simons.berkeley.edu/talks/katrina-ligett-2013-12-11 Tutorial on Differential Privacy] by [[Katrina Ligett]], California Institute of Technology, December 2013
 +
* [http://www.cerias.purdue.edu/news_and_events/events/security_seminar/details/index/j9cvs3as2h1qds1jrdqfdc3hu8 A Practical Beginner's Guide To Differential Privacy] by Christine Task, Purdue University, April 2012
 +
* [https://commondataproject.org/blog/2011/04/27/the-cdp-private-map-maker-v0-2/ Private Map Maker v0.2] on the Common Data Project blog
 +
* [https://research.googleblog.com/2014/10/learning-statistics-with-privacy-aided.html Learning Statistics with Privacy, aided by the Flip of a Coin] by Úlfar Erlingsson, Google Research Blog, October 2014
 +
*[https://www.belfercenter.org/publication/technology-factsheet-differential-privacy Technology Factsheet: Differential Privacy] by Raina Gandhi and Amritha Jayanti, Belfer Center for Science and International Affairs, Fall 2020
 +
 
 +
* Differential Privacy by Cynthia Dwork, ICALP July 2006.
 +
* The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth, 2014.
 +
* Differential Privacy: A Survey of Results by Cynthia Dwork, Microsoft Research, April 2008
 +
* Privacy of Dynamic Data: Continual Observation and Pan Privacy by Moni Naor, Institute for Advanced Study, November 2009
 +
* Tutorial on Differential Privacy by Katrina Ligett, California Institute of Technology, December 2013
 +
* A Practical Beginner's Guide To Differential Privacy by Christine Task, Purdue University, April 2012
 +
* Private Map Maker v0.2 on the Common Data Project blog
 +
* Learning Statistics with Privacy, aided by the Flip of a Coin by Úlfar Erlingsson, Google Research Blog, October 2014
 +
*Technology Factsheet: Differential Privacy by Raina Gandhi and Amritha Jayanti, Belfer Center for Science and International Affairs, Fall 2020
 +
 
 +
= = = 外部链接 = = =
 +
* 差分隐私: Cynthia Dwork,ICALP July 2006。差分隐私的算法基础》 ,Cynthia Dwork 和 Aaron Roth,2014年。2013年12月,加州理工学院卡特里娜 · 利格特教授,差分隐私,差分隐私,差分隐私实用指南,克里斯汀 · 特拉克,普渡大学,2012年4月
 +
* 私人地图制作者 v0.2 on the Common Data Project Blog
 +
* Learning Statistics with Privacy,added by the Flip of a Coin by úlfar Erlingsson,Google Research Blog,October 2014
 +
* Technology Factsheet: 差分隐私地图制作者 Raina Gandhi and Amritha Jayanti,Belfer Center for Science and International Affairs,Fall 2020
 +
 
 +
[[Category:Differential privacy| ]]
 +
[[Category:Theory of cryptography]]
 +
[[Category:Information privacy]]
 +
 
 +
 +
Category:Theory of cryptography
 +
Category:Information privacy
 +
 
 +
密码学理论范畴: 信息隐私
    
<noinclude>
 
<noinclude>
   −
<small>This page was moved from [[wikipedia:en:Urban computing]]. Its edit history can be viewed at [[差分隐私/edithistory]]</small></noinclude>
+
<small>This page was moved from [[wikipedia:en:Differential privacy]]. Its edit history can be viewed at [[差分隐私/edithistory]]</small></noinclude>
    
[[Category:待整理页面]]
 
[[Category:待整理页面]]
1,568

个编辑