− | This article is about large collections of data. For the band, see Big Data (band). For the practice of buying and selling of personal and consumer data, see Surveillance capitalism.[[File:Hilbert InfoGrowth.png|thumb|right|400px|Non-linear growth of digital global information-storage capacity and the waning of analog storage<ref>{{cite journal|url= http://www.martinhilbert.net/WorldInfoCapacity.html|title= The World's Technological Capacity to Store, Communicate, and Compute Information|volume= 332|issue= 6025|pages= 60–65|journal=Science|access-date= 13 April 2016|bibcode= 2011Sci...332...60H|last1= Hilbert|first1= Martin|last2= López|first2= Priscila|year= 2011|doi= 10.1126/science.1200970|pmid= 21310967|s2cid= 206531385}}</ref>全球数字信息存储容量的非线性增长和模拟存储的减少。|链接=Special:FilePath/Hilbert_InfoGrowth.png]] | + | This article is about large collections of data. For the band, see Big Data (band). For the practice of buying and selling of personal and consumer data, see Surveillance capitalism. |
− | '''Big data''' is a field that treats ways to analyze, systematically extract information from, or otherwise deal with [[data set]]s that are too large or complex to be dealt with by traditional [[data processing|data-processing]] [[application software]]. Data with many fields (columns) offer greater [[statistical power]], while data with higher complexity (more attributes or columns) may lead to a higher [[false discovery rate]].<ref>{{Cite journal|last=Breur|first=Tom|date=July 2016|title=Statistical Power Analysis and the contemporary "crisis" in social sciences|journal=Journal of Marketing Analytics |publisher=[[Palgrave Macmillan]]|location=London, England|volume=4 |issue=2–3 |pages=61–65 |doi=10.1057/s41270-016-0001-3 |issn=2050-3318|doi-access=free}}</ref> Big data analysis challenges include [[Automatic identification and data capture|capturing data]], [[Computer data storage|data storage]], [[data analysis]], search, [[Data sharing|sharing]], [[Data transmission|transfer]], [[Data visualization|visualization]], [[Query language|querying]], updating, [[information privacy]], and data source. Big data was originally associated with three key concepts: ''volume'', ''variety'', and ''velocity''.<ref name=":0" /> The analysis of big data presents challenges in sampling, and thus previously allowing for only observations and sampling. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an acceptable time and ''value''. | + | '''Big data''' is a field that treats ways to analyze, systematically extract information from, or otherwise deal with [[data set]]s that are too large or complex to be dealt with by traditional [[data processing|data-processing]] [[application software]]. Data with many fields (columns) offer greater [[statistical power]], while data with higher complexity (more attributes or columns) may lead to a higher [[false discovery rate]]. Big data analysis challenges include [[Automatic identification and data capture|capturing data]], [[Computer data storage|data storage]], [[data analysis]], search, [[Data sharing|sharing]], [[Data transmission|transfer]], [[Data visualization|visualization]], [[Query language|querying]], updating, [[information privacy]], and data source. Big data was originally associated with three key concepts: ''volume'', ''variety'', and ''velocity''.<ref name=":0" /> The analysis of big data presents challenges in sampling, and thus previously allowing for only observations and sampling. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an acceptable time and ''value''. |
− | Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many fields (columns) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Big data analysis challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy, and data source. Big data was originally associated with three key concepts: volume, variety, and velocity. The analysis of big data presents challenges in sampling, and thus previously allowing for only observations and sampling. Therefore, big data often includes data with sizes that exceed the capacity of traditional software to process within an acceptable time and value.
| + | 大数据领域研究如何系统地从传统数据处理应用软件无法处理的太大或太复杂的数据集中提取、分析并处理信息。由于具有多个字段(列)的数据提供了更大的统计能力,同样,具有更高复杂性(更多属性或列)的数据也可能会导致更高的错误率。<ref>{{Cite journal|last=Breur|first=Tom|date=July 2016|title=Statistical Power Analysis and the contemporary "crisis" in social sciences|journal=Journal of Marketing Analytics |publisher=[[Palgrave Macmillan]]|location=London, England|volume=4 |issue=2–3 |pages=61–65 |doi=10.1057/s41270-016-0001-3 |issn=2050-3318|doi-access=free}}</ref>大数据分析挑战包括捕获数据、数据存储、数据分析、搜索、共享、传输、可视化、查询、更新、信息隐私和数据源。大数据最初与三个关键概念有关:大数据量、多样性和高速度。大数据分析意味着抽样存在挑战,因此旧技术只能进行观察和抽样。而大数据分析通常包括超过传统软件在有限时间及性能内能处理的数据量。 |
| Current usage of the term ''big data'' tends to refer to the use of [[predictive analytics]], [[user behavior analytics]], or certain other advanced data analytics methods that extract [[Data valuation|value]] from big data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that's not the most relevant characteristic of this new data ecosystem."<ref>{{cite journal |last1=boyd |first1=dana |last2=Crawford |first2=Kate |title=Six Provocations for Big Data |journal=Social Science Research Network: A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society |date=21 September 2011 |doi= 10.2139/ssrn.1926431|s2cid=148610111 |url=http://osf.io/nrjhn/ }}</ref> | | Current usage of the term ''big data'' tends to refer to the use of [[predictive analytics]], [[user behavior analytics]], or certain other advanced data analytics methods that extract [[Data valuation|value]] from big data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that's not the most relevant characteristic of this new data ecosystem."<ref>{{cite journal |last1=boyd |first1=dana |last2=Crawford |first2=Kate |title=Six Provocations for Big Data |journal=Social Science Research Network: A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society |date=21 September 2011 |doi= 10.2139/ssrn.1926431|s2cid=148610111 |url=http://osf.io/nrjhn/ }}</ref> |