The difference between data analysis and big data analysis
First, what is data analysis:
Data analysis refers to the
process of analyzing a large amount of collected data with appropriate statistical
analysis methods, extracting useful information and forming conclusions, but
studying and summarizing the data in detail.
Data analysis includes two
aspects of "data" and "analysis". On the one hand, it includes collecting,
processing and organizing data, and on the other hand, it also includes analyzing
data, extracting valuable information from it and forming helpful conclusions for
business.
The results of data analysis are usually presented in the form of
analysis reports. For data analysis reports, analysis is the argument, data is the
argument, and both are indispensable.
The difference between data
analysis and big data analysis:
1. Data Analysis
Data
analysis refers to the process of analyzing a large amount of collected data with
appropriate statistical analysis methods, extracting useful information and forming
conclusions at the same time, that is, the process of detailed research and summary
of data.
Data analysis requires mastery of mathematical knowledge and
analytical tools. Mathematical knowledge includes statistics, probability theory and
mathematical statistics, multivariate statistical analysis, time series, and data
mining; tools should generally master Excel, SQL, R, Python, etc. It is necessary to
learn and master basic data processing and analysis methods, master advanced data
analysis and data mining methods (such as multiple linear regression, Bayesian,
neural network, decision tree, cluster analysis, association rules, time series,
support vector machine, ensemble learning, etc.) and visualization techniques.
2. Big data analysis
Big data analytics refers to collections of
data that cannot be captured, managed, and processed with conventional software
tools within an affordable time frame. It is a massive, high-growth, and diverse
information asset that requires a new processing model to have stronger
decision-making power, insight and discovery, and process optimization capabilities.
Some people define big data analysis like this: do not use the shortcut of
random sampling survey analysis, but use the analysis and processing of all data; do
not consider the distribution status of the data, because sampling data needs to
consider whether the sample distribution is biased and whether it is consistent with
the overall ; and do not need to consider hypothesis testing. This is also a
difference between big data analysis and general data analysis.
The core
difference between big data analysis and data analysis is that the scale of data
processed is different, which leads to different skills of practitioners in the two
directions. In the CDA talent competency standard, data analysts and big data
analysts are defined from five aspects: theoretical basis, software tools, analysis
methods, business analysis, and visualization.