Data Processing and analysis
Data
Processing
MEANING OF DATA ANALYSIS
In any
research, the step of analysis of the data is one of the most crucial tasks
requiring proficient knowledge to handle the data collected as per the pre
decided research design of the project. The analysis is a process which enters
into research in one form or another
form of the data. Whether it is a
qualitative or quantitative research even if the data is sufficient and valid,
it will not serve any purpose unless it is carefully processed and
scientifically analyzed and interpreted.
DIFFERENCE BETWEEN DATA ANALYSIS, PROCESSING AND
INTERPRETATION
The general
understanding is that data analysis and processing are one and the same. However a number of researchers and authors are
of the opinion that both of them are two very distinct steps in the research
process where data processing leads to data analysis. Lets us understand the
difference between the two in more detail.
Data Processing
Once the data
is collected, following steps are taken to process the data into more
measurable and concise manner:
a. Editing
In
the stage of editing all the raw data that is collected is checked for errors, omissions sometimes
legibility and consistency as well. This ensure basic standard in the data
collected and facilitate further processing.
b. Coding
Coding refers to the process of assigning numerals or other symbols to
answers so that responses can be put into a limited number of categories or
classes. Such classes should be appropriate to the research problem under
consideration. Coding can also be pre or post. Pre coding meaning codes being
assigned while the questionnaire or interview schedule is being prepared. In
the case of post coding, codes are assigned to the answers after they are
collected.
c. Classification
Once the data is collected it is to be divided
into homogeneous groups for further
analysis on the basis of common characteristics.
d. Tabulation
Tabulation
is the process of summarizing raw data and displaying the same in compact form (i.e., in the form of
statistical tables) for further analysis. In a broader sense, tabulation is an
orderly arrangement of data in columns and rows.
Tabulation
is essential because of the following reasons-
1.
It conserves space and reduces
explanatory and descriptive statement to a minimum.
2.
It facilitates the process of comparison.
3. It facilitates the summation of items and the detection of errors and omissions.
4. It provides the basis for various statistical computations
In
relatively large inquiries, we may use mechanical or computer tabulation if other factors are favorable and necessary
facilities are available
Types
of table There are generally two types of tables simple and complex. They are
discussed following:
Simple table/ frequency
distribution
Under it , the different attribute are stated
in the left hand column and the frequency or extend of occurrence of each of
theses classed are written in another column.
Univariate
|
Age
of the Respondents |
Frequency |
Percentage |
|
Below
10 |
14 |
10.8 |
|
11-20 |
18 |
13.8 |
|
21-30 |
22 |
16.9 |
|
31-40 |
42 |
32.3 |
|
41-50 |
26 |
20 |
|
Above
50 |
8 |
6.2 |
|
Total |
130 |
100 |
Complex
or cross table
In
a complex table, bi or multivariate
are used. These have become more popular in the research representation in
recent years. Following is an example of the same.
|
Income Rupees |
SEX |
Total |
|||
|
Male |
Female |
||||
|
Rural |
Urban |
Rural |
Urban |
||
|
Below
100 |
20 |
23 |
8 |
12 |
63 |
|
101-500 |
18 |
30 |
10 |
36 |
94 |
|
501-1000 |
10 |
28 |
5 |
21 |
64 |
|
Above
1000 |
5 |
15 |
2 |
14 |
36 |
|
Above
5000 |
2 |
10 |
0 |
8 |
20 |
|
Above
10000 |
1 |
8 |
0 |
5 |
14 |
In
the above table there are three variants i.e income, residence and sex are
being studied and tabulated.
Preparation
of a table Following are certain guidelines to be kept in mind while preparing
a table:
1.
Title of the table - give suitable heading to each table which should be short
and appropriate 2. Sub headings and captions - subheadings to different columns
and rows must be given. Captions are given to the various classifications made
like income, age, sex etc.
3. Size of the column- each column must have
the correct size which make them look more attractive
4. Arrangement of items in rows and columns -
items must be arranged in one order like alphabetically, chronologically etc.
5.
Totals - the total for different columns must be different.
6. Demarcation of columns - If columns have
been divided further into sub groups, they should be in an suitable order and
sub headings
7.
Footnotes - If there is anything special about the table or figures which need
to be bought attention to, the same should be mentioned in a footnote.
Data
Interpretation
Once the data has been processed and analyzed,
the final step required in the research process is interpretation of the data.
The line between analysis and interpretation is very thin. Through
interpretation one understands what the given
research findings really mean and what is the underlying generalization
which is manifested thought the data collected. This can be descriptive or
analytical or theoretical. The data is interpreted from the point of the research questions and hypothesis is tested.
While interpretation is being done, generalizations are drawn. Thus, interpretation
consists of conclusions that the
researcher has reached after the data has been processed and analyzed.
TYPES
OF DATA ANALYSIS
Data analysis depends upon the nature of
research that the researcher is undertaking. Types of data analysis vary
depending upon whether the research is qualitative or quantitative in nature.
In the present module, as earlier stated we will be studying various types of
data analysis from the stand point of quantitative research only.
Descriptive
analysis
According to C Emory, “descriptive analysis is
largely the study of distribution of one
variable. This study provides us with profiles of companies, work groups,
persons and other subjects on any multiple characteristics such as size,
composition, efficiency, preferences
Ex:
The researcher is collecting data from various law colleges in India to map the
job preferences of the students in the final year of LL.B. In such a research
job preferences like litigation, corporate, further studies, judiciary etc
becomes the variable. Under it statistical tools like percentage and means are
used and the data is then represented through a graph.
Inferential
analysis
Inferential
analysis is concerned with the various tests of significance for testing hypotheses in order to
determine with what validity data can be said to indicate some conclusion or
conclusions. It is mainly on the basis of inferential analysis that the task of
interpretation (i.e., the task of drawing inferences and conclusions) is
performed.
Illustration:
The researcher is studying the access to justice system in India and his
hypothesis beings that the India justice delivery system favors the haves and
marginalizes the have not’s. The data collected is from various stages in the
delivery system like police station, courts of justice, litigants etc. Once the
data is collected, proceeded then the researcher does inferential analysis to
test the validity of the hypotheses.
STATISTICAL
ANALYSIS OF DATA
Statistics
is not merely a device for collecting
numerical data but also a means of sound techniques for their handling,
analysis and drawing value inferences from them.
Tools
of statistical analysis
There are various statistical tools which are
available for the researcher’s assistance.
1.
Measure central tendency : The term central tendency contents the average. The
most common central tendency tools are average or mean, median, mode, geometric mean and harmonic mean.
2.
Measure of dispersion: The measure of dispersion or variability is the most common
corrective measure for the concept of average. The most common method of the
same is standard deviation. Others
are mean deviation and range.
3. Measure of asymmetry
It
is used to describe the shape of a distribution. Kurtosis is a measure that indicates
the degree to which a curve of a frequency
distribution is peaked or flat-topped.
1. Measure of relationship
Correlation and coefficient is commonly used to measure the relationship. It is
mostly used for prediction. Higher the degree of correlation, greater the
accuracy with which one can predict a score. Karl Pearson’s coefficient of
correlation is the frequently used measure in case of statistics of variables,
whereas Yule’s coefficient of association is used in case of statistics of
attributes. Multiple correlation
coefficient, partial correlation coefficient, regression analysis, etc., are other important measures often used
by a researcher.
2. Other
measures Index number and analysis of time series are some of the other
tools of data analysis. Index numbers are indicators which reflect the relative
changes in the level of a certain phenomenon in any given period called the current period with respect to its
values in some other period called the base period selected primarily for this
comparison.
3. Statistical
software packages
To assist the researcher in quantitative data
analysis, there are various statistic softwares available for computerized
statistical data analysis. Some of them are available in the open source/
public domain i.e. free of cost while others are paid and purchased softwares.
They are of great help when analyzing large quantities of data. The two most
commonly used softwares are SAS
(Statistical Analysis System) and SPSS (Statistical Package for Social
Sciences).
4. ANALYSIS
WHEN HYPOTHESIS EXISTS
When specific hypothesis has been set down,
then the major part of analysis involves getting the appropriate combinations
of data and reading them so as to verify or falsify the hypothesis. A
hypothesis which is tested for possible rejection is known as ‘null hypotheses.
Null hypothesis is very much useful in testing the significant difference between assumed and observed value.
DIAGRAMMATIC
REPRESENTATATION
A very convenient and appealing method of data representation is by using various forms of diagrams. They in a very meaningful way highlight the salient features of the data which makes them easy to understand. Following are examples of some of the diagrammatic representations that may be employed in the research report. It may be noted that all the diagrams are fictitious and made only for illustrative purpose here:
a. Graph In a graph there are two axis the X and Y axis. X axis is horizontal and the Y axis is vertical intersecting the X axis. The point where intersection occurs is the place of origin. The independent variables are scaled on the X axis and the dependent one on the Y axis. Following is an illustration of the same. In the graph the growth of female literacy in India since independence has been shown. The X axis has the years while the Y axis has the rate of growth of women literacy in India.
b)
Bar diagram The bar diagrams are drawn either vertically or horizontally. Each
bar indicates the value of the variable. Illustration- The following bar
diagram shows by way of example what was the voters turn out till the year 2010
general election in the state of Delhi. The data is merely for illustration
purpose.
c)
Pie chart In a pie chart, the data is presented in the form of a circle with
each category occupying a segment that is proportional according to the size of
its data. c) Pie chart In a pie chart, the data is presented in the form of a
circle with each category occupying a segment that is proportional according to
the size of its data. 0 10 20 30 40 50 60 70 80 1950 0 1950-1960 1961-1970
1971-1980 1981-1990 1991-2000 2001-2010
Comments
Post a Comment