Data Processing and analysis

 

Data Processing

MEANING OF DATA ANALYSIS

 In any research, the step of analysis of the data is one of the most crucial tasks requiring proficient knowledge to handle the data collected as per the pre decided research design of the project. The analysis is a process which enters into research in one form or another form of the data.  Whether it is a qualitative or quantitative research even if the data is sufficient and valid, it will not serve any purpose unless it is carefully processed and scientifically analyzed and interpreted.

DIFFERENCE BETWEEN DATA ANALYSIS, PROCESSING AND INTERPRETATION

 The general understanding is that data analysis and processing are one and the same. However a number of researchers and authors are of the opinion that both of them are two very distinct steps in the research process where data processing leads to data analysis. Lets us understand the difference between the two in more detail.

Data Processing

 Once the data is collected, following steps are taken to process the data into more measurable and concise manner:

a.       Editing

In the stage of editing all the raw data that is collected is checked for errors, omissions sometimes legibility and consistency as well. This ensure basic standard in the data collected and facilitate further processing.

b.      Coding

 Coding refers to the process of assigning numerals or other symbols to answers so that responses can be put into a limited number of categories or classes. Such classes should be appropriate to the research problem under consideration. Coding can also be pre or post. Pre coding meaning codes being assigned while the questionnaire or interview schedule is being prepared. In the case of post coding, codes are assigned to the answers after they are collected.

c.       Classification

 Once the data is collected it is to be divided into homogeneous groups for further analysis on the basis of common characteristics.

d.      Tabulation

Tabulation is the process of summarizing raw data and displaying the same in compact form (i.e., in the form of statistical tables) for further analysis. In a broader sense, tabulation is an orderly arrangement of data in columns and rows.

Tabulation is essential because of the following reasons-

1. It conserves space and reduces explanatory and descriptive statement to a minimum.

2. It facilitates the process of comparison.

 3. It facilitates the summation of items and the detection of errors and omissions.

 4. It provides the basis for various statistical computations

In relatively large inquiries, we may use mechanical or computer tabulation if other factors are favorable and necessary facilities are available

Types of table There are generally two types of tables simple and complex. They are discussed following:

 

Simple table/ frequency distribution

 Under it , the different attribute are stated in the left hand column and the frequency or extend of occurrence of each of theses classed are written in another column.

 

 

 

 

Univariate

 

Age of the Respondents

Frequency

Percentage

Below 10

14

10.8

11-20

18

13.8

21-30

22

16.9

31-40

42

32.3

41-50

26

20

Above 50

8

6.2

Total

130

100

 

Complex or cross table

In a complex table, bi or multivariate are used. These have become more popular in the research representation in recent years. Following is an example of the same.

Income

Rupees

SEX

Total

Male

Female

Rural

Urban

Rural

Urban

Below 100

20

23

8

12

63

101-500

18

30

10

36

94

501-1000

10

28

5

21

64

Above 1000

5

15

2

14

36

Above 5000

2

10

0

8

20

Above 10000

1

8

0

5

14

 

In the above table there are three variants i.e income, residence and sex are being studied and tabulated.

Preparation of a table Following are certain guidelines to be kept in mind while preparing a table:

1. Title of the table - give suitable heading to each table which should be short and appropriate 2. Sub headings and captions - subheadings to different columns and rows must be given. Captions are given to the various classifications made like income, age, sex etc.

 3. Size of the column- each column must have the correct size which make them look more attractive

 4. Arrangement of items in rows and columns - items must be arranged in one order like alphabetically, chronologically etc.

5. Totals - the total for different columns must be different.

 6. Demarcation of columns - If columns have been divided further into sub groups, they should be in an suitable order and sub headings

7. Footnotes - If there is anything special about the table or figures which need to be bought attention to, the same should be mentioned in a footnote.

Data Interpretation

 Once the data has been processed and analyzed, the final step required in the research process is interpretation of the data. The line between analysis and interpretation is very thin. Through interpretation one understands what the given research findings really mean and what is the underlying generalization which is manifested thought the data collected. This can be descriptive or analytical or theoretical. The data is interpreted from the point of the research questions and hypothesis is tested. While interpretation is being done, generalizations are drawn. Thus, interpretation consists of conclusions that the researcher has reached after the data has been processed and analyzed.

TYPES OF DATA ANALYSIS

 Data analysis depends upon the nature of research that the researcher is undertaking. Types of data analysis vary depending upon whether the research is qualitative or quantitative in nature. In the present module, as earlier stated we will be studying various types of data analysis from the stand point of quantitative research only.

Descriptive analysis

 According to C Emory, “descriptive analysis is largely the study of distribution of one variable. This study provides us with profiles of companies, work groups, persons and other subjects on any multiple characteristics such as size, composition, efficiency, preferences

Ex: The researcher is collecting data from various law colleges in India to map the job preferences of the students in the final year of LL.B. In such a research job preferences like litigation, corporate, further studies, judiciary etc becomes the variable. Under it statistical tools like percentage and means are used and the data is then represented through a graph.

 

Inferential analysis

Inferential analysis is concerned with the various tests of significance for testing hypotheses in order to determine with what validity data can be said to indicate some conclusion or conclusions. It is mainly on the basis of inferential analysis that the task of interpretation (i.e., the task of drawing inferences and conclusions) is performed.

 

Illustration: The researcher is studying the access to justice system in India and his hypothesis beings that the India justice delivery system favors the haves and marginalizes the have not’s. The data collected is from various stages in the delivery system like police station, courts of justice, litigants etc. Once the data is collected, proceeded then the researcher does inferential analysis to test the validity of the hypotheses.

STATISTICAL ANALYSIS OF DATA

Statistics is not merely a device for collecting numerical data but also a means of sound techniques for their handling, analysis and drawing value inferences from them.

Tools of statistical analysis

 There are various statistical tools which are available for the researcher’s assistance.

1. Measure central tendency : The term central tendency contents the average. The most common central tendency tools are average or mean, median, mode, geometric mean and harmonic mean.

2. Measure of dispersion: The measure of dispersion or variability is the most common corrective measure for the concept of average. The most common method of the same is standard deviation. Others are mean deviation and range.

 3. Measure of asymmetry

It is used to describe the shape of a distribution. Kurtosis is a measure that indicates the degree to which a curve of a frequency distribution is peaked or flat-topped.

1.      Measure of relationship Correlation and coefficient is commonly used to measure the relationship. It is mostly used for prediction. Higher the degree of correlation, greater the accuracy with which one can predict a score. Karl Pearson’s coefficient of correlation is the frequently used measure in case of statistics of variables, whereas Yule’s coefficient of association is used in case of statistics of attributes. Multiple correlation coefficient, partial correlation coefficient, regression analysis, etc., are other important measures often used by a researcher.

2.       Other measures Index number and analysis of time series are some of the other tools of data analysis. Index numbers are indicators which reflect the relative changes in the level of a certain phenomenon in any given period called the current period with respect to its values in some other period called the base period selected primarily for this comparison.

3.      Statistical software packages

 To assist the researcher in quantitative data analysis, there are various statistic softwares available for computerized statistical data analysis. Some of them are available in the open source/ public domain i.e. free of cost while others are paid and purchased softwares. They are of great help when analyzing large quantities of data. The two most commonly used softwares are SAS (Statistical Analysis System) and SPSS (Statistical Package for Social Sciences).

4.      ANALYSIS WHEN HYPOTHESIS EXISTS

 When specific hypothesis has been set down, then the major part of analysis involves getting the appropriate combinations of data and reading them so as to verify or falsify the hypothesis. A hypothesis which is tested for possible rejection is known as ‘null hypotheses. Null hypothesis is very much useful in testing the significant difference between assumed and observed value.

 

DIAGRAMMATIC REPRESENTATATION

 A very convenient and appealing method of data representation is by using various forms of diagrams. They in a very meaningful way highlight the salient features of the data which makes them easy to understand. Following are examples of some of the diagrammatic representations that may be employed in the research report. It may be noted that all the diagrams are fictitious and made only for illustrative purpose here:

a. Graph In a graph there are two axis the X and Y axis. X axis is horizontal and the Y axis is vertical intersecting the X axis. The point where intersection occurs is the place of origin. The independent variables are scaled on the X axis and the dependent one on the Y axis. Following is an illustration of the same. In the graph the growth of female literacy in India since independence has been shown. The X axis has the years while the Y axis has the rate of growth of women literacy in India.

 

b) Bar diagram The bar diagrams are drawn either vertically or horizontally. Each bar indicates the value of the variable. Illustration- The following bar diagram shows by way of example what was the voters turn out till the year 2010 general election in the state of Delhi. The data is merely for illustration purpose.


c) Pie chart In a pie chart, the data is presented in the form of a circle with each category occupying a segment that is proportional according to the size of its data. c) Pie chart In a pie chart, the data is presented in the form of a circle with each category occupying a segment that is proportional according to the size of its data. 0 10 20 30 40 50 60 70 80 1950 0 1950-1960 1961-1970 1971-1980 1981-1990 1991-2000 2001-2010


Comments

Popular posts from this blog

Legal Reasoning: Inductive and Deductive Methods

Nature & Scope of Research

Research design