Jump to Navigation

Surveys as a Research Method: The Backbone of Quantitative Research is Within Your Organization's Reach

Data Analysis

Compiling the survey results

Once you have finalized your survey design, it's time to think about describing and formatting the data you will collect.  Your results should end up neatly organized in a table.  Each row (or "case") in it summarizes the answers given by a single respondent.  Each column (or "variable") contains some specific type of data you have opted to collect ("Age", "Gender", "Tenure in the community" and so on). 

Here is an example of several survey questions - and what the answers may look like in your data spreadsheet:

1. What is your gender:

  • Female
  • Male
  • Prefer not to answer

2. Which of the following channels do you use to find information
about your neighborhood (please select all that apply):

  • Newspapers
  • Magazines
  • Radio
  • Television
  • Internet (all online sources)
  • Talking to my neighbors



The first column gives you the ID assigned to a particular respondent. The second column shows you the respondent's gender (Question 1).  You will notice that Question 2 has a separate column for each potential answer. That is because "select-all-that-apply" items actually bundle together a number of separate questions that we want to ask:

  • Do you use newspapers to stay on top of your community? (yes/no)
  • Do you use magazines to stay on top of your community? (yes/no)

... and so on.

It is a good idea to give the variables in your table suggestive titles - names that will remind you what that column stands for. With larger surveys, however, it may be difficult to remember what all of those names and values mean. That's where a Codebook describing your data organization comes in handy.

Here's what a codebook for the sample questions above may look like:


It is useful to have a special way to mark cases in which the respondents have skipped or refused to answer a question.  Here we have used "88" in cases where a person has specifically indicated that they don't want to answer - and "99" when they have skipped the question.  To avoid confusion, it is best to have the same value indicating a missing answer (for instance "99") throughout the survey. It should, furthermore, be a value that is not likely to appear as a legitimate answer option.  If you are asking your respondents about their age in years for example, "99" may be a valid answer.


Analyzing your data

There are many software tools that can help you analyze your survey results.  SPSS and SAS are two popular commercial packages for computational statistics. The R environment  is a free, open-source alternative allowing for a wide variety of advanced statistical tests. If you are looking for a less robust, but simpler and easy to use free analysis tool, you may find SOFA Statistics relatively helpful.

For small projects and simple data analysis tasks, you can use a spreadsheet application like Microsoft Excel.  Excel 2007 and later versions have an "Analysis Toolpak" add-in which gives easy access to frequently used data analysis functions.

(To make sure your Analysis Toolpak add-in is turned on go to Excel Options -> Add-Ins -> Manage Excel Add-Ins and make sure the box next to "Analysis Toolpak" is checked).

Get Participatory! Involving community members in the analysis process can provide a highly valuable training opportunity. In addition, if you have included their participation in earlier parts of the research process, it makes sense that they should be included in this portion of the work.

Before you perform any analyses on your data, it's useful to get some descriptive statistics (median, mean, mode, range, standard deviation, and so on)

(In Excel 2007, go to the "Data" tab, click on "Data Analysis" and select "Descriptive Statistics")


 Another step in getting acquainted with your data is taking a look at the frequency distribution of some important variables. For instance:  in your survey you may ask your respondents how much time (in hours) they spend on the Internet on a typical day. A frequency distribution will tell you how many respondents are never online, how many are online for less than an hour a day, how many spend between 1 and 2 hours on the web - and so on. Many statistical tests require that your variables have a normal distribution (the familiar bell-shaped curve, indicating that few people have very low and very high scores - and the majority of answers are clustered around the average).

(In Excel 2007, go to the "Data" tab, click on "Data Analysis" and select "Histogram")

Microsoft Excel can be used to compute correlation and covariance, which allows you to answer questions like "Do people who have longer tenure in the community also report being more involved in the local government?".  You can also perform regression analysis - a more comprehensive way to study relationships between variables.

(Functions CORREL and COVAR - or "Data" tab -> Data Analysis -> Correlation, Covariance, Regression)

A t Test or Analysis of variance (ANOVA) will help you look for differences in group averages. You can, for example, use those tests to find out whether ethnic groups tend to differ in reported approval for the local authorities.

(Function TTEST - or "Data" tab -> Data Analysis -> T Test, ANOVA)

This is just the tip of the iceberg in the process of survey analysis. If you are serious about in-depth analysis of this work, it might make sense to partner with an individual or organization that has refined quantitative research skills. Still, even small organizations without highly specialized quantitative researchers can accomplish valuable research!

Syndicate content