Defining simple surveys
Surveys come in many forms. You can present questions and solicit answers in a large variety of ways. I won't be concerned with understanding how to analyze every possible type of survey; instead, I will try to be a bit more strategic by starting with the simplest possible types of surveys.
The first type of survey to examine is one in which all the survey questions require a boolean response (yes/no, agree/disagree, male/female, and so forth). Think of this as a multiple-choice survey where all questions only offer two mutually-exclusive response options, probably the simplest type of survey you can imagine constructing.
When participants take a Web survey, you need to store their answers in a form suitable for later analysis. For analysis purposes, the best way to store survey answers is in a database table dedicated to responses from a particular survey. The survey table should have columns devoted to recording the boolean-valued response for each question (denoted q1 to q3 in the following table):
Table 1. Storing survey answers so they're suitable for later analysis
|add more rows here|
Opinion poll surveys with binary response options are ideally collected in this format. Surveys with binary data collected in this format are referred to as binary surveys.
A survey that is constructed for the purposes of classifying participants differs from the above in that it requires at least one extra classification field (denoted c1 below) to record, for example, the employment status of the participant (coded as 0 = unemployed and 1 = employed).
Table 2. Extra field allows participants to be classified
|add more rows here|
Note that records used for medical diagnostic testing are likely to have a similar format.
Surveys with binary data collected in this format are referred to as binary classification surveys.
When the adjective "simple" is used to describe a binary survey, this means that the survey consists of only one binary response per participant -- which most people would refer to as a poll. It can also be viewed as the limiting case of a survey.
When the adjective "simple" is used to describe a binary classification survey, this means that the survey consists of only two binary responses per participant, one being a response to the test question q1 and one being a response to the classification question c1.
You will find the parameter estimation concepts that I discuss in this article to be useful for analyzing simple binary surveys. In my next article, I focus on concepts and code useful for analyzing simple binary classification surveys and multivariate binary classification surveys.
The range of binary surveys as defined here represent a distinct and interesting class of surveys to study. A cornucopia of literature and palette of analytic techniques is available to analyze binary data. Binary surveys are also interesting because responses coded as 0s and 1s are written in the native language of hardware-based computing. Fields as diverse as statistics, computer science, physics, medical diagnosis, data compression, and electrical engineering can be treated in a unified manner within the mathematics of binary data analysis and modeling.