Module 2: Data Collection and Sampling Design

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

MODULE 2: DATA COLLECTION AND SAMPLING DESIGN

Introduction

After identifying your research problem, the next step is to collect appropriate and relevant
data. Data collection is crucial to the success of any investigation or study. If the investigator was
not able to collect enough relevant data, the findings and results of the study will be affected; thus,
conclusions, generalization, or implications derived from the available data may not be reliable or
valid. Becoming an expert in data collection methods and techniques require time and effort.
Guidance from an experienced researcher or statistician may help you in working your data
collection and sampling design

In this module, you will be introduced to the basic types of data collection methods and
sampling designs/techniques.

Learning Outcomes

At the end of this module, you will be able to:


1. distinguish what good data is
2. identify different sources of data and data collection methods to be used in different
studies
3. construct valid and reliable data collection tools
4. apply the appropriate sampling technique during data collection

Expected Outputs

1. Research Methods, Sampling Design, and Research Instruments


2. Quiz

Date: August 31 – September 11, 2020

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 19


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
Lesson 1: Sources of Data and Data Collection Methods
Data collection is a methodical process of gathering and analyzing specific information
to give solutions to relevant research questions.

Characteristics of a Good Data


Ortega (2017) outlines seven (7) characteristics that define quality data.

1. Accuracy and Precision: This characteristic refers to the exactness of the data. It cannot
have any erroneous elements and must convey the correct message without being
misleading. This accuracy and precision have a component that relates to its intended
use. Without understanding how the data will be consumed, ensuring accuracy and
precision could be off-target or more costly than necessary. For example, accuracy in
healthcare might be more important than in another industry (which is to say, inaccurate
data in healthcare could have more serious consequences) and, therefore, justifiably
worth higher levels of investment.

2. Legitimacy and Validity: Requirements governing data set the boundaries of this
characteristic. For example, on surveys, items such as gender, ethnicity, and nationality
are typically limited to a set of options, and open answers are not permitted. Any answers
other than these would not be considered valid or legitimate based on the survey’s
requirement. This is the case for most data and must be carefully considered when
determining its quality. The people in each department in an organization understand what
data is valid or not to them, so the requirements must be leveraged when evaluating data
quality.

3. Reliability and Consistency: Many systems in today’s environments use and/or collect
the same source data. Regardless of what source collected the data or where it resides,
it cannot contradict a value residing in a different source or collected by a different system.
There must be a stable and steady mechanism that collects and stores the data without
contradiction or unwarranted variance.

4. Timeliness and Relevance: There must be a valid reason to collect the data to justify the
effort required, which also means it has to be collected at the right moment in time. Data
collected too soon or too late could misrepresent a situation and drive inaccurate
decisions.

5. Completeness and Comprehensiveness: Incomplete data is as dangerous as


inaccurate data. Gaps in data collection lead to a partial view of the overall picture to be
displayed. Without a complete picture of how operations are running, uninformed actions
will occur. It’s important to understand the complete set of requirements that constitute a
comprehensive set of data to determine whether or not the requirements are being fulfilled.

6. Availability and Accessibility: This characteristic can be tricky at times due to legal and
regulatory constraints. Regardless of the challenge, though, individuals need the right
level of access to the data to perform their jobs. This presumes that the data exists and is
available for access to be granted.

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 20


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
7. Granularity and Uniqueness: The level of detail at which data is collected is important
because confusion and inaccurate decisions can otherwise occur. Aggregated,
summarized, and manipulated collections of data could offer a different meaning than the
data implied at a lower level. An appropriate level of granularity must be defined to provide
sufficient uniqueness and distinctive properties to become visible. This is a requirement
for operations to function effectively.

Types of Data
1. Primary Data. These are data collected by the investigator himself/ herself for a specific
purpose. For instance, the data collected by an investigator for their research projects is
an example of primary data.
2. Secondary Data. These are data collected by someone else for some other purposes, but
the being utilized by the current investigator for another purpose. For instance, the census
data is used to analyze the impact of education on career choice, and earning is an
example of secondary data.

Data Collection Tools and Instruments (Bhat, 2020)

1. Interview Method. The interviews conducted to collect quantitative data are more
structured, wherein the researchers ask only a standard set of questionnaires and nothing
more than that. There are three major types of interviews conducted for data collection

• Telephone interviews: For years, telephone interviews ruled the charts of data
collection methods. However, nowadays, there is a significant rise in conducting video
interviews using the internet, Skype, or similar online video calling platforms.
• Face-to-face interviews: It is a proven technique to collect data directly from the
participants. It helps in acquiring quality data as it provides a scope to ask detailed
questions and probing further to collect rich and informative data. Literacy
requirements of the participant are irrelevant as face-to-face interviews offer ample
opportunities to collect non-verbal data through observation or to explore complex and
unknown issues. Although it can be an expensive and time-consuming method, the
response rates for face-to-face interviews are often higher.
• Computer-Assisted Personal Interviewing (CAPI): It is nothing but a similar setup of
the face-to-face interview where the interviewer carries a desktop or laptop along with
him at the time of interview to upload the data obtained from the interview directly into
the database. CAPI saves a lot of time in updating and processing the data and also
makes the entire process paperless as the interviewer does not carry a bunch of
papers and questionnaires.

2. Survey or Questionnaire Method. The checklists and rating scale type of questions make
the bulk of quantitative surveys as it helps in simplifying and quantifying the attitude or
behavior of the respondents.

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 21


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
• Web-based questionnaire: This is one of the ruling and most trusted methods for
internet-based research or online research. In a web-based questionnaire, the receive
an email containing the survey link, clicking on which takes the respondent to a secure
online survey tool from where he/she can take the survey or fill in the survey
questionnaire.
• Mail Questionnaire: In a mail questionnaire, the survey is mailed out to a host of the
sample population, enabling the researcher to connect with a wide range of
audiences. The mail questionnaire typically consists of a packet containing a cover
sheet that introduces the audience about the type of research and reason why it is
being conducted along with a prepaid return to collect data online.

3. Observation Method. In this method, researchers collect quantitative data through


systematic observations by using techniques like counting the number of people present
at the specific event at a particular time and a particular venue or number of people
attending the event in a designated place. Structured observation is more used to collect
quantitative rather than qualitative data.

• Structured observation: In this type of observation method, the researcher has to


make careful observations of one or more specific behaviors in a more comprehensive
or structured setting compared to naturalistic or participant observation. In a
structured observation, the researchers, rather than observing everything, focus only
on very specific behaviors of interest. It allows them to quantify the behaviors they are
observing. When the observations require a judgment on the part of the observers –
it is often described as coding, which requires a clearly defining a set of target
behaviors.

4. Documents and Records. Document review is a process used to collect data after
reviewing the existing documents. It is an efficient and effective way of gathering data as
documents are manageable and are the practical resource to get qualified data from the
past. Three primary document types are being analyzed for collecting supporting
quantitative research data.

• Public Records: Under this document review, official, ongoing records of an


organization are analyzed for further research. For example, annual reports policy
manuals, student activities, game activities in the university, etc.
• Personal Documents: In contrast to public documents, this type of document review
deals with individual personal accounts of individuals’ actions, behavior, health,
physique, etc. For example, the height and weight of the students, distance students
are traveling to attend the school, etc.
• Physical Evidence: Physical evidence or physical documents deal with previous
achievements of an individual or of an organization in terms of monetary and scalable
growth.

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 22


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
Lesson 2: Sampling Design

Sampling is a statistical procedure that is concerned with the selection of individual


observations. It allows us to make statistical inferences about the population.

Approaches to Determine the Sample Size


1. Using a census for a small population (N ≤ 200). This eliminates sampling error and
provides data on all the members or elements in the population.
2. Using a sample size of a similar study. The disadvantage of using the same method used
by other research is the possibility of repeating the same errors that were made in
determining sample size for the study.
3. Using published tables. (research-advisors.com/tools/SampleSize.htm)
4. Using a formula.
a. http://www.raosoft.com/samplesize.html
b. https://www.surveymonkey.com/mp/sample-size-calculator/
c. http://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_power/BS704_Power_print.html

In using a formula to compute the sample size, the basic information needed is as follows.
a. Margin of error. It is the amount of error that you can tolerate. If 90% of respondents
answer yes, while 10% answer no, you may be able to tolerate a larger amount of
error than if the respondents are split 50-50 or 45-55. A lower margin of error requires
a larger sample size.
b. Confidence Interval. It is the amount of uncertainty you can tolerate. Suppose that
you have 20 yes-no questions in your survey. With a confidence level of 95%, you
would expect that for one of the questions (1 in 20), the percentage of people who
answer yes would be more than the margin of error away from the true answer. The
true answer is the percentage you would get if you exhaustively interviewed everyone.
A higher confidence level requires a larger sample size.

Sampling Techniques

1. Probability Sampling. It is a sampling technique wherein the members of the population are
given an (almost) equal chance to be included as a sample.

a. Simple Random Sampling. All members of the population have a chance of being included
in the sample. Example: lottery method, random numbers
b. Systematic Random Sampling (with a random start). It selects every kth member of the
population with a starting point determined at random. Example: Selecting every 5th
member of N = 1000, to get 200 samples. For instance, starting at 7th member, we have
the 12th, 17th, 22nd, and so on.
c. Stratified Random Sampling. This is used when the population can be divided into several
smaller non-overlapping groups (strata), then the sample is randomly selected from each
group.
d. Cluster Sampling. Also called area sampling in which groups or cluster, instead of
individuals are selected randomly as sample

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 23


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
e. Multi-stage Sampling. If the population is too big, two or more sampling techniques may
be used until the desired sample is selected.

2. Non-probability Sampling. It is a sampling technique wherein the sample is determined by set


criteria, purpose, or personal judgment.

a. Purposive or Judgment Sampling. The sample is selected based on predetermined criteria


set by the researcher. Example: To determine the difficulties encountered by students in
the 2017 national achievement test, only the Grade 6 pupils of the said school will be
included as a sample.
b. Convenience or Accidental Sampling. It relies on data collection from population members
who are conveniently available to participate in the study. Facebook polls or questions can
be mentioned as a popular example of convenience sampling.
c. Quota Sampling. It is a non-probability sampling technique in which researchers look for
a specific characteristic in their respondents, and then take a tailored sample that is in
proportion to a population of interest.
d. Snowball Sampling. The samples are determined by referrals made by previous members
of the sample.

Source: questionpro.com

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 24


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
Watch these videos to understand the concept of sampling in statistics.
Simple Learning Pro. (25 November 2015). Types of Sampling Methods (4.1) [Video Clip].
Retrieved 25 June 2020 from https://youtu.be/pTuj57uXWlk
Simple Learning Pro. (26 November 2015). Census, Nonresponse, and Undercoverage (4.2)
[Video Clip]. Retrieved 25 June 2020 from
https://youtu.be/EZrP_av3cmA?list=PL0KQuRyPJoe6KjlUM6iNYgt8d0DwI-IGR

Learning Activities

A. Performance Task (Collaborative Activity)


1. Write an outline of your Research Methodology for your research proposal in Activity
1. Download the template for the Activity 2. Research Method and Sampling Design.
2. In writing the Research Methodology, include the following details: research design,
sample and sampling technique, data collection procedure and instruments. Include
the draft of the research instruments to be used.
3. Submit your output to the UBian LMS using the filename Activity2_Group#. (Example:
Activity2_Group1)
4. Comment on the output of your classmates

B. Module Assessment
Take the online quiz via UBian LMS. You need to get a score of at least 80% to proceed
to the next module. You are given three (3) attempts for this assessment. If all attempts failed,
please send me an email requesting for consideration.

Further Readings
The following resources/references were used to create this independent learning material.
Bhat, A. (2020). Five methods used for quantitative data collection. Retrieved 25 June 2020 from
https://www.questionpro.com/blog/quantitative-data-collection-methods/
Fleetwood, D. (2020). Types of Sampling: Sampling Methods with Examples. Retrieved 25 June
2020 from https://www.questionpro.com/blog/types-of-sampling-for-social-research/
Formplus Blog. (25 June 2020). Primary vs secondary data: 15 key differences & similarities.
Retrieved 25 June 2020 from https://www.formpl.us/blog/primary-secondary-data
Ortega, D. (26 January 2017). Seven characteristics that define quality data. Retrieved 25 June
2020 from https://www.blazent.com/seven-characteristics-define-quality-data/
Simple Learning Pro. (25 November 2015). Types of Sampling Methods (4.1) [Video Clip].
Retrieved 25 June 2020 from https://youtu.be/pTuj57uXWlk
Simple Learning Pro. (26 November 2015). Census, Nonresponse, and Undercoverage (4.2)
[Video Clip]. Retrieved 25 June 2020 from
https://youtu.be/EZrP_av3cmA?list=PL0KQuRyPJoe6KjlUM6iNYgt8d0DwI-IGR

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 25


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020
Research Method and Sampling Design

Group Members:

Research Title:

Objectives of the Study:

Research Design:

Sample Size: (describe the sample and the population from which the sample is selected)

Sampling Technique: (discuss how samples will be selected and the criteria of selection)

Data Collection Instruments and Procedures: (Describe the content of the instruments; attach
the draft of the instruments. Identify the sources of data and how data will be collected/extracted)

This work by Romell A. Ramos is licensed under CC BY-NC-ND 4.0. 26


To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0 Updated: June 2020

You might also like