Sampling theory
Sampling Theory
Sampling is simply a process for obtaining relevant
information and making inferences about a population by analysing a small group
of people within the population for the purpose of a research. It essentially involves selecting a
small portion from the aggregate or total population and examining that portion
in order to draw inferences about the total population.
Population or Universe – It is the subject matter of research
study. It refers to the entire group or population of something taken into
consideration for the purpose of research. It may be finite or infinite.
Sample – A sample is that portion of the population which is
critically analysed during a research study in order to make estimations or
draw inferences about the entire population. A sample may be defined as a unit
chosen from the entire population which represents all the features or
characteristics of the entire population.
Sampling Unit – It refers to one item of a sample. It may
be one unit of anything i.e. one consumer, one company, one state, one city
etc.
Sampling Frame – The collection of all the items
or units of a sample make up the sampling frame. It consists a list of all
the items in a universe (only in case of finite universe, where it is possible
to list down all items).
Sampling Design – It is simply a plan for obtaining a
sample out of a given population. It lays down a definite plan for
obtaining a sample out of the entire universe in terms of sampling objectives,
population, sample frame, sample size, sample unit, data
collection etc. It is determined before the
step of data collection in order to obtain reliable, relevant and adequate
information.
There are two ways in which information
can be obtained for sampling:
·
Census Survey –
When the entire population or universe is taken into consideration for the
purpose of research.
·
Sample Survey –
When only a part of population (sample) is studied.
Sample Size – It is the number of observations that form a sample
i.e. the number of items that are selected from the entire population for the
purpose of research that form a sample. It is denoted by n. The following points
must be kept in mind while selecting a sample size:
·
Optimum – It must be optimum in size – Not
too large, nor too small.
·
Representative – It must represent the
entire population.
Reliable – It must meet
the parameters of interest of the research study
Difference Between Census and
Sampling
Census
and sampling are two methods of collecting survey data about the population
that are used by many countries. Census refers to
the quantitative research method, in which all the members of the population
are enumerated. On the other hand, the sampling is the widely used
method, in statistical testing, wherein a data set is selected from the large
population, which represents the entire group.
Census
implies complete enumeration of the study objects, whereas Sampling connotes
enumeration of the subgroup of elements chosen for participation. These two
survey methods are often contrasted with each other, and so this article makes
an attempt to clear the differences between census and sampling, in
detail; Have a look.
Difference between census
method and sample methods:
Basis of difference
|
Census method
|
Sample method
|
1. Items to be studied
|
1. Under census method each and every unit of the universe is studied.
|
1. Under sample method, only some of the items which represent the
population are studied.
|
2. Suitability
|
2. This method is suitable when the area of investigation is
relatively small.
|
2. This method is suitable where the area of investigation is wide.
|
3. Conclusion
|
3. In this method, conclusions are drawn on the basis of whole
universe.
|
3. In this method, conclusions are drawn on the basis of a sample.
|
4. Time
|
4. It is more time consuming method.
|
4. It is less time consuming method.
|
5. Natural of items
|
5. Census method is particularly suitable where the items in the
population have diverse characteristics.
|
5. Sampling method is particularly suitable when items in the
population are homogeneous.
|
6. Verification
|
6. Under census method the results of investigation is generally not
possible.
|
6. Under sampling method results can be verified by taking out another
sample.
|
7. Nature of method
|
7. It is an old method of investigation.
|
7. It is a new and practicable method.
|
8. Number of enumerators
|
8. Census method requires a large number of enumerators.
|
8. It does not require a large number of en umerators.
|
9. Expensive
|
9. It is more expensive.
|
9. It is c'r lparatively less expensive'
|
10.
Error
|
Not
present.
|
Depends
on the size of the population
|
Sampling
Methods/Techniques of Sampling
Probability Sampling – In this sampling method the
probability of each item in the universe to get selected for research
is the same. Hence the sample collected through this method is totally random
in nature. Therefore it is also known as Random Sampling.
Non-Probability Sampling – In this sampling method the
probability of each item in the universe to get selected for research
is not the same. Hence the sample collected through method is not random in
nature. Therefore it is known as Non-random Sampling.
Sampling
Methods/Sampling Techniques
Probability
or Random Sampling Methods:
(1) Simple random
sampling – This method simply involves the task selecting sampling units
randomly out of the sampling frame. A researcher may use the following methods
for selecting random samples – Lottery Method, Random Numbers, software etc.
There are two types of
random sampling:
·
SRSWR – Simple random sampling with replacement
·
SRSWOR – Simple random sampling without replacement
(2) Stratified sampling – In this method a
heterogeneous population is divided into different small sub-units, which are
called stratas. These stratas are homogenous among themselves with respect to a
certain factor or characteristic. Items or sampling units are randomly selected
from these stratas that together make up the sample.
(3) Systematic sampling – In this type of sampling
the first unit is selected randomly and then every Kth item on the
source list is selected, which becomes the part of the sample. The value of K
is determined by :
K = Total no. of units in
population/No. of units in sample
The essence of this method
is selection of random items from the source list at a specified interval from
the selected unit, hence forming a system for selecting items. The Items may be
arranged numerically, alphabetically or in an increasing or decreasing order
and then a formula is applied to it.
(4) Cluster sampling – This method is used where
the size of population is very large. In this method a homogeneous population
is divided into smaller heterogeneous
groups and then samples are drawn out at random from these heterogeneous
groups. These heterogeneous groups are called clusters. All items belonging to
the selected heterogeneous groups become the part of the sample.
(5) Area Sampling – If the clusters are divided
on geographical basis, it is termed as area sampling.
(6) Multi-stage sampling – In multistage sampling,
sampling is performed at more than 1 step or stage. At first stage units are
selected by some random sampling method usually SRSWOR or Systematic sampling
and at the second stage again some units are selected out of the previously
selected units through some suitable method. It can be understood as an expansion
of the cluster sampling method where instead of selecting the entire
heterogeneous group, items are drawn randomly from each heterogeneous group to
form a sample.
Non-Probability or Non-Random Sampling Methods
(1) Judgement sampling – In this method, the
sampling units are chosen by the researcher on the basis of his or her own
judgement. The research simply selects the sample which in his opinion will be
best for the study.
(2) Quota sampling – In this method of
sampling, quotas in form of reservation or percentage are established for
different classes of population on the basis of age, gender, nationality etc. A
sample is then drawn out on the basis of these quotas.
(3) Panel sampling – In this method regular
surveys are taken by a researcher from a panel of experts of a particular
domain through questionnaires or schedules. The panellists may or may
not know about other during the research process.
(4) Convenience sampling – In convenience sampling, a
researcher simply selects the sample and sampling units that are easily
available and accessible. No extra efforts are taken by the researcher as he
simply chooses the samples on the basis of convenience.
(5) Snowball sampling – This method is used in
cases where the population to be studied is rare, therefore it is difficult to
find good representative sampling units. In this method the researcher
initially selects a sampling unit (a doctor, a musician, a cancer patient
depending upon the study) based on his judgement and then starts taking further
samples on the basis of directions/advice/referral provided by the first
sampling unit.
The researcher starts by
interviewing one person or small group of people and then asks them for
references. He then collects data from the suggested people and asks them for
references and the chain continues until an adequate sample is formed.
.
Sampling error
In statistics, sampling error is incurred when the
statistical characteristics of a population are estimated from a subset, or
sample, of that population. Since the sample does not include all members of
the population, statistics on the sample, such as means and quantiles,
generally differ from the characteristics of the entire population, which are
known as parameters. For example, if one measures the height of a thousand
individuals from a country of one million, the average height of the thousand
is typically not the same as the average height of all one million people in
the country. Since sampling is typically done to determine the characteristics
of a whole population,
The difference between the sample and
population values is considered a sampling
error.
·
Increasing the sample size will reduce
this type of error.
Sampling Errors – It refers to the inaccuracy or errors in
the process of collection, analysis and interpretation of sampling data.
Sampling errors arise due to two reasons:
·
Systematic or biased or Non-sampling
errors –
These arise due to use of faulty procedures and techniques in making a sample
and lack of experience in research.
·
Unsystematic or unbiased or sampling
errors –
These arise due to the limitations of the sampling process.
Types of Sampling Error
Sample
Errors
·
Error caused by the act of taking a sample.
·
They cause sample results to be different from
the results of census.
·
Differences between the sample and the
population that exist only because of the observations that happened to be
selected for the sample.
·
Statistical Errors are sample error.
Non Sample Errors
Non
sample errors are not Control by Sample Size. There are two types of non sample
errors
·
Non Response Error
·
Response Error
Non response errors
A
non-response error occurs when units selected as part of the sampling procedure
do not respond in whole or in part.
Non response happens when there is a significant difference
between those who responded to your survey and those who did not. This may
happen for a variety of reasons, including:
·
Some people refused to
participate. This could be because you are asking for embarrassing information,
or information about illegal activities.
·
Poorly constructed
surveys. For example, if you have a snail mail survey for young adults or a
smartphone survey for older adults; both these scenarios are likely to lead to
a lower response rate for your targeted population.
·
Some people simply
forgot to return the survey.
·
Your survey didn’t reach
all members in your sample. For example, email invites might have disappeared
into the Spam folder, or the code used in the email may not have rendered
properly on certain devices (like cell phones).
·
Certain groups were more
inclined to answer. For example, people who are more active runners might be
more inclined to answer a survey about running than people who aren’t as active
in the community.
Response Errors
A response or
data error is any systematic bias that occurs during data collection, analysis
or interpretation.
·
Respondent error (e.g., lying, forgetting, etc.)
·
Interviewer bias.
·
Recording errors.
·
Poorly designed questionnaires.
·
Measurement error.
- Errors
in coding, tabulating, analysing data
- Lack of
trained and qualified investigators
Respondent error
Respondent
gives an incorrect answer, e.g. due to prestige or competence implications, or
due to sensitivity or social undesirability of question
·
respondent misunderstands the requirements
·
lack of motivation to give an accurate answer
·
“lazy” respondent gives an “average” answer
·
question requires memory/recall
·
proxy respondents are used, i.e. taking answers
from someone other than the respondent
Interviewer bias
Different
interviewers administer a survey in different ways
·
Differences occur in reactions of respondents to
different interviewers, e.g. to interviewers of their own sex or own ethnic
group
·
Inadequate training of interviewers
·
Inadequate attention to the selection of
interviewers
·
There is too high a workload for the interviewer
Measurement Error
The question is
unclear, ambiguous or difficult to answer
·
The list of possible answers suggested in the
recording instrument is incomplete
·
Requested information assumes a framework
unfamiliar to the respondent
·
The definitions used by the survey are different
from those used by the respondent (e.g. how many part-time employees do you
have? See next slide for an example)
Some of the common procedures of controlling non-sampling errors are as
under: -
Providing detailed guidelines for data
collection and data processing
-
Imparting proper training to the field workers
and data processing personnel;
- Introducing consistency checks
-
Performing sample check
-
Carrying out post-census and post-survey checks
-
Performing external record check
- Introducing
the scheme of interpenetrating sub-samples
o
The authors condense
their recommendation’s into a top ten list to reduce
survey measurement error:
o
Follow basic
administrative guidelines
o
Clarify the “central
players” in the region and nationally and be certain to consider ways to work
with them and reduce the chance of “spoilers”
o
Conduct pre testing
for all questionnaires
o
Hire relatively large
numbers of interviewers, whom should be tested in the course of training, while
setting high goals and providing rewards for success
o
Interviewers should be
assigned using interpenetrating sampling techniques
o
Consider all potential
errors of non observation, including sampling, coverage and non response
o
Include questions that
allow ex post identification of different types of error of measurement
o
Carefully evaluate
whether there are systematic non response patterns that might affect
interpretation of findings
o
Design clear
guidelines for filling missing data, preferably using interviewer teams done
shortly after each day of data collection
o
Attempt to compare results
with those that might be obtainable from routine statistics or other different
data source
Rules that
support the idea of sampling (4 rules)
1.
Law of
statistical regularity.
2.
Law of inertia
of large numbers.
3.
Law of
optimization.
4.
Law of
reliability or law of validity.
Law of statistical
regularity: The Law of Statistical Regularity states that a moderately large
number of items selected at random from a larger group (technically called
‘population’) are almost sure, on the whole, to possess the characteristics of
the larger group. In other words, if a random sample of large size is taken,
the sample will, to a fairly accurate degree, exhibit the characteristics of
the entire population. Two points should be kept in mind
- The
sample must be drawn at random. This means that the items
constituting the sample are obtained by giving equal chance of selection
to all the items in the population.
- The
number of items included in the sample should be sufficiently large. The
larger the size of the sample, the more representative will the sample be
of the population.
Law of inertia
of large numbers : This rule states
that as the sample size increases reliability also increases. It means when the
more sample size is drawn from the population / universe the accuracy or
reliability also increases. When the size of sample is small when compared to
population the reliability also reduces
For example: to calculate the average marks of 50
BBM students in English
50----
05
50----10
50----15
50----20
However this law does
not apply in few cases ( for example – for testing of blood group of a person
from a drop of blood & from a bottle
of blood remains the same)
Principle of
optimization: – this principle
states that with the help of sample one must be able to get optimum results
with maximum efficiency and minimum cost i.e., in sampling it should give
maximum information about the population using minimum efforts
For example: average
marks of students in a class
Census = 70 %
Sample = 68%
As per this rule it
states that sample gives maximum information about the population but not the
total information.
Principle of validity: – if valid tests are derived only then sampling design is
termed as valid i.e., sample should give us valid information about population.
The random sampling or
probability sampling satisfy this.
These are the
principles of sampling in statistics
Comments
Post a Comment