Sampling theory

Sampling Theory
Sampling is simply a process for obtaining relevant information and making inferences about a population by analysing a small group of people within the population for the purpose of a research. It essentially involves selecting a small portion from the aggregate or total population and examining that portion in order to draw inferences about the total population.
Population or Universe – It is the subject matter of research study. It refers to the entire group or population of something taken into consideration for the purpose of research. It may be finite or infinite.
Sample – A sample is that portion of the population which is critically analysed during a research study in order to make estimations or draw inferences about the entire population. A sample may be defined as a unit chosen from the entire population which represents all the features or characteristics of the entire population.
Sampling Unit – It refers to one item of a sample. It may be one unit of anything i.e. one consumer, one company, one state, one city etc.
Sampling Frame – The collection of all the items or units of a sample make up the sampling frame. It consists a list of all the items in a universe (only in case of finite universe, where it is possible to list down all items). 
Sampling Design – It is simply a plan for obtaining a sample out of a given population. It lays down a definite plan for obtaining a sample out of the entire universe in terms of sampling objectives, population, sample frame, sample size, sample unit, data collection  etc. It is determined before the step of data collection in order to obtain reliable, relevant and adequate information.
There are two ways in which information can be obtained for sampling:
·         Census Survey – When the entire population or universe is taken into consideration for the purpose of research.
·         Sample Survey – When only a part of population (sample) is studied.
Sample Size – It is the number of observations that form a sample i.e. the number of items that are selected from the entire population for the purpose of research that form a sample. It is denoted by n. The following points must be kept in mind while selecting a sample size:
·         Optimum – It must be optimum in size – Not too large, nor too small.
·         Representative – It must represent the entire population.
Reliable – It must meet the parameters of interest of the research study
Difference Between Census and Sampling
Census and sampling are two methods of collecting survey data about the population that are used by many countries. Census refers to the quantitative research method, in which all the members of the population are enumerated. On the other hand, the sampling is the widely used method, in statistical testing, wherein a data set is selected from the large population, which represents the entire group.
Census implies complete enumeration of the study objects, whereas Sampling connotes enumeration of the subgroup of elements chosen for participation. These two survey methods are often contrasted with each other, and so this article makes an attempt to clear the differences between census and sampling, in detail; Have a look.
                      Difference between census method and sample methods:
Basis of difference
Census method
Sample method
1. Items to be studied
1. Under census method each and every unit of the universe is studied.
1. Under sample method, only some of the items which represent the population are studied.
2. Suitability
2. This method is suitable when the area of investigation is relatively small.
2. This method is suitable where the area of investigation is wide.
3. Conclusion
3. In this method, conclusions are drawn on the basis of whole universe.
3. In this method, conclusions are drawn on the basis of a sample.
4. Time
4. It is more time consuming method.
4. It is less time consuming method.
5. Natural of items
5. Census method is particularly suitable where the items in the population have diverse characteristics.
5. Sampling method is particularly suitable when items in the population are homogeneous.
6. Verification
6. Under census method the results of investigation is generally not possible.
6. Under sampling method results can be verified by taking out another sample.
7. Nature of method
7. It is an old method of investigation.
7. It is a new and practicable method.
8. Number of enumerators
8. Census method requires a large number of enumerators.
8. It does not require a large number of en umerators.
9. Expensive
9. It is more expensive.
9. It is c'r lparatively less expensive'
10. Error
Not present.
Depends on the size of the population

Sampling Methods/Techniques of Sampling
Sampling methods can be categorised into two types of sampling:
Probability Sampling – In this sampling method the probability of each item in the universe to get selected for research is the same. Hence the sample collected through this method is totally random in nature. Therefore it is also known as Random Sampling.
Non-Probability Sampling – In this sampling method the probability of each item in the universe to get selected for research is not the same. Hence the sample collected through method is not random in nature. Therefore it is known as Non-random Sampling.
Sampling Methods/Sampling Techniques

Probability or Random Sampling Methods:
 (1) Simple random sampling – This method simply involves the task selecting sampling units randomly out of the sampling frame. A researcher may use the following methods for selecting random samples – Lottery Method, Random Numbers, software etc.
There are two types of random sampling:
·         SRSWR – Simple random sampling with replacement
·         SRSWOR – Simple random sampling without replacement
(2) Stratified sampling – In this method a heterogeneous population is divided into different small sub-units, which are called stratas. These stratas are homogenous among themselves with respect to a certain factor or characteristic. Items or sampling units are randomly selected from these stratas that together make up the sample.
(3) Systematic sampling – In this type of sampling the first unit is selected randomly and then every Kth item on the source list is selected, which becomes the part of the sample. The value of K is determined by :
K = Total no. of units in population/No. of units in sample
The essence of this method is selection of random items from the source list at a specified interval from the selected unit, hence forming a system for selecting items. The Items may be arranged numerically, alphabetically or in an increasing or decreasing order and then a formula is applied to it.
(4) Cluster sampling – This method is used where the size of population is very large. In this method a homogeneous population is divided into smaller heterogeneous groups and then samples are drawn out at random from these heterogeneous groups. These heterogeneous groups are called clusters. All items belonging to the selected heterogeneous groups become the part of the sample.
(5) Area Sampling – If the clusters are divided on geographical basis, it is termed as area sampling.
(6) Multi-stage sampling – In multistage sampling, sampling is performed at more than 1 step or stage. At first stage units are selected by some random sampling method usually SRSWOR or Systematic sampling and at the second stage again some units are selected out of the previously selected units through some suitable method. It can be understood as an expansion of the cluster sampling method where instead of selecting the entire heterogeneous group, items are drawn randomly from each heterogeneous group to form a sample.
Non-Probability or Non-Random Sampling Methods
(1) Judgement sampling – In this method, the sampling units are chosen by the researcher on the basis of his or her own judgement. The research simply selects the sample which in his opinion will be best for the study.
(2) Quota sampling – In this method of sampling, quotas in form of reservation or percentage are established for different classes of population on the basis of age, gender, nationality etc. A sample is then drawn out on the basis of these quotas.
(3) Panel sampling – In this method regular surveys are taken by a researcher from a panel of experts of a particular domain through questionnaires or schedules. The panellists may or may not know about other during the research process.
(4) Convenience sampling – In convenience sampling, a researcher simply selects the sample and sampling units that are easily available and accessible. No extra efforts are taken by the researcher as he simply chooses the samples on the basis of convenience.
(5) Snowball sampling – This method is used in cases where the population to be studied is rare, therefore it is difficult to find good representative sampling units. In this method the researcher initially selects a sampling unit (a doctor, a musician, a cancer patient depending upon the study) based on his judgement and then starts taking further samples on the basis of directions/advice/referral provided by the first sampling unit.
The researcher starts by interviewing one person or small group of people and then asks them for references. He then collects data from the suggested people and asks them for references and the chain continues until an adequate sample is formed.  
.

Sampling error

In statisticssampling error is incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics on the sample, such as means and quantiles, generally differ from the characteristics of the entire population, which are known as parameters. For example, if one measures the height of a thousand individuals from a country of one million, the average height of the thousand is typically not the same as the average height of all one million people in the country. Since sampling is typically done to determine the characteristics of a whole population,
The difference between the sample and population values is considered a sampling error.
·         Increasing the sample size will reduce this type of error.
Sampling Errors – It refers to the inaccuracy or errors in the process of collection, analysis and interpretation of sampling data.
Sampling errors arise due to two reasons:
·         Systematic or biased or Non-sampling errors – These arise due to use of faulty procedures and techniques in making a sample and lack of experience in research.
·         Unsystematic or unbiased or sampling errors – These arise due to the limitations of the sampling process.

Types of Sampling Error
Sample Errors
·         Error caused by the act of taking a sample.
·         They cause sample results to be different from the results of census.
·         Differences between the sample and the population that exist only because of the observations that happened to be selected for the sample.
·         Statistical Errors are sample error.

Non Sample Errors
Non sample errors are not Control by Sample Size. There are two types of non sample errors
·         Non Response Error
·         Response Error
Non response errors
A non-response error occurs when units selected as part of the sampling procedure do not respond in whole or in part.
Non response happens when there is a significant difference between those who responded to your survey and those who did not. This may happen for a variety of reasons, including:
·         Some people refused to participate. This could be because you are asking for embarrassing information, or information about illegal activities.
·         Poorly constructed surveys. For example, if you have a snail mail survey for young adults or a smartphone survey for older adults; both these scenarios are likely to lead to a lower response rate for your targeted population.
·         Some people simply forgot to return the survey.
·         Your survey didn’t reach all members in your sample. For example, email invites might have disappeared into the Spam folder, or the code used in the email may not have rendered properly on certain devices (like cell phones).
·         Certain groups were more inclined to answer. For example, people who are more active runners might be more inclined to answer a survey about running than people who aren’t as active in the community.

Response Errors
A response or data error is any systematic bias that occurs during data collection, analysis or interpretation.
·         Respondent error (e.g., lying, forgetting, etc.)
·         Interviewer bias.
·         Recording errors.
·         Poorly designed questionnaires.
·         Measurement error.
  • Errors in coding, tabulating, analysing data
  • Lack of trained and qualified investigators

Respondent error
Respondent gives an incorrect answer, e.g. due to prestige or competence implications, or due to sensitivity or social undesirability of question
·         respondent misunderstands the requirements
·         lack of motivation to give an accurate answer
·         “lazy” respondent gives an “average” answer
·         question requires memory/recall
·         proxy respondents are used, i.e. taking answers from someone other than the respondent

Interviewer bias
Different interviewers administer a survey in different ways
·         Differences occur in reactions of respondents to different interviewers, e.g. to interviewers of their own sex or own ethnic group
·         Inadequate training of interviewers
·         Inadequate attention to the selection of interviewers
·         There is too high a workload for the interviewer

Measurement Error
The question is unclear, ambiguous or difficult to answer
·         The list of possible answers suggested in the recording instrument is incomplete
·         Requested information assumes a framework unfamiliar to the respondent
·         The definitions used by the survey are different from those used by the respondent (e.g. how many part-time employees do you have? See next slide for an example)
Some of the common procedures of controlling non-sampling errors are as under: -
 Providing detailed guidelines for data collection and data processing
-       Imparting proper training to the field workers and data processing personnel;
      - Introducing consistency checks
-       Performing sample check
-       Carrying out post-census and post-survey checks
-       Performing external record check
-       Introducing the scheme of interpenetrating sub-samples
o   The authors condense their recommendation’s into a top ten list to reduce survey measurement error:
o   Follow basic administrative guidelines
o   Clarify the “central players” in the region and nationally and be certain to consider ways to work with them and reduce the chance of “spoilers”
o   Conduct pre testing for all questionnaires
o   Hire relatively large numbers of interviewers, whom should be tested in the course of training, while setting high goals and providing rewards for success
o   Interviewers should be assigned using interpenetrating sampling techniques
o   Consider all potential errors of non observation, including sampling, coverage and non response
o   Include questions that allow ex post identification of different types of error of measurement
o   Carefully evaluate whether there are systematic non response patterns that might affect interpretation of findings
o   Design clear guidelines for filling missing data, preferably using interviewer teams done shortly after each day of data collection
o   Attempt to compare results with those that might be obtainable from routine statistics or other different data source


Rules that support the idea of sampling (4 rules)
1.      Law of statistical regularity.
2.      Law of inertia of large numbers.
3.      Law of optimization.
4.      Law of reliability or law of validity.

Law of statistical regularity: The Law of Statistical Regularity states that a moderately large number of items selected at random from a larger group (technically called ‘population’) are almost sure, on the whole, to possess the characteristics of the larger group. In other words, if a random sample of large size is taken, the sample will, to a fairly accurate degree, exhibit the characteristics of the entire population. Two points should be kept in mind
  1. The sample must be drawn at random. This means that the items constituting the sample are obtained by giving equal chance of selection to all the items in the population.
  2. The number of items included in the sample should be sufficiently large. The larger the size of the sample, the more representative will the sample be of the population.

Law of inertia of large numbers : This rule states that as the sample size increases reliability also increases. It means when the more sample size is drawn from the population / universe the accuracy or reliability also increases. When the size of sample is small when compared to population the reliability also reduces
For example: to calculate the average marks of 50 BBM students in English 
                                             50---- 05
                                             50----10
                                             50----15
                                             50----20
However this law does not apply in few cases ( for example – for testing of blood group of a person from a drop of  blood & from a bottle of blood remains the same)
Principle of optimization: – this principle states that with the help of sample one must be able to get optimum results with maximum efficiency and minimum cost i.e., in sampling it should give maximum information about the population using minimum efforts
For example: average marks of students in a class
                      Census = 70 %
                      Sample   = 68%  
As per this rule it states that sample gives maximum information about the population but not the total information.
Principle of validity: – if valid tests are derived only then sampling design is termed as valid i.e., sample should give us valid information about population.
The random sampling or probability sampling satisfy this.
These are the principles of sampling in statistics



Comments

Popular posts from this blog

Methods of wage payment