There are many approaches to computing sample size. In public policy evaluation, for example, one is usually tented to check if there is statistical evidence on the impact of a intervention over a population of interest. This vignette is devoted to explain the issues that you commonly find when computing sample sizes.
Note the definition of power is related to testing an hypothesis testing process. For example, if you are interested in testing if a difference of proportions is statistically significant, then your null hypothesis may look as following:
Ho : P1 − P2 = 0 vs. Ha : P1 − P2 = D > 0
Where D, known as the null effect, is any value greater than zero. You must notice that this kind of test induces the following power function, defined as the probability of rejecting the null hypothesis. Second, you note that we should estimate P1 and P2 by using unbiased sampling estimators (e.g. Horvitz-Thompson, Hansen-Hurwitz, Calibration estimators, etc.), say P̂1 and P̂2, respectively. Third, in general, in the complex sample set-up, we can define the variance of P̂1 − P̂2 as
$$Var(\hat{P}_1 - \hat{P}_2) = \frac{DEFF}{n}\left(1-\frac{n}{N}\right)(P_1Q_1+P_2Q_2)$$
Where DEFF is defined to be the design effect that collects the inflation of variance due to complex sampling design. Usually the power function is noted as βD:
$$\begin{align*} \beta_D &\leq Pr\left(\dfrac{\hat{P}_1-\hat{P}_2}{\sqrt{\frac{DEFF}{n}\left(1-\frac{n}{N}\right)(P_1Q_1+P_2Q_2)}} > Z_{1-\alpha} \left | \right. P_1 -P_2 =D \right)\\ &= 1-\Phi\left(Z_{1-\alpha} - \dfrac{D}{\sqrt{\frac{DEFF}{n}\left(1-\frac{n}{N}\right)(P_1Q_1+P_2Q_2)}} \right) \end{align*} $$
After some algebra, we find that the minimum sample size to detect a null effect D, is
$$\begin{align} n \geq \dfrac{DEFF(P_1Q_1+P_2Q_2)}{\dfrac{D^2}{(Z_{1-\alpha}+Z_{\beta_D})^2}+\dfrac{DEFF(P_1Q_1+P_2Q_2)}{N}} \end{align}$$
###Some comments
##The ss4dpH
function
The ss4dpH
function may be used to plot a graphic that
gives an idea of how the definition of D affects the sample size. For
example, suppose that we draw a sample according to a complex design,
such that DEFF = 2,
for a finite population of N = 1000 units. This way, if we
define the null effect to be D = 3%, then we should have to draw
a sample of size n> 873 for
the probability of rejecting the null hypothesis to be 80% (default of
the power), with a confidence of 95% (default of the confidence). Notice
that as null effect increases, sample size decreases.
## [1] 873
##The b4dp
function
The b4dp
function may be used to plot a figure that
gives an idea of how the definition of the sample size n affects the power of the test. For
example, suppose that we draw a sample according to a complex design,
such that DEFF = 2,
for a finite population of N = 1000 units, a sample size of
n> 873, a null effect of
D = 3% and a confidence of
95%, then power of the test is β= 80.0228278%. Notice that as the
sample size decreases, power also decreases.
## With the parameters of this function: N = 1000 n = 873 P1 = 0.5 P2 = 0.5 D = 0.03 DEFF = 2 conf = 0.95 .
## The estimated power of the test is 80.02283 .
##
## $Power
## [1] 80.02283
You may have been fooled for some people telling you do not need a large sample size. The sample size is an issue that you have to pay a lot of attention. The conclusions of your study could have been misleaded because you draw a sample with no enough size. For example, from last figure, one may conclude that with a sample size close to 600, the power of the test is as low as 30%. That is simple unacceptable in social research.