sampling the population proportion to measure difference in intervals



Case Study:

The quality control department issues permit to general contractors to work on a quality improvement project. For each permit issued, the department inspects the result of the production bu giving a “pass” or “fail” rating to each product. A failed product must be re-manufactured until it receives a pass rating. The department had been frustrated by the high cost of re-inspection and reproduction and decided to publish the inspection records of all products on the web. It was hoped that public access to the records would lower the re-inspection rate. A year after the web access was made public, two samples of records were randomly selected. One sample was selected from the pool of records before the web publication and one after. The proportion of products that passed on the first inspection was noted for each sample. The results are summarized below. Construct a point estimate and a 90% confidence interval for the difference in the passing rate on the first inspection between the two time periods. 

No access: sample size = 500, sample proportion = 0.67
Access permited: sample size = 100, sample proportion = 0.80

Solution:

#Input:
sample_one_size = 500
sample_one_prop = 0.68

sample_two_size = 100
sample_two_prop = 0.80

alfa = 0.90

#STEP 1: The point estimate: p1 - p2
#the difference between when access was permitted and when it wasn't
point_estimate = sample_one_prop - sample_two_prop

#STEP 2: to make sure that the samples are sufficiently large. 
#2.1 sample one 
Error = 3*sqrt((sample_one_prop*(1- sample_one_prop))/sample_one_size)
c(between(sample_one_prop - Error, 0, 1),between(sample_one_prop + Error, 0, 1) )

#[1] TRUE TRUE: the sample 1 size is big enough!

#2.2 sample two 
Error = 3*sqrt((sample_two_prop*(1- sample_two_prop))/sample_two_size)

c(between(sample_two_prop - Error,0, 1),between(sample_two_prop + Error, 0, 1))
#[1] TRUE TRUE: the sample 2 size is big enough!

# STEP 3: Confidence Interval: 
z_alfa =  qnorm( (1 - alfa)/2, 
                mean = 0, 
                sd = 1
)

  c((sample_one_prop - sample_two_prop) + z_alfa*sqrt(
    (sample_one_prop*(1- sample_one_prop))/sample_one_size +
      (sample_two_prop*(1- sample_two_prop))/sample_two_size
  ), 
  (sample_one_prop - sample_two_prop) - z_alfa*sqrt(
    (sample_one_prop*(1- sample_one_prop))/sample_one_size +
      (sample_two_prop*(1- sample_two_prop))/sample_two_size
))

Conclusion:

The 90% confidence interval is [−0.20, −0.06 ] . We are 90% confident that the difference in the population proportions lies in the interval [−0.20, −0.06 ] , in the sense that in repeated sampling 90% of all intervals constructed from the sample data in this manner will contain p1 − p2 . Taking into account the labeling of the two populations, this means that we are 90% confident that the proportion of projects that pass on the first inspection is between 6 and 20 percentage points higher after public access to the records than before.

Comments

Popular posts from this blog

Simulation Project: Production Line Wasted Outputs

simulating production volume

supplier evaluation using population proportion tests hypothesis