OBJECTIVES
An appropriate sampling strategy for estimating an epidemiologic volume of diabetes has been evaluated through a simulation. METHODS: We analyzed about 250 million medical insurance claims data submitted to the Health Insurance Review & Assessment Service with diabetes as principal or subsequent diagnoses, more than or equal to once per year, in 2003. The database was re-constructed to a 'patient-hospital profile' that had 3,676,164 cases, and then to a 'patient profile' that consisted of 2,412,082 observations. The patient profile data was then used to test the validity of a proposed sampling frame and methods of sampling to develop diabetic-related epidemiologic indices. RESULTS: Simulation study showed that a use of a stratified two-stage cluster sampling design with a total sample size of 4,000 will provide an estimate of 57.04% (95% prediction range, 49.83 - 64.24%) for a treatment prescription rate of diabetes. The proposed sampling design consists, at first, stratifying the area of the nation into "metropolitan/city/county" and the types of hospital into "tertiary/secondary/primary/clinic" with a proportion of 5:10:10:75. Hospitals were then randomly selected within the strata as a primary sampling unit, followed by a random selection of patients within the hospitals as a secondly sampling unit. The difference between the estimate and the parameter value was projected to be less than 0.3%. CONCLUSIONS: The sampling scheme proposed will be applied to a subsequent nationwide field survey not only for estimating the epidemiologic volume of diabetes but also for assessing the present status of nationwide diabetes control.