|
|
|
Biostatistics Branch Journal Club Papers
Reference List
Click on the Title to download a PDF Version
1. Benjamini Y, Hochberg Y. Controlling
the false discovery rate: a practical and powerful approach to multiple
testing. Journal of the Royal Statistical Society.Series B (methodological)
1995;57:289-300.
Abstract: The common approach to the multiplicity problem calls
for controlling the familywise error rate (FWER). This approach, though,
has faults, and we point out a few. A different approach to problems of
multiple significance testing is presented. It calls for controlling the
expected proportion of falsely rejected hypotheses-the false discovery
rate. This error rate is equivalent to the FWER when all hypotheses are
true but is smaller otherwise. Therefore, in problems where the control
of the false discovery rate rather than that of the FWER is desired, there
is potential for a gain in power. A simple sequential Bonferroni-type
procedure is proved to control the false discovery rate for independent
test statistics, and a simulation study shows that the gain in power is
substantial. The use of the new procedure and the appropriateness of the
criterion are illustrated with examples.
2. Benjamini Y, Yekutieli D. The control of the false discovery rate
in multiple testing under dependency. Annals of Statistics 1999;29:1165-88.
Abstract: Benjamini and Hochberg suggest that the false discovery
rate may be the appropriate error rate to control in many applied multiple
testing problems. A simple procedure was given there as an FDR controlling
procedure for independent test statistics and was shown to be much more
powerful than comparable procedures which control the traditional family
wise error rate. We prove that this same procedure also controls the false
discovery rate when the test statistics have positive regression dependency
on each of the test statistics corresponding to the true null hypotheses.
This condition for positive dependency is general enough to cover many
problems of practical interest, including the comparisons of many treatments
with a single control, multivariate normal test statistics with positive
correlation matrix and multivariate t. Furthermore, the test statistics
may be discrete, and the tested hypotheses composite without posing special
difficulties. For all other forms of dependency, a simple conservative
modification of the procedure controls the false discovery rate. Thus
the range of problems for which a procedure with proven FDR control can
be offered is greatly increased.
3. Efron B, Tibshirani R. Empirical Bayes methods
and false discovery rates for microarrays. Genet.Epidemiol. 2002;23:70-86.
4. Hildesheim A, Apple RJ, Chen CJ, Wang SS,
Cheng YJ, Klitz W et al. Association of HLA class I and II alleles and
extended haplotypes with nasopharyngeal carcinoma in Taiwan. J.Natl.Cancer
Inst. 2002;94:1780-9.
Abstract: BACKGROUND: Nasopharyngeal carcinoma (NPC), which
occurs at a disproportionately high rate among Chinese individuals, is
associated with Epstein-Barr virus (EBV). Human leukocyte antigen (HLA)
polymorphisms appear to play a role in NPC, because they are essential
in the immune response to viruses. We used high-resolution HLA genotyping
in a case-control study in Taiwan to systematically evaluate the association
between various HLA alleles and NPC. METHODS: We matched 366 NPC case
patients to 318 control subjects by age, sex, and geographic residence.
Participants were interviewed and provided blood samples for genotyping.
High-resolution (polymerase chain reaction- based) genotyping of HLA class
I (A and B) and II (DRB1, DQA1, DQB1, and DPB1) genes was performed in
two phases. In phase I, 210 case patients and 183 control subjects were
completely genotyped. In phase II, alleles associated with NPC in the
phase I analysis were evaluated in another 156 case patients and 135 control
subjects. Extended haplotypes were inferred. RESULTS: We found a consistent
association between HLA-A*0207 (common among Chinese but not among Caucasians)
and NPC (odds ratio [OR] = 2.3, 95% confidence interval [CI] = 1.5 to
3.5) but not between HLA-A*0201 (most common HLA-A2 allele in Caucasians)
and NPC (OR = 0.79, 95% CI = 0.55 to 1.2). Individuals with HLA-B*4601,
which is in linkage disequilibrium with HLA-A*0207, had an increased risk
for NPC (OR = 1.8, 95% CI = 1.2 to 2.5) as did individuals with HLA-A*0207
and HLA-B*4601 (OR = 2.8, 95% CI = 1.7 to 4.4). Individuals homozygous
for HLA-A*1101 had decreased risks for NPC (OR = 0.24, 95% CI = 0.13 to
0.46). The extended haplotype HLA-A*3303-B*5801/2- DRB1*0301-DQB1*0201/2-DPB1*0401,
specific to this ethnic group, was associated with a statistically significantly
increased risk for NPC (OR = 2.6, 95% CI = 1.1 to 6.4). CONCLUSIONS: The
restriction of the association of HLA-A2 with NPC to HLA-A*0207 probably
explains previously observed associations of HLA-A2 with NPC among Chinese
but not Caucasians. The extended haplotypes associated with NPC might,
in part, explain the higher rates of NPC in this ethnic group.
5. Hill WG. Estimation of linkage disequilibrium
in randomly mating populations. Heredity 1974;33:229-39.
6. Rieder MJ, Taylor SL, Clark AG, Nickerson DA.
Sequence variation in the human angiotensin converting enzyme. Nat.Genet.
1999;22:59-62.
Abstract: Angiotensin converting enzyme (encoded by the gene DCP1,
also known as ACE) catalyses the conversion of angiotensin I to the physiologically
active peptide angiotensin II, which controls fluid-electrolyte balance
and systemic blood pressure. Because of its key function in the renin-angiotensin
system, many association studies have been performed with DCP1. Nearly
all studies have associated the presence (insertion, I) or absence (deletion,
D) of a 287-bp Alu repeat element in intron 16 with the levels of circulating
enzyme or cardiovascular pathophysiologies. Many epidemiological studies
suggest that the DCP1*D allele confers increased susceptibility to cardiovascular
disease; however, other reports have found no such association or even
a beneficial effect. We present here the complete genomic sequence of
DCP1 from 11 individuals, representing the longest contiguous scan (24
kb) for sequence variation in human DNA. We identified 78 varying sites
in 22 chromosomes that resolved into 13 distinct haplotypes. Of the variant
sites, 17 were in absolute linkage disequilibrium with the commonly typed
Alu insertion/deletion polymorphism, producing two distinct and distantly
related clades. We also identified a major subdivision in the Alu deletion
clade that enables further analysis of the traits associated with this
gene. The diversity uncovered in DCP1 is comparable to that described
for other regions in the human genome. The highly correlated structure
in DCP1 raises important issues for the determination of functional DNA
variants within genes and genetic studies in humans based on marker association
7. Satagopan JM, Verbel DA, Venkatraman ES,
Offit KE, Begg CB. Two-stage designs for gene-disease association studies.
Biometrics 2002;58:163-70.
Abstract: The goal of this article is to describe a two-stage design
that maximizes the power to detect gene-disease associations when the
principal design constraint is the total cost, represented by the total
number of gene evaluations rather than the total number of individuals.
In the first stage, all genes of interest are evaluated on a subset of
individuals. The most promising genes are then evaluated on additional
subjects in the second stage. This will eliminate wastage of resources
on genes unlikely to be associated with disease based on the results of
the first stage. We consider the case where the genes are correlated and
the case where the genes are independent. Using simulation results, it
is shown that, as a general guideline when the genes are independent or
when the correlation is small, utilizing 75% of the resources in stage
1 to screen all the markers and evaluating the most promising 10% of the
markers with the remaining resources provides near-optimal power for a
broad range of parametric configurations. This translates to screening
all the markers on approximately one quarter of the required sample size
in stage.
|
|
|