Past Journal webinars
A scalable bootstrap for massive data (RSS Series B, Volume 76, Issue 4, 2014)
Autor: Michael I. Jordan is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley. He received his Masters in Mathematics from Arizona State University, and earned his PhD in Cognitive Science in 1985 from the University of California, San Diego. He was a professor at MIT from 1988 to 1998. His research interests bridge the computational, statistical, cognitive and biological sciences, and have focused in recent years on Bayesian nonparametric analysis, probabilistic graphical models, spectral methods, kernel machines and applications to problems in distributed computing systems, natural language processing, signal processing and statistical genetics. Prof. Jordan is a member of the National Academy of Sciences, a member of the National Academy of Engineering and a member of the American Academy of Arts and Sciences. He is a Fellow of the American Association for the Advancement of Science.He has been named a Neyman Lecturer and a Medallion Lecturer by the Institute of Mathematical Statistics. He received the IJCAI Research Excellence Award in 2016, the David E. Rumelhart Prize in 2015 and the ACM/AAAI Allen Newell Award in 2009. He is a Fellow of the AAAI, ACM, ASA, CSS, IEEE, IMS, ISBA and SIAM.
Co-authors: Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar
Chair: Richard Samworth, Cambridge University
The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large data sets—which are increasingly prevalent—the calculation of bootstrap-based quantities can be prohibitively demanding computationally. Although variants such as subsampling and the m out of n bootstrap can be used in principle to reduce the cost of bootstrap computations, these methods are generally not robust to specification of tuning parameters (such as the number of subsampled data points), and they often require knowledge of the estimator's convergence rate, in contrast with the bootstrap. As an alternative, we introduce the ‘bag of little bootstraps’ (BLB), which is a new procedure which incorporates features of both the bootstrap and subsampling to yield a robust, computationally efficient means of assessing the quality of estimators. The BLB is well suited to modern parallel and distributed computing architectures and furthermore retains the generic applicability and statistical efficiency of the bootstrap. We demonstrate the BLB's favourable statistical performance via a theoretical analysis elucidating the procedure's properties, as well as a simulation study comparing the BLB with the bootstrap, the m out of n bootstrap and subsampling. In addition, we present results from a large-scale distributed implementation of the BLB demonstrating its computational superiority on massive data, a method for adaptively selecting the BLB's tuning parameters, an empirical study applying the BLB to several real data sets and an extension of the BLB to time series data.
Download slides (Power Point), watch video (YouTube)
‘Is the Carli Index flawed? Assessing the case for the RPIJ’ published in JRSS-A in 2015 (Vol 178:2)
Author: Peter Levell is an economics researcher at the Institute for Fiscal Studies (IFS) and a part-time PhD student at University College London. His work at the IFS has so far covered a diverse set of subjects: from measurement issues in expenditure surveys, to issues around taxation, the distributional impact of inflation and behavioural economics. To date his academic work has focused on trying to answer questions concerning the appropriate measurement of consumer price inflation and at achieving a better understanding of household decisions over consumption and labour supply."
Chair: Paul Smith, Associate Professor in Official Statistics, University of Southampton
Discussant: Andrew Baldwin, a former employee of Statistics Canada
This paper discusses the decision in March 2013 of the UK's Office for National Statistics to replace the controversial Carli index with the Jevons index in a new version of the Retail Prices Index - the RPIJ. In doing so we make three contributions to the way price indices should be selected for measures of consumer price inflation when quantity information is not available (i.e at the `elementary' level). Firstly, we introduce a new price bouncing test under the test approach for choosing index numbers. Secondly, we provide empirical evidence on the performance of the Carli and Jevons indices in different contexts under the statistical approach. Thirdly, applying something analogous to the principle of insufficient reason, we argue contrary to received wisdom in the literature, that the economic approach can be used to choose indices at the elementary level, and moreover that it favours the use of the Jevons index. Overall, we conclude that there is a case against the Carli index and that the Jevons index is to be preferred.
Download slides (PDF), Download comments (PowerPoint)
6 July 2016
Presented by Alan Philipps, Vice President Biostatistics, ICON Clinical Research and Peter Diggle, President of the RSS webcast (YouTube) slides1, slides2
Chair: Andrew Garrett
Estimands (what is to be estimated) is a new and hot topic in clinical development, in particular with the regulatory authorities who are responsible for approving new treatments. New regulatory guidance will be developed that will also address the subject of sensitivity analyses. This webinar will explore the current thinking on the topic and revisit some of the earlier work to consider how the topic has evolved over 10 years and how it might look in the future.
1 March 2016
Optimal design: getting more out of experiments with hard-to-change factors’
Presented by Professor Peter Goos - download slides (PDF),watch video (YouTube)
Discussant: Maria Lanzerath
We introduce a new method for generating optimal split-plot designs. These designs are optimal in the sense that they are efficient for estimating the fixed effects of the statistical model that is appropriate given the split-plot design structure. One advantage of the method is that it does not require the prior specification of a candidate set. This makes the production of split-plot designs computationally feasible in situations where the candidate set is too large to be tractable. The method allows for flexible choice of the sample size and supports inclusion of both continuous and categorical factors. The model can be any linear regression model and may include arbitrary polynomial terms in the continuous factors and interaction terms of any order. We demonstrate the usefulness of this flexibility with a 100-run polypropylene experiment involving 11 factors where we found a design that is substantially more efficient than designs that are produced by using other approaches.
Peter Goos is a professor at the Faculty of Bio-Science Engineering of the University of Leuven and at the Faculty of Applied Economics of the University of Antwerp, where he teaches various introductory and advanced courses on statistics and probability. His main research area is the statistical design and analysis of experiments. He has published books on 'The Optimal Design of Blocked and Split-Plot Experiments', 'Optimal Experimental Design: A Case-Study Approach', 'Statistics with JMP: Graphs, Descriptive Statistics and Probability' and 'Statistics with JMP: Hypothesis Tests, ANOVA and Regression'.
To date, Peter Goos has received the Shewell Award and the Lloyd S Nelson Award of the American Society for Quality, the Ziegel Award and the Statistics in Chemistry Award from the American Statistical Association, and the Young Statistician Award of the European Network for Business and Industrial Statistics.
21 October 2015
Frequentist accuracy of Bayesian estimates
Presented by Bradley Efron, Max H Stein Professor of Humanities and Sciences, Professor of Statistics at Stanford University - download slides
Discussant: Andrew Gelman of Columbia University.
Chair: Peter Diggle.
Bradley's paper 'Frequentist accuracy of Bayesian estimates' was recently published in the Royal Statistical Society's Series B Journal (Volume 77 (2015), part 3). The abstract is as follows:
In the absence of relevant prior experience, popular Bayesian estimation techniques usually begin with some form of 'uninformative' prior distribution intended to have minimal inferential influence. Bayes' rule will still produce nice-looking estimates and credible intervals, but these lack the logical force attached to experience-based priors and require further justification. This paper concerns the frequentist assessment of Bayes estimates. A simple formula is shown to give the frequentist standard deviation of a Bayesian point estimate. The same simulations required for the point estimate also produce the standard deviation. Exponential family models make the calculations particularly simple, and bring in a connection to the parametric bootstrap.
Bradley Efron is Max H Stein professor of humanities and sciences, professor of statistics at Stanford University, and professor of biostatistics with the Department of Health Research and Policy in the School of Medicine. He is a former president of both the American Statistical Association and the Institute of Mathematical Statistics. A recipient of the Ford Prize of the Mathematical Association of America and of both the Wilks Medal and the Noether Prize from the American Statistical Association (ASA). In 2003 Bradley was given the inaugural Rao Prize for outstanding research in statistics by Pennsylvania State University in 2005 he received the National Medal of Science. In 2014, Bradley was awarded the Guy Medal in Gold by the Royal Statistical Society for his 'seminal contributions to many areas of statistics'.
21 May 2015
Speakers: Dr Guosheng Yin from the Department of Statistics and Actuarial Science at the University of Hong Kong - download slides (PDF)
Professor Franz Koenig from the Center for Medical Statistics, Informatics and Intelligent Systems at the Medical University of Vienna - download slides (PDF)
Chair: Robert Cuffe of ViiV Healthcare.
Paper 1: Two-stage adaptive randomization for delayed response in clinical trials
Despite the widespread use of equal randomisation in clinical trials, response adaptive randomisation has attracted considerable attention. There is typically a prerun of equal randomisation before the implementation of response-adaptive randomisation, while it is often not clear how many subjects are needed in this prephase. Real-time response-adaptive randomisation often requires patients’ responses to be immediately available after the treatment, whereas clinical responses may take a relatively long period of time to exhibit. We propose a two-stage procedure to achieve a balance between power and response, which is equipped with a likelihood ratio test before skewing the allocation probability towards a better treatment. Furthermore, we develop a non-parametric fractional model and a parametric survival design with an optimal allocation scheme to tackle the common problem caused by delayed response. We evaluate the operating characteristics of the two-stage designs through simulation studies and show that the methods proposed satisfactorily resolve the arbitrary size of the equal randomisation phase and the delayed response problem in response-adaptive randomisation.
Dr Guosheng Yin, currently a professor at University of Hong Kong, is also an adjunct professor at University of Texas MD Anderson Cancer Center. He received a PhD in Biostatistics from University of North Carolina and worked as in the Department of Biostatistics at MD Anderson Cancer Center, before becoming associate professor in the Department of Statistics and Actuarial Science at University of Hong Kong in 2009. Dr Yin was elected as a Fellow of the American Statistical Associationin 2013, and a Member of the International Statistical Institute in 2012. He is Associate Editor for the Journal of American Statistical Association, Bayesian Analysis,and Contemporary Clinical Trials. Hismain research areas include Bayesian adaptive designin clinical trials andsurvival analysis. He has publishedover 100 peer-reviewed papersand a book on ‘Clinical Trial Design: Bayesian and Frequentist Adaptive Methods’ in the John Wiley Series.
Paper 2: Adaptive graph-based multiple testing procedures
Multiple testing procedures defined by directed, weighted graphs have recently been proposed as an intuitive visual tool for constructing multiple testing strategies that reflect the often complex contextual relations between hypotheses in clinical trials. Many well-known sequentially rejective tests, such as (parallel) gatekeeping tests or hierarchical testing procedures are special cases of the graph based tests. We generalise these graph-based multiple testing procedures to adaptive trial designs with an interim analysis. These designs permit mid-trial design modifications based on unblended interim data as well as external information, while providing strong family wise error rate control. Because the adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof.
Franz Koenig is currently associate professor at the Section of Medical Statistics at the Medical University of Vienna, Austria. He is currently member of ethics committee of the Medical University of Vienna and also of the ethics committee of the community of Vienna. From 2008-2010 he was seconded to the European Medicines Agency (EMA) in London as statistical expert in the Unit Human Medicines Development and Evaluation, where he also held the Scientific Secretariat of the Biostatistics Working Party (BSWP). He was involved in the development of guidelines and assessment of statistical methods and clinical trial protocols. His main research interests are multiple testing, adaptive/flexible designs, interim analyses and data safety monitoring boards (DSMB). Professor Koenig has served as guest editor for special issues in Biometrical Journal and Statistics in Medicine. He is currently leading work on the work package ‘adaptive designs’ in the EU funded research project IDEAL and is deputy coordinator of an EU Horizon 2020 funded Marie Curie ITN network IDEAS on early drug development studies.
24 February 2015
Doubly robust estimation of the local average treatment effect curve
Speaker: Elizabeth Ogburn, Assistant Professor of Biostatistics at Johns Hopkins University. Summary
Chair: Dr Dylan Small, The Wharton School, University of Pennsylvania.
Co-authors Andrea Rotnitzky and Jamie Robins
This paper is about estimation of the causal effect of a binary treatment on an outcome, conditional on covariates, from observational studies or natural experiments in which there may be unmeasured confounding of the treatment-outcome relationship but there is a binary instrument for treatment.
The paper describes a doubly robust, locally efficient estimator of the parameters indexing a model for the local average treatment effect, conditional on covariates V, when randomisation of the instrument is only true conditional on a high dimensional vector of covariates X, possibly bigger than V. (The local average treatment effect is the treatment effect among compliers, or those subjects whose treatment value would agree with their instrument value, whether that value were 0 or 1). It also discusses the surprising result that inference is identical to inference for the parameters of a model for an additive treatment effect on the treated conditional on V that assumes no treatment-instrument interaction.
Elizabeth Ogburn (Betsy) has been an Assistant Professor of Biostatistics at Johns Hopkins University since August 2013. She received her PhD in biostatistics from Harvard University, where she worked with Andrea Rotnitzky and Jamie Robins, followed by a postdoctoral fellowship with Tyler VanderWeele at the Harvard School of Public Health Program on Causal Inference. She works on developing statistical methodology for causal inference, with a focus on novel data sources and structures; for example, using electronic medical records to inform individual-level healthcare decisions and using social network and other data that evince complex dependence among observations.
The paper is published in the Journal of the Royal Statistical Society: Series B (Statistical Methodology) and is available online to subscribers of the journal.
Webcast (YouTube), Slides (PDF)
20 November 2014
Modelling/predicting criminal behaviour
Chair: Professor Chris Skinner, professor of statistics at the London School of Economics & Political Science.
The item count method for sensitive survey questions: Modelling criminal behaviour
Speakers: Jouni Kuha and Jonathan Jackson
The item count method is a way of asking sensitive survey questions
which protects the anonymity of the respondents by randomization before
the interview. It can be used to estimate the probability of sensitive
behaviour and to model how it depends on explanatory variables. The
results of the author’s analysis of criminal behaviour highlight the
fact that careful design of the questions is crucial for the success of
the item count method.
Which method predicts recidivism best? A comparison of statistical, machine learning and data mining prediction models
Speakers: Nikolaj Tollenaar and Peter van der Heijden
Risk assessment instruments are widely used in criminal justice settings
all over the world. However, in recent times, different approaches to
prediction have been developed. This paper investigates whether modern
techniques in data mining and machine learning provide an improvement in
predictive performance over classical statistical methods such as
logistic regression and linear discriminant analysis. Using data from
criminal conviction histories of offenders, these models are compared.
Results indicate that in these data, classical methods tend to do
equally well as or better than their modern counterparts.
Webcast (YouTube), Slides (PPTX)
1 April 2014
Joint event with Statisticians in the Pharmaceutical Industry (PSI) sponsored by Quintiles and Wiley
Chair: James Carpenter (London School of Hygiene & Tropical Medicine)
A Bayesian dose finding design for oncology clinical trials of combinational biological agents
Speaker: Ying Yuan, Department of Biostatistics, University of Texas
Co-authors: Chunyan Cai, Yuan Ji; Journal of the Royal Statistical Society: Series C (Applied Statistics), Volume 63, Issue 1, Pages 159–173, January 2014
Escalation strategies for combination therapy Phase I trials
Speaker: Michael J Sweeting (Department of Public Health and Primary Care, University of Cambridge)
Discussant: Tony Sabin (Amgen)
Co-author: Adrian P Mander; Pharmaceutical Statistics, Volume 11, Issue 3, Pages 258–266, May/June 2012
10 December 2013
Point process modelling for directed interaction networks
Speakers: Patrick O Perry (firstname.lastname@example.org) and Patrick J Wolfe (email@example.com)
Chair: John Aston (J.A.D.Aston@warwick.ac.uk)
Network data often take the form of repeated interactions between senders and receivers tabulated over time. A primary question to ask of such data is which traits and behaviours are predictive of interaction. To answer this question, a model is introduced for treating directed interactions as a multivariate point process: a Cox multiplicative intensity model using covariates that depend on the history of the process. Consistency and asymptotic normality are proved for the resulting partial-likelihood-based estimators under suitable regularity conditions, and an efficient fitting procedure is described. Multicast interactions – those involving a single sender but multiple receivers – are treated explicitly. The resulting inferential framework is then employed to model message sending behaviour in a corporate email network. The analysis gives a precise quantification of which static shared traits and dynamic network effects are predictive of message recipient selection.
Webcast (Flash), Audio (MP3), slides (PDF)
30 September 2013
A likelihood-based sensitivity analysis for publication bias in meta-analysis
Speaker: Professor John B Copas (Emeritus Professor of Statistics, University of Warwick)
Chair: Professor James Carpenter (London School of Hygiene & Tropical Medicine)
Publication bias, a serious threat to the validity of meta-analysis, is essentially a problem of non-random sampling. If the research studies identified in a systematic review are thought of as a sample from the population of all studies which have been done in the area of interest, and if studies which report a statistically significant result are more likely to be published than studies whose results are inconclusive, than a meta-analysis of the studies selected in the review will be biased, giving over-estimated treatment effects and exaggerated assessments of significance. This recent paper in Applied Statistics discusses a sample selection model for meta-analysis and suggests a sensitivity analysis that can be useful for assessing how large the effect of publication bias is likely to be. Two examples are discussed in detail, including an example of a published meta-analysis whose conclusion was completely contradicted by evidence from a later large collaborative clinical trial.
Webcast (Flash), audio (MP3) and slides (PDF) available.
13 June 2013
Speakers: Ron S Kenett (KPA, Raanana, Israel, University of Turin, Italy, and New York University–Poly, USA) and Galit Shmueli (Indian School of Business, Gachibowli, India)
Chair: Dr Shirley Coleman
We define the concept of information quality ‘InfoQ’ as the potential of a data set to achieve a specific (scientific or practical) goal by using a given empirical analysis method. InfoQ is different from data quality and analysis quality, but is dependent on these components and on the relationship between them. We survey statistical methods for increasing InfoQ at the study design and post-data-collection stages, and we consider them relatively to what we define as InfoQ.
We propose eight dimensions that help to assess InfoQ: data resolution, data structure, data integration, temporal relevance, generalizability, chronology of data and goal, construct operationalization and communication. We demonstrate the concept of InfoQ, its components (what it is) and assessment (how it is achieved) through three case studies in on-line auctions research. We suggest that formalising the concept of InfoQ can help to increase the value of statistical analysis, and data mining both methodologically and practically, thus contributing to a general theory of applied statistics.
Ron Kenet's slides (PDF) and PowerPoint presentation. Webcast (YouTube video) and Galit Shmueli's slides (PDF) also available.
16 April 2013
Survival analysis (joint session with PSI)
Chair: James Carpenter (London School of Hygiene & Tropical Medicine)
Evaluating joint effects of induction – salvage treatment regimes on overall survival in acute leukaemia
Speaker: Abdus S Wahed (University of Pittsburgh and RSS)
Co-author: Peter F Thall, Journal of the Royal Statistical Society: Series C (Applied Statistics), Volume 62, Issue 1, Pages 67–83, January 2013
Slides (PDF). Abstract and the article are available on the Wiley Online Library website.
Attenuation of treatment effect due to measurement variability in assessment of progression-free survival
Speaker: Nicola Schmitt (AstraZeneca)
Co-authors: S Hong, A Stone, J Denne, Pharmaceutical statistics, Volume 11, Issue 5, pages 394-402, September/October 2012
Slides (PDF). Abstract and the article are available on the Wiley Online Library website. Webcast with slides (WMV | MP4) and audio only (MP3) available.