Discussion meetings

Discussion Meetings are events where articles ('papers for reading') appearing in the Journal of the Royal Statistical Society are presented and discussed. The discussion and authors' replies are then published in the relevant Journal series. About half of the meetings are organised by the Society's Research Section and the events are often preceded by an informal session on the issues raised by the papers. See our guidelines for papers for discussion.

Preprints of journal papers are available to download to encourage discussion at our Discussion Meetings before publication in one of our journals. Other papers, such as Presidential addresses, are also available to download. All preprints available here are provisional and subject to later amendment by the authors.

Contact Judith Shorten if you would like to make a written contribution to a discussion meeting or receive a preprint for each meeting by email.

Click here to watch videos from past discussion meetings.

Preprint discussion papers

2018

Research Section Discussion Meeting, Tuesday, 4 December 2018
Covariate-assisted ranking and screening for large-scale two-sample inference
T. Tony Cai, Wenguang Sun and Weinan Wang
Details


A comparison of sample survey measures of earnings of English graduates with administrative data - Watch video, download presentation (.PDF)
Jack Britton, Neil Shephard and Anna Vignoles
Details


Extended Discussion Meeting, Wednesday, 5 September 2018, at the Royal Statistical Society’s annual conference in Cardiff

Three papers on ‘Data visualization’:
‘Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view’ (S. Castruccio, M. G. Genton and Y. Sun) - Watch video;
‘Visualization in Bayesian workflow’ (J. Gabry, D. Simpson, A. Vehtari, M. Betancourt and A. Gelman) - Watch video;
'Graphics for uncertainty’ (A. W. Bowman) - Watch video;


Preprints

2018

Research Section Discussion Meeting, Tuesday, 4 December 2018

Covariate-assisted ranking and screening for large-scale two-sample inference
T. Tony Cai (University of Pennsylvania, Philadelphia) and Wenguang Sun and Weinan Wang (University of Southern California, Los Angeles)

Two-sample multiple testing has a wide range of applications. The conventional practice first reduces the original observations to a vector of p-values and then chooses a cut-off to adjust for multiplicity. However, this data reduction step could cause significant loss of information and thus lead to suboptimal testing procedures. We introduce a new framework for two-sample multiple testing by incorporating a carefully constructed auxiliary variable in inference to improve the power. A data-driven multiple-testing procedure is developed by employing a covariate-assisted ranking and screening (CARS) approach that optimally combines the information from both the primary and the auxiliary variables. The proposed CARS procedure is shown to be asymptotically valid and optimal for false discovery rate control. The procedure is implemented in the R package CARS. Numerical results confirm the effectiveness of CARS in false discovery rate control and show that it achieves substantial power gain over existing methods. CARS is also illustrated through an application to the analysis of a satellite imaging data set for supernova detection.

To be published in Series B; for more information go to the Wiley Online Library.

The preprint is available to download.
‘Covariate-assisted ranking and screening for large-scale two-sample inference’ (PDF)
Supporting information (PDF)
Data and computer code (.zip)


A comparison of sample survey measures of earnings of English graduates with administrative data

Jack Britton (Institute for Fiscal Studies, London), Neil Shephard (Harvard University, Cambridge) and Anna Vignoles (University of Cambridge)

Administrative data sets are increasingly used in research because of their excellent coverage and large scale. However, in the UK the use of administrative data on individuals’ earnings, and particularly graduates’ earnings, is novel. Understanding the strengths and weaknesses of such data is important as they are set to be used extensively for research and to inform policy. Here we compare survey-based labour earnings data from the UK’s Labour Force Survey (LFS) with UK Government administrative sources of individual level earnings data, focusing separately on young (up to age 32 years) graduates and non-graduates. This type of administrative data set has few sample selection issues and is longitudinal and its large samples mean that the earnings of subpopulations can potentially be studied with low error. Overall we find a similar share of individuals with zero earnings in the LFS and administrative data, but a considerably higher share (conditionally on working) earning below £8000 in the administrative data. The LFS has generally higher earnings right through the distribution, though above the median a large share of the differences can potentially be explained by employee pension contributions. We also find considerably larger gender difference in the survey data. The findings hold for both graduates and non-graduates. These differences are substantively important and suggest different conclusions about the gender wage gap, the graduate earnings premium and the extent of earnings inequality.

To be published in Series A; for more information go to the Wiley Online Library.

The preprint is available to download.
'A comparison of sample survey measures of earnings of English graduates with administrative data’ (PDF)
Supporting information (PDF)
Data and computer code (.zip)


Extended Discussion Meeting on ‘Data visualization’, Wednesday, 5 September 2018

Stefano Castruccio (University of Notre Dame, USA) and Marc G. Genton and Ying Sun (King Abdullah University of Science and Technology, Thuwal)

‘Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view’

Recent advances in computing hardware and software present an unprecedented opportunity for statisticians who work with data indexed in space and time to visualize, explore and assess the structure of the data and to improve resulting statistical models. We present results of a 3-year collaboration with a team of visualization experts on the use of stereoscopic view and virtual reality (VR) to visualize spatiotemporal data with animations on non-trivial manifolds. We first present our experience with fully immersive VR with motion tracking devices that enable users to explore global three-dimensional time–temperature fields on a spherical shell interactively. We then introduce a suite of applications with VR mode, freely available for smartphones, to port a visualization experience to any interested people. We also discuss recent work with head-mounted devices such as a VR headset with motion tracking sensors.

To be published in Series A; for more information go to the Wiley Online Library.

The preprint is available to download.
‘Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view’ (PDF)
View animation (.zip)
Supporting information (PDF)

Jonah Gabry (Columbia University, New York), Daniel Simpson (University of Toronto), Aki Vehtari (Aalto University, Espoo), Michael Betancourt (Columbia University, New York, and Symplectomorphic, New York) and Andrew Gelman (Columbia University, New York)

‘Visualization in Bayesian workflow’

Bayesian data analysis is about more than just computing a posterior distribution, and Bayesian visualization is about more than trace plots of Markov chains. Practical Bayesian data analysis, like all data analysis, is an iterative process of model building, inference, model checking and evaluation, and model expansion. Visualization is helpful in each of these stages of the Bayesian workflow and it is indispensable when drawing inferences from the types of modern, high dimensional models that are used by applied researchers.

To be published in Series A; for more information go to the Wiley Online Library.

The preprint is available to download.
‘Visualization in Bayesian workflow’ (PDF)
Supporting information (PDF)

Adrian W. Bowman (University of Glasgow)

'Graphics for uncertainty’

Graphical methods such as colour shading and animation, which are widely available, can be very effective in communicating uncertainty. In particular, the idea of a ‘density strip’ provides a conceptually simple representation of a distribution and this is explored in a variety of settings, including a comparison of means, regression and models for contingency tables. Animation is also a very useful device for exploring uncertainty and this is explored particularly in the context of flexible models, expressed in curves and surfaces whose structure is of particular interest. Animation can further provide a helpful mechanism for exploring data in several dimensions. This is explored in the simple but very important setting of spatiotemporal data.

To be published in Series A; for more information go to the Wiley Online Library.

The preprint is available to download.
‘Graphics for uncertainty’ (PDF)
View animations (zip)

This meeting forms part of the RSS 2018 Conference and anyone registered for that day can automatically attend the meeting. If you are not able to attend the conference but wish to just attend the discussion meeting session please contact the conference office.