SciPy Proceedings 2021 Survey

Last year, the SciPy Conference Proceedings Committee (Proccom) started collecting demographic data from authors and reviewers, in order to understand:

  1. how authors compare to conference attendees; and,
  2. how authors compare to reviewers.

This post will be an update with new data from 2021; the discussion of the 2020 results are available here.

Both 2020 and 2021 were unusual years for the SciPy conference, due to the ongoing global pandemic, and the conference moving to an online format. Last year, this seemed to have the effect of increasing the number of conference participants in general, and especially those from countries besides the United States. SciPy 2021 had a record number of participants:

but this did not translate into a record number of papers or reviewers. The number of full papers submitted to SciPy 2021 was on the low end for a typical year (about the same number of papers as 2017). The number of volunteer reviewers, by contrast, was less than half of what we typically see, and was not enough to field reviewers for all the full papers this year (Proccom chairs stepped in to review the remainder).

Out of the total number of reviewers and [corresponding] authors, less than a quarter filled out the post-conference demographics survey. Since the original population was already less than a hundred individuals, this has left us with a dataset too small to draw any meaningful conclusions, or to compare it to the demographic data for the conference as a whole. Nevertheless, we will present these data "as-is" for posterity.

To protect the privacy of the individuals who responed to the survey, any responses with less than five counts have been combined into an "Other" category. This has had the effect of reducing most comparisons to binaries.

How is gender distributed across authors?

Similarly to last year, ~10% of authors and reviewers identified as something besides male. This, of course, could be a result of sampling error.

scipy-proceedings-2021-gender-plot

How is age distributed across authors?

Like last year, the 35-50 age group is overrepresented as authors, and underrepresented as reviewers. This, of course, could be a result of sampling error.

scipy-proceedings-2021-age-plot

How is ethnicity distributed across authors?

Unlike last year, respondents who identify as "white" appeared in roughly equal proportions as both authors and reviewers, although they still appear to be overrepresented as participants in the proceedings in general. This, of course, could be a result of sampling error.

scipy-proceedings-2021-ethnicity-plot

Like last year, our ethnicity categories are derived from the U.S. Census, and as such may not be useful or accurate for participants from other countries. This may be confounding the relationship between ethnicity and authorship, as authors are overwhelmingly American (>75% in 2020 and 2021) but reviewership has been closer to 50/50 over the last two years. This, of course, could be a result of sampling error.

scipy-proceedings-2021-country-plot

How is employment distributed across authors?

Unlike last year, both authors and reviewers tended to hold academic positions. This might be a result of non-academic conference attendees reducing their academic activities, possibly due to stress or higher workloads in other parts of their life during the pandemic.This, of course, could be a result of sampling error.

scipy-proceedings-2021-employer-plot

How is career stage distributed across authors?

In 2020, respondents had a hard time deciphering te survey language around career stage. This year, Proccom opted for survey questions that were more strongly tied to job titles:

scipy-proceedings-2021-career-plot

then mapped these afterward to career stages. Similar to 2020, mid-career respondents were more likely to be authors, and early career respondents were more likely to be reviewers. This, of course, could be a result of sampling error.

scipy-proceedings-2021-career-stage-plot

How are roles in the library ecosystem distributed across authors?

Like last year, about a third of respondents report being a maintainer of a core scientific Python library. Also like last year, core developers were more likely to be authors, and less likely to be reviewers. This, of course, could be a result of sampling error.

scipy-proceedings-2021-tools-plot

How are relationships to the conference distributed across authors?

Like last year, less than half of reviewers reported presenting or having had presented a paper at SciPy. This is not shown in the plot below, but like last year, some fraction of reviewers still report never having attended a SciPy conference in the past, and not attending this year's conference either. This, of course, could be a result of sampling error.

scipy-proceedings-2021-conf-plot