Panel Provides Open Access Insight in COVID-19 Era
Eleonora Presani’s first day as Executive Director of arXiv was March 16, 2020 – the same day that much of New York State was locked down.
Even though she was new to arXiv, the free, open-source science research database maintained and operated by Cornell, she immediately realized that some things needed to change.
“Basically out of nowhere we started to see over 100 [coronavirus-related] newspapers per week, ”Presani told an Oct. 19 panel, “Rapid response: How arXiv and other open access resources are adapting to the needs of the research community during the pandemic.”
“We needed to do something about it,” she said, “to make sure we could treat them to the same level of service that we provide for our other papers.”
The mission of arXiv is to make scholarly research that has not yet been published available to researchers around the world. When COVID-19 hit, that mission became even more urgent, as scientists rushed to learn more about the coronavirus and develop ways to control the pandemic. But he also highlighted the challenges facing pre-print servers, including arXiv, bioRxiv, and medRxiv, in providing easy access to a huge volume of research that has yet to be covered. peer review while minimizing the spread of potential disinformation.
With nearly 1.8 million articles in its collection, arXiv relies on a team of 185 volunteer moderators, supported by a small team of collaborators, to select the articles submitted. But since the majority of its content is in the fields of physics, mathematics and computer science, the organization needed to quickly increase its expertise in COVID-related research.
In addition to creating a new search category for articles on coronaviruses, arXiv has hired a Cornell postdoctoral researcher in biomedicine to help filter these articles for potentially misleading content. arXiv has also partnered with other organizations compiling open access research, in order to make these articles more easily accessible and to accelerate the pace of discovery around the world.
At bioRxiv and medRxiv – which have seen huge spikes in submissions as COVID-19 spread – moderators soon realized they needed improved measures to address the risk of harmful misinformation. When an article published in January claiming that the inserts in the coronavirus spike protein resembled protein sequences from HIV, technical issues with the paper sparked a storm of criticism that led to its removal, John Inglis said, co-founder of bioRxiv and medRxiv, during the panel.
“But we also noticed over the weekend that the diary was picked up by conspiracy-oriented websites discussing the possibility that the virus was human-designed,” Inglis said. “We have decided to post an additional disclaimer regarding the use of pre-prints – a large yellow banner that reminds readers that these are pre-prints and, among other things, should not. be reported in the US media as established news. ”
The sites, hosted by the Cold Spring Harbor Laboratory, also have additional screening practices in place, including a two-step review of each article. Screeners look for issues like plagiarism and claims that could harm the public if they turn out to be wrong, Inglis said.
“Part of it was a realization that in all the fury over hydroxychloroquine, for example, the race on the drug was such that patients who really needed hydroxychloroquine were denied it. access, ”he said. “We were very aware that we didn’t want to add to this problem or create another similar problem.”
The arXiv, bioRxiv and medRxiv articles are among those included in CORD-19 (COVID-19 Open Research Dataset), a new dataset made available to take advantage of artificial intelligence to make an vast mine of coronavirus research.
“We wanted to take all the research on COVID-19 and the coronavirus, going back to the 1960s and 1970s, when the initial research on the coronavirus began to be published, and make it available in a machine-readable format,” said panelist Sebastian Kohlmeier, CORD-19 Project Manager and Senior Director at Semantic Scholar, Allen Institute for Artificial Intelligence, which created CORD-19 in partnership with other research groups.
When CORD-19 started in March, it contained 29,000 items. Today there are more than 280,000.
“It has been absolutely amazing,” Kohlmeier said. “And one of the main drivers of making all content available was that many scientific publishers and preprint servers came together to make COVID-19 related research open access, which was a very important part. to make this research accessible to scientists. who wanted to analyze the entire universe of coronavirus research.