University scientists at the Minor Lab have created a database to provide the most accurate and up-to-date information about the COVID-19 virus in hopes of giving the biomedical community trusted structural information about the virus’s components. The team hopes this website will help scientists, as well as the broader research community, find and use accurate information about the virus and is actively looking for collaborators.
Wladek Minor, a Harrison Distinguished Teaching Professor at the School of Medicine, teamed up with Medical School research scientists Ivan Shabalin and Dariusz Brzezinski to create the resource.
Structural models are instrumental in the search for vaccines and other drug discoveries as pharmaceutical companies often use this information to develop compounds that will stop the virus from replicating and block the function of the proteins. In this race, it is crucial for scientists and researchers to test their product on the most structurally accurate model possible. However, this process of determining a molecular structure is not an easy and foolproof task.
“Among hundreds of very high-quality models, a small number reminds us that the process of structure determination is not fail-safe,” Minor said in an email to The Cavalier Daily. “The number of suboptimal structures is very low, but the ripple effect of suboptimal or sometimes even erroneous structures may invalidate many biomedical hypotheses when discovered.”
According to Minor, many researchers and structural biologists have been adding protein structure relating to COVID-19 to the Protein Data Bank daily, making them publicly available prior to publication and peer review — an evaluation of scientific or academic work done by other professionals in the same field.
“Since the structures from the first-line research are produced in an accelerated mode, there is an elevated chance of mistakes and errors with the ultimate risk of hindering, rather than speeding‐up, drug development,” Minor said.
Understanding these issues, Minor and his team aimed to create an accessible database that provided the most factually accurate representation of the structure of COVID-19 and share this knowledge with the scientific community.
“Our first goal is to organize it in a way that is easily accessible for the biomedical community,” Minor said. “And second, to provide scientists with the structure switch — optimal or close to optimal — so that they can trust the data.”
Once COVID-19’s growth accelerated in March and most students and faculty were sent home, Minor and his team were hard at work.
“The hardest time was the end of March,” Shabalin said. “We were working very hard, pushing this as fast as we could, [and] working through weekends.”
Brzezinski further added that it was challenging to curate all the data as soon as it was available.
“The idea was that we should show [the information] to the world, and what information we should show … how and with only a couple of days to do that, it was definitely something hard.” Brzezinski said. “You have data coming in all the time but we have to share it with other people and doing that very quickly was definitely a hurdle.”
Shabalin, Minor, Brzezinski and a few other top scientists used their past experience in experimental validation and interpretation to improve COVID-19 structural models. These tools and resources fueled the development of the new database.
Every week the Protein Data Bank — a database for three-dimensional structures of biological molecules — is updated, and the team extracts the information related to COVID-19. The team then uses validation tools to verify the data and manually inspect the protein models.
The website begins with a filtering system, followed by a search bar and an area with a spreadsheet containing the Protein Data Bank code, resolution or clarity of the structure, release date, title, detection method and number of ligands or small molecules that produce a signal by bonding to another protein.
“The website categorizes the analyzed proteins according to the experimental method used to determine the structure, virus type, protein type and ligand status,” Minor said.
Additionally, the resource provides the quality of the structural results.
“Because structural models are the interpretation of the diffraction data, they are sometimes suboptimal or even erroneous,” Minor said. “In most cases, only minor corrections were suggested. However, in several cases, the revisions were significant, especially in the sensitive area of protein-ligand complexes that are critical for follow-up research, like drug discovery.”
The website is updated on a daily basis, and if any errors are spotted, the resources are updated immediately.
“We have a couple of ideas on how to visualize the virus and connect it with the proteins, ” Brzezinski said. “[In the future,] we want to use some machine learning and verify that.”
The team keeps moving forward with information on the virus, playing an important role in the battle against COVID-19.
“The whole project is still continuing to improve new structures that appear in the Protein Data Bank,” Shabalin said.
Minor further noted the importance of strong collaboration playing a key role in the development of the project.
“We are not looking for certain types of institutions.” Minor said. “We are ready to collaborate with anyone who is willing to help us.”
The team comes from a number of diverse backgrounds with scientists like Minor and Shabalin having experience in structural biology while Brzezinski is a computer scientist. The team also collaborates with a number of international organizations.
Much of the team's funding came from international organizations including the Polish National Science Centre, the Polish National Agency for Academic Exchange, the Austrian Science Foundation and the Center for Cancer Research.
“Working on a project driven by strong international collaborations is an enormous opportunity for younger scientists, like Ivan Shabalin and Dariusz Brzezinski, who will undoubtedly lead other highly impactful studies in the near future,” Minor said.