The Cavalier Daily
Serving the University Community Since 1890

Improved data algorithms could help physicians diagnose, treat diseases

Researchers at the School of Data Science explore new technologies at the intersection of data science and medicine

Recent advances in algorithms are providing physicians with new tools to predict, diagnose and even treat diseases. Professors and students at the School of Data Science are at the forefront of this progress.

Algorithms create sets of rules for processing softwares to follow, allowing the softwares to sort and analyze the data. These algorithms being developed in the medical field incorporate new forms of data including how patients speak about their symptoms as well as very high-resolution images that can be zoomed in to the nuclear level. 

Engineering Prof. Don Brown, founding director of the Data Science Institute, is working on the development of these algorithms and their implications for medicine.

“[The algorithm development] has allowed us to look at and better diagnose diseases,” Brown said. “For example, when you look at an image from a biopsy, it's hard to read that image. So it makes it much easier for us to use computers to understand what's happening in images like that.”

Deep learning models computerize processes that humans do naturally, such as identifying images of dogs and cats. Image identification functions can also be applied to analyze medical data. For example, an image of a biopsied cell will have features that lead a physician to identify the cell as abnormal or healthy. 

These common features, or patterns in the images of healthy cells versus abnormal cells, are used as guidelines for the algorithm. The algorithm can then sort through new images and label them as healthy or abnormal, creating a deep learning model. The benefit of using a model is that many more images can be analyzed rapidly. 

Graduate data science student Saurav Sengupta collaborated with peers at the University and others in Zambia, London and Pakistan on a capstone project that applied these models to the diagnosis of celiac disease.

“We were able to build a model that was able to predict with a high degree of accuracy if the image that we were seeing is a celiac disease image, or a normal image or environmental enteropathy,” Sengupta said. “We had to classify each image into the three classes and see if there are medical insights that could be had when we investigate those models.”

Part of the model Sengupta worked on classified images of environmental enteropathy, a chronic intestinal inflammation disorder. These algorithms are now being used to analyze a wide variety of diseases — including Barrett's, Crohn's and Alzheimer's disease at the School of Data Science.

“If you're making the prediction that the person has a disease, you have to be very sure of that prediction, and you have to be able to explain why you made that decision,” Sengupta said. “A lot of like real-world, state-of-the-art methods don't really have those things and the major challenge for us is to make the models more explainable such that they are giving you a high degree of accuracy.”

The role of the physician in this process remains important as well. Dr. Sana Syed, a pediatric gastroenterologist at U.Va. Health, uses artificial intelligence for pattern recognition in biopsy images. 

“You have to have a human because there are all these limitations of bias,” Syed said. “And then the other thing is an algorithm can't tell you what to do if something goes wrong. So a human has to be part of that, but it can enhance your decision-making.”

Biases, or the model producing predilections for certain outcomes, come from not having a large enough or representative data set, Syed said. ImageNet, a research project created by Prof. Fei-Fei Li at Stanford University, enables researchers to train image recognition models and has been very impactful to this field, according to Syed. The power of ImageNet comes from its use of an extremely large data set made of 15 million data points. The larger the data set a model is trained on, the more accurate the model is likely to be when encountering new data.  

The next steps for research in the intersection of data science and medicine lie in improving the accuracy of these models. Researchers at the School of Data Science and U.Va. Health are working together to improve this technology and continue to apply it in a medical setting. 

“There's a lot of work that needs to be done on improving the algorithms and better understanding the characteristics of the algorithms so that we can drive those improvements,” Brown said. “There's a lot of work that needs to be done in building out these kinds of techniques — these kinds of data science machine learning techniques — that will do an even better job of prediction, diagnosis and classification.”

Comments