Publication Date

Spring 2020

Advisor(s) - Committee Chair

Qi Li (Director), Guangming Xing, and Zhonghang Xi

Degree Program

School of Engineering and Applied Sciences

Degree Type

Master of Science


Inflammatory bowel disease (IBD) is a set of disorders that involve chronic inflammation of digestive tracts, e.g., Crohn's disease (CD) and ulcerative colitis (UC). Millions of people around the world have inflammatory bowel disease. However, it is still difficult to treat IBD due to its unknown cause. In fact, accurately diagnosing inflammatory bowel disease (IBD) can be very challenging too since some of IBD symptoms can mimic those of other conditions. In this work, we apply classification methods to help improve the success rate of diagnosis. We study four formulations of IBD classification: i) IBD and non-IBD (binary classification), ii) CD and non-IBD (binary classification), iii) UC and non-IBD (binary classification), and iv) UC, and non-IBD (ternary classification). We have applied a number of classification methods, including decision tree, Naive Bayes, K-nearest neighbor, and rule-based classifier, to the two IBD classification problems using a metagenomic dataset collected from stool samples. Our study shows that a rule-based classifier achieves the best combination of classification accuracy and readability. We also explored the roles of attributes in the diagnosis of IBD based on interpretation of learned models. Studying the importance of specific attributes could lead to a better understanding of IBD by either discovering new connections or reinforcing known ones.


Bioinformatics | Computer Sciences | Physical Sciences and Mathematics