Week 9

So this is week 9, meaning that there's only one more week left. Right now, I'm focusing mostly on finalizing my results and making my presentation more interesting. Since this was a pretty uneventful week, I had a lot of time to think about past weeks and about where I am in this project. Although the progress wasn't as fast as I initially hoped for, I do have some interesting findings. If I had more time, I think I could significantly expand on my results and do other cool things with the data.

After getting all the series that were referenced in each radiologist report, I ended up with a noticeable difference in the distributions of the series. When reading the prostate MRI scans to examine the prostate, the most commonly referenced images were in series 9, 31002, 1100, 5, 10, and 11. In fact, these series accounted for around half of all series referenced, while around a dozen series were very rarely referenced. This means that in order to cut time, a radiologist who wants to find and describe biomarkers on the prostate could look first at images in the 6 most commonly referenced series.
However, this still doesn't really tell us much about how the series are used specifically in describing certain traits of the prostate. I can expand on this later by looking at how they are grouped together with phrases and keywords such as "neurovascular bundle invasion". However, we can get some idea about the popularity of certain series between different organs/regions of the abdomen. For example, I also created a pie chart showing the distribution of series referenced by radiologists when describing the lymph nodes surrounding the prostate. 

The series distribution for the lymph nodes reveals that, at least for the approximately 250 cases I examined, there were not as many relevant series as the prostate. In fact, only 13 series were ever used across these 250 cases to describe the prostate, and of these series, only 8 appeared more than 5% of the time. The most common images were from the series 16, 4, 6, 14, which were hardly ever referenced when examining the prostate. Of course this is pretty obvious, since the prostate and the lymph nodes are separate organs, but it's still interesting.

To get more meaningful results, I need to mine more data from the radiologist reports, such as the clustering of certain phrases, the scan parameters, and details about the case/patient. I also want to do some statistical analysis on my data to get a clearer picture of what's going on. But for now these findings can be useful to help narrow down the search for radiologists tasked with reading the prostate MRI scans.


