Week 7

The fact that it’s already week 7 makes me feel bad for student researchers at 8-week programs. Even though I haven’t made the sort of progress I wanted to this week, I think that Rob and I will be fine presenting in week 10.


Goals

  1.  Present on the current research findings.
    •  Both of my presentations this week went well. Practice really pays off for that sort of thing.
  2.  Analyze how well the current models predict CAC.
    •  This had several setbacks this week, but I now know how to calculate the accuracy, F-score, and FNR correctly. Since the predictive ability of the current model isn’t great, we are looking at how to improve it

Weekend

Rob and I spent a good chunk of the weekend working on our presentation for Monday. We rehearsed for nearly 3 hours on both Saturday and Sunday to get the talk running smoothly.

However, I didn’t spend all weekend thinking about work. On Saturday night, I drove some of the REU girls to watch the big fireworks show in Bloomington. We ended up watching it in the Walmart parking lot, which is ultra-American. Afterward, we took some basic photos in front of IU’s light wall (picture picture picture). I also couldn’t resist the Hendrix urge cool off in the fountain (picture).

On Sunday, I visited the First Presbyterian Church of Bloomington alongside Thomas and Disaiah. The organ and stained glass were phenomenal (picture).

Monday

I thought that the presentation today went very well. You can view the powerpoint here. Dr. Kersting advised us to compare the reliability of the different networks we created and put those scores in our paper.

Just listening to Dr. Kersting and Dr. Natarajan talk was very interesting. I didn’t understand a lot of the more technical topics, but the advice they gave for pursuing a research career was valuable. The conversation on authorship order was eye-opening. It made me realize that I need to become a very strong scientific writer to make it far in research. I also got an idea of the pressure put on graduate students to publish frequently. The only research I’d been exposed to previously was in a wet laboratory. Research with living organisms can take years to complete and publish.  In the machine learning community, however, graduate students are expected to publish two papers per year.

After the meeting ended, I worked for a few hours on finishing up the things Dr. Natarajan asked me to do on Friday. First I made a network with all of the years’ feature data combined. This involved me adding another few lines onto the data reading/creation code I’ve been using in Python. I also made the intersection and union network of the BDe scoring metric for all years together.

Tuesday

It’s Independence Day! I spent the morning watching the Bloomington parade with my suite-mates (picture). There were a lot of really interesting floats, and the whole thing had a nice small-town vibe to it.

Since Dr. Natarajan wanted the network scores ASAP, I worked from home most of the day. Nandini had mentioned finding the AUC-ROC for the networks, so I spent a really long time trying to figure out how to do that. The only two R packages that seemed useful were pROC and ROCR. However, both packages took continuous data for predictions and binary data for labels. I had thought that the predictions should be binary as well, so I was pretty confused. Finally, I put in the CAC CPT values for each person based on their observed features. This returned a ROC graph that looked reasonable to me. I can ask whether I did it correctly tomorrow.

Wednesday

Today, two of Dr. Natarajan’s student’s had their thesis defense. Everyone not on the graduate’s committee was booted from the room during the defense, but we all at least got to see their thesis presentations. I’d seen Philip’s talk before, but he had made some improvements to it.  Shuo’s presentation was very interesting to listen to because my current project is based on her previous research. I think that being better able to map temporal changes is important for applying probabilistic models to the medical domain.

The theses took 4 hours, so I had to miss the Wednesday Workshop this week. After the theses, I got Nandini to come talk to me about the AUC-ROC graphs I made. When I asked her what values I should have put in for the probabilities, she seemed confused about why I couldn’t put in the values given by bnlearn’s predict function. She ended up telling me to just throw away the AUC-ROC stuff.

Since the CARDIA data is very skewed (in year 20 only ~11% have CAC), Nandini suggested I try undersampling the data in different ratios. She showed me how to calculate the F-score, False Negative Rate, and Accuracy for the resulting network. I am supposed these calculations for each of the different ratios of undersampling, both using the training set as a test set and splitting the data into a 4:1 ratio of testing data to training data.

Thursday

Today was Dr. Natarajan’s last day in his ProHealth office. He seemed really happy; his students passed their defenses, Mina made him some cool glasses, and he had several academic friends visiting from out of town. Rob and I got to show the visiting professors our research so far, and they seemed to think that it was pretty neat.

Rob worked on the Reviewer-Response table, which can be viewed here. He also worked some on the final video outline and the poster.

I ran my code through all of the undersampling data this morning. It is kind of tedious because I have some code in Python and some in R; this means that I keep having to transfer data back and forth through text files. I also need to either work on a better file-naming system or start putting more files in folders, because it can get kind of confusing.

The FNRs for all of the data were really low. I thought this was odd since I didn’t believe the networks were predicting that well. It wasn’t until later in the evening that I realized I’d misunderstood how to calculate the FNR. Calculating the FNRs correctly returned much worse results, but it’s a lot better than going forward with the wrong results. You can view the revised Python code for calculating the various scores here.

I was really frustrated about calculating the FNRs incorrectly, but Max took my mind off of it with a baking party (picture). I baked chocolate mint brownies for Dr. Natarajan’s party Saturday and Max made cupcakes for Tom’s birthday tomorrow. I’m really excited for Tom’s birthday tomorrow; Max, Gabby, and I went kind of overboard with buying decorations.

Friday

My primary focus today was Tom’s surprise birthday party. It took a long time to get Tom to leave his office long enough to decorate it, but when the opportunity arose we took full advantage. Within minutes, his desk was covered in balloons and streamers (picture). We even got him a pinata (picture). He was very suprised! It was a lot of fun watching his reaction and seeing him so happy (picture).

The party continued in the ProHealth Tea (picture). Everyone loved Max’s cupcakes; it really made the birthday party complete (picture). We all got to eat party food while watching students give ~1-minute lightning talks about their research. It was good presentation practice and the talks were recorded in case we ever need to send them into a conference to demonstrate public speaking skills.

I may not have gotten a lot done today, but I feel like Tom’s party was the best part of my week.


Unknown Animal of the Week: Bald Uakari

A bright red face is the epitome of a healthy mate for these distinct Amazonian monkeys.