Week 2

This weekly update comes to you in 5 parts (feel free to count on your fingers if you get lost):

  1. The introduction (Yay! You’re almost done with part 1!)
  2. A very tentative outline for completing my deliverables this summer
  3. My goals for this week and progress towards them
  4. A day-by-day breakdown of the week
  5. Unknown Animal of the Week (a reward for if you make it to the end)

Depending on why you are reading this blog, it may appear too personal/technical/detailed. It is what it is. Don’t feel obligated to read anything that doesn’t interest you. It’s super duper long.


Summer Outline

Week 2 –

  • Research
    • Learn about Bayesian Networks
    • Get Familiar with Netica and Weka
  •  Paper
    • Outline Related Work
    • Outline Methods

Week 3 –

  • Research
    • Continue learning about Bayesian Networks
    • Start working with machine learning tools
  •  Paper
    • Outline Introduction
    • Draft Related Work
    • Draft Methods

Week 4

  • Research
    • Start practicing machine learning techniques
  •  Paper
    • Outline Abstract
    • Draft Introduction
    • Peer Review Related Words
    • Peer Review Methods

Week 5 –

  • Research
    • Continue working with the methods
  •  Paper
    • Outline Findings
    • Outline Discussion
    • Draft Abstract
    • Peer Review Introduction
    • Re-Draft Related Works
    • Finalize Methods

Week 6 –

  • Research
    • Begin looking at the data
  • Paper
    • Draft Findings
    • Draft Discussion
    • Peer Review Abstract
    • Re-Draft Introduction
    • Finalize Related Works

Week 7 –

  • Research
    • Begin implementing Bayesian Network structures on the data
  • Paper
    • Peer Review Findings
    • Peer Review Discussion
    • Re-Draft Abstract
    • Finalize Introduction

Week 8 –

  • Research
    • Continue working on modeling the data
  • Paper
    • Draft Findings
    • Draft Discussion
    • Peer-Review Abstract

Week 9 –

  • Research
    • Make modifications to the algorithms as needed
  • Paper
    • Finalize Findings
    • Finalize Discussion
    • Finalize Abstract

Week 10 –

  • Research
    • Revel in the Progress Made
  • Paper
    • Have Everything Camera Ready

Goals

  1. Clearly Articulate Research Goals
    • The research will have something to do with Bayesian Networks, but Dr. Natarajan wants to get a better feel for our comfort level with coding and machine learning concepts before making solid, achievable plans.
  2. Outline Methods
    • Machine learning is an iterative process. We don’t know exactly what algorithms that we will use, though it will be in the realm of Bayesian Models. As mentioned above, we aren’t even totally sure on the project we are doing; it will be a while before we have concrete methods. However, we do know that we will probably be using Weka and/or Netica in our research. What we know is in our ShareLaTeX document.
  3. Outline Related Works Section
    • The articles and books that we have read are in ShareLaTeX. The number of readings in the document doesn’t represent how much reading we have done over the past few days, however. To learn the basics of Bayes Nets, Rob and I read many websites/PowerPoints/Lectures that aren’t in the Related Works section at all. Of the articles cited, two are over 50 pages. Also, two books appear in the related works; we’ve read over a hundred pages of both. Since the writing is highly technical and contains equations on every page, it took a long time to fully digest. We will add to the Related Work section as we continue reading and get a better idea of exactly what we need.

Memorial Day Weekend

This weekend, I had the opportunity to explore Bloomington a little bit more. It’s a really cool town; though it’s around 8 times the size of Batesville, it still retains a small-town feel. Also, it has a very strong hipster vibe.

For instance, on Saturday morning, I drove some of my REU friends down to the farmer’s market. It was huge! There was live music playing for the couple of hours that we were there. I got a pastry, coffee, and 2 heads of lettuce.

Image

That afternoon, some of us went to Krogers to stock up on lunch food because we wanted to start bringing lunch to work instead of eating out all of the time. That way we could save time/money/calories.

On Sunday, I attended Trinity Episcopal Church as part of my church-hopping plan this summer. The choir was very good and the church was packed, which is impressive given that it’s summer in a college town. I brought Disaiah along with me. It was his first time in an Episcopal Church, so that was fun too.

Image

Once I got back, I read the articles Nandini gave me. Some of it was over my head, but I was able to make sense of most of the unfamiliar terms/concepts thanks to my good friend Wikipedia.

After reading for several hours, I decided to take the evening off. Gabby, Anne, and I went out to dinner to try the famous Mother Bear’s Pizza. It lived up to the hype! After we ate, I watched Logan with some friends. It was a lot more depressing than I thought it would be.

Image

I spent most of Monday rereading the articles given to me and finding a few related works to peruse. In the evening, Rob showed me some of the basics of Weka, which is user-friendly Data-Mining software. I like how easy it is to play around with learning algorithms and statistically test models. However, I know that the research I’m doing this summer is going to require more than pushing a few buttons.

I hope you liked the pictures thus far. You won’t see them again until the Friday post.

Tuesday

The morning was a bit slow. Rob and I weren’t meeting with Nandini to discuss our weekend readings until 2. I skimmed over the assigned readings and worked on finding good related work. Also, I created a group on Mendeley so that Rob and I could easily share articles and sort through what we have read/ want to read.

About an hour before the meeting, Nandini sent us some data to analyze in Weka. I felt really unprepared because neither Rob nor I could figure out how to run the learning algorithms on the data. When Nandini showed up, she showed us how to filter the data so that it was marked as categorical, fixing our error.

After that was sorted out, I had a chance to ask questions that I had about the article. Nandini said that most of my questions were too technical, but she kindly answered them anyways. I also asked her a lot of questions regarding the project in general, so that I could cover everything that Dr. Siek wanted me to in this blog. Since what Rob and I are working on this summer isn’t fully determined, a lot of the answers were pretty vague. Some of the questions needed to be run by Dr. Natarajan. I felt kind of bad because it just added more work to Nandini’s plate.

Nandini ducked into Dr. Natarajan’s office for a minute and then hurried back out to tell us that Dr. Natarajan wanted to meet with us. That was surprising because earlier Nandini told us that he was too busy this week to meet with us. Rob and I rushed into his office, pen and paper in hand.

Dr. Natarajan spoke very quickly; it was hard to keep up with taking notes. He told us that this project didn’t have the neat methods and deadlines that other projects do. Roughly, we would spend the next week understanding the methods of Bayes Nets, 2 weeks working with the methods, a week working with the data, and 3 weeks formulating the findings. Our task for the next few days was to read everything we could about Bayesian Networks, particularly in regards to parameter learning. The author he recommended starting with was Kevin Murphy. We also needed to download Netica and get familiar with it. On Thursday, he told us to meet with him to discuss what we learned.

I was slightly dazed when I left his office. There was so much to do and so little time.

Rob found the textbook Machine Learning: A Probabilistic Perspective, which was written by Kevin Murphy, in PDF format. I wish I could have printed it out, but it is 1000+ pages so I kept notes in an outline to keep myself focused. Over the course of the day, I read chapters 1, 2, part of 3, and most of chapter 5. It was interesting and I feel that I learned a lot from it, but it was very dense. The introduction said that it was for graduate students with a background in statistics, multi-variable calculus, and linear algebra. I’ve only taken through Calculus 2. The pages were packed full of equations and how they could be manipulated. Though I could figure out what was happening in each equation, it took a while. As the book progressed, the equations became composed nearly entirely of symbols that referred to other equations that just continued the cycle. I think that this book will be a handy reference if we need a certain equation for, say, measuring the accuracy of our model. Given the time we had to read it, however, it was impossible to memorize formulas.

Wednesday

I read more today than I have in a very long time. I spent 30 minutes reading over breakfast. At the office, I spent 7 hours reading continuously, with just a 15 minute lunch. Once I was home, I read for another 4 hours before falling asleep.

What did I read, exactly? In the morning, I finished up chapter 5 of the textbook I started yesterday. I also read 2 really long articles about Bayes Nets. Dr. Goadrich, a kind soul, emailed me a link to one of Kevin Murphy’s websites about Bayes Nets (which also contained further reading). Dr. Goadrich’s email also included to a link to a PowerPoint that contained this gem: a mathematical formula in Comic Sans!

My main break was the Wednesday Workshop, led by Ben Jelen. He talked about keeping a journal and how to outline our Methods and Related Works sections on ShareLaTeX. I started the Related Works with the readings I had completed and Rob worked on the Methods for a little while.

I also took periodic breaks to learn Netica. It was a long process just to get it downloaded and running. When I first opened the GUI, I laughed out loud – it looked like MS Paint for statisticians! However, once I started using it, I kind of fell in love. The tutorial I followed to learn the basics of creating networks using a file of data can be found here.

Thursday

I feel like I’ve reached saturation point for retaining information.

Rob found another textbook today that looks like it could be helpful: Bayesian Reasoning and Machine Learning. I started on it, and I like that it is actually written at an undergraduate level, building up to higher concepts. I plan to try to read most of it over the weekend, though I’ve read nearly a hundred pages thus far.

Sadly, I missed my opportunity to meet with Dr. Natarajan today. He was free for nearly an hour this morning but, by the time Rob and I got ready to go in, his office was filled with graduate students. At 11, he left for a meeting and I didn’t see him again after that.

We were upset that we didn’t get to talk with him, since we needed to get a firmer outline of our deliverables and some direction for continued reading. As soon as he comes in tomorrow, we will make up for today.

Friday

We got in to talk to Dr. Natarajan today. It was really quick because he is swamped with a proposal he is trying to get in. We basically asked what we needed to be reading and told him the two textbooks that we had started on. He told us that those were too much and instead gave us Artificial Intelligence: A Modern Approach, telling us to read chapters 14 and 15.

It didn’t take me very long to read and outline those chapters. Having a physical book was really nice and helped me concentrate compared to trying to highlight a PDF. Also, the material was a lot less complicated than the stuff we had been reading. It was nice. I also got a better sense of the basics that didn’t quite get reading other material.

That’s pretty much all the reading I did today. Most of the day was spent writing. I pretty much wrapped up this blog post, refined the outline of Methods and Related Works in ShareLaTeX, and sent a reflection of my time thus far to Dr. Goadrich, my Odyssey advisor for this summer. (Quick shout out to Dr. Goadrich! He is awesome for agreeing to mentor me and reading these excessively long blog posts!)

All of the ProHealth students also met to discuss our progress over the past week. My favorite part of the meeting was when the faculty mentors talked about their experience with Qualifying Exams and gave some other great graduate school advise. They highlighted the power of networking, especially at conferences.

After work, I drove some people to Food Truck Friday. Afterwards we waked to Baked, which is a personalized cookie venue. You can tell them exactly what you want in your cookie and they bake them fresh. Mina (shown below), gave me two of her cookies. I can’t articulate how good they were.

Image

After eating, I dropped everyone but Mina off and a few other REU students joined us to go to the Gallery Walk. We only got to see two galleries, but the artwork was phenomenal. I just got my first REU check today and I had trouble not spending it on one piece I really liked.

Image

What really made the night that, as we were walking out of the second gallery, Mina spotted Kamau Bell! He is a really famous comedian who was in town for the Limestone Comedy Festival this weekend. Mina was the most excited that I’ve seen anyone in a long time. She was too nervous to ask him for a photo, so I just went up to him and pointed her out. The resulting photo is below. Look how happy she is!

Image

So that was probably the highlight of my week. 🙂


Unknown Animal of the Day: Amazonian Royal Flycatcher

These little guys like to spread out. Though they are typically just over 6 inches long, their nests can be up to 6 feet! These nests are placed on branches over the water to lessen the risk from predators.