Posts

Showing posts from 2017

Day 31

Presentation day! I'll admit that I was pretty nervous before going up to present. There wasn't much reason to feel so pressured, but the other interns did so well, so it felt like there was a standard to be met. As per tradition, my presentation was under time, even though it was over time when I was rehearsing the night before. Not a big deal. My biggest concern going into today was that nobody would understand what I was talking about. After presenting, I still got the feeling that nobody understood what I was talking about. I guess I'll never know for sure. Dmitry did ask a good question though (to which I provided a mediocre response), so I suppose that proves something. I think all of the interns did really well today, and this was a great way to end the internship. Postmortem time! Coming into this internship, I had nagging doubts about the program, most of which, to be honest, revolved around it being unpaid. These doubts were first addressed on the day of my ...

Day 30

Last day before presentations. The SMCAE tests weren't going anywhere, so I fell back to the old data. I added it to my presentation, applied some finishing touches, and gave it to Joe. After lunch, Titus rehearsed his presentation in the auditorium, and Aditi, Anjana, and I gave him a lot of feedback. He managed to cut the time down significantly, but the organization of his presentation made it a little bit confusing, even to Aditi. I didn't feel like I had to rehearse mine, since I've been pretty good on time. At the lab, there wasn't much to do, since Ron was out for most of the day and all of the presentation work was done. Nate did receive instructions for the GRSS challenge from Ron, so I helped him with that for the rest of the day.

Day 29

This morning, we ran through all of the presentations in the auditorium. Joe was really strict about staying within the time limit because he wanted to finish the presentations by noon, both today and on Thursday. Today we finished almost exactly at noon, but in all likelihood, that's not happening on Thursday. I found myself stuttering and going into less depth than I did with the lab rehearsals. With time to spare, I think I could do well by speaking slower and focusing on the important content. The only feedback I got was a question from Joe: "What's the difference between your work and Nate's?" Nate and I gave a lackluster answer, but I genuinely think that compacting our presentations into 15 minutes would be a bit impractical. Possible, maybe, but I wouldn't want to.

Day 28

Week 6 of the internship; the final presentation is on Thursday. It doesn't feel like week 6, but that's probably because I missed almost all of week 4. Regardless, there was much work to do before Thursday, so Nate and I rehearsed our presentations twice in front of the lab. Many revisions were made, and I'm feeling much more prepared for the real deal. Thank you to everybody who was willing to listen to four lame high school presentations about the same topic in the same day. Your contribution will make the very same presentations significantly less lame on Thursday. Apart from presentation-related stuff, Ron and I ran some ladder network and SMCAE experiments all day. Ron tackled the ladder network with PELU activation (which was apparently very difficult) and I continued with adjusting initial layer sizes on the ladder network and doing SMCAE runs with fixed component count PCA as a feature scaler. The results aren't looking too hot, but I'll withhold judgem...

Day 27

Nate and I ran our presentations today in front of the lab and got some feedback. I was worried about going over 10 minutes so I rushed the presentation and ended much quicker than 10 minutes. It was pretty sloppy too, but now I have a better idea of what needs to get changed. It should go much smoother on Monday. The ladder network tests were running overnight, and didn't finish by the time I came in today. In fact, as of the time I left today, they still didn't finish. I think they should be done Saturday morning, at which point I'll start some new tests.

Day 26

Today we made some promising progress on both experiments. Ron noticed that the ladder network performed exceptionally well on features generated by 4-5 layer SMCAEs and suggested that the ladder network may work best when the first layer reduces dimensionality instead of adding dimensionality. This theory was completely contrary to the results of the original paper and the "gold standard" MNIST ladder network, so I was a bit skeptical, but there was no reason not to try the same approach with 1-3 layer SMCAEs. So far, these kinds of models work marginally better than an SVM. I will withhold judgement until we do more trials. On the SCAE/SMCAE side of the experiments, Ron suggested that since the stacks with more layers produce more features, the SVM used to classify may be prone to overfitting (explaining our poor results from last time). Hence, we should perform PCA before classifying with a fixed number of output components so that the number features remains constant. I...

Day 25

Today we switched directions a bit and tried the ladder network on SMCAE-generated features. It's performing a little better than the SVM, which is looking promising for my presentation and Ron's paper. We also looked into the SCAE data from a while ago, and it honestly looks pretty bad. The multiloss CAEs aren't performing very well, so we might have to reframe that experiment too. Luckily the CAE vs. MCAE data is still pretty good, so it's not a complete loss. I also sat in on a DIRS meeting, where Prof. Rottman (from Ben-Gurion University of the Negev) gave a funny and informative presentation on target detection in hyperspectral images. It's an interesting problem, and he explained the mathematics behind it quite well.

Day 24

Similar to yesterday, I went about poking the ladder network in hopes of exploiting every percentage of accuracy that it can offer. I really do think that this model has a lot of potential, seeing how well it does with MNIST with only 100 labeled samples. It's just frustratingly difficult to find the magic algorithms and numbers that make it suited for Pavia University. Even if I had a very deep understanding of how it works, nobody really knows how to pick the perfect learning rate, layer sizes, activation functions, etc. Research is hard.

Day 23

More noodling around with the ladder network today. Ron was helping out, and he put in PELU activation and weight decay to see if they yielded better performance. It hasn't been going very well; I think our highest accuracy is around 82%, which is still significantly lower than a standard SVM-RBF. On top of that, the model is still severely overfitting. While running tests, I worked on my presentation, which is coming along pretty well. I also didn't notice how close our last day was until today. I hope I'm able to get some good results before presentation day rolls around.

Day 22

Image
Today was my first day back in the lab, just in time for the undergraduate research symposium. Nate and I headed down for a bit to look around at the posters. I didn't know RIT had so many REU students, and there was a huge variety of topics. Everyone from the lab went down to stop by Angelina's poster, and Nate had a friend presenting there as well. There was definitely some cool stuff being presented. I mainly kept working on the ladder network today, trying to find out how to improve it. Once again, Andrej Karpathy's wonderful course notes for his CS231n class helped out a ton. I mainly focused on testing different learning rate functions. These functions determine how much a neural network adjusts its weights after every iteration of training. The goal is to minimize loss more efficiently. These are a few examples of learning rate functions. I originally had the "Rectified Linear" function, but I noticed that loss and accuracy would hardly change after...

Days 18-21

I was out of town for these few days. I hope I didn't miss too much.

Outline

Motivation/Background: HSI classification What is HSI? Comparison to RGB The advantages of HSI over other kinds of data Instant, remote, nondestructive data collection Hundreds of data points per pixel What is classification? Introduce using toy data set, like CUB-200 Applications of HSI classification Cancer detection Art authentication Food quality analysis Remote sensing Urban development agricultural monitoring environmental assessment Problems faced High dimensionality Scarcity of labeled data Method: MCAE and Ladder Network Feature Extractors Baseline: PCA Inspiration: Autoencoder Our proposed method: MCAE Classifiers Baseline: SVM Inspiration: Neural network Our proposed method: Ladder network Results : State-of-the-art Comparison Comparison of MCAE against PCA and state-of-the-art Comparison of Ladder Network against SVM and state-of-the-art

Day 17

Working with the ladder network all day has brought some things to light. Firstly, and most importantly, the model is overfitting like there is no tomorrow. This means that the model memorizes the training data instead of generalizing. Contrary to what high school has taught me, this is undesired. The original model predicts up to 15% more accurately on training data than validation data which is very bad. I've been trying to change parameters to reduce this margin, but I haven't had much success. The main problem is that there are way too many knobs to turn. I could add layers, remove layers, change the sizes of layers, adjust the amount of noise, change weights of costs, change the initial learning rate, change the learning rate function, or change the optimizer. Decisions, decisions.

Day 16

The first thing I did when I got to the lab was test my changes. I believe the problem I was trying to solve was an exploding gradient, and my solution was gradient clipping. Don't quote me on that though. To my relief, it worked! Turns out there was a very simple solution to what seemed to be like a disastrous problem. I spent the rest of my time in the lab working on various parts of the ladder network. At 2:00, Mingming presented his driving simulator at Slaughter Hall. Watching Gerry drive like a madman and trying it out for myself was pretty fun. It's a cool project, and I think Mingming got some really good feedback/suggestions, which is nice to see. At the end of the day, the interns, REU students, and some staff drove down to the Mees Observatory. I'd never been to an observatory, so this was pretty cool. Unfortunately, we didn't get to see anything because of the weather. It was a good time nonetheless.

Day 15

Today I made a lot of slow and steady progress. With the paper and the CAE experiment out of the way (for now), I focused on the ladder network code. First, I adapted the code to be object-oriented, so that it would (mostly) work within our pipeline. Every few iterations, I tested the code with Pavia U and MNIST, just to make sure I didn't stray away from the 80% and 98% I achieved earlier. I would guess that failing to do so is what gives rise to hard-to-trace bugs, and nobody likes those. There was one issue where the model would crash and burn around epoch 50, and predict nothing at all. This might be linked to the NaN loss that was occurring in the previous ladder network code, and I think I've found the solution. Whether I have succeeded or failed will be in tomorrow's blog post. I haven't had enough time to run >50 epochs with the fix yet, and everybody likes a good cliffhanger. Tune in next week to find out who shot Nate and the real reason he left to run in t...

Day 14

Image
"Happiness is when the code works." - poster from the lab I have finally made some progress on the ladder network. With the code from last time, I wrote a method to pass the Pavia University data into the ladder network. There were a surprising number of difficulties, but I got it to work. There were a few problems with how I formatted the data that I didn't notice at first, so I was heartbroken when the network made abysmal predictions. Turns out these algorithms don't learn too well when fed garbage, so I fixed the data and kept trying. Here is the end result. The ladder network was trained on 450 labeled HSI samples and 42,776 unlabeled samples trained over 30 epochs. I'll admit the results aren't great (an SVM achieved 84% on similar data). However, only necessary changes were made to the network, so this is basically the same model that was predicting digits yesterday. There's still room for optimization with this model, so I expect the accurac...

Day 13

Image
Today, Nate came back from the mountains to work in the lab. Welcome back Nate! I kept on doing the same work from last time. With the CAE experiment v.2 running in the background, I proofread and edited my section of the paper, and I worked with the ladder network for the rest of the day. First, I got rinuboney's implementation of the ladder network in TensorFlow ( github ) working in Python 3. Testing it with the MNIST dataset , I got the expected results (very good). Since we all love graphs, I quickly made one for said results. The idea is pretty simple. The ladder network is trained on 60,000 images of handwritten digits, 100 of which are labeled. The network makes predictions about the labels of 10,000 other images, and we calculate how accurate those predictions are. This process comprises 1 epoch, and is repeated 149 more times. By the end, the network achieves almost 99% accuracy. This is what we're hoping to do with HSI classification, but there seems to be an...

Day 12

Today I started rereading the ladder network papers and trying to find out why the code doesn't work well. It was a fruitless endeavor; there's just too much that could go wrong. Perhaps the ladder network just does poorly on HSI data, and there is no flaw. Maybe it's as simple as changing some hyperparameters. Who knows? Most, if not all, of the interns came outside for lunch, which was a nice surprise. After lunch I was tired of looking at the ladder network code, so I went back to writing the paper. Taking the feedback Ron had last time, I improved on the SSL section a lot. It's not perfect, but we're getting there. I also had some time to visit the Perform Lab (Titus and Aditi's lab), where I got to see their SMI eye tracker (which was unfortunately malfunctioning), motion trackers, and HTC Vive, which they let me try. Very cool stuff; thanks to everyone at the Perform Lab.

Day 11

Today was pretty eventful. At the morning meeting, we finished up judging each other's abstracts (sans Emma's). Arriving at the lab, Ron reviewed some work with me. I now have three tasks on my plate: Redo the (S)(M)CAE experiments, now with a slightly different setup. Draft the semi-supervised learning portion of Ron's paper. Debug the ladder network code. Of the three, the last is by far the most difficult. There are hundreds of lines of code using a library I'm not familiar with, and it's emulating a model I don't fully understand. We're also dealing with big data, intertwined systems, and random distributions, which make it hard to debug with small, isolated tests. I'm planning on re-reading the papers on ladder networks until it completely makes sense. Today I also attended the MVRL meeting about the Pupil Labs eye tracker. It seemed useful for high quality video capture, but I'm not really familiar with the research. It's also cool...

Day 10

At the morning meeting we critiqued a few more abstracts, including mine. None of the other interns gave me any feedback, and Joe clarified a few points that were not fully explained in the abstract. I guess I'll just leave it as it is. I talked to Ron about finding papers that relate to the research, and apparently I had a misconception last time. Building upon the silly extended metaphor from last time, I present a continuation: like trying to find a needle in a haystack, and finding a handful of paper clips, then realizing that there is another haystack, and paper clips were what you were supposed to look for. I did this for most of the day today as well. There was a tech talk today about quantum physics and quantum computing. It seemed pretty out-of-place for CIS, but informative nonetheless. I also ran into a Victor graduate at the tech talk, which was kind of surreal. For the last part of the day, I wrote a bit about semi-supervised learning in Ron's research paper, and ...

Day 9

Image
This morning we looked at some of the interns' abstracts. This showed me that I should probably edit my own abstract. It's really quite difficult to maintain brevity (recommended <100 words) while trying to include everything I think I should. This morning's peer review helped me sort it out a bit more though. Today was Ron's first day back, and he had a lot of catching up to do with Angelina, Marc, and I. He showed me how to use RIT Libraries and Google Scholar to find papers about semi-supervised models to compare our own performance to. This task basically took up all of my time today, and I'm likely going to do something similar tomorrow. The process is really tedious, but also clearly necessary for the paper I am helping Ron with. I'd compare it to trying to find a needle in a haystack, but finding a handful of paperclips instead. Not quite what you're looking for, but it'll have to do until you find a better one. As promised from last time, I ...

Day 8

Image
Today Joe came back for morning meetings and Nate left to run in the mountains. Ron's supposed to be back tomorrow. Over the weekend, I collected data for the CAE experiments, so I don't have any more work with that until Ron gets back. So, I started looking at the ladder network architecture Ron implemented with TensorFlow. It's probably the most intricate model I've looked at thus far, but like anything else, it slowly made sense as I broke it down. The papers I read earlier helped a lot by providing a general deconstruction of the model. I used these papers, example code, and a healthy dose of common sense to document the code. I had a lot of downtime, so I did some reading and personal experimentation. In particular, I wanted to get familiar with what HSI data looks and feels like. Here's the result of a small script I wrote: This is a visualization of the 103 bands included in the Pavia University HSI dataset. On the right, brighter colors show higher refl...

Abstract

The greatest challenge posed by hyperspectral image (HSI) classification is extracting features which are useful for classification. Convolutional autoencoders (CAEs) are designed to generate relevant and generalized features from this kind of data, and advancements such as stacked CAEs (SCAEs) and multiloss CAEs (MCAEs) attempt to improve upon the model. In this experiment, we train these models on randomly selected labeled samples of the Pavia University HSI dataset. A support vector machine (SVM) uses the features extracted by these models to make predictions about unlabeled data. The accuracy of these predictions are used to evaluate the quality of each feature extractor.

Day 7

Image
This morning, Ron sent me some instructions on building a neural network in keras, saving me from having nothing to do for the entire day. Nice! I kept the experiment running in the background while I worked on the neural network. Fortunately (or unfortunately) keras made it very easy to create a sophisticated and successful model. I finished testing everything I needed to by midday. Nice? After lunch, there was still stuff to do. As per usual, I did some reading. I highly recommend the notes from Andrej Karpathy's class on using CNNs for visual recognition ( link ). In my Day 2 blog post , I mentioned trying to help Nate format his data to be compatible with the SVM. We are seven days into the internship, and nothing has changed. I figured out how to format my data for the image classification pipeline (one of our internship projects) two days ago, and this format happens to be different. Explaining this format and my implementation to Nate was pretty difficult, and I had a naggi...

Day 6

My experiment has officially started! I first cleaned up the code and prepared it for the 24 test cases, with 30 trials each. Then, I ran it. And waited. ... I predict that this will be a common trend for the next few days. During my downtime, I wrote my abstract and read some papers about machine learning models. The first test finished just after I came back from lunch. After collecting the results and calculating the relevant statistics, I started the second test. It was some time after this that I became aware of the fact that I was using 100% of the memory of all four GPUs on the server. After promptly stopping the experiment, I changed a setting to minimize the damage to only one GPU and started it back up. Luckily, the program's performance didn't change significantly, since some packages like to allocate memory it doesn't necessarily need (hence why it was happily devouring ~48000 MiB). I've been running the experiment remotely since I came home, and I plan t...

Day 5

Today I made really good progress, and I didn't really notice until I was about to leave. I ran into a lot of issues along the way: memory allocation issues (?!?), struggling to import a module, incorrect data dimensions, the list goes on. Huzzah. It took the entire day to iron out the bugs, and when the entire pipeline worked for the first time, it felt like I finished a marathon. A surprisingly quick runtime and high accuracy were my reward. There's still one bug where SCAEs work properly and CAEs don't, which is puzzling, since the SCAE code is completely reliant on the CAE code. That'll be a problem for tomorrow though. At noon, there was a talk about biomedical imaging from Professor Linte. It was high-level overview of a variety of topics from cardiac models to VR. It was a nice way to break up the day, and it helped get a better idea of the other things that go on at CIS. Tomorrow, I'm going to be working out that one bug and hopefully starting my first exp...

Day 4

Today was Joe's first day out on vacation and Matt's first day subbing in for the morning meetings. With no audio to accompany the cheesy video, we had a brief talk about imaging science. My work today began with more reading. This paper was really great at explaining ladder networks, and once I was fairly comfortable with the subject matter, I started working on my first experiment: testing convolutional autoencoders (CAEs) and multiloss CAEs (MCAEs) on the Pavia U dataset. So far, I've documented the CAE and stacked CAE (SCAE) files and set up the data formatting and a small part of the pipeline.

Day 3

After today's daily briefing about writing abstracts, Nate and I got to work at the lab. I picked up where I left off with autoencoders and finished the convolutional and denoising ones pretty quickly. I'll probably come back to the more complex autoencoders some other time. Left with a nagging feeling that my SVM code from last time was subpar, I revisited it. I ended up rewriting almost all of it. The results included elegant data manipulation, good runtime, optimized preprocessing, tuned hyperparameters, complete metrics, and (this is the important one) okay accuracy. Not great, but not bad either. I think that under the given constraints, I nearly achieved the best that the SVM could perform, so I'm ready to close the book on this tutorial experiment. That's not actually true; Nate's SVM is having some performance problems that we have yet to diagnose... I also continued reading about the topics relevant to my experiments. I have to admit: a lot of it still wen...

Day 2

Day 2 started with a very brief meeting where we went over the schedule for the next six weeks. Abstract, outline, presentation; two weeks per task. It was at this point that I realized the difficulty of making technical topics sound interesting, then I immediately dismissed the problem as one for my future self. Today was dedicated to Support Vector Machine (SVM) creation, evaluation, and optimization. I was unaware of many of the wonderful functions included in the provided libraries, and ended up doing the data reformatting less elegantly. It still worked, so I left it. When I finished up my SVM code, I tried to help Nate with the errors he was running into. The keyword here is "tried." Note that it's not easy to explain concepts that you figured out hours earlier and happen to be more complicated than they should be. Nate was a good sport about it though, and with Ron's help, he had a good idea of the solution by the end of the day. For the last part of the day, ...

Day 1

The day started with a brief gander at the intern handbook and a tour around the Carlson building, then us interns went down to the Red Barn for some team building. We got to learn each other's names before inspecting many ropes, building PVC pipe structures of questionable integrity, unsafely flipping magic carpets, and more. We're a pretty quiet group, which made the team building a bit difficult, but we got to know each other nonetheless. Ending with some final thoughts, we wandered our way back to the Carlson building for lunch, which was provided for us (thanks Mr. Pow!). After lunch we got to meet with our respective research groups. When we got to the lab, Ron and Dr. Kanan talked to Nate (my fellow intern) and I about college, research, and the internship in general. Ron went through all of the topics we'd be covering through tutorials and readings throughout the first week, just to give us an idea about our focus. I'm really glad he took the time to put tog...