Posts

Showing posts from July, 2017

Outline

Motivation/Background: HSI classification What is HSI? Comparison to RGB The advantages of HSI over other kinds of data Instant, remote, nondestructive data collection Hundreds of data points per pixel What is classification? Introduce using toy data set, like CUB-200 Applications of HSI classification Cancer detection Art authentication Food quality analysis Remote sensing Urban development agricultural monitoring environmental assessment Problems faced High dimensionality Scarcity of labeled data Method: MCAE and Ladder Network Feature Extractors Baseline: PCA Inspiration: Autoencoder Our proposed method: MCAE Classifiers Baseline: SVM Inspiration: Neural network Our proposed method: Ladder network Results : State-of-the-art Comparison Comparison of MCAE against PCA and state-of-the-art Comparison of Ladder Network against SVM and state-of-the-art

Day 17

Working with the ladder network all day has brought some things to light. Firstly, and most importantly, the model is overfitting like there is no tomorrow. This means that the model memorizes the training data instead of generalizing. Contrary to what high school has taught me, this is undesired. The original model predicts up to 15% more accurately on training data than validation data which is very bad. I've been trying to change parameters to reduce this margin, but I haven't had much success. The main problem is that there are way too many knobs to turn. I could add layers, remove layers, change the sizes of layers, adjust the amount of noise, change weights of costs, change the initial learning rate, change the learning rate function, or change the optimizer. Decisions, decisions.

Day 16

The first thing I did when I got to the lab was test my changes. I believe the problem I was trying to solve was an exploding gradient, and my solution was gradient clipping. Don't quote me on that though. To my relief, it worked! Turns out there was a very simple solution to what seemed to be like a disastrous problem. I spent the rest of my time in the lab working on various parts of the ladder network. At 2:00, Mingming presented his driving simulator at Slaughter Hall. Watching Gerry drive like a madman and trying it out for myself was pretty fun. It's a cool project, and I think Mingming got some really good feedback/suggestions, which is nice to see. At the end of the day, the interns, REU students, and some staff drove down to the Mees Observatory. I'd never been to an observatory, so this was pretty cool. Unfortunately, we didn't get to see anything because of the weather. It was a good time nonetheless.

Day 15

Today I made a lot of slow and steady progress. With the paper and the CAE experiment out of the way (for now), I focused on the ladder network code. First, I adapted the code to be object-oriented, so that it would (mostly) work within our pipeline. Every few iterations, I tested the code with Pavia U and MNIST, just to make sure I didn't stray away from the 80% and 98% I achieved earlier. I would guess that failing to do so is what gives rise to hard-to-trace bugs, and nobody likes those. There was one issue where the model would crash and burn around epoch 50, and predict nothing at all. This might be linked to the NaN loss that was occurring in the previous ladder network code, and I think I've found the solution. Whether I have succeeded or failed will be in tomorrow's blog post. I haven't had enough time to run >50 epochs with the fix yet, and everybody likes a good cliffhanger. Tune in next week to find out who shot Nate and the real reason he left to run in t

Day 14

Image
"Happiness is when the code works." - poster from the lab I have finally made some progress on the ladder network. With the code from last time, I wrote a method to pass the Pavia University data into the ladder network. There were a surprising number of difficulties, but I got it to work. There were a few problems with how I formatted the data that I didn't notice at first, so I was heartbroken when the network made abysmal predictions. Turns out these algorithms don't learn too well when fed garbage, so I fixed the data and kept trying. Here is the end result. The ladder network was trained on 450 labeled HSI samples and 42,776 unlabeled samples trained over 30 epochs. I'll admit the results aren't great (an SVM achieved 84% on similar data). However, only necessary changes were made to the network, so this is basically the same model that was predicting digits yesterday. There's still room for optimization with this model, so I expect the accurac

Day 13

Image
Today, Nate came back from the mountains to work in the lab. Welcome back Nate! I kept on doing the same work from last time. With the CAE experiment v.2 running in the background, I proofread and edited my section of the paper, and I worked with the ladder network for the rest of the day. First, I got rinuboney's implementation of the ladder network in TensorFlow ( github ) working in Python 3. Testing it with the MNIST dataset , I got the expected results (very good). Since we all love graphs, I quickly made one for said results. The idea is pretty simple. The ladder network is trained on 60,000 images of handwritten digits, 100 of which are labeled. The network makes predictions about the labels of 10,000 other images, and we calculate how accurate those predictions are. This process comprises 1 epoch, and is repeated 149 more times. By the end, the network achieves almost 99% accuracy. This is what we're hoping to do with HSI classification, but there seems to be an

Day 12

Today I started rereading the ladder network papers and trying to find out why the code doesn't work well. It was a fruitless endeavor; there's just too much that could go wrong. Perhaps the ladder network just does poorly on HSI data, and there is no flaw. Maybe it's as simple as changing some hyperparameters. Who knows? Most, if not all, of the interns came outside for lunch, which was a nice surprise. After lunch I was tired of looking at the ladder network code, so I went back to writing the paper. Taking the feedback Ron had last time, I improved on the SSL section a lot. It's not perfect, but we're getting there. I also had some time to visit the Perform Lab (Titus and Aditi's lab), where I got to see their SMI eye tracker (which was unfortunately malfunctioning), motion trackers, and HTC Vive, which they let me try. Very cool stuff; thanks to everyone at the Perform Lab.

Day 11

Today was pretty eventful. At the morning meeting, we finished up judging each other's abstracts (sans Emma's). Arriving at the lab, Ron reviewed some work with me. I now have three tasks on my plate: Redo the (S)(M)CAE experiments, now with a slightly different setup. Draft the semi-supervised learning portion of Ron's paper. Debug the ladder network code. Of the three, the last is by far the most difficult. There are hundreds of lines of code using a library I'm not familiar with, and it's emulating a model I don't fully understand. We're also dealing with big data, intertwined systems, and random distributions, which make it hard to debug with small, isolated tests. I'm planning on re-reading the papers on ladder networks until it completely makes sense. Today I also attended the MVRL meeting about the Pupil Labs eye tracker. It seemed useful for high quality video capture, but I'm not really familiar with the research. It's also cool

Day 10

At the morning meeting we critiqued a few more abstracts, including mine. None of the other interns gave me any feedback, and Joe clarified a few points that were not fully explained in the abstract. I guess I'll just leave it as it is. I talked to Ron about finding papers that relate to the research, and apparently I had a misconception last time. Building upon the silly extended metaphor from last time, I present a continuation: like trying to find a needle in a haystack, and finding a handful of paper clips, then realizing that there is another haystack, and paper clips were what you were supposed to look for. I did this for most of the day today as well. There was a tech talk today about quantum physics and quantum computing. It seemed pretty out-of-place for CIS, but informative nonetheless. I also ran into a Victor graduate at the tech talk, which was kind of surreal. For the last part of the day, I wrote a bit about semi-supervised learning in Ron's research paper, and

Day 9

Image
This morning we looked at some of the interns' abstracts. This showed me that I should probably edit my own abstract. It's really quite difficult to maintain brevity (recommended <100 words) while trying to include everything I think I should. This morning's peer review helped me sort it out a bit more though. Today was Ron's first day back, and he had a lot of catching up to do with Angelina, Marc, and I. He showed me how to use RIT Libraries and Google Scholar to find papers about semi-supervised models to compare our own performance to. This task basically took up all of my time today, and I'm likely going to do something similar tomorrow. The process is really tedious, but also clearly necessary for the paper I am helping Ron with. I'd compare it to trying to find a needle in a haystack, but finding a handful of paperclips instead. Not quite what you're looking for, but it'll have to do until you find a better one. As promised from last time, I

Day 8

Image
Today Joe came back for morning meetings and Nate left to run in the mountains. Ron's supposed to be back tomorrow. Over the weekend, I collected data for the CAE experiments, so I don't have any more work with that until Ron gets back. So, I started looking at the ladder network architecture Ron implemented with TensorFlow. It's probably the most intricate model I've looked at thus far, but like anything else, it slowly made sense as I broke it down. The papers I read earlier helped a lot by providing a general deconstruction of the model. I used these papers, example code, and a healthy dose of common sense to document the code. I had a lot of downtime, so I did some reading and personal experimentation. In particular, I wanted to get familiar with what HSI data looks and feels like. Here's the result of a small script I wrote: This is a visualization of the 103 bands included in the Pavia University HSI dataset. On the right, brighter colors show higher refl

Abstract

The greatest challenge posed by hyperspectral image (HSI) classification is extracting features which are useful for classification. Convolutional autoencoders (CAEs) are designed to generate relevant and generalized features from this kind of data, and advancements such as stacked CAEs (SCAEs) and multiloss CAEs (MCAEs) attempt to improve upon the model. In this experiment, we train these models on randomly selected labeled samples of the Pavia University HSI dataset. A support vector machine (SVM) uses the features extracted by these models to make predictions about unlabeled data. The accuracy of these predictions are used to evaluate the quality of each feature extractor.

Day 7

Image
This morning, Ron sent me some instructions on building a neural network in keras, saving me from having nothing to do for the entire day. Nice! I kept the experiment running in the background while I worked on the neural network. Fortunately (or unfortunately) keras made it very easy to create a sophisticated and successful model. I finished testing everything I needed to by midday. Nice? After lunch, there was still stuff to do. As per usual, I did some reading. I highly recommend the notes from Andrej Karpathy's class on using CNNs for visual recognition ( link ). In my Day 2 blog post , I mentioned trying to help Nate format his data to be compatible with the SVM. We are seven days into the internship, and nothing has changed. I figured out how to format my data for the image classification pipeline (one of our internship projects) two days ago, and this format happens to be different. Explaining this format and my implementation to Nate was pretty difficult, and I had a naggi

Day 6

My experiment has officially started! I first cleaned up the code and prepared it for the 24 test cases, with 30 trials each. Then, I ran it. And waited. ... I predict that this will be a common trend for the next few days. During my downtime, I wrote my abstract and read some papers about machine learning models. The first test finished just after I came back from lunch. After collecting the results and calculating the relevant statistics, I started the second test. It was some time after this that I became aware of the fact that I was using 100% of the memory of all four GPUs on the server. After promptly stopping the experiment, I changed a setting to minimize the damage to only one GPU and started it back up. Luckily, the program's performance didn't change significantly, since some packages like to allocate memory it doesn't necessarily need (hence why it was happily devouring ~48000 MiB). I've been running the experiment remotely since I came home, and I plan t

Day 5

Today I made really good progress, and I didn't really notice until I was about to leave. I ran into a lot of issues along the way: memory allocation issues (?!?), struggling to import a module, incorrect data dimensions, the list goes on. Huzzah. It took the entire day to iron out the bugs, and when the entire pipeline worked for the first time, it felt like I finished a marathon. A surprisingly quick runtime and high accuracy were my reward. There's still one bug where SCAEs work properly and CAEs don't, which is puzzling, since the SCAE code is completely reliant on the CAE code. That'll be a problem for tomorrow though. At noon, there was a talk about biomedical imaging from Professor Linte. It was high-level overview of a variety of topics from cardiac models to VR. It was a nice way to break up the day, and it helped get a better idea of the other things that go on at CIS. Tomorrow, I'm going to be working out that one bug and hopefully starting my first exp

Day 4

Today was Joe's first day out on vacation and Matt's first day subbing in for the morning meetings. With no audio to accompany the cheesy video, we had a brief talk about imaging science. My work today began with more reading. This paper was really great at explaining ladder networks, and once I was fairly comfortable with the subject matter, I started working on my first experiment: testing convolutional autoencoders (CAEs) and multiloss CAEs (MCAEs) on the Pavia U dataset. So far, I've documented the CAE and stacked CAE (SCAE) files and set up the data formatting and a small part of the pipeline.

Day 3

After today's daily briefing about writing abstracts, Nate and I got to work at the lab. I picked up where I left off with autoencoders and finished the convolutional and denoising ones pretty quickly. I'll probably come back to the more complex autoencoders some other time. Left with a nagging feeling that my SVM code from last time was subpar, I revisited it. I ended up rewriting almost all of it. The results included elegant data manipulation, good runtime, optimized preprocessing, tuned hyperparameters, complete metrics, and (this is the important one) okay accuracy. Not great, but not bad either. I think that under the given constraints, I nearly achieved the best that the SVM could perform, so I'm ready to close the book on this tutorial experiment. That's not actually true; Nate's SVM is having some performance problems that we have yet to diagnose... I also continued reading about the topics relevant to my experiments. I have to admit: a lot of it still wen

Day 2

Day 2 started with a very brief meeting where we went over the schedule for the next six weeks. Abstract, outline, presentation; two weeks per task. It was at this point that I realized the difficulty of making technical topics sound interesting, then I immediately dismissed the problem as one for my future self. Today was dedicated to Support Vector Machine (SVM) creation, evaluation, and optimization. I was unaware of many of the wonderful functions included in the provided libraries, and ended up doing the data reformatting less elegantly. It still worked, so I left it. When I finished up my SVM code, I tried to help Nate with the errors he was running into. The keyword here is "tried." Note that it's not easy to explain concepts that you figured out hours earlier and happen to be more complicated than they should be. Nate was a good sport about it though, and with Ron's help, he had a good idea of the solution by the end of the day. For the last part of the day,

Day 1

The day started with a brief gander at the intern handbook and a tour around the Carlson building, then us interns went down to the Red Barn for some team building. We got to learn each other's names before inspecting many ropes, building PVC pipe structures of questionable integrity, unsafely flipping magic carpets, and more. We're a pretty quiet group, which made the team building a bit difficult, but we got to know each other nonetheless. Ending with some final thoughts, we wandered our way back to the Carlson building for lunch, which was provided for us (thanks Mr. Pow!). After lunch we got to meet with our respective research groups. When we got to the lab, Ron and Dr. Kanan talked to Nate (my fellow intern) and I about college, research, and the internship in general. Ron went through all of the topics we'd be covering through tutorials and readings throughout the first week, just to give us an idea about our focus. I'm really glad he took the time to put tog