RE Class: Experiences With Malware Term Projects

Mon 21 April 2014 by Wesley

This week in my reverse engineering class, the students are presenting their work that they've performed throughout the semester on their term-length projects. A running theme through every course I teach is the importance I place on good technical writing and the ability to convey to others the results of work performed. Essentially: if you didn't document it in a way someone else can follow, you might as well not have done it. The students who don't wind up going into infosec or reverse engineering jobs will at least have exercised and demonstrated some ability to communicate. So, in this class, they have to write a whitepaper and give a 25 minute presentation on it.

It's fairly open-ended, which opens up some (not insurmountable) problems I'll discuss in a bit. The students are to:

  • Select a malware sample (massive thanks to VirusShare for providing a good repository for my students to use)
    • Windows PE
    • Compiled IA32 code (no .NET, Java, AutoIT, etc.)
  • Write a proposal about what is already known from other online analyses, basic static analysis
  • Spend the rest of the semester hacking away at it, as they see fit
  • Write a whitepaper based on their findings and experiences
  • Present their work in front of the class (the rest of the department is invited as well)

Students are allowed to work individually or in pairs. That's their choice.

Today was the first day of presentations, with five of the twelve presentations given. We were treated to some interesting talks on:

  • Two variants of Zeus (with very different approaches by the analysts and reversed functionality)
  • BANGAT, from the Mandiant APT1 set (I love these because most are relatively easy to reverse and make excellent teaching aids)
  • CryptoLocker
  • Swen.A, an ancient (especially by my students' standards) worm

I've made some observations:

Comfort with reading disassembled code

It's easy to tell who's comfortable with IA32 assembly language by now, because they're standing up in front of the class generalizing about blocks of code and looping groups of blocks in very high-level (but functionally accurate) terms and statements. It's like they're finally seeing The Matrix within all the falling green symbols. They know when the code's doing very mundane things and they can speed through it, and when it's doing something they need to pay attention to.

On the other side of things, you can tell who hasn't read enough assembly by how close to verbatim their discussion of each instruction gets. You'll see them fret over things like the PUSHes leading up to a function call. The mode in which their mind has to act like a virtual x86 executing line after line is stuck permanently on. It's a tough thing for the students, since their only prior exposure to assembly language is on PIC microcontrollers.

It really seems to be a function of how much time they've spent looking at disassembled code.

Building capability

As a result of the class, it seems like we're getting more and more students capable and interested in the low-level details of computing. Honestly, most of my students are now where I was just a few years ago. Putting together this class and teaching it has taught me a lot. A lot of the RE skills turn out to be the foundational skills of other interesting areas of computer security as well: vulnerability discovery, exploit development, etc.

Students are building their own tools now, in addition to using IDA Pro and Immunity. They're getting interested in performing research in this area and furthering the art and science of it, as graduate students and undergraduate researchers. This is creating a great environment for me to improve my skills as well, and I think it'll be good for the department and the university in the long run.

Handling criticism

Each student is subject to a 5 minute Q&A period, in which I encourage the rest of the class to challenge the speaker on decisions made, techniques used, and findings that don't appear to be well-supported. If they don't, I tell them I will. It usually turns out, though, that most of the questions I've jotted down to ask wind up being asked by the students' peers before I get a chance to jump in.

I've been very impressed with how professionally the students have handled this, and I can tell that, knowing this is coming, students have taken care in their work. This leads to confidence in their presentations.

Malware selection is hard

...especially early in the semester. The students essentially don't know what they don't know. It's easy to fall off on the side of biting off more than they can chew on a sample that's very large and complex. It's also easy to accidentally pick up something that's too trivial. I try to get them to err on the side of complexity. I'd rather they cover a subset of functionality very well than exhaust all the functionality of something too early.

Focus and direction

It's also very difficult for those new to this to know where to focus their efforts on a new and unknown sample. We occasionally wind up with analyses that are incredibly in-depth on installation procedures while ignoring other much more interesting functionality. As long as the students have been productive with their time, this doesn't bother me very much, but I do need to try harder in the future to cover ways of slicing up large codebases and narrowing our focus to more interesting functionality.

Old malware is kinda fun

Sven.A spreads over Kazaa and NNTP, and has all sorts of other oddities of late 90s/early 2000s malware programming that make it almost comical to look at now. The student who analyzed it worked on it about as hard as anyone else worked on newer samples, though, and came up with a good analysis. He had some interesting challenges in setting up an environment for it to live in for dynamic analysis. In this case, the journey was a very instructive reward.

High expectations are awarded

Challenge your students to write command-and-control servers, and a lot of them will do it. Expect professional-looking whitepapers, and they'll largely deliver. On a more basic level, if you tell them they're to teach themselves a new language (IA32 assembly) within a couple weeks and you start testing them on it shortly thereafter, they'll surprise themselves about how quickly they pick it up. A lot of the students have learned some good problem solving and research skills as a result of this class.

Overall, it's been a lot of fun and I'm looking forward to the remaining presentations.