Hey all,
My name is Eric, and I’ll be one of your regular authors here on int main(). You’ve probably already read a little bit about me from the post that my friend Sam put up earlier, but I’ll start with a quick introduction and background
. I’m a second year Computer Science student at the University of California, Los Angeles. I consider myself well-versed in C, C++, and Java, primarily. I also enjoy (if that’s the word) working with assembly on occasion (I’ll post a discussion of some assembly topics later on). Anyway. That’s a little bit about myself. I’d also like to state here that if any of our readers have side projects that they are working on and would like an extra set of eyes or hands, feel free to email. Now that I’ve dispensed with the introductory remarks, let’s dive right into today’s topic.
I feel that as a reader, I would be hesitant to read and trust someone’s opinion on things unless I knew that they actually had some experience with the things that they write about. To that end, I’d like to start this blog by talking about a program that I was a developer on – AudioGuardian. This begs a bit of background information, I’m sure. This project started as part of a competition back in February 2009. I participated in SS12 put on by ProjectPossibility, an organization aiming to make technology more accessible to the disabled community. AudioGuardian was originally named MobileSoundNotifier, and its purpose was to alert hearing-impaired users of hazards in their environment. I will append links to the project’s webpage, the project wiki, and the SVN if any of our readers would like to see some of my “production” code. I quote that because it’s definitely hacked code – this project was completed over the span of approximately 30 hours. From scratch. With that said, here’s the links to the project information:
Project Webpage
Project Wiki
The SVN is linked from the webpage in the bottom right hand corner, if any of you enterprising developers would like to take the project and work on it yourself. It is licensed under the GPL, so keep that in mind.
.
I guess you can’t really fork this project without knowing a bit about how we actually designed and developed the program. Coupled with the fact that this is a CS blog and not a self-promotion blog, I will now move into a technical discussion of the program. I’m going to keep this as high-level as possible so it will appeal to all audiences. Let’s get to it.
This application is built using J2ME, which is the Java 2 Mobile Edition. It has most of the capabilities of the full Java language, but it’s missing some key features. It does not support the Swing environment, so we used the built-in J2ME graphics library to build a primitive GUI. The programming paradigm is event-driven, like the majority of Java programs. A control loop listens for commands and acts appropriately. The basic application logic should be simple to follow. But that’s all academic and really not what you’re here for. Let’s get into the problems with working with audio.
The heart of this program is an audio analyzer. It listens to the surrounding environment and throws alerts when it detects sounds deemed dangerous by either us (the programmers) or the user. Those of you with a physics background may have heard of a discrete Fourier transform before. If so, you will know that this is a complex wave analysis algorithm used to translate a signal from the time domain to the frequency domain. This was important for our application because we cannot use the time domain to perform sound matching. The only correlating factors between sound waves are amplitude and frequency, which we have to use a DFT to extract from the data stream. Sounds simple, right? Without looking at any code (you can download the code yourself and look at it, it’s too long to post here), let’s discuss the issues here.
All microphones are not created equal. Every mic has a different sensitivity and different DC offset applied to the raw data stream. This was a problem for us, since we were doing development across different laptops with different microphones. To that end, we decided to build a calibration function that listens to the ambient environment and basically filters the offset and all the outside noise. This also had the unintended side effect of increasing the accuracy of the matching function. It also built cross-compatibility, so our application will work with any microphone out there. The program also had some problems processing data streams in byte array form. The Recorder class in J2ME had a nice little habit of including the file header when it passed the data to a byte array. This then caused garbage to be passed to the graphing function, which caused some serious inaccuracy and lag time in sound recognition. To fix this, we had to strip the file header ourselves. I’ll leave it to the reader to read the code and see exactly what we did.
Other problems we encountered:
The DFT itself was not easy to write. It didn’t help that J2ME was very poorly documented and as such we could not find a standard library. We ended up borrowing an implementation of quicksort from an outside source. It was optimized for use on pairs of data, which was what we were using for matching. Writing this alone took us almost an entire day of work, and then the rest of the time was spent building a GUI and other small features into the program.
One final problem was the issue of data persistence. We wanted to be able to store settings and data across instances of the program. Sounds simple, right? Write the data to a file and restore it when you open the program? Easy enough….until you look at how mobile phones and the J2ME stores data. We ended up using something called a RecordStore, which is actually an object in Java. It has to be opened before it can be used. It is indexed in a very counterintuitive manner – not like a normal array or vector of objects. We essentially ended up taking this RecordStore object and building our own suite of methods to store and access data, because the built-in functionality was not doing what we needed it to do. This took a large amount of time as well, and took brains away from cracking the DFT. All in all, however, we managed to get everything working.
Well, that was long. Hopefully I gave you all a bit of insight into that project, and a bit of perspective on what goes on in developing a mobile phone application when you’ve never done it before. If any of you have questions or would like me to cover something more in-depth, please post a comment or drop us anĀ email. We also appreciate email if there is any topic you’d like us to talk about in general
. See you next time.