As we get older our hearing declines. This is a problem that needs solving.
My father has a refrigerator door alarm that goes off when the door is left open for more than two minutes. Unfortunately the alarm sounds at a frequency he cannot hear. Assisting the elderly makes me feel like a good person and I figured I could hack the door alarm so it would have the intended effect.
Various Approaches, None Straightforward
My first problem was determining my strategy. If I just modulated the frequency of every noise in the room and echoed it back, that would quickly become annoying. I needed to actually detect the alarm. So I did what anyone would do in this situation and bought a Raspberry Pi.
I had no experience with Raspberry Pi or an Arduino, but I figured how hard can it be? Eight year-olds are making lightsabers with these things. Turned out hubris was my second problem.
Raspberry Pi is a computer. It runs a thin Linux distro, has a bunch of USB ports, an audio port, and an HDMI port. It will execute programs written in any language that runs on an x86 machine. The microphone I bought was a serial port microphone and would only work on an Arduino – so any chance of having sample code to work off of was lost. I ended up using a USB microphone and the audio line out to speakers.
Problem and Approach
So, here was the task: detect a beep that lasts one second, pauses nine seconds, and repeats. When the detected state is reached, allow the occupants of the room a few polling intervals to shut down the alarm. If the alarm continues after some number of times, raise a flag (I chose to play a car alarm wav file, but considered leveraging my company’s API to send an SMS or place an IVR phone call).
The problem boiled down to detecting the beep, and everything else was trivial.
At first, I had some false starts using fast Fourier transforms (FFT). Running analysis on the entire band of frequencies detectable (0-22,050Hz) is computationally expensive, slow, and overkill. I really only needed to look at the small band of frequencies right near the target noise.
I found success with discrete Fourier transforms (DFT) using the Goertzel algorithm, detecting amplitude at one frequency bin: 2711Hz. I found the target by running FFTs on a recording of the alarm and finding the outlier in terms of amplitude.
Timing is Key
The next problem was, how do I guarantee that if an alarm is triggered I’ll capture it? I found that if the polling interval is less than the noise’s duration would ensure that.
My approach was to record for 200ms (0.2 seconds) and run DFT on the recording. Based on the performance of my implementation (it consistently took less than 200ms), let the thread sleep for an amount of time such that the total elapsed time was always less than one second.
The guarantee is ensured because we are always recording a 200ms sample of (at least) every second and not recording for an upper bound of 700ms. This means that any sound lasting for 1000ms cannot go undetected. It may help to visualize on a timeline.
If it detects an alarm, it enters an “in alarm mode” state. I needed some precision here to ensure that the next recording started exactly 10 seconds after it began the prior one. My approach was, mark the time, start recording, run analysis, and mark the time again. Let x be the difference between starting the recording and finishing the analysis in milliseconds, then:
If in alarm mode state: Sleep 10,000ms - x Else: Sleep 500ms
Deploying
Once that was implemented and tested, I wrapped my service using YAJSW, began logging, and deployed it. I also created a job to restart the service at midnight to ensure that if it crashed today it would still work tomorrow.
Conclusion and Code
It’s fun to see your ideas come to life. It’s why I do what I do. Developing things that have tangible impacts is a very engaging task. Even if your results are less than stellar, the experience is often well worth the effort. Seeing your code work the first time, seeing your app launch, seeing your refrigerator door alarm work – they all bring me the same smile, childish enjoyment, and feeling of accomplishment.
Source Code: GitHub
A funny side note, the car alarm wav file is about 40 seconds long. A few days after deploying, my dad called me laughing to tell me that he’d gotten the message and could I please have it shut up if the door was closed? After explaining the technical hurdles (while playing the alarm it’s not recording) he promised he’d “be a quick learner”.
Leave a Reply