7/18/2014– Jon McKay
Image source: Matthew Bergman
Several months ago, Eran Hammer messaged me about helping him with a project he was working on for NodeConf. My only experience with Eran Hammer up to that point was watching his over-the-top presentation at Realtime Conf. I’ve since forgotten a majority of the presentation, but I do distinctly remember a dime bag containing blue rock candy, dozens of remotely controlled mortar and pestles, and several hundred pounds of dirt and edible plants in a warehouse. In short, it was weird and intriguing and made me feel vaguely like I had abused moderately-strong hallucinogens.
Naturally, I was excited about helping with the next project, especially since I could develop it on Tessel.
Eran was set on creating an LED fireworks display to be “set off” on July 4th. The idea was scaled down from an overhead, 15 foot wide, 10 foot long LED grid (estimated cost of $75000) to “simply” 1260 Neopixel LEDs laid out in the shape of an exploding firework.
Neopixel is Adafruit’s brand for addressable LED strips using a specific chip called the WS2812. They are extremely popular due to their simple programming interface and mesmerizing light displays.
In all, Eran’s laptop ran a Node server responsible for initiating the Tessel script and generating animations, sending those animation as a binary packet to Tessel, and Tessel was responsible for routing the buffer to the Neopixel driver.
Each RGB LED receives 3 bytes of data (one byte for each color) and then uses a shift register to pop those bytes off before sending the rest of the animation buffer on to the next pixel. Each transmitted bit is essentially a PWM signal with a period of 1.25 microseconds. A “0” is represented by a duty cycle of 32% and a “1” is represented by a duty cycle 85% (with a ~10% error margin). At the end of an animation, the signal should be held low for at least 50 microseconds.
Eran and I were initially worried that we wouldn’t be able to get a good frame rate with a single strand of 1260 pixels. A reasonable frame rate for animations is 30 frames per second. To determine the expected frame rate with Neopixels, we’d need to do a little math:
1260 pixels * (24 bits/pixel * 1.25 microseconds/bit) + 50 microseconds = 22100 microseconds, or .0221 seconds per animation frame. In theory, that’s about 47 frames per second, but real life is always a bit slower and we weren’t sure by exactly how much.
Fortunately, our LPC1830 microcontroller features a State Configurable Timer (SCT) which provides programmers with an extremely flexible interface for pin manipulation. Essentially, it allows you to define different timing states associated with different pin inputs and outputs.
In the case of a single strand of neopixels, I could have used 4 different states. A simplified example of how it could work would be:
A timer for the entire period. This timer sets the output pin high at the beginning of a bit transmission which makes the PWM active.
A timer for the “0” duty cycle completion (at 32% as mentioned above). When this event fires, it will actually check the value of the current bit being transferred. If it’s a 0, then there is no conflict and the pin state will be pulled low. If it’s a 1, there is a conflict, and the pin will not be changed.
A timer for the “1” pwm completion (at 85%) will pull the pin low regardless of the value of the bit being transmitted. This is because it’s a 0, it was already pulled low and doesn’t matter, and if it’s a 1, it needs to be pulled down immediately.
A timer that only fires at the end of at the end of a 8 bits. This timer will call an interrupt routine to set up the next byte to be transferred.
But in reality, I couldn’t set up the next byte after the previous one finished, because that would cause a delay between when the previous byte was sent out and the amount of time it would take to process the logic of setting up the next one. That delay in the signal would totally ruin the animation being sent to Neopixels and the colors would be all wrong (in the best case).
In fact, what I needed was a double buffer. A double buffer would allow me to set up the next byte while the current byte is being transmitted. I implemented that functionality by adding another SCT state. I set the state of a GPIO pin either high or low depending on which buffer the SCT should be using to generate a signal, and set the next byte in the other in interrupt routines after the previous byte was sent.
I ended up finishing the single strand implementation in a few hours, but wanted to take the system to the extreme. Tessel has three PWM pins connected to its GPIO port, and I wanted to enable all three of them to output animation data in parallel, effectively tripling the possible frame rate.
One frustrating weekend, I attempted to add a four more events for the two other pins (each requires two events for the double buffer). Unfortunately, there just wasn’t enough time in the interrupt routine to service three different buffers at a time.
1.25us/(1/180Mhz) = 225 clock cycles to update three buffers
I found that I could update two pwm channels fairly well, but three channels took too long to update. I think my implementation could have been cleaner if I didn’t use so many structs within structs (thanks OOP).
I ran out of time before I could try implementing the more complicated, but ever so elegant and efficient: DMA transfer. DMA effectively allows for the movement of data over physical pipes without requiring any overhead from the processor. You just tell the microcontroller the destination and source of data transfer and it all happens automagically. I wish DMA was taught in more undergraduate computer architecture classes; it’s a really cool technology.
Using DMA would allow us to use all three PWM channels without the overhead of the interrupt routine that takes all too long to update the next byte because the hardware would handle it automatically.
If you’re interested in contributing, this would be a super fun way to get started! As someone still getting familiar with writing firmware myself, I learned a ton about how timing, double buffers, DMA, and PWM work on a much deeper level and was able to visually see my achievement with an awesome light show.