Blast Motion – Advanced Logging to Help Advanced Athletes
Blast Motion is one of those companies that’s so cool you can’t help but brag that they are your customer. I mean, these guys make sensors and apps that analyze athletes swings, hits, jumps, and motion. They are the official bat sensor technology for Major League Baseball!
And they use Bugfender for logging from their iOS and Android Apps.
I was able to chat with Steve Wehba, the System Architect for the Blast Motion apps, about how they use logging during development, testing, and production and how Bugfender has helped them.
Lots of Logging for Lots of Devices
First, it’s important to set some context. Blast Motion makes small sensors that you attach to athletic equipment (like baseball bats or golf clubs) or even athletes themselves. These sensors gather a lot of data from very highly calibrated sensors and then sends that data to a connected phone or tablet over (typically) Bluetooth Low Energy (BLE) for further processing and display. The combination of the data from the sensors and the processing done in the mobile apps gives the athletes deep insights into their performance. You can check out a short video on how this works for baseball at https://blastmotion.com/products/baseball/.
If you’re like me this setup gives you two thoughts. 1) That sounds like a really fun product to build. 2) That’s a lot of different components, each with their own special way of causing hard to debug problems. Steve confirmed that making everything work reliably in tough situations – like using BLE in a stadium full of people with smartphones – has been a lot of work.
To help with this, Blast Motion has created a cross-platform framework of code shared between their mobile apps. The framework handles common things, like BLE connections and some of the low-level motion analysis, as well as logging. For logging, they capture typical extra data over what the developer explicitly logs, like method name and line number. They also capture Blast Motion specific data, such as information about the user (which can either be the player or a coach).
They have been disciplined about adding a large amount of logging at the correct log levels, allowing them to capture only errors for normal production usage or detailed debugging if the situation calls for it (configured at runtime without pushing new builds). Remember, also, that the sensor doesn’t have an independent network connection and must, instead, rely on the mobile apps to perform logging.
The Blast sensor is actually quite interesting (beyond the inherent interestingness of the large number of highly calibrated sensors packed into it). It can operate without a mobile device for a while, capturing and storing data, and then download that data to a mobile device when it’s connected. One use case for this is skating or snowboarding, where the athlete makes a run with just the sensor and connects to a mobile device afterward for analysis. The Blast sensor can also use a activity specific mode to capture only data around interesting events, such as several seconds before and after when it detects a bat hitting a ball.
This all means that Blast Motion has had to be disciplined and organized about what logging they capture, when, and how they collect and forward that data. Both to get what they need and to be respectful of the bandwidth and storage limitations that their users would prefer.
Logs for Customer Support – Not Just Developers
A striking aspect of how Blast Motion uses Bugfender is that their customer support also has access to the logs and regularly reviews them. Customer support also directs customers to enable additional logging in the app when the situation calls for it. This access to the logs helps support solve more problems more quickly and, if escalation to development is necessary, provide more complete information to ease debugging.
This arrangement was made possible because of the mobile focused and easy-to-use interface provided by Bugfender. Before using Bugfender, Blast Motion was putting their mobile app logs into their centralized logging system that also handles their backend logs. That system is much more complex and has to handle the large log volume from the backend. The added complexity made using that for customer support infeasible (and irritating for developers and QA). But once they switched to Bugfender for the mobile app logging, opening up access for support was suddenly simple.
Splitting Logs by Development Phase
One of the most valuable insights that Steve shared with me, and one of the ways they make giving broader access to the logs workable, is how they manage logging for the different phases of development and make finding logs for relevant devices easier.
For each of their iOS or Android apps available to users, they actually create 3 versions of the app in Bugfender: one each for development, test, and production. This arrangement makes it simple for users, depending on their role, to see the logs that are relevant to them. The developers can add the messy, detailed debug logging that they need during development without causing confusion and false positives in the production logs accessible by support. Similarly, QA can focus solely on the apps that are currently in test.
They are also very disciplined in how they name development and test devices, making it quick and easy for everyone to find the logs for the devices they are testing or developing with.
I love these solutions because of how simple they are. No queries or configuration. Just navigate to the app that corresponds to the work you are doing and find the device that you need to view.
Debugging Cloud Sync
Steve also shared one particularly hard bug that Bugfender helped them track down. They have a custom cloud sync component (both on the mobile and backend side) that lets them share athletic performance analysis among multiple devices (imagine an athlete capturing data on a phone but doing detailed review on an iPad later or giving access to a coach). At some point, they started having trouble with data not syncing.
It was – as is the case with all terrible bugs – intermittent and hard to reproduce. They looked through the Bugfender logs and became even more confused:
- The apps were correctly initiating sync to the server
- Connections were being made to the server (so it was not a basic network problem)
- But the servers were returning 400 HTTP error codes – indicating that it was not a server problem (which would be a 500 error), but a client issue
The next step was standard procedure: they ran the app under the debugger and set breakpoints where the app attempted to send data to the server so they could step through the code. And, of course, everything worked correctly. They ran without the debugger and could produce the error, but never under the debugger.
At this point they cranked the logging up to the highest level and managed to capture a failure with full logs. As they pored over the detailed client and server logs they started to notice a very minor skew in the timestamps of the logs – the client was always ahead by a tiny amount.
That was the key to understanding the bug. It turns out that the server, in an attempt to protect itself from misconfigured or misbehaving clients, would reject any data with a timestamp in the future. So when the mobile devices had a clock ahead of the server, it was possible that data would be sent with a timestamp that, from the point-of-view of the server, was invalid because it was in the future. The clock skew was small enough that, when the app was paused at a breakpoint in the debugger, enough time would pass on the server for data it received after the app was resumed to appear valid.
The fix was easy once they understood the problem and reminded themselves that distributed clocks are incredibly difficult (the challenges caused by distributed clocks is one reason why Google’s Spanner distributed database relies on atomic clocks and GPS on every node!).
This, to me, is a perfect example of why logging can be so valuable: it allows you to test your software under exactly the same conditions as when it runs normally while still getting the data you need to understand the problem. Debuggers are great, but they necessarily change how the software runs. And in this case, where clocks were the culprit, that difference made the bug unreproducible. You see the same kind of problem with other time sensitive code or code where ordering matters (which can be the case with threading or asynchronous code).
Blast Motion is a great example of using Bugfender with a large, complex set of apps in a full production environment. It was inspiring for us to hear how the mobile focused feature set of Bugfender made it the right tool in their toolbox. I’d like to thank Steve for sharing his stories and insights with us. It’s always great to talk with a talented developer that’s successfully found practical solutions to hard problems.
About Blast Motion
Based in Carlsbad, California, Blast Motion is defining the future of wearable motion capture technology. By combining the industry’s most complete performance improvement solution and real-time metrics analysis with auto-curated video highlights, Blast Motion has created a contextually rich user experience that enhances the way people capture, analyze, and improve their game. For additional information on Blast Motion, please visit: blastmotion.com.Read the comments