Monitoring on the Move: A Developers' Guide to Mobile App Stability and Scalability

Monitoring on the Move: A Developers' Guide to Mobile App Stability and Scalability

Explore how we steered our mobile app's stability from 68% to 99%, and some hands-on developers' tips to replicate the efforts.

App stability and scalability are two crucial app performance metrics that determine the overall quality and usability of the app.

While an app might be fairly stable initially, constant updates, data overload, and sudden scaling bring a slew of changes that derail its stability.

Here, we discuss various tips for mobile app developers to create scalable and stable apps. We also share the efforts we invested in one of our apps to make it 99% stable and highly scalable.

Developer Tip #1 - Identify and Define Key Events to Track and Record Them Separately

Diversify the analytics by tracking and recording all the key events, such as all types of conversion events, like:

  • App browsing
  • Wishlist tracking
  • Check out
  • Cart abandonment
  • Visitor activity, etc

Create different events for different steps in the customer journey, such as conversion, and track them.

This way the upper management can get highly granular business intelligence in the form of actionable insights, such as:

  • Increase or decrease in conversions after a new feature release
  • Conversion trends for different campaigns
  • Sales and impressions generated for the similar product with different imagery or promotional campaigns

Takeaway:

Creating different events for all the crucial steps in the customer journey facilitates event tracking and app monitoring at a granular level. It makes issue tagging easier and faster.

Developer Tip #2 - Proactive Root Cause Analysis (RCA)

We started working on the stability and scalability of one of our mobile apps a few years back, and a few of our priorities were:

  • Proactive event tracking and issue resolution
  • Constant on-the-move monitoring
  • Repeating the cycle for a glitch-free UI/UX

And, our list of challenges included the following:

  • Gathering information about events that led to issues, such as crashes, missed orders, or alerts, from the end-users
  • Making our mobile app more stable and knowing runtime crashes on production
  • Hitting a large number of APIs via a Pull-based mechanism

Now, these three challenges spawned the following three issues:

  • Inability to locate and identify the root cause of the issue - a wrong entry, any existing bug, process fault, etc.
  • Inability to track and monitor the app stability in runtime from a developer’s perspective
  • A massive number of API calls persisted irrespective of the efficiency of the query

Now, take a look at the following image for a basic overview of the implications these issues caused:

Screenshot 2022-05-30 at 11.01.18 AM.png

Source: Shipsy

So, what did we do?

We had a two-pronged approach with Firebase here - Analytics and Crashlytics.

Let us explore them one by one.

Analytics - What, How, and Why?

We created an event for every user action and recorded every action, such as:

  • Button clicks
  • Screen navigation
  • API calls and status
  • Functional events

The entire process is shown below:

image (13).png

Source: Shipsy

Next, we gathered the event data (analytics data) for an in-depth analysis. This data store recorded events (past events and intra-day events) in a date-wise manner.

Finally, we did a thorough analysis and reported on these data sets to generate actionable insights into the overall user activity and app performance. We have also created multiple dashboards for the analysis purpose to know the status of core functionalities.

Gathering such analytics allowed us to tap into the granular user activities and experiences, such as:

  • User behavior analysis, such as average time spent on a specific tab or page
  • Analyze the features or updates that increased crashes or app issues
  • Track and monitor app user journey to figure out the most-used app features
  • Optimize and improve the app UI by reprioritizing the app screen order

So, we leveraged the following 4-step process to improve the UI via analytics:

Infographic chart-100.jpg

Source: Shipsy

Crashlytics: Proactive tracking, monitoring, and resolution of issues

A few years back one of our worst-case scenarios looked like this:

  • 5% of 6800 order numbers were lost
  • We were getting 35000 API calls per minute!
  • Every end-user was looking for updates and new orders via a “PULL” mechanism, causing more API hits

To overcome these challenges, we ensured that every time our mobile app crashed or threw an exception, the stack trace was thrown with the exception.

So, our crash reports now included:

  • Crash versions
  • Names
  • Keys
  • Logs
  • Event data

This way, we no longer required the app users to report the exact sequence of events that led to the crash or any exception.

Also, every crash monitoring was followed by a crash report and a crash fix, and the process was repeated proactively.

We started “pushing down” the latest information about events as alerts and notifications.

This helped us significantly reduce the number of API calls and helped us scale in an efficient and sustainable way.

This proactive RCA and in-depth event analytics increase our app stability to 99% from 68%. It also empowered us with insights that helped us resolve any type of issues in our mobile app even before the client could notice them.

Takeaway:

Proactive Root Cause Analysis can be of real help when it comes to building highly scalable and stable mobile apps. Using actionable insights from the event records can improve the overall app usability and performance.

Let Your App Tell You What Is Wrong

While the technical aptitude of every app user is different, generally, their perspective is way different from a developer.

For example, the app user would say - “I pressed the New Order button and the app didn’t work properly. It did nothing.”

On the other hand, a developer is looking for something like - “I was in the middle of updating a delivery record when I got the New Order message. I tapped on it and the app didn't do anything. I was not able to record the delivery and had to relaunch the app.”

Situations like these can be endless.

Therefore, it is important that every crash event detail is fetched from the most trustworthy source - your app.

Developer Tip #3 - Track, Monitor, Record, and Analyze Data

Following a consistent and robust app data recording practice pays off in various ways. You always, always have the right event data for debugging, app improvements, and issue resolution.

This makes it easier for the developers to locate and identify the exact cause of the crash event and resolve it properly.

Takeaway:

Making your mobile apps “tell you” what went wrong allows you to make them more stable and resilient.

Bridging the Gap Between Support and Production

Earlier our dashboard system for our Support Team and Production Team was disparate.

However, this led to the disjoint information collection and redundant event data collection.

This is because every support team member would use a different phrase to record the crash information. This burdened the production team to make individual calls to the end-users and gather the information.

We overcame this challenge by bridging the gap between our Support and Production teams.

Now, every event has an event ID, that can be tracked, monitored, and referenced in the future as well.

Developer Tip #4 - Reduce the Time Spent in Gathering App Event Data

By doing so you can:

  • Reduce the overall time spent in gathering the event data
  • Skip the agony of processing average analysis done by support staff
  • Track the event
  • Record and monitor it for future references

Takeaway:

Reduce the number of steps for event data gathering and standardize the event reporting, monitoring, and tracking process. This makes debugging process more efficient and less redundant.

App Performance Monitoring: An Ongoing Process

Creating a stable, robust, and scalable mobile app is a daunting task that requires consistent efforts. This is because the scope, functionalities, and utility of an enterprise mobile app vary over time as more people and different user categories evolve.

We, at Shipsy, believe in the consistent improvement of our products, codebase, and underlying tech to ensure that our products stay relevant, high-performing, and razor-sharp.

To become a part of our developer community, please visit the Careers Page.

Acknowledgments and Contributions

As an effort towards consistent learning and skill development, we have regular “Tech-A-Break” sessions at Shipsy where team members exchange notes on specific ideas and topics. This write-up stems from a recent Tech-A-Break session on Mobile App stability and scalability, helmed by Pankaj Yadav.

Technical Contributions: Sahil Arora and Kalpesh Kundanani.