At most telco operators, end-to-end tests are the dream. Given that each new network update or service offering potentially requires dozens of testcases verifying conformance, acceptance, functionality, and performance, true end-to-end tests that validate the entire functioning of the network from the end-user’s perspective are daunting and often difficult to accomplish (especially if you’re testing by hand). As such, your average network tester might balk at the idea that end-to-end testing wasn’t enough to ensure high network quality—considering that going end-to-end is no mean feat in itself.
And yet, the endpoints sometimes don’t tell the full story. Often, the most telling details are happening on the signalling level. If protocols aren’t being implemented correctly and operations aren’t happening in the right order, the results at the endpoints aren’t necessarily going to tell you so—especially if things otherwise seem to be working normally. Still, the fact that there might be some information hidden in signal traces hasn’t automatically made it a top priority for telco testers, no doubt because it seems difficult to incorporate into workflows without ceding even more time to testing. This is understandable, but it might be shortsighted. Here’s why:
What Is MAP Signalling?
To begin with, let’s talk a little bit about MAP Signalling. MAP (or Mobile Application Part) is an application layer protocol used in cellular telephony to manage call handling, SMS messages, PDP contexts, mobility services, and a host of other critical functionality. Naturally, it’s not the only protocol for which a trace capture or trace analysis might be appropriate, but because SRVCC and SMS are such key parts of modern network functionality, it seems like an appropriate starting point. MAP connects voice calls through circuit switched channels, while data and signalling information are both in the packet switched domain—meaning that the transfer signalling information isn’t as restricted as it was in pre-packet switched networks. As such, MAP signalling control information is sent before setup, during the call, and after teardown, and includes much more than just the caller’s telephone number and subscriber ID.
By using available tools to gain access to this trace data, testers can suddenly access the inner workings of the system under test (SUT).
By using available tools to gain access to this trace data, testers can suddenly access the inner workings of the system under test (SUT). They can learn which operations and procedures (e.g. VLR location update, SIP messaging, NP inquiry) the network is executing while any given testcase is running. The importance of such a thing might seem like it’s only relevant to failed testcases, but in point of fact the difference in trace data between failed testcases and related testcases that may have just barely passed can tell testers a lot about what’s going on in their networks and how it’s effecting quality and customer experience.
Uncovering the Root Cause
Like we said above, signalling traces for MAP (or other protocols) goes “beyond end-to-end” to give testers insights into test cases that simply wouldn’t have been possible otherwise. The value here becomes obvious when you stop thinking of test operations as a siloed process adjacent to other network activity and begin thinking of the process more holistically. What we’re getting at here is that while testing solutions that tout their speed above all else are all well and good, it’s generally not savvy to sacrifice test quality for faster turnaround times. Why? Because when a testcase fails and requires follow-up attention, you’re potentially drawing your bug fixes from the same pool of resources that handles testing in the first place—i.e. you may be speeding up testing, but you’re certainly not speeding up time-to-market. Since the point of rapid end-to-end verification isn’t speeding up testing itself but, indeed, improving time-to-market and network quality, the integration of testing into other network activities is key.
How do you accomplish this integration? More importantly, how do you speed up not just verification itself but root cause identification for bugs? Well, it can start with reporting: more readable reports will help open up test results beyond the testing silo, and granular keyword-based test reports can give readers clear and useful clues about what, exactly, went wrong. From there, the more information you can provide for aftercare, the better. If, for instance, you’re able to look at a failed testcase, visualize the results on an in-depth level (pinpointing the exact moment of the failure, whether the phone failed to connect to the network in the first place or it fell victim to a misfiring SRVCC handover, e.g.), and then determine the activity on the protocol level, you can find root causes that much more quickly. Since this has the potential to radically speed up test aftercare, it can also radically speed up time-to-market without sacrificing network quality.
Automation Beyond End-to-end
Okay, we’ve seen how signalling trace analyses can add value for telecom testers by making root causes of faults easier to pinpoint and address. But, that doesn’t actually solve our original problem: how do testers possibly find the time to add this to their service verification flows? Luckily, it’s not as difficult as it might seem. Step one is to ensure that your testing framework is actually equipped to capture and analyze traces. This means making signalling trace a priority when choosing, for instance, an automation provider.
If, for instance, you were verifying HLR/HSS provisioning using an automation framework that boasted a) Wireshark integration and b) a robust library of signalling trace-based keywords, you could automatically test that any modification in settings was leading to the correct changes in functionality. Because this would be part of a larger set of automated test suites, you’d be able to capture these traces with the push a few buttons (or the typing in of a few keywords, as the case may be). Thus, you would essentially include signalling trace analysis without adding any real time to your overall testing workflow—but the time spent on bug fixes would inevitably shorten as a result of the increase in granular, protocol-level information. Sure, there will be plenty of test failures that don't require this added analysis in order to be resolved, but for those that do it’s worth incorporating trace capture into your automated testing.