Design Description

As a systems engineer, your job is to explain a novel idea to other systems engineers, developers, testers, and possibly external customers. The information to share with the different information customers may vary but they all require explaining the same design. An idea is a strange thing, it may germinate differently in different minds depending on level of knowledge, preconceptions, similar experiences, or the general disposition of the person on the day. Thus, the idea must be developed and exposed using the maximum level of clarity. The best way to achieve this is to explain the same concept thoroughly and from different angles:

A) Introduce the topic. Start by describing the design at a high level. Stress any high level concepts such as applicability, benefits, prerequisites, general workings of design, and affected functions. Next, provide an architecture diagram showing the various functions in the system and identifying the affected functionality. Where applicable, show the control and data flows within your diagrams.

B) Provide a detailed description of the design. Divide the idea or design into atomic topics, order the topics to ensure a good flow for the presentation of the ideas, and then tackle the sub-topics one by one. For each sub-topic:

B1) Describe the issue in detail. For every functional box describe the inputs, outputs, and operations. For every algorithm provide a step by step description. For every message, describe field descriptions and actions at source and destination related to the packets and fields.

B2) Add pictures. Humans are visual beings. Beside the aforementioned architecture diagram, add figures to explain the detailed workings of your design. The diagrams may zoom in on parts of the architecture diagram to show a finer level of detail of the interactions between the various functions. Try to have at least one picture per slide if presenting, and an introductory section with good illustrations per main chapter if writing a design description document.

B3) Include flow charts for algorithms. A flow chart accompanies the step by step description of each algorithm.

B4) Add use cases. Use cases describe the various ways a product will be used. It is true that, by this point, the functionality has been described textually and in pictures. However, use cases will dig into specific examples of how the product will work. Start with the main use cases, and then cover all corner cases.

B5) Add call flows. Call flows describe the exchange of messages between different functions within a product, or between different nodes of a system. Frequently, a use case can be fully described by a call flow.

Review your document repeatedly to ensure a good flow.

Finally, clean up the document of any typos or mismatched styles and fonts. You don’t want cosmetics to distract from your message.

Configuration

Configurability is one of the most crucial, and one of the most frequently mishandled, features of a product.

A good configuration enables crucial calibration and evolution of a product to meet evolving external conditions.

Systems engineers must provide a list of configurable variables with explanation of usage, impact to product, range of settable values, default value, and within the range of settable values, the list of values to test. The test list has a multiplive effect on unit testing, and potentially on system test. The rookie systems engineer may define too many configuration variables, and large ranges of values to test, only to find out later, that only the default values have been tested for many of the configuration variables due to time constraints. The experienced systems engineer will define the minimum set of configuration variables, and restrict as much as possible the testable range space. The engineer may highlight hard coded variables that may be candidates for future change so that the plumbing is created to enable adding configurability to these variables, if desired, in future product updates.

The Magic of Math

Me: in the equation, t represents the arrival time of the first packet carrying the object…

Meeting attendee: wait, wait, t, x, y, that is too complicated…

And that is how some engineers, who have Masters of Science or PhD degrees and who have been on the job for 10 to 15 years, are completely disconnected from the most simple of math equations.

We work with interfaces, call flows and requirements. We rinse and repeat until our skills expand in the areas of defining APIs, of defining functional interactions, of optimizing behaviors and designing the most simplified behavior to achieve efficiency. However, we sometimes embark on an optimization quest without, first, trying to identify the obvious mathematical optimum. Identifying the optimal achievable performance allows one to stop searching for improvements when none are to be had.

One example we had to deal with is extracting the optimal playback time for http streaming objects delivered in constant bit rate bursts over a channel.

Assume that http streaming objects are created at a periodic interval T1 and each carry T1 seconds worth of video. The segments are of maximum size S. Assume, for simplicity, that all segments are of maximum size. Furthermore, assume that the packets are delivered using short bursts delivering at most S/T1*T2 bytes per burst every T2 seconds where T2<T1.

Assume that the segments are generated at an encoder once every T1 seconds. The segments are delivered to the delivery engine which maps the arriving data onto the scheduling opportunities every T2 seconds. Since T1 and T2 are not aligned then the packets will exhibit variable delay until scheduled. The data is then delivered to an end client which must decide what is the optimal playback time of the stream of segments based on the delivery characteristics of the first receive segments.

One can show for the above stream that the receiver can schedule the playback of the first received segment at arrival time of first packet of segment + ceil(T1/T2)*T2 and a slight margin. The playback or every subsequent segment can be scheduled at T1 intervals with segments guaranteed to be delivered before the playback point of these future segments.

Our software engineers had tried to use multiple ad-hoc formulas based on the first/last packet arrival of a segment and based on T1. These formulas were not optimal and worse sometimes failed depending on the actual setting of T2 which was a variable characteristic of the system.

A simple analysis was all it took. This would be a simple exercise for an engineering student. So, why was it so complicated for experienced and highly cultured engineers?

My advice is to keep your accumulated experience and knowledge close and your math toolbox closer.

A Call for An Open Approach

In my rookie days, I worked helter skelter with a bunch of good willed engineers. I tried my best but hated criticism. I would work alone, preparing my design, consult a few times with my direct lead, then run a concept review followed by the actual design review. Once the design was reviewed by systems, the design was reviewed by the software and test teams. The software and test teams used to be so much behind that they did not see the design until 3 months after it was done by systems. At that point, I was reluctant to make changes. In any case, I received comments as criticism to my work and sought to avoid them.

With more experience, I have tried to involve as many co-workers as possible in my design work. Feedback is a way of canvassing the knowledge in the team to improve the design. At the end of the day, the software team knows what is the simplest implementation, and the test team knows the pain points encountered in previous testing. Obtaining this knowledge and being flexible in modifying the design to incorporate the comments leads to the best design possible. This will also lead to improved thinking and clarity on the part of the designer and will lead to more efficient designs in the future.

I have experienced negative feedback on my approach but only by occasional contributors. These occasional comentors may believe that they are correcting a flawed design while what they are really doing is being a part of the design team on a quest for the best feasible design. People who work with me love this approach as they feel a sense of ownership of the design.

Finally, I would like to comment that this should never be design by consensus. The designer should still spend a lot of time coming up with original approaches and vetting his or her ideas with colleagues until convinced on the best approach. The designer should have the strength of their conviction when discussing the design and push it through if negative feedback is unconvincing. The goal is to always push the envelope within the limits of the solution space.

Signs of Trouble

At one point or another, you will find that a team is struggling to deliver. The most frequent signs of trouble in my experience are:

A rejection of every proposal as too complicated. The struggling team will reject every proposed enhancement or change as too complicated. Every small change will be associated with a disproportionate man month estimate. In general, one must trust the estimates returned by the team specialized in each layer and task. However, estimates that don’t make sense are an indication that the corresponding team have lost the handle on their product.
A reluctance to provide time estimates. Time estimates are always given with a grain of salt. Every engineer worth his salt will provide one. It may be optimistic, pessimistic or on the money but a confident engineer will give an estimate. A struggling team may refuse to give an estimate even if it is understood that the estimate is a loose one.
An overly protective lead. Bad habits, like good ones, require gentle handling to blossom. An overly protective lead may overly protect his team in bad times leading to a loss in accountability and openness.
A defensive attitude. An us against them mentality and a refusal to accept just criticism are hallmarks of struggling teams that are in denial. Defensiveness of this kind goes against the healthy ethos of information sharing and open discussion. It may be needed when interacting with other companies (although, even there, a more balanced approach works better), but it is poison to improvement when applied in-house.
Turnovers. When a team is struggling, the good and nimble players will scramble away first. Some good stubborn apples may remain but the core strength of the team will be weakened.
Finger pointing. This is the ugliest red flag. I recall working on a project where the team could not deliver a firmware update and constantly had bugs in their system. The team finally identified one of the junior engineers and blamed him for checking in code without properly notifying the team members. The guy may have done a mistake but he was most likely trying to compensate for the chronic failures around him. Ten years later, the scapegoat is leading a firmware team and his prior bosses have been laid off or have left the company.

A Moving Target

Previous to my current job, I used to work with a competitor, let’s call it company E. I had started to help the standards team at E, contributing ideas to our standards proposals. The standards groups, at the time, were in the midst of defining a new radio access technology. We proposed what we thought was a good design. Lo and behold, the company I work for now, let’s call it Q, had contributed a standards design as well. Their design was inferior to ours and had obvious flaws. We were relieved. However, in the next couple of cycles of standards meeting, Q brought up updates to their proposal where they closed all the loopholes. Their design formed the bulk of the new standard.

The morale is that good companies present a moving, ever improving, target. Weak ones will give it their best shot but then stagnate.

The same applies to groups within a company. A strong group will successfully fix the flaws in its product. A weak group will be overwhelmed by the size of the fixes and will stagnate and fail.

At a micro level, the same can be said of individuals 🙂

The Multi-Branch Delusion

Early on in the life of a project, the development of a product progresses in consecutive releases developed in succession.

At some point in the life of the project, there may come a time where the software team finds it hard to develop multiple features requested by different customers.

At this point, some may propose branching into 2 or more product lines, one for each customer. In my experience, this is usually the worst possible decision. A team that is already struggling with one product release, now has to face the issues arising from multiple product releases.

The extra systems engineering effort is relatively small and consists of distributing features among the new branches. The testing effort is the most affected; it is directly multiplied by the number of branches. The development effort, while seemingly the same, is also significantly increased since the developers have to support change requests from multiple testing teams, then incorporate the fixes into multiple branch releases.

A multi-branch solution may make sense for a while, but the shrewd project manager should plan to merge all branches in a unifying release at the earliest possible opportunity. Creating multiple branches because a team is in arrears, however, is like trying to block cracks in a dam with your finger.

Rushed Designs For the Wrong Reasons

As a follow-up to my previous post, I will discuss the opposite of a well thought out design: the rushed design for the wrong reasons. This phenomenon takes place when a bug or product requirement is discovered and resolved asap through a system design change. The change may be the only way out of a self inflicted wound, or may be a last minute requirement change from a customer. I have seen both. However, more often than not, the fix will bring its own slew of problems. Eventually, the fix will be removed after great expense and replaced with the more correct solution.

In a project I was working on, a customer asked our radio team to introduce hysteresis on the reporting of a cell level system characteristic. On a handoff from one cell to another, a change in this parameter is reported to the application. My team fought the idea because the handoff operation itself has its own hysteresis mechanism. It was the last requirement before launch and my company complied. The impact was:

The hysteresis introduced delay between the knowledge at the radio stack and the knowledge at the application level, affecting most functions on the radio level. The functioning at the radio level had to be changed in multiple low level functions to mimic the delay inknowledge.
The complicated changes led to complicated use cases with multiple back and forth with the customer on the behavior of tangential use cases.
The hysteresis itself did not introduce any tangible benefit and the customer stopped using it after a while.
We had to keep two branches of the code, one for this particular customer and one for the rest of the world. The reason is that we did not want external intellectual property to be available to a 3rd party. This doubled our testing load for every release.
We removed the hysteresis code after a couple of years.

In the example above, a stronger product team, or better communication channels with the customer, would have allowed a push back on these requirements. In retrospect, it was not worth it. However, at the time, it was the cost to be accepted at launch.

Squeezing the Lemon

As a chess grandmaster once told me: “once you find a good move, search for a great move.” The same applies to systems engineering design, once you have found a good design, search for a better design.

A better design may not necessarily improve performance but may be easier to implement, or be future-proof, or circumvent some implementation difficulties.

The design should start with an individual exploration. This solo effort will have the highest chance of being original. Here, the most original ideas should be considered. The metaphorical lemon is squeezed to ensure that the best possible solution is proposed. Once an individual conceptualization of the solution is present in the mind of the systems engineer, cross-pollination with the ideas of other systems engineers and software or test engineers may rechannel or replace the original ideas obtained through the original solo investigation.

For cross-pollination and guidance in the design process, it helps to have a good systems team that is involved in the concept review and the drafting of the design, The best reviews are the ones where the attendees are attentive and where feedback is thorough at all levels from the high level concepts down to the details of the presentation of the design. I was lucky to work on a small and experienced systems engineering team of five. Design reviews were brutally frank and incisive. We all gave as good as we received and the outcome was excellent designs that were well organized and attentive to detail. By the time the design reached the software and test teams, we could concentrate on implementation issues and seldom had to change the high level concepts.

Beyond the systems team, good communication with the software and test teams at or before the concept review ensures that impossible approaches are discarded, practical points of view are considered, and previously identified pain points are remedied or avoided.

As part of the fleshing out of the design in the detailed design phase, the systems engineer should consider detailed use cases including adversarial scenarios. Problem use cases identified in the systems design phase will avoid costly problems in the field when bug fixes will be costly and slow to reach the already deployed product. The metaphorical lemon is squeezed again to identify all the possible impacts of the design. An error in the design at the systems level is at least ten times harder and ten times costlier to fix than an error in the software design, which is at least ten times costlier than an error in the coding of the software design. This multiplicative effect is the reason that enough time should be allocated to the systems design phase, and that the systems design should be given the attention it deserves.

The Development Process

The development process of a product is illustrated in the following V-shape diagram:

Product Design <——————————————> Customer Interop tests

Systems Design <——————————> System Test

Software Design <————> Unit Test

Implementation

The product implementation and testing start on the left upper cusp of the V in the diagram, descends down then ascends on the other branch of the V.

First comes the product requirements from the product management team. The systems engineer produces system requirements based on the product requirements. The software engineer produces a software design based on the systems design and requirements and proceeds to implement the module. The software test team or the software engineer run some basic unit tests on the modified code. The code is delivered to the system test team which verifies that all the systems requirements are implemented. Finally, whenever applicable, interoperability tests at the customer or partner facility are run to verify that end-to-end customer requirements are met.