Friday, December 18, 2020

Software inertia

Imagine software as a snow ball that you want to move forward. As uncle Bob said, if you don't put aside time to clean software, in time its inertia will increase. In the beginning the ball will be light and you will add features, i.e. move the ball easily. With time the code will become messier and changes that took 1 day in the beginning start to take first a couple of days and later weeks. The snow ball gets heavier whenever you move it further. In other engineering disciplines, the more you work on a product, the better it gets, at least it doesn't get much worse. Good luck to software project managers who try to estimate when the project will be done, because the further down the road, the less reliable the estimates are. For any non-trivial project, you can only come up with reasonable time estimates if the code is continuously cleaned up.

Monday, November 30, 2020

Difference between Simulink model and executable results

When you generate code from a Simulink model, build an executable and compare the results of Simulink model and excutable, you will usually notice that there are differences. To make sure that these differences are normal, do the following:

  1. Verify that Simulink and executable inputs are exactly (up to 16 digits) the same. Use format long before printing Simulink inputs and outputs. Do the same for executable input and output printing. Most of the time, differences in results are due to differences in significant digits. Pro Tip: If your Simulink model uses constant blocks for all inputs, when you generate C code, those inputs will be embedded into the code. If you build this code and don't touch any of the inputs, you can be sure that Simulink and executable inputs are the same without doing anything extra.
  2. In the executable main(), use double (64bit floating point) data type for inputs and outputs.
  3. Run the executable two times and verify that results of these two runs are exactly the same. If they are not, this is possibly due to a random number generator in model. Add random seed to input so that same random numbers will be generated which then guarantees consecutive run results being the same.
  4. Build executable as 32bit, run and save results. Build as 64bit, run and save results. Compare 32bit and 64bit results. The difference in these results will give you an idea of difference order of magnitude to expect.
  5. Verify that the difference between Simulink run and executable run is of similar order of magnitude to the difference between 32bit and 64bit runs.

Letters to a novice programmer

Previous letter

When the problem you need to solve has only two states, don't create a monster (e.g. FizzBuzz Enterprise), with state machine builders, polymorphism and templates sprinkled around! Use simple switch statements so that a developer can easily move through code with ctrl+left clicks, without ever needing to use ctrl+F.

Monday, September 14, 2020

Tip when reading data from file

When I read data from a file in Matlab, I sometimes read the wrong file because files have the same name but different paths and I want to read the latest file. If you print the file date/time to Matlab's command window, you will get a visual indication which can alert you if you read an older file with the same name. You can use the following two lines to print file date:
fileInfo = dir(fileName);
fprintf("File date: %s\n", fileInfo.date);

Friday, September 11, 2020

Printing all permutations of a string

You can use the following C code to print all two character permutations (with repetition allowed) of a string:

If you would like to print three character permutations, you have to add another for loop:

As you can see, this is not generalizable because you have to keep adding or deleting for loops manually. Also notice that with each loop, the p[] index is increased by one. We could use this fact to write a recursive function that could handle all character permutations:

This is a nice example of first writing the simple cases, seeing a pattern and reaching the general case, i.e. inductive reasoning.

Friday, August 21, 2020

Clean Code

Software project requirements are often not fully known in advance, requiring teams to learn a significant portion of the necessary information during the development process. This frequently leads to the addition of new features after the initial requirements and design phases, as well as extensive debugging. A key property of software is its ability to improve a system without changing the hardware. To take advantage of this flexibility, software must be designed in such way that it is easily modifiable. Otherwise, it risks becoming as inflexible as hardware. Since most of the development time is spent reading and modifying code, it is imperative to optimize for this fact which means writing clean code, i.e. code that is easy to understand.

Clean Code - Uncle Bob / Lesson 1:

  • [28:51] It is not the old people training the new people it is the old code training the new people.
  • [31:06] No one writes clean code first because it is just too hard to get the code to work... Cleaning code requires as much time as making it work in the first place. Nobody wants to put that extra effort in. You are not done when it works, you are done when it's right/clean.
  • Clean code should read like well written prose.
  • [46:30] Every line in a function must be at the same abstraction level.
  • [58:45] A function should do one thing, i.e. you should not be able to extract another function from it.
  • Don't pass a boolean to a function. Bad: setCentered(true, false). OK: setVisible(true). Using enumeration variables or constants rather than a boolean variable, you make your code more readable, e.g.repaint (PAINT :: immediate).
  • Replace switch/if with polymorphism
  • Command and query separation: A function returning void must have a side effect [change system state], a function returning a value should have no side effect. With this convention, when you see a function that returns a value, you assume it is safe to call it because it should leave the system in the same state before you called it.
  • [31:27] The design and code should get better with time, not get worse [continuous improvement].
  • [32:15] If you touch it [the messy code], you will break it. If you break it, it becomes yours(!) Minimize personel risk vs improving a messy system.
  • Unit tests results in fearless competence.
  • [35:32] Always check it in a little bit better than you found it.
Tools are not the Answer: Raise the level of software discipline and professionalism. Never make excuses for sloppy work.

When working in an environment where messy code is common, it's crucial to commit to the repository in as small increments as possible. This is because even minor code changes can unexpectedly break the system. Often, you may only realize this after several days or weeks. Given the complexity of the code, your primary method of diagnosing the issue will be reverting to earlier revisions. If your commits are small, it will be much easier and quicker to isolate the cause of the code breakage.

Monday, August 10, 2020

TortoiseSVN: List all externals in a repository

When you have a repository that uses other repos as externals, the easiest way is to right click on your local repo root folder and select Branch/tag. The window will display at the bottom all externals in that repo.

When you tag a repo, don't forget to check all externals so that when at a later date you go back to that tag, the externals will also be in their revision of the tag date. If you do not check, they will come back as head revisions wich might cause problems if your tagged non-external code is not consistent with the latest revision of external repos. The only problem with this scenario is that if you test your code and then let time pass before tagging, another developer might update one of the externals and when you tag, TortoiseSVN uses the revision on the server, not your local.

If you want to be sure that you tag the external revision in your local, you should use svn copy [local path] [server path] --pin-externals -m "this is a test"

Note that you need version 1.9 or later of TortoiseSVN in order to use pin-externals.

Friday, June 19, 2020

Letters to a novice programmer

I decided to move software related posts to this blog. See my latest letter. Recently I saw C++ code similar to the following:
myAlgo.setInputs(inputStruct);
myAlgo.calculate();
myAlgo.getOutputs(outputStruct);
The correct way is to refactor calculate() method as follows:
outputStruct out = calculate(inputStruct)
Using this version would save the user of myAlgo from a couple of lines, he would not face the risk of forgetting to set inputs. In the previous version, if you forget to call setInputs(), the compiler will happily build your code and you will waste time finding the bug at run time. In the new version, if you forget to pass inputs to calculate(), it won't build and you will instantly see the bug.

Whenever you have multiple public initialization functions, try to combine them into a constructor. Your design should be such that when your code builds successfully, you should be confident that it has no initialization or finalizations related bugs. Let the compiler help you.

Tuesday, June 16, 2020

From Workspace uses first element as time

If you create a Simulink model with a From Workspace block and use it to get a vector, you will be surprised to see that the first element is missing:


The reason is that From Workspace treats the first element of a vector as time and the rest as data.

Tuesday, May 12, 2020

Applying derivative and integral on noisy data

Let's say you have a motion with sinusoidal acceleration, i.e. xddot(t) = sin(t). The analytic solution for speed and position will be xdot(t) = -cos(t)+1 (assuming zero initial speed) and x(t) = -sin(t) + t.  Let's say you don't know the analytical expressions of position, speed and acceleration and only have their values at a fixed sample rate. If you want to increase the samples, you can use Matlab's spline function.

If you have position as input and calculate speed and acceleration, if there is any noise in the signal, it will be amplified. If you have acceleration as input and calculate speed and position via integration, noise will be attenuated. Another advantage of using integration instead of derivation is that we can use linear interpolation since when integrating, we don't need smooth second derivatives. This both decreases computational load and lag due to two points in linear interpolation compared to four points in spline. See below Matlab code for an example:



Thursday, May 7, 2020

Mex only on file change

For large Simulink projects with lots of s-function, performing a mex of all files can be time consuming. Leaving it up to the user might result in simulation runs with older mex files because the user might forget to re-mex the updated file. You can automate the process as follows:
  • Create a hash list file that consists of file names used by s-functions and corresponding hashes.
  • Before simulation start, in InitFcn, calculate hashes of files used by s-functions and compare them with the values saved in hash list file. Mark files whose hashes are different as "changed".
  • Mex files marked as "changed".
For hash calculation you can use [status, cmdout] = system(['certutil -hashfile', fileName, ' SHA256']) and parse cmdout via str=splitlines(cmdout); hash=str[2]; to get hash of file.

From Workspace Sample Time

Let's say you have a single Simulink model with plant and controller, and another that just has the plant.

If you save outputs of plant&controller with to workspace blocks, add from workspace blocks to plant only model and run it with outputs of plant&controller, you might see differences between results. In order to have a close match of results, you shouldn't leave the sample time of from workspace blocks as zero, you have to enter them equal to sample time of plant&controller model.

Tuesday, March 24, 2020

Documenting a real time system

A developer handbook is indispensible if you need to remember what you did a year ago (due to a bug or a new feature to be added) or if you want to bring a junior developer up to speed without baby sitting him. A good handbook should include the following:
  1. Introduction: Why does the system exist, what function does it serve?
  2. Components of the system, how they communicate with each other (serial, ethernet etc.). Photos of the real system and a mind map denoting peripherals (sensors, actuators) would be handy:

  3. Setup: All software and hardware setup procedures, links to software installation folders.
  4. Sanity tests to verify that setup works properly. For example, have a test to verify that the system really works in real time by comparing system time with an external time source. A very crude test would be using the timer in your smartphone.
  5. Development use cases: How to add a new state to the state machine.
  6. Design: 
    1. Architecture.
    2. Main workflow, especially external input/output.
    3. Task priorities and rationale.
    4. Design decisions and their reasons. Why is the current design the best one under existing constraints (time, budget, experience)? Why is there no simpler way to satisfy requirements? What were the disadvantages of the alternatives. Examples: Why did you write your own file transfer protocol instead of using an existing one? Why have you not used an OS like VxWorks but choose Micrium? Couldn't you have done it without an OS?
  7. Troubleshooting guide for frequent problems.

Thursday, March 12, 2020

HWIL development workflow

A typical HWIL development workflow starts with Simulink models and ends with code deployed on hardware with software that is expected to run in real time:

If there are differences between Simulink and Visual Studio runs, most of the time it is due to you forgetting to equalize all inputs. If you are sure about inputs being the same, the remaining sources of difference are details of floating point representations and sometime a block in Simulink generating wrong code. For example Aerospace blockset ECEF2LLA block code results in latitude lagging one time step behind, which does not occur during a Simulink run.

If there are differences between software running on hardware and Visual Studio results, it might be an indication of insufficient stack size allocation or tasks not meeting their deadlines. If you are lucky and have an advanced real time OS like VxWorks, it might throw a segmentation viloation if there is too little stack available. If you don't want to count on luck, you have to test thoroughly, preferably using automated tests.

Monday, March 9, 2020

Stack corruption due to different pointer types

The C programming language allows you to send a float pointer to a function that expects a double pointer, which causes stack corruption. Example code:
When you run it, you will see that the val_f2 is zero (should be 5):

Visual Studio 2015 will only display stack corruption message when you build in debug mode. In release mode, you don't get a message.

If you copy the same code to a cpp file, Visual Studio will use the C++ compiler and it will not build the code, saying that types are incompatible.

Unconventional interfacing

When you have to interface with a program that has a GUI but whose API you don't know, you can use a video camera with image processing if you only need to read data. If you also have to change values in GUI, you can use tools like Macro Scheduler.

Sunday, March 8, 2020

Simulink: Passing strings to s-functions

As of R2019b, you can't pass a string (e.g. a file path) to an s-function via string constant block. You can convert a string to an ASCII value array, pass it to s-function, and inside the s-function convert the array back to string. Note that this means you are limited to the ASCII table of characters. You can use the following function to construct string from s-function input:

Wednesday, March 4, 2020

Verifying that your system runs in real time

Usually custom designed HWIL electronic cards don't have real time clocks independent of the CPU. Your RTOS will calculate time by clock_frequency*time_const. Both these values are part of the card's configuration which are set by the hardware designer before handing the card over to you. If these values are not consistent with each other, your software won't run at real time. You might waste weeks looking for bugs in your software, while the root cause is wrongly configured hardware.

For you own sanity, one of the first tests you should run on hardware is to verify that your software runs in real time. The simplest way is to run for 5 minutes by checking with an independent time source (like the stopwatch app on your phone or Windows system time) and track the simulation time. You cannot rely on time values provided by the RTOS. Time between start and stop should be 5*60 = 300s. If not, contact the person responsible for the card.

After you have established that simulation time and real time are the same, you can start to test the system under different loading conditions to verify that all your software tasks meet their deadlines. If not, you update your software, because at this stage you are sure that the hardware is ok in the real time sense.

Tuesday, March 3, 2020

Embedding data into code

When working in Windows, you can easily read data from a text file. But when you deploy your code to custom hardware with an operating system that does not have text file reading and parsing functions, it will be easier to embed data in file to arrays in code.

When data is large like Geoid Height Data (721*1440 elements), you cannot embed it into a header file. If you do and try to build in Visual Studio, the build might hang. You have to embed data into a cpp file and expose it in header via extern keyword.

To use two dimensional (2D) data in a function you can use the following example:

Debug C++ file

To debug a C++ file within a Simulink run, you compile the file with the "-g" flag. Consider the following Simulink model:

  1. Compile file: mex -g mySFunction.cpp. This will create mySFunction.mexw64 and mySFunction.mexw64.pdb.
  2. Open Visual Studio, open mySFunction.cpp
  3. Click Debug/Attach to process, select Matlab.exe.
  4. Set breakpoint in Visual Studio
  5. Run Simulink
  6. Code will stop in Visual Studio.

Thursday, January 16, 2020

Using workspace variables in C++ Mex functions

When you have C or C++ files that you would like to use in your Simulink model, you have to interface them with a mex file. Variables in Matlab workspace can be used inside C++ mex functions. Example m file and mex function demonstrating scalar, vector, matrix, string and structure usage:

When you run mexDriver.m, you will get an output similar to this:


Tuesday, January 7, 2020

Comparing two models

When you have an mdl/slx model, to compare it with another version of the same model:
  1. Open model in Simulink.
  2. Go to Analysis menu, click Compare Simulink XML Files...
  3. Select the model file you want to compare with the opened one.
  4. Simulink will open a window displaying the differences. When you click on a line, it will display both models, highlighting the differences.

Monday, January 6, 2020

File closing bug

Recently, I saw that when I ran a simulink model multiple times in a row from a script with sim() command, the model would throw an error at the 509th time with a message indicating that it could not find a file, i.e. an fopen() operation returned null. The curious thing is that the file exists, otherwise it would not be able to run the previous 508 times.

After one day of debugging, I saw that I forgot to close a file in another unrelated s-function. I guess after a certain number of unclosed file pointers, fopen() commands start to fail. This reminded me of an old heap corruption bug that took me months to find an fix.