Monday, December 30, 2024

Embedded Software: Simple vs. Complex

Embedded software is integral to modern technology, ranging from simple home appliances to advanced autonomous systems. It can be broadly classified into two categories: simple (Non-OS) and complex (OS-driven) embedded software.

Simple Embedded Software: When simplicity and low cost are priorities and an OS would be overkill

Examples:

  1. Power or temperature monitoring systems.
  2. Simple applications in household appliances like ovens and washing machines.
Characteristics:
  1. Typically designed for applications with few tasks.
  2. No operating system necessary.
  3. Software interacts directly with the microcontroller’s hardware (registers etc.), forcing rewrites if the hardware changes.
Advantages:
  1. Low power consumption, low cost.
  2. Can be developed by electronics engineers, no need for computer engineers because basic embedded programming knowledge is sufficient.
  3. Deterministic: By sidestepping the complexity of OS schedulers, simple systems achieve predictable performance.
  4. Fewer abstraction layers make verification and validation straightforward, which is a huge advantage for safety-critical certification.
Complex Embedded Software: When multi-tasking, file operations and networking necessitate an OS

Examples:

  1. IoT devices requiring seamless connectivity.
  2. Systems involving advanced sensor integration or navigation.
Characteristics:
  1. Runs on an operating system that manages tasks and system resources.
  2. Capable of handling multiple tasks and applications simultaneously.
  3. Safety-critical certification is difficult. To make it easier, safety-critical parts should be developed as separate, simpler modules.
Advantages:
  1. Less competition and higher profit margins, provided that you have a strong technical team.
  2. Requires computer engineers to lead the development because of increased software complexity. Besides embedded software courses, related concepts of algorithms, data structures, and operating systems are also a core parts of computer engineering but not electronics engineering.
  3. The OS abstracts low-level hardware management, enabling developers to focus on application logic. A POSIX-compliant application, for instance, can run on any POSIX-supporting OS with minimal changes.
  4. Easier for new developers to adapt and contribute due to less hardware dependency.
  5. A broad range of pre-existing libraries simplifies development.
  6. Operating systems provide abstraction layers (e.g., Linux Device Model), allowing drivers to expose standard interfaces while interacting with specific hardware.
  7. Simplifies adding new functionality (e.g. telemetry) or adapting to new hardware (e.g. new/different sensors).
  8. With minor modifications, software can be tested on a PC, speeding up testing with less effort (no need for electronic cards, power supplies, etc.) and reducing bugs.
Operating systems can also be categorized as either simple (e.g., FreeRTOS) or complex (e.g., real time Linux with ROS) - but let's leave that topic for another blog post.

Monday, December 16, 2024

Serialization

In C++, to serialize simple data, aka plain old data (POD), where the layout in memory is predictable, you can use a char* (byte) buffer:


This method cannot be used for non-POD types (e.g., those with pointers or virtual methods) because their memory layout is not portable. Examples are std::string, std::vector. For such types, you can use std::ostringstream:


Both approaches assume that the serialized data format and endianness match between serialization and deserialization. For more complex cases, use libraries like nlohmann/json for JSON-based serialization and Boost.Serialization for binary/text serialization with more features.

Data Structure Alignment

The C++ compiler aligns data structures to the largest alignment required by any field (8 bytes in the case below, due to double). This ensures faster memory access, as modern CPUs perform better when data is aligned to specific boundaries because it results in single memory word access. As a side effect, sizeof(MyStructure) (40 bytes due to padding) is larger than the sum of individual fields (33 bytes).

Field Offsets and Padding

  1. int i1:

    • Requires 4-byte alignment.
    • Starts at offset 0.
    • Takes 4 bytes.
    • The next field, d1, requires 8-byte alighment. Since i1 ends at offset 4, the compiler adds 4 bytes of padding after i1.
  2. double d1:

    • Requires 8-byte alignment.
    • Starts at offset 4 + 4 = 8.
    • Takes 8 bytes.
  3. char s[9]:

    • Requires no specific alignment (1-byte alignment is sufficient).
    • Starts at offset 16 (immediately after d1).
    • Takes 9 bytes.
    • The next field, int i2, requires 4-byte alignment. Therefore, the compiler adds 3 bytes of padding after s to ensure proper alignment.
  4. int i2:

    • Requires 4-byte alignment.
    • Starts at offset 28.
    • Takes 4 bytes.
  5. double d2:

    • Requires 8-byte alignment.
    • The next offset must be a multiple of 8. Since i2 ends at offset 32 (already aligned), no padding is required.
    • Starts at offset 32.
    • Takes 8 bytes.

Tuesday, December 3, 2024

Simulation variable names

In simulation, you have to be precise when talking about a parameter. For example, it is never enough to say "height". You should always say "height with respect to mean sea level, with units in feet". The reason is that height can also be measured from WGS84 ellipsoid or ground (AGL). Every couple of months, I see engineers waste days, sometimes weeks, due to such misunderstandings.

Here is a list that I frequently encounter, with bad and good variable naming:
  • height: h - hMSL_ft (height measured from MSL, units in feet)
  • time: t - timeFreeFlight_s (time started at free flight start, units in seconds)
  • velocity: v - v_bc_Fn_mps (velocity of body fixed frame Fb wrt ground fixed frame Fc, with components expressed in NED frame, units in m/s)
  • speed: v - speed_Mach
  • acceleration: a - a_bi_noG_Fb_mps2 (acceleration of Fb wrt inertial frame Fi, without gravity components, expressed in Fb, units in m/s^2)
  • Euler angles: euler - euler_Fn2FbRFB321_rpy_rad (321 yaw pitch roll sequence rotated frame based Euler angles that convert a vector in Fn to a vector in Fb, array index order is roll pitch yaw, units in radians)
  • Azimuth: az - azimuthTrueNorth_deg (azimuth angle measured from True North, units in degrees)

Monday, November 25, 2024

Denormalized floating-point numbers

Recently, when debugging with Visual Studio 2019, I noticed that a variable had the unusual value of 6.324...e-322#DEN. This is "almost" equal to zero. In Visual Studio's debugger, the #DEN label refers to a denormalized (or subnormal) floating-point number. This occurs when the value of a floating-point variable is so close to zero that it cannot be represented in the normalized format of the IEEE 754 standard.

A floating-point number is typically represented as:

This is called a normalized number because the leading digit of the fraction (mantissa) is assumed to be 1.

When the exponent is at its smallest possible value (the minimum exponent), and the number is still too small to be represented in the normalized form, it is stored as a denormalized number. For denormalized numbers, the leading digit is 0 instead of 1, so the representation becomes:

Denormalized numbers fill the gap between the smallest normalized number and zero. 

By the way, it turned out that the strange 6.3...e-322#DEN value I was observing during debugging was due to an unitialized variable.

You can use the following C++ code to investigate further:

Friday, November 15, 2024

Strange error when generating C code from Simulink

When I tried to generate code from a recently updated Simulink model, I got "index exceeds the number of array elements. Index must not exceed 0". This error happened only during code generation, not when running the model. After spending a day, I found out that the reason was forgetting the folder separator "\" in an entry in Model Configuration Parameters > Code Generation > Custom Code > Source Code list, i.e. instead of "abc\file.cpp" the entry was "abcfile.cpp". A very unhelpful error message for such a simple error... A better error message would be "could not find file abcfile.cpp".

By the way, you can generate code from the MATLAB command line with slbuild('your_model_name').

Another error is "Invalid setting for input port dimensions of ...". If you are sure that there is no problem with port dimensions, close MATLAB, delete existing .mexw64 files, open MATLAB and regenerate them.

Friday, November 8, 2024

Finding root cause of latencies

A common issue in real-time hardware-in-the-loop simulations is latency in components that run on separate computers. These latencies can significantly degrade system performance. To differentiate between delays in component operations and network latency, you can send timestamps along with the data, log the time at reception, and record both timestamps:

Since components A and B have different clocks, you cannot directly compare the send and receive times. However, you can calculate the differences separately. If diff(timeSend) == diff(timeReceive), the latencies are likely not due to network congestion or a faulty switch/NIC but rather to delays in the operation of component A. It is highly unlikely for time differences to be the same if there is a problem with the network, it could only happen if the network was always adding a constant delay for each packet transmission.

Of course, you should perform the file saving in a separate thread to prevent blocking other operations. To minimize disk access, write to the log file periodically, for example, by flushing the log buffer to the file once per second.

Thursday, October 31, 2024

Handling long operations in observer chains

If you have lengthy observer notification chains where observers notify other observers, making the trigger order unpredictable, and these chains include time-consuming operations like updating a map drawing, you can use the following approach to only update the drawing when the last observer in the chain is reached:

Thursday, October 24, 2024

chatGPT 4o vs o1-preview

I normally use chatGPT 4o because it is much faster than o1-preview. Today I asked 4o the following:

Write a function that performs the following 2 byte hex to 2 byte signed int transformations:
0x801D --> -29
0x811D --> -285
0x821D --> -541
0xFF1D --> -32541
0x001D --> 29
0xAA1D --> -10781
0x101D --> 4125

It wrote a function that resulted in the following, mostly wrong, output:

-32739
-32483
-32227
-227
29
-21987
4125

I fed this output back, and it apologized and rewrote something slightly different, but the output was still wrong. I repeated the steps, and I got responses like 'I see the issue more clearly now' and 'Thank you for your patience,' but the output remained incorrect. When I fed the same prompt to o1-preview, it solved the problem in a single iteration. Here is the final python function (sign-magnitude representation):

def hex_to_signed_int(N):
    isNegative = N & 0x8000 # equivalent to "N >= 0x8000"
                            # 0x8000=1000 0000 0000 0000
    if isNegative:
        magnitude = N & 0x7FFF # 0x7FFF=0111 1111 1111 1111
        return -magnitude
    else:
        return N

test_values = [0x801D, 0x811D, 0x821D, 0xFF1D, 0x001D, 0xAA1D, 0x101D]

for val in test_values:
    print(f"0x{val:04X} --> {val:05} --> {hex_to_signed_int(val)}")

Sunday, September 22, 2024

Using code from Simulink model in HIL

Steps of converting a Simulink model to code usable in Hardware-in-the-loop (HIL) simulation:
  1. Checkout/pull Simulink model from repository to your local.
  2. Run Simulink model and confirm it finishes as expected. If not, inform the model maintainer and ask them to commit/push the model with correct settings to repo. 
  3. Confirm that C/C++ code can be generated from model. Sometimes an s-function build file (mexw64) exists but its source code is missing, which allows the model to run but prevents code generation.
  4. Copy code to Visual Studio and verify that you can build and run it. There are cases where Simulink is more forgiving of errors like uninitialized variables, or missing #include <cmath> but Visual Studio cannot build the code.
  5. Copy code to real time Linux PC and verify code can be build there too.
  6. Commit code to its own repo.
  7. Run HIL with new code and verify HIL works as expected.

Monday, September 9, 2024

Why is file hash comparison faster than byte-wise comparison

Question: Since calculating the hash of a file requires reading every byte, why is comparing hashes of two files faster than byte wise comparison of file contents?

Answer: In hash comparison, each file's hash is computed once (by reading all its bytes), and then the two hashes, which are small fixed-size values (e.g., 256-bit or 512-bit), are compared. Comparing two hashes takes constant time, regardless of file size. In a byte-wise comparison, if there are N bytes, in the worst case where files are the same, N comparisons have to be made. 
  • Hash comparison = reading file + 1 comparison.
  • Byte-wise comparison = reading file + N comparisons.

Tuesday, July 30, 2024

LONG_MAX is different for Windows 64 and Linux 64

When you generate code with Simulink (MATLAB R2023b) using ert.tlc, the default OS is Windows 64, see Configuration Parameters - Hardware Implementation - Device type. When you generate C code, the <model name>_private.h file will contain checks for ULONG_MAX and LONG_MAX.

On 64-bit Windows, the long type is typically 32 bits, which causes the LONG_MAX to be 0x7FFFFFFF. On 64-bit Linux systems, the long type is typically 64 bits, i.e. LONG_MAX is 0x7FFFFFFFFFFFFFFF. When you use code generated with the Windows 64 setting and use that on a Linux 64 OS, the check in <model name>_private.h will fail. The solution is to use the Linux 64 setting in Simulink which removes the LONG_MAX check from header file.

This checks seem to have been added after MATLAB R2022b because code generated with R2022b does not have them.

Friday, July 12, 2024

Passing JSON to ProcessBuilder

I am using one JVM to prepare inputs for a simulation in another JVM. The simulation uses a C++ DLL, and when that DLL crashes, it takes the JVM with it. Running it in a separate JVM protects the first JVM from crashing as well. I prepare the simulation inputs as a JSON string in the first JVM and pass it to the second using ProcessBuilder. However, when passing a standard JSON, ProcessBuilder strips away the double quotes, e.g., "count": 5 becomes count: 5, which results in an invalid JSON that cannot be parsed in the main(String[] args) method. The workaround is to use jsonStr.replace("\"", "\\\"") before passing jsonStr to ProcessBuilder.

Tuesday, March 26, 2024

When do you need HIL tests?

The steps to create an autonomous aircraft, from design to product, are as follows:
  1. Concept of Operation
  2. Requirements
  3. Design
  4. Ground tests
    1. Test components
    2. Test system
  5. Flight tests
  6. Deployment
  7. Maintenance/Updates
As you progress through these steps, the cost of fixing problems increases exponentially.

Consider a typical closed-loop diagram:
The "plant" consists of the airframe, actuators, and engine. The environment includes the atmosphere, aerodynamics, gravity, and electromagnetic interference.

During design phase, you start without any hardware and simulate everything with software-in-the-loop (SIL) simulations. The advantage of SIL is that it allows you to run millions of automated tests in a short time and with low cost to verify that you don't have any logic errors in your software. 

As hardware becomes available, you proceed to ground tests, transitioning more and more of your software from standard PCs to custom hardware. This slower and more costly step is called hardware-in-the-loop (HIL/HWIL) tests. HIL tests are necessary because:
  1. Your system might work in SIL but since certain bugs only manifest themselves on a particular OS - compiler - hardware configuration, you cannot be sure with just SIL tests that your software is bug-free. Note: Instead of bug-free, the term 'tolerable' might be more appropriate because, for complex software, it is statistically improbable to achieve an entirely bug-free state.
  2. Resource constraints (memory, processing power, network speed, etc.) of real hardware might differ from those in SIL which might cause a working system in SIL to fail in HIL due to missed timings etc.
  3. Electromagnetic conditions (interference, noise, etc.) might differ from those in SIL. Components that work individually in isolation might cause problems when integrated together.
  4. Although you can't test as extensively as with SIL, you can still conduct far more tests than with flight tests.

Wednesday, March 6, 2024

Visual Studio call hierarchy

With Visual Studio 2022, the call hierarchy feature has two problems:
  1. Despite me being always interested in the incoming calls to a function, never outgoing callsI always have to click the "Calls To" icon for each level and also click the entry in Call Sites to go to the line.
  2. If at any stage there is a function pointer assignment, Visual Studio loses track (code sample from open source Blender project):

ReSharper C++ call tracking fixes these problems and makes navigating the call hierarchy a breeze and you can follow the sequence up to main:

Visual Assist X, which is 70$ cheaper per license than ReSharper C++, does not have this feature, which is a deal breaker for me.

ReSharper C++ also helps with useful tips to modernize the code base. For more, see ReSharper C++ Quick Tips


Note that if instead of Visual Studio, you use CLion, it includes ReSharper features.

Thursday, February 8, 2024

NetBeans "the type of ... is erroneous"

If your Java project builds successfully but NetBeans 8.2 shows red marks in code with the message "the type of ... is erronous", close NetBeans, clear all files inside 8.2 cache folder. On Windows the folder is located in %LOCALAPPDATA%/NetBeans/Cache/8.2
If this does not get rid of red marks, close NetBeans, open project.properties file, Delete any extra "}" at the end of javac.classpath section, e.g. it the final line in that section should not be ${reference.mylib}} but ${reference.mylib}. This usually happens when you manually edit the file.

Friday, January 19, 2024

The curse of band-aid solutions

The flexibility inherent in software development can become a curse because it allows developers to implement quick and dirty fixes without fully understanding the root cause of a problem. Suppose you are tasked with writing a factorial function, knowing that factorial(1) = 1 and factorial(2) = 2. You write a function to satisfy these conditions:
double factorial(size_t a) {
    return a;
}
Then, during live tests, you realize that the function should return 6 for an input of 3, and 24 for an input of 4. Instead of investigating the correct mathematical approach, you modify your code by adding if statements, because that is what you know:
double factorial(size_t a) {
    if (a < 3) return a;
    else if (a < 4) return a*(a-1);
    else return a*(a-1)*(a-2);
}
You add new if conditions as failed tests pile up. For such a simple case, all developers agree that this is not the way to go. However, as problems become more complex, they often lack straightforward solutions that a single line prompt to ChatGPT can provide. Also, there is always pressure to get things done quickly and you don't have time to get to the bottom of things. Most engineers yield under pressure which over time leads to a growing mess, dissatisfaction, burnout and resignation.
The optimum strategy is to use a band-aid solution in the short term, make a note of it (preferably in an issue tracking system), and as soon as you get a chance, spend time on how your solution could fail and make it more robust. It is crucial to be interested in the problem rather than merely viewing it as something to be gotten rid of. You never attain perfection, you approach it asymptotically. Those who are curious and have the discipline to conduct thorough root cause analysis will become 100X engineers. Those who don't will be replaced by AI. 

Tuesday, January 16, 2024

Visual Studio C++ debug and release configurations

Visual Studio 2019 C++ debug configuration is more forgiving than release. Examples that a debug build allows but release build does not:
enum E {a = 0, b=1};
E r[3] = {0}; //causes "initializing: cannot convert from int to E"

double a = 0;
E d = (E) a; //causes "typecast: cannot convert from double to E"

typedef struct {
    double d[3] {0}; //causes "an in-class initializer is not allowed for a member 
of an anonymous union in non-class scope"
} S; class C { double s[3] {0}; //causes "cannot specify explicit initializer for arrays" }
Therefore, to make sure that the C/C++ code you generated from Simulink model is suitable for building, in your nightly tests, build in release configuration.

Monday, January 15, 2024

Visual Studio: Include all *.c and *.cpp by default

In Eclipse CDT, implementation files (*c., *.cpp) are included until you exclude them. In Visual Studio 2019, it is the reverse, i.e. files are excluded until you manually include them in your project. To include all files automatically and just exclude the file sim\ert_rtw\ert_main.c file, open your *.vcxproj file with an editor, remove all <CICompile Include = .../> lines and use the following:
<ItemGroup>
    <ClCompile Include="**\*.cpp" />
    <ClCompile Include="**\*.c" Exclude="sim\ert_rtw\ert_main.c" />
</ItemGroup>