**Look into Palanteer and get an omniscient view of your program** Improve software code quality - C++ and Python

@@ Overview

@@ Getting started

@@ Base concepts

@@ C++ Instrumentation API

@@ C++ Instrumentation configuration

For boolean configuration values, the expected values are 0 (disabled) or 1 (enabled). ## Configuration strategies The `Palanteer` configuration is defined inside the `palanteer.h` file as a list of macro constants with default values that can be overriden with compilation flags. Three types of configuration parameter exist: - `Palanteer` parameters that are used only [where the library is implemented](#configurationoftheimplementation) (with `#define PL_IMPLEMENTATION`) - `Palanteer` parameters that are used [by the instrumentation](#configurationoftheinstrumentation), so potentially everwhere (see note below) - Your own [groups](base_concepts.md.html#groups) declaration/configuration !!! warning For the sake of consistency, never override a parameter of the second type directly in each instrumented file.
First, it is not convenient. And if you miss one file, inconsistent configuration can lead to undefined behavior. Defining your own configuration can be done safely in several ways (non-exhaustive): 1. Directly modify the file `palanteer.h` - Safe and easy way but with potential conflict when upgrading 2. Create your own `myPalanteer.h` header which includes `palanteer.h` after defining some flags, and then use this header exclusively ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ File myPalanteer.h: ------------------- // Set your configuration #define PL_EXTERNAL_STRINGS 1 #define PL_COLLECTION_BUFFER_BYTE_QTY 100000 // Define the group "DEBUG_ASSERT". // Note: configuration through the build system is possible as we check existence before setting #ifndef PL_GROUP_DEBUG_ASSERT #define PL_GROUP_DEBUG_ASSERT 0 #endif // Define the group "CONTAINER" #ifndef PL_GROUP_CONTAINER #define PL_GROUP_CONTAINER 1 #endif ... // Then include the original file #include "palanteer.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3. In addition to (1) or (2), have some files dedicated to group definition, and include one of your choice in each source file. - this strategy limits the scope of the code recompilation when changing a value for a group ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ File groups3DEngine.h: ---------------------- #ifndef PL_GROUP_DEBUG_ASSERT #define PL_GROUP_DEBUG_ASSERT 0 #endif // Define the group "CONTAINER" #ifndef PL_GROUP_CONTAINER #define PL_GROUP_CONTAINER 1 #endif File myCode.cpp: ---------------- // Include your instrumentation header (or the modified palanteer.h) #include "myPalanteer.h" // Include the groups definition (can be done before or after the instrumentation header) #include "groups3DEngine.h" // Your code with instrumentation covered by the groups defined in groups3DEngine.h ... ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!! In short, adapt the strategy to the need.
For large projects it is recommended to use groups and at least wrap the instrumentation library as in point (2) above. The rest of this chapter describes all configuration parameters for the instrumentation. ## Configuration of the implementation "Implementing" the library means defining the flag __PL_IMPLEMENTATION__ before including it.
The library shall be implemented once and only once. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPLEMENTATION 1 #include palanteer.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some configuration parameters are applicable only in this "implementation file" (and ignored otherwise).
These parameters are listed below and start with `PL_IMPL_`. They shall be set *before* the include of the `palanteer.h` file.
They fully define the [memory usage](index.html#c++memoryusage) of the library. !!! note No worry: The default values fit most needs These configuration variables exist just in case a finer control is required on the memory, or in case of "outstanding" usage.
| Event collection | Description | Default | |-------------------------------------------------------------------------------------|----------------------------------------------------------------------|:------------:| | [PL_IMPL_AUTO_INSTRUMENT](#pl_impl_auto_instrument) | Enables the auto instrumentation feature (GCC Linux only) | 0 (disabled) | | [PL_IMPL_AUTO_INSTRUMENT_PIC](#pl_impl_auto_instrument_pic) | Enables the Position Independent Code for auto instrumentation | 1 (enabled) | | [PL_IMPL_CATCH_SIGNALS](#pl_impl_catch_signals) | Enables catching OS signals (segv, etc...) | 1 | | [PL_IMPL_COLLECTION_BUFFER_BYTE_QTY](#pl_impl_collection_buffer_byte_qty) | Buffer size for event collection (2 are needed) | 5000 KB | | [PL_IMPL_DYN_STRING_QTY](#pl_impl_dyn_string_qty) | Defines the maximum quantity of dynamic strings per collection round | 1024 | | [PL_IMPL_REMOTE_REQUEST_BUFFER_BYTE_QTY](#pl_impl_remote_request_buffer_byte_qty) | Buffer size for remote command request | 8 KB | | [PL_IMPL_REMOTE_RESPONSE_BUFFER_BYTE_QTY](#pl_impl_remote_response_buffer_byte_qty) | Buffer size for remote command response | 8 KB | | [PL_IMPL_STRING_BUFFER_BYTE_QTY](#pl_impl_string_buffer_byte_qty) | Buffer size for new string batch sending | 8 KB | | [PL_IMPL_MAX_EXPECTED_STRING_QTY](#pl_impl_max_expected_string_qty) | Expected quantity of unique string for the program under test | 4096 | | [PL_IMPL_MAX_CLI_QTY](#pl_impl_max_cli_qty) | Maximum registered CLI quantity | 128 | | [PL_IMPL_CLI_MAX_PARAM_QTY](#pl_impl_cli_max_param_qty) | Defines the maximum CLI parameter quantity | 8 | | [PL_IMPL_MANAGE_WINDOWS_SOCKET](#pl_impl_manage_windows_socket) | Enables socket initialization (Windows only) | 1 |
| Automatic events | Description | Default | | --------- | ----------- | :-----: | | [PL_IMPL_OVERLOAD_NEW_DELETE](#pl_impl_overload_new_delete) | Enables the new/delete operators overload to collect memory events | 1 | | [PL_IMPL_CONTEXT_SWITCH](#pl_impl_context_switch) | Enables the collection of OS context switches, if enough privileges | 1 |
| Crash and stacktrace | Description | Default | |-------------------------------------------------|-----------------------------------------------|:-------------------------:| | [PL_IMPL_STACKTRACE](#pl_impl_stacktrace) | Enables the stacktrace dump if a crash occurs | Windows: 1
Linux: 0 | | [PL_IMPL_CONSOLE_COLOR](#pl_impl_console_color) | Enable the color display of all console dumps | 1 |
| Platform support | Description | Default | | --------- | ----------- | :----: | | [PL_IMPL_STACKTRACE_FUNC](#pl_impl_stacktrace_func) | Function called to log the stacktrace | `built-in` | | [PL_IMPL_CRASH_EXIT_FUNC](#pl_impl_crash_exit_func) | Function called when a crash occurs | `built-in` | | [PL_IMPL_PRINT_STDERR](#pl_impl_print_stderr) | Low level error text display function | printf on stderr | ### PL_IMPL_AUTO_INSTRUMENT The automatic C++ functions instrumentation feature is based on `GCC` built-in mechanism enabled with the `-finstrument-functions` flag.
It inserts a hook at the entering and leaving of each instrumented function. The default value is "disabled": ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_AUTO_INSTRUMENT 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enabling the feature requires additional steps described [here](getting_started.md.html#quickc++automaticfunctionsinstrumentation), with a discussion about the use cases. **Principles** The hooks introduced when using the `-finstrument-functions` flag provide the bare address of the called function. To get a readable name, additional work is required: - either dynamically by calling a special service (`dlfcn` library) at run time to read the debug symbol, then demangle it. - (+) this solution allows a fully integrated feature without any external step - (-) the overhead easily reaches 1 µs per event, so a factor ~100x compared to the manually instrumented version - either externally, by just storing the function address and resolving its name later, outside of the program. - (+) the overhead of this solution is minimal, reduced to the additional calls of the hooks. - (-) an additional and external step is required to resolve the function name. Only the external resolution of the function name is compatible with the "high performance" target: `Palanteer` uses this scheme.
And fortunately, this requirement is very close to the `external strings` feature and shares its resolution mechanism: - the event carries a representation of the filename and name of the scope as string hashes, which are resolved on server side - the tool `stringLookupGenerator.py` used to generate the external strings from the source code can also extract functions addresses and string names from the ELF binary (using `nm`) - the main difference is that the automatically instrumented functions have a hash-representation which varies at each build, while the external strings have a constant hash-representation for a given string content. !!! The hash of the automatically instrumented functions are XORed with the hash of the application name, in order to seamlessly support multiple stream auto instrumented.
Also, both the symbol address and the address+1 are used, the latter being the coded string containing both the filename and the line number. The last piece of the puzzle is the matching of the collected function addresses during runtime, after relocation in a virtual memory space, with the function addresses from the ELF binary.
ASLR (Address Space Layout Randomization) is enabled by default on many Linux distributions. Its aim is to protect against certain types of attacks by changing the base virtual address of programs at each run.
This base address shall thus be retrieved at runtime and substracted from the function addresses from the hook. This is done by calling the `dlsym()` function from the `dl` library, and enabled by this `PL_IMPL_AUTO_INSTRUMENT=1` flag. **Excluding some functions** Any functions used inside the low level `Palanteer` event tracing **must not** be auto-instrumented. Else an infinite call recursion is created and the program crashes due to a stack overflow.
As the library uses small inlined functions during tracing for efficiency reasons, this means that the following parts shall be excluded from the auto-instrumentation: 1. the file `palanteer.h` (because "header only" library so it contains functions during event tracing) 2. all inlined standard library functions used during tracing, directly or indirectly Thanks to the GCC flag `-finstrument-functions-exclude-file-list` excluding file by substring pattern matching, enforcing (1) is straightforward.
The (2) depends on the GCC installation paths. On a Debian 11, the exclusion parameter values `-finstrument-functions-exclude-file-list=palanteer.h,include/c++` and `-finstrument-functions-exclude-function-list=__tls_init` do the job, excluding all the standard library calls, inlined (which creates the problem) or not.
A more fine grained working exclusion set (Debian 11 at least) is: `-finstrument-functions-exclude-file-list=palanteer.h,atomic,chrono,/bits/ -finstrument-functions-exclude-function-list=atomic,thread,operator,condition_variable,__tls_init` - it uses also the `-finstrument-functions-exclude-function-list` flag which excludes function based on a substring pattern matching. - it excludes more precisely the part of the standard library used in the event tracing, letting some parts of it auto-instrumented (compared to the version (1)) - it was not easily found as any function called internally shall also be excluded, which means that the standard library structure and internals shall be taken into account. !!! The recommended exclusion set is to use the coarse parameters (and adapt them to the GCC installation if needed)
Indeed, the fine-grained exclusion method is much trickier to set properly and: - does not bring any real value as the standard library is heavily templatized and the function names get very long and are unreadable. - create an heavy tracing which affects the reliability of the performance measures. !!! tip If you quickly encounter a `segmentation fault` specifically with automatic function instrumentation, this is probably due to an incorrect exclusion list. This can be checked under a debugger by looking at the stack trace.
Beware that 2 code paths shall be considered: the event tracing itself (using `atomic`, `thread`...), and the contention resolution in case of buffer saturation which additionally uses `std::this_thread::yield()` with many sub function calls. ### PL_IMPL_AUTO_INSTRUMENT_PIC The automatic C++ functions instrumentation feature receives function address values either relative to a runtime known offset or as an absolute value.
This flag enables the removal of the dynamic offset of the address. The purpose is to have a match with the symbol addresses inside the ELF (using the UNIX 'nm' tool). On 64 bits Linux, ASLR (Address Space Layout Randomization) is enabled by default, so this flag should be enabled.
On 32 bits Linux, the function addresses are often absolute, so this flag should be disabled.
All depends on your build configuration... The default value is "enabled", which matches typical 64 bits Linux configuration: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_AUTO_INSTRUMENT_PIC 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This flag is considered only if PL_IMPL_AUTO_INSTRUMENT is 1 !!! When this Position Independent Code flag is disabled, linking with the 'dl' library is not required. ### PL_IMPL_CATCH_SIGNALS Catching signals allows the identification of some class of issues and their location (if it is an internal signal and if the stacktrace dump is enabled). This compilation flag enables the registration of an handler to catch useful signals for debugging purpose (SIGABRT, SIGFPE, SIGILL, SIGSEGV, SIGINT and SIGTERM).
On Windows, it also registers vectored exceptions. The usual reason not to use this feature is if the program also wants to control signals. The default value is "enabled": ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CATCH_SIGNALS 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_COLLECTION_BUFFER_BYTE_QTY As explained previously, events are collected with a [double storage bank mechanism](base_concepts.md.html#baseconcepts/c++specific/eventcollectionmechanism). The byte size of each bank buffer is defined by this constant. * Too small a size and your threads may have to busy-wait until a fresh and empty buffer is available * Too big a size and memory is wasted The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_COLLECTION_BUFFER_BYTE_QTY 5000000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!! The size shall be at least 56 byte (internal event size) * the peak event rate / the polling frequency.
Ex: for 1 million events per second and a polling frequency of 200 Hz, the minimum buffer size is 56 * 1000000 / 200 = 280000 bytes (without margin) ### PL_IMPL_DYN_STRING_QTY [Dynamic strings](base_concepts.md.html#staticanddynamicstrings) are preallocated at initialization time for efficiency and fine control reasons.
Their maximum size and their quantity per collection cycle shall be known from the beginning. !!! Dynamic strings are more flexible than static strings but have a performance price and some constraints due to preallocation. This constant defines the maximum quantity of strings per collection cycle.
If this pool of dynamic strings is depleted during a collection cycle, the tracing thread will *busy-wait* until the pool is refilled after the collection process. Refer to the description of the [collection mechanism](base_concepts.md.html#baseconcepts/c++specific/eventcollectionmechanism) for details. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_DYN_STRING_QTY 1024 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_REMOTE_REQUEST_BUFFER_BYTE_QTY The maximum byte size of a received CLI request.
If a larger request is received, the program stops on a failed assertion. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_REMOTE_REQUEST_BUFFER_BYTE_QTY 8*1024 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_REMOTE_RESPONSE_BUFFER_BYTE_QTY The maximum byte size of a sent remote CLI response.
If a CLI response is larger than the capacity of the buffer, an error status is returned for this CLI. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_REMOTE_RESPONSE_BUFFER_BYTE_QTY 8*1024 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_STRING_BUFFER_BYTE_QTY The size of the buffer used to send unique new strings to the server.
It shall be larger enough to hold any string in the program, else the program strops on a failed assertion.
If more quantity of unique and new strings than this buffer can contain shall be sent to the server during one collection round, the sending is done in several batches. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_STRING_BUFFER_BYTE_QTY 8*1024 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_MAX_EXPECTED_STRING_QTY This value defines the initial allocation for the lookup identifying unique strings.
If not large enough during the program execution, a reallocation occurs and doubles the size of the lookup.
This parameter provides a way to avoid such additional allocation after `Palanteer` initialization, if it matters and if an upper bound on the quantity of unique strings in the program under test is known. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_MAX_EXPECTED_STRING_QTY 4096 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_MAX_CLI_QTY This constant defines the maximum quantity of registered CLI.
If this quantity is exceeded, the program stops on a failed assertion. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CLI_MAX_PARAM_QTY 128 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_CLI_MAX_PARAM_QTY This constant defines the maximum quantity of remote CLI parameters.
If this quantity is exceeded, the program stops on a failed assertion. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CLI_MAX_PARAM_QTY 8 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_MANAGE_WINDOWS_SOCKET This parameter is applicable only on Windows OS and has no effect under other OSes. Using sockets on Windows requires a global initialization (WSA). If your program already manages sockets by itself (initialization and uninitialization), then this flag shall be set to 0.
When enabled, `Palanteer` instrumentation library automatically handles socket management and call the required functions (namely `WSAStartup(...)` and `WSACleanup()`) The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_MANAGE_WINDOWS_SOCKET 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_OVERLOAD_NEW_DELETE Dynamic memory allocation can be tracked automatically by overloading operator new & delete. It works in most cases but you may have to disable the overload. The alternative to track the dynamic memory allocations, in case these operators are already overloaded, is to instrument them directly. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_OVERLOAD_NEW_DELETE 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change it to zero to disable the overload. ### PL_IMPL_CONTEXT_SWITCH Tracking "context switches" from the operating system shows you the usage of the processor. In particular, which thread of your program is currently under execution, and if not, what the cores are doing. !!! warning Accessing these information requires elevated privileges for security reasons.
On Windows, your program shall be run as administrator, and on Linux as root (or sudo).
If required privileges are not matching, context switches will simply not be collected. This variable configures the collection or not of the context switches. If this collection is enabled, it is effective only if the privileges are high enough, as described above. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CONTEXT_SWITCH 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_STACKTRACE Dumping a clear stacktrace when a crash occurs is a great debugging tool.
Ex: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ Crash without stacktrace feature enabled: ----------------------------------------- [PALANTEER] signal 11 'Segmentation fault' received Crash with stacktrace feature enabled: -------------------------------------- [PALANTEER] signal 11 'Segmentation fault' received #0 0x7FA65883C83C : (null) #1 testProgram.cpp(81) : void crashSubContractor(unsigned short, int) #2 testProgram.cpp(95) : doCrash_Please(int) #3 testProgram.cpp(248) : doSomeWork(char const*, char const*, unsigned int, int, int, bool) #4 invoke.h(60) : void std::__invoke_impl(std::__invoke_other, void (*&&)(char const*, char const*, unsigned int, int, int, bool), char const*&&, char const*&&, int&&, int&&, int&&, bool&&) #5 invoke.h(95) : std::__invoke_result::type std::__invoke(void (*&&)(char const*, char const*, unsigned int, int, int, bool), char const*&&, char const*&&, int&&, int&&, int&&, bool&&) #6 thread(244) : decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)(), (_S_declval<2ul>)(), (_S_declval<3ul>)(), (_S_declval<4ul>)(), (_S_declval<5ul>)(), (_S_declval<6ul>)())) std::thread::_Invoker >::_M_invoke<0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul>(std::_Index_tuple<0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul>) #7 thread(253) : std::thread::_Invoker >::operator()() #8 thread(196) : std::thread::_State_impl > >::_M_run() #9 0x7FA658C41B2B : (null) #10 pthread_create.c(476) : start_thread #11 clone.S(93) : __clone ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Getting a stacktrace requires two base functionalities: - unwinding the call stack - decoding the program counter addresses to get the associated filename, line number, and function name, demangled. On Windows, the standard libraries provide all the required functions.
On Linux, the stacktrace requires two external dynamic libraries: - libunwind.so to unwind the call stack - installed with "sudo apt install libunwind-dev" - libdw.so, to read the DWARF debug information from the ELF binary and get the information on addresses - installed with "sudo apt install libdw-dev" If your system meets the requirements, just set `PL_IMPL_STACKTRACE` to 1 in the build system to enable the feature. !!! On Windows, the stacktrace will be directly visible only if your program is not graphical.
If it is graphical, you need to redirect the "stderr" output in a dialog box. This can be done by overriding the `PL_IMPL_PRINT_STDERR` macro. ### PL_IMPL_CONSOLE_COLOR This parameter simply enables the usage of terminal escape codes to add color into the text messages of the builtin stacktrace handler. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CONSOLE_COLOR 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_STACKTRACE_FUNC Getting a stacktrace is very dependent on the Operating System.
At the moment, `Palanteer` provides builtin handlers only for Linux and Windows. This macro points to the function called to retrieve the stacktrace and dump it with both with `PL_IMPL_PRINT_STDERR` and as text in the record.
In order to implement the feature for another Operating System, this macro shall be overridden. The default value is (builtin function): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_STACKTRACE_FUNC() plPriv::crashLogStackTrace() ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_CRASH_EXIT_FUNC This macro points to the last function called to terminate the program when an assertion fails or a signal is received.
Before this call and if applicable, the record is already flushed to the server and stacktraces are already dumped). The typical use case for overriding this macro is on an embedded system where "exiting" has no meaning. The default value is simply calling the system exit function: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_CRASH_EXIT_FUNC() exit(1) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_IMPL_PRINT_STDERR This macro function aims at displaying error messages issued by assertions or signals. It takes a text string as input, and two boolean values: - The boolean `isCrash` tells that there will be several call to build the crash information. Else it is a lonely message. - The boolean `isLastFromCrash` tells the end of the last information for the crash. A dialog box can then be displayed. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_IMPL_PRINT_STDERR(msg, isCrash, isLastFromCrash) fprintf(stderr, "%s", msg) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On Windows OS, the display shall be redirected on a MessageBox for graphical programs, as there is no console display. ## Configuration of the instrumentation This configuration **must** be consistent in all source files using `Palanteer` (see configuration [strategies](#configurationstrategies)). | Instrumentation parameters | Description | Default | |---------------------------------------------------|---------------------------------------------------------------|:-------:| | [PL_EXTERNAL_STRINGS](#pl_external_strings) | Enables external strings | 0 | | [PL_VIRTUAL_THREADS](#pl_virtual_threads) | Enables the virtual threads (=userland threads) | 0 | | [PL_DYN_STRING_MAX_SIZE](#pl_dyn_string_max_size) | Defines the maximum size of a dynamic string | 512 B | | [PL_SHORT_STRING_HASH](#pl_short_string_hash) | Use 32 bits hash rather than 64 bits | 0 | | [PL_SIMPLE_ASSERT](#pl_simple_assert) | Disables the assertion enhancements and reverts to basic ones | 0 | | [PL_SHORT_DATE](#pl_short_date) | Use a 32 bits clock output, which implies wraps | 0 | | [PL_COMPACT_MODEL](#pl_compact_model) | Use the "compact app model" to reduce the transferred data | 0 |
| Platform support | Description | Default | | --------- | ----------- | :----: | | [PL_GET_CLOCK_TICK_FUNC](#pl_get_clock_tick_func) | Getter for the high resolution clock | builtin | | [PL_GET_SYS_THREAD_ID](#pl_get_sys_thread_id) | Getter for the current system thread identifier | builtin | ### PL_EXTERNAL_STRINGS The [external strings](index.html#externalstrings) feature" strips at compile time all static strings used by `Palanteer` so that: * the instrumentation is obfuscated (no string content in the binary) * the binary size is reduced due to this string removal The "price to pay" is the necessity to generate of a lookup *hash->strings* from the source code, and use it in the viewer to decode strings.
The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_EXTERNAL_STRINGS 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!! warning Dynamic strings are not concerned by this feature as they cannot be processed at compile time by definition. Another reason to avoid them. ### PL_VIRTUAL_THREADS The "virtual thread" feature enables the profiling of non-OS threads (userland threads), like "fibers" or in a simulated OS environment. Some hooks in the non-OS thread framework are required to leverage the non-OS thread switching notification.
This [section](instrumentation_api_cpp.md.html#virtualthreads) explains in detail how to use these hooks to make the feature effective. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_VIRTUAL_THREADS 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Virtual thread support is not enabled by default due to the additional allocations and processing (minor but non zero). ### PL_DYN_STRING_MAX_SIZE [Dynamic strings](base_concepts.md.html#staticanddynamicstrings) are preallocated at initialization time for efficiency and fine control reasons.
This constant defines the maximum dynamic string size in bytes (null termination included).
Larger strings are truncated. !!! tip Important Stacktraces are sent as dynamic strings and C++ templatized names are known to be long... The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_DYN_STRING_MAX_SIZE 512 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_SHORT_STRING_HASH All strings used by `Palanteer` are hashed.
The selected hash function is the Fowler–Noll–Vo which is a good trade-off between: * the compile time capability of C++11 (`constexpr` is pretty limited in this version) * the run-time performance for dynamic strings * the discriminating power of the hash By default, the 64 bits version is used to ensure that virtually no collision happens.
This constant forces the usage of the 32 bits version of FNV. The only 'gain' is for 32 bits systems and only when using dynamic strings or recording context switches: run-time computation speed will then be slightly improved.
Note that storage size and transmitted data are not reduced when using this flag. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_SHORT_STRING_HASH 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_SIMPLE_ASSERT The [enhanced assertions feature](index.html#enhancedassertions) uses variadic templates which may increase the code size. With this compilation flag it is possible to keep the "per category compilation" feature of the `Palanteer` assertions but revert to the lighter and standard behavior of assertions by ignoring the extra context parameters. The typical usage of this feature is in constrained embedded systems where code size really matters. The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_SIMPLE_ASSERT 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_SHORT_DATE This compilation flag switches the instrumentation to use a 32 bits clock, and the server side to handle its wraps. The two main use cases are: - The high resolution clock is indeed 32 bits (on 32 bits architectures for instance) - The compact application model is used (see [PL_COMPACT_MODEL](#pl_compact_model) below) If set, the server manages clock wraps and reconstitutes the unique clock date.
For proper wrap handling, the wrap period shall be twice larger than the event sending period (should always be the case in theory). The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_SHORT_DATE 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_COMPACT_MODEL A "compact application model" is an instrumented program which transfers 50% less data with the server (12 bytes per events instead of 24).
This applies both for storage on file and for connected mode. To be compatible with the "compact model", the following constraints shall be fulfilled: - At most 65535 unique strings are used - Only 32 bits types are used - If used, `double` and 64 bits integers are truncated to their 32 bits equivalent !!! note This mode is not restricted to small embedded devices The biggest constraint is usually the restriction to 32 bits data types (which is even easier on 32 bits architectures).
Indeed, if the usage of formatted dynamic string is moderate, the limit of 65k unique strings should be reached only for large programs In this mode, the following settings are forced: - `PL_SHORT_DATE` is set to 1 (32 bits clock) - `PL_SHORT_STRING_HASH` is set to 1 (32 bits string hashes) The default value is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #define PL_COMPACT_MODEL 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !!! tip The record sizes on the server side are unaffected by this flag. Only the `.pltraw` file size and the byte quantity sent by socket are reduced. ### PL_GET_CLOCK_TICK_FUNC This macro points to the function which provides a high resolution clock stored on a uint64_t. The builtin function for Windows and Linux provide an efficient one, which is also compatible with the one of the context switch (the source of information may differ). !!! The value of the clock frequency does not matter as the clock is calibrated at initialization time. The default value is (builtin function): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ // Returned value is a uint64_t #define PL_GET_CLOCK_TICK_FUNC() plPriv::getClockTick() ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ### PL_GET_SYS_THREAD_ID This macro points to the function which returns the current system thread identifier. The default value is (built-in function): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C++ #if defined(__unix__) #define PL_GET_SYS_THREAD_ID() syscall(SYS_gettid) #endif #if defined(_WIN32) #define PL_GET_SYS_THREAD_ID() GetCurrentThreadId() #endif ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## Troubleshooting **What is the recommended setting for small embedded devices?** Such devices usually use a 32 bits architecture and run software of moderate size.
Two settings have been designed with this use-case in mind: - The [PL_COMPACT_MODEL](#pl_compact_model), which reduces the data transfer on socket (or any transport layer) - The [PL_EXTERNAL_STRINGS](#pl_external_strings) feature, which zeroes the cost of using static strings in the instrumentation Note that the compact model forces the activation of [PL_SHORT_DATE](#pl_short_date) and [PL_SHORT_STRING_HASH](#pl_short_string_hash).
For very limited resource cases, the additional flag [PL_SIMPLE_ASSERT](#pl_simple_assert) can be used. The remote control can also be disabled (with `PL_NOCONTROL=1`), but be able to remotely control and test an embedded device is always interesting.
**I want to hide the events from the Palanteer threads** The events generated by the instrumentation library itself are controlled by the group `PL_VERBOSE`, disabled by using the flag `-DPL_GROUP_PL_VERBOSE=0` !!! tip On Windows, a second group `PL_VERBOSE_CS_CBK` is used in the additional thread to handle the context switch, but is disabled by default (debug only).
**What is the impact of these settings on the memory usage?** See the description in the [performance section](index.html#c++memoryusage).
**I have some troubles with sockets under Windows** First, ensure that "`#define _WINSOCKAPI_`" is set before including any windows header (see testProgram.cpp).
Then, if your program already manages the WSA, you have to tell Palanteer not to do it also, with `#define PL_IMPL_MANAGE_WINDOWS_SOCKET 0` (default is 1)
**PL_IMPL_CONTEXT_SWITCH is set to 1 but I do not see any CPU related event** The access to such data is restricted on many OS for security reasons.
Their collection implies that the program runs in privileged mode (root on Linux, administrator on Windows).
**The stacktrace is not displayed on Windows** On Windows, the stacktrace decoding requires some initialization which cannot be safely done at crash time. The following points shall be checked: - The compilation flag `USE_PL` is set to 1. - If only the stacktrace feature is desired, set all `USE_PL`, `PL_NOEVENT` and `PL_NOCONTROL` to 1 - The compilation flag `PL_IMPL_STACKTRACE` is not explicitely set to zero (it is set to the expected value 1 by default on Windows) - The function `plInitAndStart` is called - The stacktrace decoding initialization is done at this momen - If no event recording is desired, the mode `PL_MODE_INACTIVE` shall be used.
**The external string lookup tool is saying that I have "collisions"** It may happen statistically that different strings are associated to the same hash value, especially with short hashes on 32 bits.
It is called a "hash collision". In this case, the collided strings will simply be displayed with the value of the first one which has been encountered.
The string lookup generation tool checks for collision and returns a non zero status in this case, listing on stderr the colliding strings. The mechanism to use in this case is to use the configuration flag PL_HASH_SALT, initially set to 0.
By giving it another positive value, all hashes are scrambled and collision is likely to disappear.
**I have a segmentation fault quickly after launching an auto-instrumented program** Check the list of excluded functions. As explained in [PL_IMPL_AUTO_INSTRUMENT](#pl_impl_auto_instrument), it is crucial to exclude all functions that are used during the event tracing in order to avoid a deadly infinite call recursion.
**My auto-instrumented program displays a lot of hexadecimal values instead of function names** These are the value of the string hashes of names that could not be resolved by using the strings lookup.
Note that each program built may scramble the function addresses, so a new string lookup is required each time you rebuild the program. A description of how to build and use such strings lookup is present in the [getting started](getting_started.md.html#gettingstarted/quickc++automaticfunctionsinstrumentation) section

@@ Python Instrumentation API

@@ Scripting API

@@ More