Logging and Tracing Solutions
In many systems tracing events recorded in log files are the main source of information about system operation.
Elbrus team has proven experience in system tracing design and implementation.
- To improve observability of the system
- To replace printf for debugging, integration and testing purposes (to avoid multiple code changes and recompilations)
To provide a solid basis for automatic testing and diagnosis tools:
- postmortem analysis of the system behavior
- timing analysis for performance optimization
- log based visualization tools
- log based replay tools
- Should provide necessary level of details to analyze system behavior without a need to reproduce a problem
- Flexible control of verbosity
- Fixed format to support automatic analysis
- Time stamp with highest possible time resolution
- Buffered/not buffered disk operation modes
Log files management system:
- disk space checking
Event Types for Tracing:
- Configuration events and parameters
- Initialization events
- External System I/O
Internal Communication events:
- system wide messages related to external I/O
- system level messages
- subsystem level messages
- periodical messages
- Events of main data objects and state machines
- Internal process events
There is a trade-off between logging verbosity necessary for problem localization, time period covered by log file and log file size.
The following resources limit the possible log file size:
- Required disk space
- RAM usage during nominal operation
- Viewer Performance
Disk space and RAM limitations are less significant with current disk/memory volumes.
Log Viewer performance become critical element in tracing system effectiveness.
Applications and integration with other tools
Log files provide significant part of information about system behavior. Tracing subsystem is a one of the most important infrastructure parts and is used by both nominal software and testing tools.
Log files are integral part of problem reports generated by Bugrep tool.
System Monitor analyzes system behavior with Log Analyzer modules.
Bugview analyzes and shows different logs from problem report.
Access to log files on the systems in the field is the most basic feature of Remote Service and predictive diagnostics.