Troubleshooting and Postmortems¶
This document captures recurring failures observed during cross-OS packaging and VM validation, plus the implemented fixes.
Diagram source: triage-decision-tree.mmd
1) GLIBCXX_* not found when calling system binaries from packaged app¶
Symptoms¶
apt-get/oscapfailures with errors like:libstdc++.so.6: version GLIBCXX_3.4.29 not found
Root cause¶
Packaged runtime (AppImage/PyInstaller) leaked LD_LIBRARY_PATH into subprocesses, causing host binaries to link against bundled incompatible libs.
Fix¶
- Added
get_clean_env()incore/utils.py - Routed subprocess execution through sanitized env in
ToolManagerand wrappers
Validation¶
System package commands and scanner binaries run with host libraries instead of packaged library overrides.
2) Server start failures in packaged builds due to missing frontend assets¶
Symptoms¶
server start --detachfails in VM runner- web routes return frontend missing errors
Root cause¶
PyInstaller bundle omitted frontend assets.
Fix¶
- Build workflow includes
--add-data "frontend:frontend"(Linux) and equivalent Windows data embedding.
3) OpenSCAP package name drift across distributions/versions¶
Symptoms¶
- Install failures on Debian/Ubuntu variants because package names differ.
Root cause¶
Single static package list is insufficient for mixed distro/version targets.
Fix¶
- Added
PACKAGE_MAPPINGSwith distro+version keys - Added fallback by distro and defaults
- Added unsupported guards where packages are unavailable (Debian 11 OpenSCAP)
4) Debian 11 OpenSCAP unavailability¶
Findings¶
- Confirmed from Debian package search and in-VM
apt-cachechecks: - no usable OpenSCAP packages in default Debian 11 repos
External evidence used during triage:
- Debian package search (bullseye): no
openscappackages returned - Debian package search (bookworm):
openscap-utils/openscap-scanneravailable
Resolution¶
- Removed Debian 11 from active VM matrix
- Added unsupported-version skip behavior for OpenSCAP on Debian 11
5) OpenSUSE and package manager coverage¶
Symptoms¶
- Tool install paths skipped on SUSE before zypper support.
Fix¶
- Added
zypperpackage manager path inToolManager - Added
install_pkgs_zypperin tool config
6) AlmaLinux installation instability under low memory¶
Symptoms¶
- package operations terminated or inconsistent under constrained memory.
Fix¶
- Increased AlmaLinux VM memory in
Vagrantfileto 2048 MB.
7) Windows CIS-CAT not running due to Java prerequisite¶
Symptoms¶
- CISCAT install/scan path incomplete on Windows.
Root cause¶
Java runtime not always available by default.
Fix¶
- Shifted Java prerequisite handling into
ToolManager._install_windows()forciscat. - Added runtime detection and installer fallbacks (
winget/choco).
8) USG false-failure noise on non-Ubuntu systems¶
Symptoms¶
- VM tests showing expected non-support as failures.
Fix¶
- Added distro support constraints (
supported_distros) forusg - Skip unsupported distro installs cleanly
- Keep explicit note that Ubuntu Pro entitlement may still be required
9) VM artifact folder accidentally tracked in git¶
Symptoms¶
- Large generated VM artifacts polluted git history.
Fix¶
- Added
vms/to.gitignore - Removed tracked artifacts and rewrote affected commit/tag state
10) Practical triage playbook¶
- Check per-VM log tails first.
- Separate environment failures (box download, OOM, ports) from code failures.
- Reproduce in single VM with manual install command.
- Patch installer config (
tools_config) and installer logic (tool_manager) together. - Re-run targeted VM tests before full concurrent suite.
11) File path changes to be aware of during troubleshooting¶
When investigating issues, note these structural changes:
- CLI:
cli.pyhas been replaced by thecli/package (8 modules). Stack traces will referencecli/<module>.pyinstead ofcli.py. - Reporter:
report_gen.pyhas been deleted and merged intoreporter.py. The parsing logic now lives incore/parsers/(a package with per-format parser modules). The slim orchestrator remains inreporter.py. - CIS-CAT Linux: CIS-CAT execution logic was extracted from
linux.pytowrappers/ciscat_linux.py. - New core modules:
core/deps.py(dependency checks),core/constants.py(shared constants),core/os_detect.py(OS detection),core/ws_patch.py(WebSocket patching). - Database:
database.pynow uses lazy initialization. Thescanstable includes acompleted_atcolumn. New helpers:delete_scan, pagination support. - Logging:
scan_logger.pynow writes per-tool log files.logger.pysupportsLOG_FORMAT=json. - Docker: Single
Dockerfile(uv multi-stage);Dockerfile.rockyremoved. setup.py: Deleted. Usepyproject.tomlonly.
Postmortem policy going forward¶
For each new issue, capture:
- Signature (exact error snippets)
- Scope (which OS/version/artifact)
- Root cause class (packaging/runtime/config/repo/network)
- Patch summary (files + behavior)
- Validation proof (command + log evidence)