Python Symbolization
eBPF symbol collection
Python symbols are stored in a BPF_MAP_TYPE_LRU_HASH map called python_symbols. The map is filled by an eBPF program during the stack unwinding process.
python_symbols contains function names and filenames for each symbol by ID. The symbol ID is a (code_object_address, pid, co_firstlineno) tuple which serves as a unique Python symbol identifier within the system.
The Python stack is passed as an array of Python symbol IDs to the user space.
User space symbolization
Upon receiving a Python sample from the perf buffer, Python symbol IDs need to be converted to function names and filenames. For this, we can look up the python_symbols BPF map using another layer of userspace cache to avoid syscall map lookup overhead.