vllm.utils.lite_profiler.lite_profiler ¶
Minimal helpers for opt-in lightweight timing collection.
LiteScope ¶
Lightweight context manager for timing code blocks with minimal overhead.
This class provides a simple way to measure and log the execution time of code blocks using Python's context manager protocol (with statement). It's designed for high-frequency profiling with minimal performance impact.
Source code in vllm/utils/lite_profiler/lite_profiler.py
__enter__ ¶
__exit__ ¶
__exit__(
exc_type: type[BaseException] | None,
exc_value: BaseException | None,
traceback: TracebackType | None,
) -> None
Source code in vllm/utils/lite_profiler/lite_profiler.py
_should_log_results ¶
_should_log_results() -> bool
Check if the current process should log results. Only the data-parallel rank 0 engine core and worker 0 should emit logs in multi-process deployments so that we avoid duplicating identical timing data.
Source code in vllm/utils/lite_profiler/lite_profiler.py
_write_log_entry ¶
Write a profiler entry using cached file handle for optimal performance.
This function implements an efficient caching approach where the file handle is opened once and reused for all subsequent writes. This eliminates the significant overhead of opening/closing files for every profiler entry, which is crucial for maintaining the lightweight nature of the profiler.
The cached file handle is automatically closed on program exit via atexit.
Source code in vllm/utils/lite_profiler/lite_profiler.py
maybe_emit_lite_profiler_report ¶
Generate and display a summary report of profiling data if available.
This function serves as the main entry point for analyzing and displaying profiling results. It checks if profiling was enabled and a log file exists, then delegates to the lite_profiler_report module to generate statistics like function call counts, timing distributions, and performance insights.