Python测量服务器进程最大内存使用量

2024-11-15 14:53:46

使用 Python 测量服务器进程最大内存使用量

如何测量服务器进程从启动到收到 SIGINT 信号期间的最大内存使用量？这个问题在性能分析和资源监控中至关重要。本文将探讨使用 /usr/bin/time 命令以及 Python 子进程模块实现该目标的方法，并分析可能遇到的问题和对应的解决方案。

`/usr/bin/time` 和 Python 子进程

/usr/bin/time 是一个实用工具，可以跟踪程序的执行时间和资源使用情况。结合 -f %M 选项，可以获取进程的最大常驻内存集大小（Maximum Resident Set Size）。配合 Python 的 subprocess 模块，我们可以方便地启动和管理子进程。

然而，直接使用 subprocess.Popen 并通过 communicate() 方法获取输出，在目标进程需要通过信号终止时，可能会遇到无法获取 stderr 输出的问题。这是因为 /usr/bin/time 将程序的输出信息写入到标准错误流，而直接向子进程发送 SIGINT 信号并调用 communicate() 方法，会导致缓存区无法及时刷新，从而丢失输出。

解决方案：使用 `select` 监控输出

为了解决这个问题，可以使用 select 模块监控子进程的输出流。select 可以帮助我们判断文件符是否可读，从而避免阻塞等待。

import subprocess
import select
import signal
import os

def measure_max_memory(command):
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    # poll object for process
    poll = select.poll()
    poll.register(process.stderr, select.POLLIN)

    max_memory = ""
    while True:
        if (process.poll() is not None): #process has terminated
             break
        #check if the fd is ready, otherwise check back soon (no blocking, 0 timeout)
        if poll.poll(0):  #check for output available with non blocking, 0 timeout, check back every little loop to break condition immediately after SIGINT signal is sent
            line = process.stderr.readline().decode().strip()
            # Filter lines, keep the lines for peak memory usage to be returned, skip rest
            try: # Check if number - safe to parse for result
              int(line) 
              max_memory=line  #assume multiple output only maximum rss usage as last line
            except:  #pass over rest, for /usr/bin/time non memory related loggin info lines
               pass   


        # Simulate waiting for external event and trigger
        # ... Your logic to determine when to send SIGINT ...

        # Sending signal after some condition, otherwise the process stays up running
        process.send_signal(signal.SIGINT) 

    return int(max_memory)



if __name__ == "__main__":
  # Example:
  command = ["/usr/bin/time", "-f", "%M", "python", "./terminate-me.py"] #command is composed as a list with script parameters and program name as in your question

  try: 
    max_mem = measure_max_memory(command) #use command to get a test value from the external example, assuming maximum mem will never be higher than MAXINT for python programs, check program behaviour before implementation if in other edge cases return codes are prefered
    print(f"Maximum memory usage: {max_mem} KB")

  except KeyboardInterrupt: #use KeyboardInterrupt Ctrl+c to break main script as well or remove for indefinite wait inside measure_max_memory() function in case there is another breaking point for wait to exit in the real server program measurement

       print("Main script interrupted by CTRL + c")

  except:
       print ("Memory data read and parse was interrupted")