因为操作系统自动把当前进程(称为父进程)复制了一份(称为子进程),然后分别在父进程和子进程内返回,子进程永远返回
0,而父进程返回子进程的 ID
这样做的理由是,一个父进程可以 fork
出很多子进程,所以,父进程要记下每个子进程的 ID,而子进程只需要调用
getppid 就可以拿到父进程的 ID
Python 的 os 模块封装了常见的系统调用,其中就包括
fork,所以可以在 Python 程序中轻松创建子进程
1 2 3 4 5 6 7 8
import os print('Process (%s) start...' % os.getpid()) # Only works on Unix/Linux/Mac: pid = os.fork() if pid == 0: print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid())) else: print('I (%s) just created a child process (%s).' % (os.getpid(), pid))
由于 Windows 没有 fork 调用,上面的代码在 Windows
上无法运行;而 Mac 系统是基于 BSD(Unix的一种)内核,所以,在 Mac
下运行是没有问题的
if __name__ == '__main__': print('Parent process %s.' % os.getpid()) p = Process(target=run_proc, args=('test',)) print('Child process will start.') p.start() p.join() print('Child process end.')
1 2 3 4
Parent process588. Child process will start. Run child process test (19096)... Child processend.
创建子进程时,只需要传入一个执行函数和函数的参数,创建一个
Process 实例,用 start
方法启动,这样创建进程比 fork 还要简单
if __name__ == '__main__': print('Parent process %s.' % os.getpid()) p = Pool() for i inrange(5): p.apply_async(long_time_task, args=(i,)) print('Waiting for all subprocesses done...') p.close() p.join() print('All subprocesses done.')
1 2 3 4 5 6 7 8 9 10 11 12 13
Parent process 37332. Waiting for all subprocesses done... Run task 0 (21488)... Run task 1 (35776)... Run task 2 (19912)... Run task 3 (18472)... Task0 runs 0.37 seconds. Run task 4 (21488)... Task2 runs 0.94 seconds. Task1 runs 1.30 seconds. Task4 runs 0.95 seconds. Task3 runs 2.75 seconds. All subprocesses done.
对 Pool 对象调用 join
方法会等待所有子进程执行完毕,调用 join 之前必须先调用
close,调用 close 之后就不能继续添加新的
Process 了
Pool 默认进程数是 CPU 核数,也可以入参
processes 进行设置
1 2 3 4 5 6 7
if processes isNone: processes = os.cpu_count() or1 if processes < 1: raise ValueError("Number of processes must be at least 1") if maxtasksperchild isnotNone: ifnotisinstance(maxtasksperchild, int) or maxtasksperchild <= 0: raise ValueError("maxtasksperchild must be a positive int or None")
控制子进程
创建了子进程后,有时还需要控制子进程的输入和输出
subprocess
模块可以让我们非常方便地启动一个子进程,然后控制其输入和输出
1 2 3 4 5
import subprocess
print('$ nslookup www.python.org') r = subprocess.call(['nslookup', 'www.python.org']) print('Exit code:', r)
from multiprocessing import Process, Queue import os, time, random
# 写数据进程执行的代码: defwrite(q): print('Process to write: %s' % os.getpid()) for value in ['A', 'B', 'C']: print('Put %s to queue...' % value) q.put(value) time.sleep(random.random())
# 读数据进程执行的代码: defread(q): print('Process to read: %s' % os.getpid()) whileTrue: value = q.get(True, 1) print('Get %s from queue.' % value)
Process to write: 26020 Put A toqueue... Process to read: 916 Get A fromqueue. Put B toqueue... Get B fromqueue. Put C toqueue... Get C fromqueue. Process Process-2: ... # 这里会报错,因为 q.get(True, 1) 设置了超时时间
defconsumer(): r = '' whileTrue: n = yield r ifnot n: return print('[CONSUMER] Consuming %s...' % n) r = '200 OK'
defproduce(c): next(c) n = 0 while n < 5: n = n + 1 print('[PRODUCER] Producing %s...' % n) r = c.send(n) print('[PRODUCER] Consumer return: %s' % r) c.close()