Subprocess Module

Here are my additional python tutorials on:
- Computer Vision with OpenCV 3
- Machine Learning with scikit-learn
- Image and video processing with FFmpeg
A running program is called a process. Each process has its own system state, which includes memory, lists of open files, a program counter that keeps track of the instruction being executed, and a call stack used to hold the local variables of functions.
Normally, a process executes statements one after the other in a single sequence of control flow, which is sometimes called the main thread of the process. At any given time, the program is only doing one thing.
A program can create new processes using library functions such as those found in the os or subprocess modules such as os.fork(), subprocess.Popen(), etc. However, these processes, known as subprocesses, run as completely independent entities-each with their own private system state and main thread of execution.
Because a subprocess is independent, it executes concurrently with the original process. That is, the process that created the subprocess can go on to work on other things while the subprocess carries out its own work behind the scenes.
The subprocess module allows us to:
- spawn new processes
- connect to their input/output/error pipes
- obtain their return codes
It offers a higher-level interface than some of the other available modules, and is intended to replace the following functions:
- os.system()
- os.spawn*()
- os.popen*()
- popen2.*()
- commands.*()
We cannot use UNIX commands in our Python script as if they were Python code. For example, echo name is causing a syntax error because echo is not a built-in statement or function in Python. So, in Python script, we're using print name instead.
To run UNIX commands we need to create a subprocess that runs the command. The recommended approach to invoking subprocesses is to use the convenience functions for all use cases they can handle. Or we can use the underlying Popen interface can be used directly.
The simplest way of running UNIX command is to use os.system().
>>> import os >>> os.system('echo $HOME') /user/khong 0 >>> # or we can use >>> os.system('echo %s' %'$HOME') /user/khong 0
As expected, we got $HOME as stdout (to a terminal). Also, we got a return value of 0 which is the result of executing this command, which means there was no error in the execution.
os.system('command with args') passes the command and arguments to our system's shell. By using this can actually run multiple commands at once and set up pipes and input/output redirections. :
os.system('command_1 < input_file | command_2 > output_file')
If we run the code above os.system('echo $HOME') in the Python IDLE, we only see the 0 because the stdout means a terminal. To see the command output we should redirect it to a file, and the read from it:
>>> import os >>> os.system('echo $HOME > outfile') 0 >>> f = open('outfile','r') >>> '/user/khong\n'
Open a pipe to or from command. The return value is an open file object connected to the pipe, which can be read or written depending on whether mode is 'r' (default) or 'w'. The bufsize argument has the same meaning as the corresponding argument to the built-in open() function. The exit status of the command (encoded in the format specified for wait()) is available as the return value of the close() method of the file object, except that when the exit status is zero (termination without errors), None is returned.
>>> import os >>> stream = os.popen('echo $HOME') >>> '/user/khong\n'
os.popen() does the same thing as os.system except that it gives us a file-like stream object that we can use to access standard input/output for that process. There are 3 other variants of popen that all handle the i/o slightly differently.
If we pass everything as a string, then our command is passed to the shell; if we pass them as a list then we don't need to worry about escaping anything.
However, it's been deprecated since version 2.6: This function is obsolete. Use the subprocess module.
This is basically just like the Popen class and takes all of the same arguments, but it simply wait until the command completes and gives us the return code., *, stdin=None, stdout=None, stderr=None, shell=False)
Run the command described by args. Wait for command to complete, then return the returncode attribute.
>>> import os >>> os.chdir('/') >>> import subprocess >>>['ls','-l']) total 181 drwxr-xr-x 2 root root 4096 Mar 3 2012 bin drwxr-xr-x 4 root root 1024 Oct 26 2012 boot ...
The command line arguments are passed as a list of strings, which avoids the need for escaping quotes or other special characters that might be interpreted by the shell.
>>> import subprocess >>>'echo $HOME') Traceback (most recent call last): ... OSError: [Errno 2] No such file or directory >>> >>>'echo $HOME', shell=True) /user/khong 0
Setting the shell argument to a true value causes subprocess to spawn an intermediate shell process, and tell it to run the command. In other words, using an intermediate shell means that variables, glob patterns, and other special shell features in the command string are processed before the command is run. Here, in the example, $HOME was processed before the echo command. Actually, this is the case of command with shell expansion while the command ls -l considered as a simple command.
Here is a sample code (PyGoogle/FFMpeg/ It downloads YouTube video and then extracts I-frames to sub folder:
''' - download video and ffmpeg i-frame extraction Usage: (ex) python -u This code does two things: 1. Download using youtube-dl cmd = ['youtube-dl', '-f', videoSize, '-k', '-o', video_out, download_url] 2. Extract i-frames via ffmpeg cmd = [ffmpeg,'-i', inFile,'-f', 'image2','-vf', "select='eq(pict_type,PICT_TYPE_I)'",'-vsync','vfr', imgFilenames] ''' from __future__ import unicode_literals import youtube_dl import sys import os import subprocess import argparse import glob if sys.platform == "Windows": FFMPEG_BIN = "ffmpeg.exe" MOVE = "move" MKDIR = "mkdir" else: FFMPEG_BIN = "ffmpeg" MOVE = "mv" MKDIR = "md" def iframe_extract(inFile): # ffmpeg -i inFile -f image2 -vf \ # "select='eq(pict_type,PICT_TYPE_I)'" -vsync vfr oString%03d.png # infile : video file name # (ex) 'FoxSnowDive-Yellowstone-BBCTwo.mp4' imgPrefix = inFile.split('.')[0] # imgPrefix : image file # start extracting i-frames home = os.path.expanduser("~") ffmpeg = home + '/bin/ffmpeg' imgFilenames = imgPrefix + '%03d.png' cmd = [ffmpeg,'-i', inFile,'-f', 'image2','-vf', "select='eq(pict_type,PICT_TYPE_I)'",'-vsync','vfr', imgFilenames] # create iframes print "creating iframes ...." # Move the extracted iframes to a subfolder # imgPrefix is used as a subfolder name that stores iframe images cmd = 'mkdir -p ' + imgPrefix os.system(cmd) print "make subdirectoy", cmd mvcmd = 'mv ' + imgPrefix + '*.png ' + imgPrefix print "moving images to subdirectoy", mvcmd os.system(mvcmd) def get_info_and_download(download_url): # Get video meta info and then download using youtube-dl ydl_opts = {} # get meta info from the video with youtube_dl.YoutubeDL(ydl_opts) as ydl: meta = ydl.extract_info(download_url, download=False) # renaming the file # remove special characters from the file name print('meta[title]=%s' %meta['title']) out = ''.join(c for c in meta['title'] if c.isalnum() or c =='-' or c =='_' ) print('out=%s' %out) extension = meta['ext'] video_out = out + '.' + extension print('video_out=%s' %video_out) videoSize = 'bestvideo[height<=540]+bestaudio/best[height<=540]' cmd = ['youtube-dl', '-f', videoSize, '-k', '-o', video_out, download_url] print('cmd=%s' %cmd) # download the video # Sometimes output file has format code in name such as 'out.f248.webm' # so, in this case, we want to rename it 'out.webm' found = False extension_list = ['mkv', 'mp4', 'webm'] for e in extension_list: glob_str = '*.' + e for f in glob.glob(glob_str): if out in f: if os.path.isfile(f): video_out = f found = True break if found: break # call iframe-extraction : ffmpeg print('before iframe_extract() video_out=%s' %video_out) iframe_extract(video_out) return meta def check_arg(args=None): # Command line options # Currently, only the url option is used parser = argparse.ArgumentParser(description='download video') parser.add_argument('-u', '--url', help='download url', required='True') parser.add_argument('-i', '--infile', help='input to iframe extract') parser.add_argument('-o', '--outfile', help='output name for iframe image') results = parser.parse_args(args) return (results.url, results.infile, results.outfile) # Usage sample: # syntax: python -u url # (ex) python -u if __name__ == '__main__': u,i,o = check_arg(sys.argv[1:]) meta = get_info_and_download(u)
subprocess.check_call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
The check_call() function works like call() except that the exit code is checked, and if it indicates an error happened then a CalledProcessError exception is raised.
>>> import subprocess >>> subprocess.check_call(['false']) Traceback (most recent call last): ... subprocess.CalledProcessError: Command '['false']' returned non-zero exit status 1
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
The standard input and output channels for the process started by call() are bound to the parent's input and output. That means the calling program cannot capture the output of the command. To capture the output, we can use check_output() for later processing.
>>> import subprocess >>> output = subprocess.check_output(['ls','-l']) >>> print output total 181 drwxr-xr-x 2 root root 4096 Mar 3 2012 bin drwxr-xr-x 4 root root 1024 Oct 26 2012 boot ... >>> output = subprocess.check_output(['echo','$HOME'], shell=True) >>> print output /user/khong
This function was added in Python 2.7.
The underlying process creation and management in this module is handled by the Popen class. It offers a lot of flexibility so that developers are able to handle the less common cases not covered by the convenience functions.
subprocess.Popen() executes a child program in a new process. On Unix, the class uses os.execvp()-like behavior to execute the child program. On Windows, the class uses the Windows CreateProcess() function.
class subprocess.Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)
- args:
should be a sequence of program arguments or else a single string. By default, the program to execute is the first item in args if args is a sequence. If args is a string, the interpretation is platform-dependent. It is recommended to pass args as a sequence.
- shell:
shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.
On Unix with shell=True, the shell defaults to /bin/sh.
- If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them.
- If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. That is to say, Popen does the equivalent of:
Popen(['/bin/sh', '-c', args[0], args[1], ...])
- bufsize:
if given, has the same meaning as the corresponding argument to the built-in open() function:- 0 means unbuffered
- 1 means line buffered
- any other positive value means use a buffer of (approximately) that size
- A negative bufsize means to use the system default, which usually means fully buffered
- The default value for bufsize is 0 (unbuffered)
- executable:
specifies a replacement program to execute. It is very seldom needed. - stdin, stdout and stderr:
- specify the executed program's standard input, standard output and standard error file handles, respectively.
- Valid values are PIPE, an existing file descriptor (a positive integer), an existing file object, and None.
- PIPE indicates that a new pipe to the child should be created.
- With the default settings of None, no redirection will occur; the child's file handles will be inherited from the parent.
- Additionally, stderr can be STDOUT, which indicates that the stderr data from the child process should be captured into the same file handle as for stdout.
- preexec_fn:
is set to a callable object, this object will be called in the child process just before the child is executed. (Unix only) - close_fds:
is true, all file descriptors except 0, 1 and 2 will be closed before the child process is executed. (Unix only). Or, on Windows, if close_fds is true then no handles will be inherited by the child process. Note that on Windows, we cannot set close_fds to true and also redirect the standard handles by setting stdin, stdout or stderr. - cwd:
is not None the child's current directory will be changed to cwd before it is executed. Note that this directory is not considered when searching the executable, so we can't specify the program's path relative to cwd. - env:
is not None, it must be a mapping that defines the environment variables for the new process; these are used instead of inheriting the current process' environment, which is the default behavior. - universal_newlines:
is True, the file objects stdout and stderr are opened as text files in universal newlines mode. Lines may be terminated by any of '\n', the Unix end-of-line convention, '\r', the old Macintosh convention or '\r\n', the Windows convention. All of these external representations are seen as '\n' by the Python program. - startupinfo:
will be a STARTUPINFO object, which is passed to the underlying CreateProcess function. - creationflags:
This is intended as a replacement for os.popen, but it is more complicated. For example, we use
subprocess.Popen("echo Hello World", stdout=subprocess.PIPE, shell=True) of
os.popen("echo Hello World").read()
But it is comprehensive and it has all of the options in one unified class instead of different os.popen functions.
>>> import subprocess >>> proc = subprocess.Popen(['echo', '"Hello world!"'], ... stdout=subprocess.PIPE) >>> stddata = proc.communicate() >>> stddata ('"Hello world!"\n', None)
Note that the communicate() method returns a tuple (stdoutdata, stderrdata) : ('"Hello world!"\n' ,None). If we don't include stdout=subprocess.PIPE or stderr=subprocess.PIPE in the Popen call, we'll just get None back.
Popen.communicate(input=None)Popen.communicate() interacts with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional input argument should be a string to be sent to the child process, or None, if no data should be sent to the child.
So, actually, we could have done as below:
>>> import subprocess >>> proc = subprocess.Popen(['echo', '"Hello world!"'], ... stdout=subprocess.PIPE) >>> (stdoutdata, stderrdata) = proc.communicate() >>> stdoutdata '"Hello world!"\n'
or we can explicitly specify which one we want from proc.communicate():
>>> import subprocess >>> proc = subprocess.Popen(['echo', '"Hello world!"'], ... stdout=subprocess.PIPE) >>> stdoutdata = proc.communicate()[0] >>> stdoutdata '"Hello world!"\n'
The simplest code for the example above might be sending the stream directly to console:
>>> import subprocess >>> proc = subprocess.Popen(['echo', '"Hello world!"'], ... stdout=subprocess.PIPE) >>> proc.communicate()[0] '"Hello world!"\n'
The code below is to test the stdout and stderr behaviour:
# import sys sys.stdout.write('Testing message to stdout\n') sys.stderr.write('Testing message to stderr\n')
If we run it:
>>> proc = subprocess.Popen(['python', ''], ... stdout=subprocess.PIPE) >>> Testing message to stderr >>> proc.communicate() (Testing message to stdout\n', None) >>>
Note that the message to stderr gets displayed as it is generated but the message to stdout is read via the pipe. This is because we only set up a pipe to stdout.
Then, let's make both stdout and stderr to be accessed from Python:
>>> proc = subprocess.Popen(['python', ''], ... stdout=subprocess.PIPE, ... stderr=subprocess.PIPE) >>> proc.communicate() (Testing message to stdout\n', Testing message to stderr\n')
The communicate() method only reads data from stdout and stderr, until end-of-file is reached. So, after all the messages have been printed, if we call communicate() again we get an error:
>>> proc.communicate() Traceback (most recent call last): ... ValueError: I/O operation on closed file
If we want messages to stderr to be piped to stderr, we do: stderr=subprocess.STDOUT.
>>> proc = subprocess.Popen(['python', ''], ... stdout=subprocess.PIPE, ... stderr=subprocess.STDOUT) >>> proc.communicate() ('Testing message to stdout\r\nTesting message to stderr\r\n', None)
As we see from the output, we do not have stderr because it's been redirected to stderr.
Writing to a process can be done in a very similar way. If we want to send data to the process's stdin, we need to create the Popen object with stdin=subprocess.PIPE.
To test it let's write another program ( which simply prints Received: and then repeats the message we send it:
# import sys input = sys.stdout.write('Received: %s'%input)
To send a message to stdin, we pass the string we want to send as the input argument to communicate():
>>> proc = subprocess.Popen(['python', ''], stdin=subprocess.PIPE) >>> proc.communicate('Hello?') Received: Hello?(None, None)
Notice that the message created in the process was printed to stdout and then the return value (None, None) was printed. That's because no pipes were set up to stdout or stderr.
Here is another output after we specified stdout=subprocess.PIPE and stderr=subprocess.PIPE just as before to set up the pipe.
>>> proc = subprocess.Popen(['python', ''], ... stdin=subprocess.PIPE, ... stdout=subprocess.PIPE, ... stderr=subprocess.PIPE) >>> proc.communicate('Hello?') ('Received: Hello?', '')
>>> p1 = subprocess.Popen(['df','-h'], stdout=subprocess.PIPE) >>> p2 = subprocess.Popen(['grep', 'sda1'], stdin=p1.stdout, stdout=subprocess.PIPE) >>> p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits >>> output = p2.communicate()[0] >>> output '/dev/sda1 19G 9.2G 8.3G 53% /\n'
The p1.stdout.close() call after starting the p2 is important in order for p1 to receive a SIGPIPE if p2 exits before p1.
Here is another example for a piped command. The code gets window ID for the currently active window. The command looks like this:
xprop -root | awk '/_NET_ACTIVE_WINDOW\(WINDOW\)/{print $NF}'
Python code:
# This code runs the following awk to get a window id for the currently active X11 window # xprop -root | awk '/_NET_ACTIVE_WINDOW\(WINDOW\)/{print $NF}' import subprocess def py_xwininfo(): winId = getCurrentWinId() print 'winId = %s' %winId def getCurrentWinId(): cmd_1 = ['xprop', '-root'] cmd_2 = ['awk', '/_NET_ACTIVE_WINDOW\(WINDOW\)/{print $NF}'] p1 = subprocess.Popen(cmd_1, stdout = subprocess.PIPE) p2 = subprocess.Popen(cmd_2, stdin = p1.stdout, stdout=subprocess.PIPE) id = p2.communicate()[0] return id if __name__ == '__main__': py_xwininfo()
winId = 0x3c02035
Avoid shell=True by all means.
shell=True means executing the code through the shell. In other words, executing programs through the shell means, that all user input passed to the program is interpreted according to the syntax and semantic rules of the invoked shell. At best, this only causes inconvenience to the user, because the user has to obey these rules. For instance, paths containing special shell characters like quotation marks or blanks must be escaped. At worst, it causes security leaks, because the user can execute arbitrary programs.
shell=True is sometimes convenient to make use of specific shell features like word splitting or parameter expansion. However, if such a feature is required, make use of other modules are given to you (e.g. os.path.expandvars() for parameter expansion or shlex for word splitting). This means more work, but avoids other problems. - from Actual meaning of 'shell=True' in subprocess.
If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them:
>>> proc = subprocess.Popen('echo $HOME', shell=True) >>> /user/khong
The string is an exactly the formatted as it would be typed at the shell prompt:
$ echo $Home
So, the following would not work:
>>> proc = subprocess.Popen('echo $HOME', shell=False) Traceback (most recent call last): ... OSError: [Errno 2] No such file or directory
Following would not work, either:
>>> subprocess.Popen('echo "Hello world!"', shell=False) Traceback (most recent call last): ... OSError: [Errno 2] No such file or directory
That's because we are still passing it as a string, Python assumes the entire string is the name of the program to execute and there isn't a program called echo "Hello world!" so it fails. Instead we have to pass each argument separately.
psutil.Popen(*args, **kwargs) is a more convenient interface to stdlib subprocess.Popen().
"It starts a sub process and deals with it exactly as when using subprocess.Popen class but in addition also provides all the properties and methods of psutil.Process class in a single interface".
- see
The following code runs python -c "print 'hi, psutil'" on a subprocess:
>>> import psutil >>> import subprocess >>> proc = psutil.Popen(["/usr/bin/python", "-c", "print 'hi, psuti'"], stdout=subprocess.PIPE) >>> proc <psutil.Popen(pid=4304, name='python') at 140431306151888> >>> proc.uids user(real=1000, effective=1000, saved=1000) >>> proc.username 'khong' >>> proc.communicate() ('hi, psuti\n', None)
- subprocess - Work with additional processes
- subprocess - working with Python subprocess - Shells, Processes, Streams, Pipes, Redirects and More
Python tutorial
Python Home
Running Python Programs (os, sys, import)
Modules and IDLE (Import, Reload, exec)
Object Types - Numbers, Strings, and None
Strings - Escape Sequence, Raw String, and Slicing
Strings - Methods
Formatting Strings - expressions and method calls
Files and os.path
Traversing directories recursively
Subprocess Module
Regular Expressions with Python
Regular Expressions Cheat Sheet
Object Types - Lists
Object Types - Dictionaries and Tuples
Functions def, *args, **kargs
Functions lambda
Built-in Functions
map, filter, and reduce
List Comprehension
Sets (union/intersection) and itertools - Jaccard coefficient and shingling to check plagiarism
Hashing (Hash tables and hashlib)
Dictionary Comprehension with zip
The yield keyword
Generator Functions and Expressions
generator.send() method
Classes and Instances (__init__, __call__, etc.)
if__name__ == '__main__'
@static method vs class method
Private attributes and private methods
bits, bytes, bitstring, and constBitStream
json.dump(s) and json.load(s)
Python Object Serialization - pickle and json
Python Object Serialization - yaml and json
Priority queue and heap queue data structure
Graph data structure
Dijkstra's shortest path algorithm
Prim's spanning tree algorithm
Functional programming in Python
Remote running a local file using ssh
SQLite 3 - A. Connecting to DB, create/drop table, and insert data into a table
SQLite 3 - B. Selecting, updating and deleting data
MongoDB with PyMongo I - Installing MongoDB ...
Python HTTP Web Services - urllib, httplib2
Web scraping with Selenium for checking domain availability
REST API : Http Requests for Humans with Flask
Blog app with Tornado
Multithreading ...
Python Network Programming I - Basic Server / Client : A Basics
Python Network Programming I - Basic Server / Client : B File Transfer
Python Network Programming II - Chat Server / Client
Python Network Programming III - Echo Server using socketserver network framework
Python Network Programming IV - Asynchronous Request Handling : ThreadingMixIn and ForkingMixIn
Python Coding Questions I
Python Coding Questions II
Python Coding Questions III
Python Coding Questions IV
Python Coding Questions V
Python Coding Questions VI
Python Coding Questions VII
Python Coding Questions VIII
Python Coding Questions IX
Python Coding Questions X
Image processing with Python image library Pillow
Python and C++ with SIP
PyDev with Eclipse
Redis with Python
NumPy array basics A
NumPy Matrix and Linear Algebra
Pandas with NumPy and Matplotlib
Celluar Automata
Batch gradient descent algorithm
Longest Common Substring Algorithm
Python Unit Test - TDD using unittest.TestCase class
Simple tool - Google page ranking by keywords
Google App Hello World
Google App webapp2 and WSGI
Uploading Google App Hello World
Python 2 vs Python 3
virtualenv and virtualenvwrapper
Uploading a big file to AWS S3 using boto module
Scheduled stopping and starting an AWS instance
Cloudera CDH5 - Scheduled stopping and starting services
Removing Cloud Files - Rackspace API with curl and subprocess
Checking if a process is running/hanging and stop/run a scheduled task on Windows
Apache Spark 1.3 with PySpark (Spark Python API) Shell
Apache Spark 1.2 Streaming
bottle 0.12.7 - Fast and simple WSGI-micro framework for small web-applications ...
Flask app with Apache WSGI on Ubuntu14/CentOS7 ...
Fabric - streamlining the use of SSH for application deployment
Ansible Quick Preview - Setting up web servers with Nginx, configure enviroments, and deploy an App
Neural Networks with backpropagation for XOR using one hidden layer
NLP - NLTK (Natural Language Toolkit) ...
RabbitMQ(Message broker server) and Celery(Task queue) ...
OpenCV3 and Matplotlib ...
Simple tool - Concatenating slides using FFmpeg ...
iPython - Signal Processing with NumPy
iPython and Jupyter - Install Jupyter, iPython Notebook, drawing with Matplotlib, and publishing it to Github
iPython and Jupyter Notebook with Embedded D3.js
Downloading YouTube videos using youtube-dl embedded with Python
Machine Learning : scikit-learn ...
Django 1.6/1.8 Web Framework ...
Bogotobogo Image / Video Processing
Computer Vision & Machine Learning
with OpenCV, MATLAB, FFmpeg, and scikit-learn.
Bogotobogo's Video Streaming Technology
with FFmpeg, HLS, MPEG-DASH, H.265 (HEVC)
Bogotobogo's contents
To see more items, click left or right arrow.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization