Python Functions
- lambda 2020
Python supports the creation of anonymous functions (i.e. functions that are not bound to a name) at runtime, using a construct called lambda. This is not exactly the same as lambda in functional programming languages such as Lisp, but it is a very powerful concept that's well integrated into Python and is often used in conjunction with typical functional concepts like filter(), map() and reduce().
Like def, the lambda creates a function to be called later. But it returns the function instead of assigning it to a name. This is why lambdas are sometimes known as anonymous functions. In practice, they are used as a way to inline a function definition, or to defer execution of a code.
The following code shows the difference between a normal function definition, func and a lambda function, lamb:
>>> >>> def func(x): return x ** 3 >>> print(func(5)) 125 >>> >>> lamb = lambda x: x ** 3 >>> print(lamb(5)) 125 >>>
As we can see, func() and lamb() do exactly the same and can be used in the same ways. Note that the lambda definition does not include a return statement -- it always contains an expression which is returned. Also note that we can put a lambda definition anywhere a function is expected, and we don't have to assign it to a variable at all.
The lambda's general form is :
lambda arg1, arg2, ...argN : expression using arguments
Function objects returned by running lambda expressions work exactly the same as those created and assigned by defs. However, there are a few differences that make lambda useful in specialized roles:
- lambda is an expression, not a statement.
Because of this, a lambda can appear in places a def is not allowed. For example, places like inside a list literal, or a function call's arguments. As an expression, lambda returns a value that can optionally be assigned a name. In contrast, the def statement always assigns the new function to the name in the header, instead of returning is as a result. - lambda's body is a single expression, not a block of statements.
The lambda's body is similar to what we'd put in a def body's return statement. We simply type the result as an expression instead of explicitly returning it. Because it is limited to an expression, a lambda is less general that a def. We can only squeeze design, to limit program nesting. lambda is designed for coding simple functions, and def handles larger tasks.
>>> >>> def f(x, y, z): return x + y + z >>> f(2, 30, 400) 432
We can achieve the same effect with lambda expression by explicitly assigning its result to a name through which we can call the function later:
>>> >>> f = lambda x, y, z: x + y + z >>> f(2, 30, 400) 432 >>>
Here, f is assigned the function object the lambda expression creates. This is how def works, too. But in def, its assignment is an automatic must.
Default work on lambda arguments:
>>> mz = (lambda a = 'Wolfgangus', b = ' Theophilus', c = ' Mozart': a + b + c) >>> mz('Wolfgang', ' Amadeus') 'Wolfgang Amadeus Mozart' >>>
In the following example, the value for the name title would have been passes in as a default argument value:
>>> def writer(): title = 'Sir' name = (lambda x:title + ' ' + x) return name >>> who = writer() >>> who('Arthur Ignatius Conan Doyle') 'Sir Arthur Ignatius Conan Doyle' >>>
The lambdas can be used as a function shorthand that allows us to embed a function within the code. For instance, callback handlers are frequently coded as inline lambda expressions embedded directly in a registration call's arguments list. Instead of being define with a def elsewhere in a file and referenced by name, lambdas are also commonly used to code jump tables which are lists or dictionaries of actions to be performed on demand.
>>> >>> L = [lambda x: x ** 2, lambda x: x ** 3, lambda x: x ** 4] >>> for f in L: print(f(3)) 9 27 81 >>> print(L[0](11)) 121 >>>
In the example above, a list of three functions was built up by embedding lambda expressions inside a list. A def won't work inside a list literal like this because it is a statement, not an expression. If we really want to use def for the same result, we need temporary function names and definitions outside:
>>> >>> def f1(x): return x ** 2 >>> def f2(x): return x ** 3 >>> def f3(x): return x ** 4 >>> # Reference by name >>> L = [f1, f2, f3] >>> for f in L: print(f(3)) 9 27 81 >>> print(L[0](3)) 9 >>>
We can use dictionaries doing the same thing:
>>> key = 'quadratic' >>> {'square': (lambda x: x ** 2), 'cubic': (lambda x: x ** 3), 'quadratic': (lambda x: x ** 4)}[key](10) 10000 >>>
Here, we made the temporary dictionary, each of the nested lambdas generates and leaves behind a function to be called later. We fetched one of those functions by indexing and the parentheses forced the fetched function to be called.
Again, let's do the same thing without lambda.
>>> >>> def f1(x): return x ** 2 >>> def f2(x): return x ** 3 >>> def f3(x): return x ** 4 >>> key = 'quadratic' >>> {'square': f1, 'cubic': f2, 'quadratic': f3}[key](10) 10000 >>>
This works but our defs may be far away in our file. The code proximity that lambda provide is useful for functions that will only be used in a single context. Especially, if the three functions are not going to be used anywhere else, it makes sense to embed them within the dictionary as lambdas. Also, the def requires more names for these title functions that may cause name clash with other names in this file.
If we know what we're doing, we can code most statements as expressions:
>>> >>> min = (lambda x, y: x if x < y else y) >>> min(101*99, 102*98) 9996 >>> min(102*98, 101*99) 9996 >>>
If we need to perform loops within a lambda, we can also embed things like map calls and list comprehension expressions.
>>> import sys >>> fullname = lambda x: list(map(sys.stdout.write,x)) >>> f = fullname(['Wassily ', 'Wassilyevich ', 'Kandinsky']) Wassily Wassilyevich Kandinsky >>> >>> >>> fullname = lambda x: [sys.stdout.write(a) for a in x] >>> t = fullname(['Wassily ', 'Wassilyevich ', 'Kandinsky']) Wassily Wassilyevich Kandinsky >>>
Here is the description of map built-in function.
map(function, iterable, ...)
Return an iterator that applies function to every item of iterable, yielding the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted.
So, in the above example, sys.stdout.write is an argument for function, and the x is an iterable item, list, in the example.
In the following example, the lambda appears inside a def and so can access the value that the name x has in the function's scope at the time that the enclosing function was called:
>>> def action(x): # Make and return function, remember x return (lambda newx: x + newx) >>> ans = action(99) >>> ans <function <lambda> at 0x0000000003334648> >>> ans(100) 199 >>>
Though not clear in this example, note that lambda also has access to the names in any enclosing lambda. Let's look at the following example:
>>> >>> action = (lambda x: (lambda newx: x + newx)) >>> ans = action(99) >>> ans <function <lambda> at 0x0000000003308048> >>> ans(100) 199 >>> >>> ( (lambda x: (lambda newx: x + newx)) (99)) (100) 199
In the example, we nested lambda structure to make a function that makes a function when called. It's fairly convoluted and it should be avoided.
Here is a simple example of using lambda with built-in function sorted():
sorted(iterable[, key][, reverse])
The sorted() have a key parameter to specify a function to be called on each list element prior to making comparisons.
>>> death = [ ('James', 'Dean', 24), ('Jimi', 'Hendrix', 27), ('George', 'Gershwin', 38), ] >>> sorted(death, key=lambda age: age[2]) [('James', 'Dean', 24), ('Jimi', 'Hendrix', 27), ('George', 'Gershwin', 38)]
In this example, we want to read a video file and sort the packet in the order of starting time stamp. Also, we want to count the number of chunks.
#!/usr/bin/python import psutil import simplejson import subprocess procs_id = 0 procs = {} procs_data = [] def getMetadata(video): cmd = ['ffprobe', '-show_streams', '-show_packets', '-print_format', 'json', video] print 'cmd=', cmd stdout = runCommand(cmd, return_stdout = True, busy_wait = False) data = simplejson.loads(stdout) metadata = { } if data: # Obtain duration here if 'streams' in data: for item in data['streams']: if 'codec_type' in item and 'duration' in item and 'video' in item['codec_type']: metadata['duration'] = float(item['duration']) else: metadata['duration'] = float(0) # Obtain iframes here iframes = [] if 'packets' in data: # Filter out packet types video_packets = sorted( [packet for packet in data['packets'] if (packet['codec_type'] == "video" and 'pos' in packet)], key = lambda packet: int(packet['pos']) ) video_positions = sorted([int(packet['pos']) for packet in video_packets]) audio_packets = sorted( [packet for packet in data['packets'] if (packet['codec_type'] == "audio" and 'pos' in packet)], key = lambda packet: int(packet['pos'])) audio_positions = sorted([int(packet['pos']) for packet in audio_packets]) # Search for iframes iframe_packets = [packet for packet in video_packets if (packet['flags'] == "K")] positions = sorted([int(packet['pos']) for packet in data['packets'] if ('pos' in packet)]) start_byte = 0 end_byte = 0 duration = None for iframe in iframe_packets: start_byte = int(iframe['pos']) end_byte = 0 for pos in positions: if pos > start_byte: end_byte = pos - 188 break if duration is None: duration = float(iframe['pts_time']) else: new_duration = float(iframe['pts_time']) iframes.append({ 'byte_start': start_byte, 'byte_end': end_byte, 'duration': (new_duration - duration) }) duration = new_duration last_duration = float(video_packets[-1]['pts_time']) iframes.append({ 'byte_start': start_byte, 'byte_end': end_byte, 'duration': last_duration - duration }) metadata['iframes'] = iframes print 'metadata=',metadata return metadata # Runs command silently def runCommand(cmd, use_shell = False, return_stdout = False, busy_wait = True, poll_duration = 0.5): # Sanitize cmd to string cmd = map(lambda x: '%s' % x, cmd) if use_shell: command = ' '.join(cmd) else: command = cmd if return_stdout: proc = psutil.Popen(cmd, shell = use_shell, stdout = subprocess.PIPE, stderr = subprocess.PIPE) else: proc = psutil.Popen(cmd, shell = use_shell, stdout = open('/dev/null', 'w'), stderr = open('/dev/null', 'w')) global procs_id global procs global procs_data proc_id = procs_id procs[proc_id] = proc procs_id += 1 data = { } while busy_wait: returncode = proc.poll() if returncode == None: try: data = proc.as_dict(attrs = ['get_io_counters', 'get_cpu_times']) except Exception, e: pass time.sleep(poll_duration) else: break (stdout, stderr) = proc.communicate() returncode = proc.returncode del procs[proc_id] if returncode != 0: raise Exception(stderr) else: if data: procs_data.append(data) return stdout if __name__ == '__main__': segMeta = getMetadata('bunny_400.ismv') print 'segMeta=',segMeta for k in segMeta.keys(): if(k == 'iframes'): print 'iframe size =',len(segMeta[k]) break
After reading in the video using ffprobe, the data looks like this:
{ "packets": [ { "codec_type": "video", "stream_index": 0, "pts": 0, "pts_time": "0.000000", "dts": 0, "dts_time": "0.000000", "size": "847", "pos": "2927", "flags": "K" }, { "codec_type": "video", "stream_index": 0, "pts": 1200000, "pts_time": "0.120000", "dts": 1200000, "dts_time": "0.120000", "size": "486", "pos": "3804", "flags": "_" }, ........ ], "streams": [ { "index": 0, "codec_name": "h264", "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10", "profile": "High", "codec_type": "video", "codec_time_base": "1/50", "codec_tag_string": "avc1", "codec_tag": "0x31637661", "width": 288, "height": 160, "has_b_frames": 2, "sample_aspect_ratio": "80:81", "display_aspect_ratio": "16:9", "pix_fmt": "yuv420p", "level": 13, "r_frame_rate": "25/1", "avg_frame_rate": "0/0", "time_base": "1/10000000", "start_pts": 0, "start_time": "0.000000", "duration_ts": 5964400000, "duration": "596.440000", "bit_rate": "400074", "nb_read_packets": "14911", "disposition": { "default": 1, "dub": 0, "original": 0, "comment": 0, "lyrics": 0, "karaoke": 0, "forced": 0, "hearing_impaired": 0, "visual_impaired": 0, "clean_effects": 0, "attached_pic": 0 }, "tags": { "language": "und", "handler_name": "VideoHandler" } } ] }
The input file is: video.dat which is actually a fragmented mp4 file.
Output looks like this:
cmd= ['ffprobe', '-show_streams', '-show_packets', '-print_format', 'json', 'video.dat'] metadata= { 'duration': 596.44, 'iframes': [ {'duration': 10.0, 'byte_end': 399823, 'byte_start': 377082}, {'duration': 10.0, 'byte_end': 998254, 'byte_start': 984197}, {'duration': 10.0, 'byte_end': 1833216, 'byte_start': 1804498}, {'duration': 10.0, 'byte_end': 2591816, 'byte_start': 2569925}, .... {'duration': 10.0, 'byte_end': 29431348, 'byte_start': 29422617}, {'duration': 10.0, 'byte_end': 29633871, 'byte_start': 29633940}, {'duration': 10.0, 'byte_end': 29801180, 'byte_start': 29793525}, {'duration': 6.399999999999977, 'byte_end': 29801180, 'byte_start': 29793525}]} iframe size = 60
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization