Java Bugs with and without Fuzzing – AFL-based Java fuzzers and the Java Security Manager

In the last half a year I have been doing some fuzzing with AFL-based Java fuzzers, namely Kelinci and JQF. I didn’t really work with java-afl. The contents of this post are:

Various AFL-based Java fuzzers are available that can be used to find more or less severe security issues. By combining these with sanitizers provided by the Java Security Manager, additional instrumentation can be achieved.

This blog post will mention several files, they are included on github. Additionally, the zip file includes several other files that reproduce the same bugs.

AFL-based Java fuzzing tools

The AFL fuzzer is really popular nowadays as it performs instrumented fuzzing. If you are not familiar with AFL, it’s probably better if you at least quickly look at AFL before you read this post. It is especially important to understand how AFL handles hangs (test cases that take too much time to process) and crashes (e.g. target program segfault).

Kelinci is one of the first AFL for Java implementations and is very promising, although the approach with having two processes per fuzzing instance is a little clumsy and can get confusing. One process is the native C side, which takes mutated inputs produced by AFL and sends them to the second process via TCP socket. The second process is the Java process that feeds the input to the target program and sends back the code paths taken with this input. There are certain error messages in the Java part of this fuzzer that are not always exactly clear (at least to me), but they seem to indicate that the fuzzer is not running in a healthy state anymore. However, so far Kelinci worked very well for me and came up with a lot of results. There has not been any development for 7 months, so I hope the author will pick it up again.

JQF is actively maintained and the last changes were commited a couple of days ago. It does not take the classic fuzzer approach that most fuzzers for security researchers take, but instead is based on Java Unit Tests and focuses more on developers. It currently has only limited support of AFL’s -t switch for the timeout settings and there is also only rudimentary afl-cmin support. While this is perfect for developers using Unit Tests, it is not the most flexible fuzzer for security researchers fuzzing Java code.

java-afl has not been updated in four months. This is actually the fuzzer I didn’t successfully use at all. I tried to ask the developer about how to run it properly, but didn’t get an answer that would help me run it on the test case I had in mind. If you have better luck with java-afl, please let me know, it would be interesting to hear how this fuzzer performs.

First steps with Apache Commons

I started with the Apache Common’s Imaging JPEG parser as a target. The choice was simple because it was one of the examples explained for the Kelinci fuzzer. Apache Commons is a very popular library for all kind of things that are missing or incomplete in the Java standard library. When going through the author’s example, I realized that he gave the fuzzer only one input file containing the text “hello”, which is not a JPEG file and not a very good starting corpus. While it’s probably lcamtuf’s very interesting experiment that makes people believe using such corpus data is a valid choice, it is not a valid choice for proper fuzzing runs. lcamtuf’s experiment was good to proof the point that the fuzzer is smart, but for productive fuzzing proper input files have to be used to achieve good results. Fuzzing is all about corpus data in the end. So I took the JPEG files in lcamtuf’s corpus on the AFL website and some files from my private collection. The fuzzer quickly turned up with an additional ArrayIndexOutOfBoundsException which I reported to Apache (file ArrayIndexOutOfBoundsException_DhtSegment_79.jpeg). That was quite an easy start into Java fuzzing. If you would do the same for other parsers of Apache Commons (for example PNG parser), you would most probably find some more unchecked exceptions.

Goals: Taking a step back

After this quick experiment I gave the idea of fuzzing Java some more thoughts. Fuzzing is originally applied to programs that are not memory safe, hoping that we are able to find memory corruption issues. Out of bound read or writes in Java code simply do not result in memory corruption but in more or less harmless Exceptions such as IndexOutOfBoundsException. While it might be desirable to find (code robustness) issues and might result in Denial of Service issues, the severity of these issues is usually low. The question is what kind of behavior and fuzzing results are we looking for? There are different scenarios that might be of interest, but the attack vector (how does the attacker exploit the issue in the real world?) matters a lot when looking at them. Here is my rough (over)view on Java fuzzing:

  • Finding bugs in the JVM.
    • Arbitrary Java code as input. This could be helpful in more exotic scenarios, for example when you need to escape from a sandboxed JVM. In most other scenarios this attack vector is probably just unrealistic, as an attacker would be executing Java code already.
    • Feeding data into built-in classes/functions (fuzzing the standard library), such as strings. This is not very likely to come up with results, but you never know, maybe there are Java deserialization vulnerabilities lurking deep in the JVM code?
    • Finding low-severity or non-security issues such as code that throws an Exception it didn’t declare to throw (RuntimeExceptions).
  • Finding memory corruption bugs in Java code that uses native code (for example JNI or CNI). This is probably a very good place to use Java fuzzing, but I don’t encounter this situation very much except in Android apps. And fuzzing Android apps is an entirely different beast that is not covered here.
  • Fuzzing pure Java code.
    • We could go for custom goals. This might depend on your business logic. For example, if the code heavily uses file read/writes maybe there is some kind of race condition? Also the idea of differential fuzzing for crypto libraries makes a lot of sense.
    • Finding “ressource management” issues, such as Denial of Service (DoS) issues, OutOfMemoryExceptions, high CPU load, high disk space usage, or functions that never return.
    • Finding low-severity or non-security issues such as RuntimeExceptions.
    • Finding well-known security issues for Java code, such as Java deserialization vulnerabilities, Server Side Request Forgery (SSRF), and External Entity Injection (XXE).

I was especially interested in the last three points in this list: Finding ressource issues, RuntimeExceptions and well-known Java security issues. While I already found a RuntimeException in my little experiment described above, I was pretty sure that I would be able to detect certain ressource management issues by checking the “hangs” directory of AFL. However, the last point of finding well-known security issues such as SSRF seems tricky. The fuzzer would need additional instrumentation or sanitizers to detect such insecure behavior. Just as Address Sanitizer (ASAN) aborts on invalid memory access for native code (which then leads to a crash inside AFL), it would be nice to have sanitizers that take care about such areas in the Java world. A file sanitizer for example might take a whitelist of files that are allowed to be accessed by the process, but abort if any other file is accessed. This could be used to detect XXE and SSRF scenarios. A network sanitizer might do the same if sockets are used. Imagine a Java image file parsing library as a target. From a security perspective such a library should never open network sockets, as this would indicate Server Side Request forgery. This is a very realistic scenario, and I did find XXE issues in PNG XMP metadata parsing libraries before.

Java Security Manager

After doing some research it turned out that there is nothing like a file whitelist sanitizer for native code where AFL is usually used. So if we would fuzz any C/C++ code we would have to write our own parser and as stated by Jakub Wilk it might be tricky to implement due to async-signal-safe filesystem functions. So if you feel like writing one, please go ahead.

Back to Java I found out that there is already such a sanitizer. The best part is that it’s a built-in feature of the JVM and it’s called Java Security Manager. Look at this simple Java Security Manager policy file I created for running the Kelinci fuzzer with our simple Apache Commons JPEG parsing code:

grant {
    permission java.io.FilePermission "/tmp/*", "read,write,delete";
    permission java.io.FilePermission "in_dir/*", "read";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master0/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master1/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave0/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave1/*", "read, write, delete";
    permission java.net.SocketPermission "localhost:7007-", "accept, listen, resolve";
    permission java.lang.RuntimePermission "modifyThread";
};

All it does is allowing file access to the temporary directory, reading from the input directory (in_dir) and writing to the output directory (out_dir) of AFL. Moreover, it allows the Kelinci Java process to listen on TCP port 7007 as well as to modify other threads. As the Security Manager is built into every Java JVM, you can simply start it with your usual command line with two more arguments:

java -Djava.security.manager -Djava.security.policy=java-security-policy.txt

So in our case we can run the Kelinci fuzzer server process with:

java -Djava.security.manager -Djava.security.policy=java-security-policy.txt -Djava.io.tmpdir=/tmp/ -cp bin-instrumented:commons-imaging-1.0-instrumented.jar edu.cmu.sv.kelinci.Kelinci driver.Driver @@

I went back and ran the Kelinci fuzzer some more hours on the Apache Commons JPEG parser without getting any new results with the Java Security Manager. However, at this point I was convinced that the Java Security Manager would take Java fuzzing to the next level. I just needed a different target first.

Targeting Apache Tika

Fast forward several days later, I stumbled over the Apache Tika project. As Apache Tika was formerly part of Apache Lucene, I was convinced that a lot of servers on the Internet would allow users to upload arbitrary files to be parsed by Apache Tika. As I’m currently maintaining another related research about web based file upload functionalities (UploadScanner Burp extension) this got me even more interested.

Apache Tika is a content analysis toolkit and can extract text content from over a thousand different file formats. A quick’n’dirty grep-estimate turned out that it has about 247 Java JAR files as dependencies at compile time. Apache Tika also had some severe security issues in the past. So as a test target Apache Tika seemed to fit perfectly. On the other hand I also knew that using such a big code base is a bad idea when fuzzing with AFL. AFL will more or less quickly deplete the fuzzing bitmap when the instrumented code is too large. Afterwards, AFL will be unable to detect when an input results in an interesting code path being taken. I was also not sure if I could successfully use the Java fuzzers to instrument the huge Apache Tika project. However, I decided to go on with this experiment.

I first tried to get things running with Kelinci, but ran into multiple issues and ended up creating a “works-for-me” Kelinci fork. After Kelinci was running, I also tried to get the JQF fuzzer running, however, I ran into similar but distinct problems and therefore decided to stick with Kelinci at this point. For Tika I had to adopt the Java Security Manager Policy:

grant {
    //Permissions required by Kelinci
    permission java.lang.RuntimePermission "modifyThread";
    
    permission java.net.SocketPermission "localhost:7007", "listen, resolve";
    permission java.net.SocketPermission "localhost:7008", "listen, resolve";
    permission java.net.SocketPermission "localhost:7009", "listen, resolve";
    permission java.net.SocketPermission "localhost:7010", "listen, resolve";
    permission java.net.SocketPermission "[0:0:0:0:0:0:0:1]:*", "accept, resolve";
    
    permission java.io.FilePermission "in_dir/*", "read";
    permission java.io.FilePermission "corpus/*", "read, write";
    permission java.io.FilePermission "crashes/*", "read";
    permission java.io.FilePermission "out_dir/*", "read, write";
    
    //Permissions required by Tika
    permission java.io.FilePermission "tika-app-1.17.jar", "read";
    permission java.io.FilePermission "tika-app-1.17-instrumented.jar", "read";

    permission java.io.FilePermission "/tmp/*", "read, write, delete";
    
    permission java.lang.RuntimePermission "getenv.TIKA_CONFIG";
    
    permission java.util.PropertyPermission "org.apache.tika.service.error.warn", "read";
    permission java.util.PropertyPermission "tika.config", "read";
    permission java.util.PropertyPermission "tika.custom-mimetypes", "read";
    permission java.util.PropertyPermission "org.apache.pdfbox.pdfparser.nonSequentialPDFParser.eofLookupRange", "read";
    permission java.util.PropertyPermission "org.apache.pdfbox.forceParsing", "read";
    permission java.util.PropertyPermission "pdfbox.fontcache", "read";
    permission java.util.PropertyPermission "file.encoding", "read";

    //When parsing certain PDFs...
    permission java.util.PropertyPermission "user.home", "read";
    permission java.util.PropertyPermission "com.ctc.wstx.returnNullForDefaultNamespace", "read";
    
    //When parsing certain .mdb files...
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.resourcePath", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.brokenNio", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.charset.VERSION_3", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.columnOrder", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.enforceForeignKeys", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.allowAutoNumberInsert", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.timeZone", "read";
};

To produce this policy file manually was much more annoying than for Apache Commons. The reason is that the necessary permissions we need to whitelist depend on the input file. So if a PNG file is fed into Apache Tika, it will need other runtime property permissions than if a PDF file is fed into Apache Tika. This means that we have to do a dry run first that will go through the entire input corpus of files and run them once with the minimum policy file. If a security exception occurs, it might be necessary to whitelist another permission. This process takes a lot of time. However, as an article from 2004 states:

There’s currently no tool available to automatically generate a [Java security] policy file for specific code.

So that’s why I wrote another quick’n’dirty hack/tool to generate Java security policy files. As it’s not a beauty I gave it the ugly name TMSJSPGE on github. However, it does it’s job and generates a Java security policy file. It will feed each corpus file to the target process (Tika in this case) and add a new rule to the security policy.

If you look at the above property permissions, I’m still not sure what they are all doing. However, I just decided I’ll go with them and allow Tika to use them.

If you run your fuzzer with different input files, you might be required to adopt the Java Security policy, as other code paths might require new permissions. So the above security policy for Apache Tika is likely to be incomplete.

Apache Tika findings

As already explained, a good input corpus is vital for a successful fuzzing run. Additionally, I had to run Tika with as many files as possible to make sure the Java Security Policy covered most permissions necessary. Over the years I’ve collected many input sample files (around 100’000) by doing fuzzing runs with various libraries and by collecting third-party files (that’s actually a topic for another day). So I decided I will run the TMSJSPGE tool with each of these 100’000 files to create the best Security Policy I can. When I checked back on the TMSJSPGE I saw that the tool was stuck feeding a certain file to Apache Tika. This means that Apache Tika never returned a result and the process hung. And that meant I already found security issues in Apache Tika 1.17 before I even started fuzzing. After removing the file that resulted in a hang and restarting TMSJSPGE, Apache Tika hung with several other files as well. Some of the files triggered the same hang and after deduplicating, the following two security issues were reported to Apache Tika:

I was wondering where these input files I had in my collection were coming from. Several BPG files triggering the issue were from a fuzzing run I once did for libbpg, so they were produced by AFL when creating BPG files for a native library. But the chm file triggering the other issue was a file that I downloaded a long time ago from the fuzzing project. It was a file Hanno Bรถck provided that came out of a fuzzing run for CHMLib. Interesting.

So here I was and had already found an uncaught exception in Apache Commons and two low severity issues in Apache Tika without even starting to do proper fuzzing.

To get an idea of the Java classes causing the issue I ran Apache Tika with a debugger and the triggering file, stopped the execution during the infinite loop and printed a stack trace. But most of the hard work to figure out the actual root causes of these issues was done by the maintainers, most importantly by Tim Allison and the Apache Tika team. That is also true for all the upcoming issues.

Fuzzing Apache Tika with Kelinci

After sorting out the input files that resulted in a hang, I started a couple of afl-fuzz fuzzing instances and waited. The behavior of the Kelinci fuzzer is sometimes a little brittle, so I often got the “Queue full” error message. It means the fuzzer is not running properly anymore and that timeouts will occur. I had to restart the fuzzing instances several times and tried to tweak the command line settings to improve stability. However, over time the instances often managed to fill up the queue again. Anyway, a couple of instances ran fine and found several “AFL crashes”. Keep in mind that “AFL crashes” in this case just mean uncaught Java exceptions. After looking through and deduplicating issues, I reported the following non-security (or very low severity, a matter of definition) issues to the maintainers of the libraries used by Apache Tika:

The hang directory of AFL did not show any interesting results. After running each of the files in the hang directory with Apache Tika I found a PDF file that took nearly a minute to process, but none of the files lead to a full hang of the Tika thread. I suspect that the synchronization of the two processes was one of the reasons no infinite hangs were found by the fuzzer.

However, at this stage I was most disappointed that none of the crashes indicated that anything outside of the specified Java Security Manager policy was triggered. I guess this was a combination of my brittle configuration of Kelinci and the fact that it is probably not as easy to find arbitrary file read or write issues. But in the end you often simply don’t know what’s exactly the reason for not being successful with fuzzing.

JQF and a bug in Java

At one point I also wanted to try the JQF fuzzer on my ARM fuzzing machines with Apache Tika. It didn’t work for me at first and I found out that OpenJDK on ARM had horrible performance with JQF, so I switched to Oracle’s Java. Additionally, Apache Tika would simply not run with JQF. After the Tika 1.17 issues were fixed in Apache Tika I thought it was time to notify the maintainers of the fuzzers, so they could try to fuzz Apache Tika themselves. Rohan (maintainer of JQF) quickly fixed three independent issues and implemented a test case/benchmark for the fixed Tika 1.18 in JQF. After that I was able to fuzz Tika with my own corpus, but the performance was very bad for various reasons. One reason was the weak ARM boxes, but JQF couldn’t handle timeouts either (AFL’s -t switch). Rohan attempted a fix, but it’s only working sometimes. Rohan was also very quick to implement afl-cmin and said running with a Java Security Manager policy should be no problem. However, I couldn’t try those features properly due to the performance problems on the ARM machines. As I was not in the mood to switch fuzzing boxes, I just tried to get the fuzzer running somehow. After cutting down the input corpus and removing all PDF files that were taking potentially longer to be processed by Apache Tika, the fuzzer crept slowly forward. After not paying attention for 10 days, another hang was found by JQF in Apache Tika 1.18… I thought! However, after submitting this bug to Apache Tika, they pointed out that this was actually a bug in the Java standard libraries affecting Java before version 10 that I rediscovered:

The hang file was created by the JQF fuzzer by modifying a sample QCP file “fart_3.qcp” from the public ffmpeg samples. So without actively targeting Java itself, I had rediscovered a bug in Java’s standard libraries, as Tika used it. Quite an interesting twist.

Adding a x86 fuzzing box

At the same time I also realized that these ARM JQF fuzzer instances were stuck. The endless RIFF loop file was detected as a crash (which might just be bad behavior of JQF for hangs), so I didn’t really know the reason why they were stuck currently. I tried to run the current input file on another machine, but the testcase didn’t hang. So I didn’t figure out why the fuzzer got stuck, but as Rohan pointed out the timeout handling (AFL’s “hangs”) isn’t optimal yet. JQF will detect timeouts when the infinite loop hits instrumented part of the Java code, as it will be able to measure the time that passed. However, JQF will hang for now if a test file makes the code loop forever in non-instrumented code. I removed all RIFF/QCP input files so hopefully I wouldn’t rediscover the RIFF endless loop bug again (I never switched to Java 10) and restarted the fuzzing instances.

I decided to additionally use a 32bit x86 VMWare fuzzing box, maybe it would run more stable there. I setup JQF with Java 8 again and without RIFF files as inputs. The x86 virtual machine performed much better, executing around 10 testcases per second. So I let these instances run for several days… just to realize when I came back that both instances got stuck after 7 hours of running. I checked again if the current input file could be the reason and this time this was exactly the problem, so another bug. Rinse and repeat, the next morning another bug. So after a while (at least 5 iterations) I had a bag full of bugs:

  • An endless loop in Junrar (file 11_hang_junrar_zero_header2.rar), where the code simply never returned when the rar header size is zero. I contacted one of the maintainers, beothorn. It was fixed and this issue ended up as CVE-2018-12418.
  • Infinite loop in Apache Tika’s IptcAnpaParser for handling IPTC metadata (file 12_hang_tika_iptc.iptc), where the code simply never returned. This was fixed and assigned CVE-2018-8017.
  • Infinite loop in Apache PDFbox’ AdobeFontMetricsParser (file 16_570s_fontbox_OOM.afm), after nearly 10 minutes (on my machine) leading to an out of memory situation. This was fixed and assigned CVE-2018-8036.
  • An issue when a specially crafted zip content is read with Apache Commons Compress (file 14_69s_tagsoup_HTMLScanner_oom.zip) that leads to an out of memory exception. This was fixed in Apache Commons Compress and CVE-2018-11771 was assigned. Another zip file created (file 15_680s_commons_IOE_push_back_buffer_full.zip) runs for 11 minutes (on my machine) leading to IOException with a message that the push back buffer is full and is probably related to the issue. Also probably the same issue is a file where Tika takes an arbitrary amount of time (during the tests between 20 seconds and 11 minutes) to process a zip file (file 13_48s_commons_truncated_zip_entry3.zip). This last one is worth a note as JQF correctly detected this as a hang and put it in AFL’s hang directory. The underlying problem of CVE-2018-11771 was that a read operation started to return alternating values of -1 and 345 when called by an InputStreamReader with UTF-16. The minimal code to reproduce is:
@Test
public void testMarkResetLoop() throws Exception {
    InputStream is = Files.newInputStream(Paths.get("C:/14_69s_tagsoup_HTMLScanner_oom.zip"));
    ZipArchiveInputStream archive = new ZipArchiveInputStream(is);
    ZipArchiveEntry entry = archive.getNextZipEntry();
    while (entry != null) {
        if (entry.getName().contains("one*line-with-eol.txt")) {
            Reader r = new InputStreamReader(archive, StandardCharsets.UTF_16LE);
            int i = r.read();
            int cnt = 0;
            while (i != -1) {
                if (cnt++ > 100000) {
                    throw new RuntimeException("Infinite loop detected...");
                }
                i = r.read();
            }
        }
        entry = archive.getNextZipEntry();
    }
}

After all these fixes I ran the fuzzer again on a nightly build of Apache Tika 1.19 and it didn’t find any new issues in more than 10 days. So my approach of fuzzing Tika seems to be exhausted. As always, it doesn’t mean another approach wouldn’t find new issues.

Summary

This is where I stopped my journey of Java fuzzing for now. I was a little disappointed that the approach with the Java Security Manager still did not find any security issues such as SSRF and that I only found ressource management issues. However, I’m pretty sure this strategy is still the way to go, it probably just needs other targets. As you can see there are loose ends everywhere and I’m definitely planning to go back to Java fuzzing:

  • Use Kelinci/JQF with other Apache Commons parsers, e.g. for PNG
  • Write sanitizers such as file or socket opening for native code AFL
  • Contribute to the AFL-based Java fuzzers

However, for now there are other things to break on my stack.

I would like to thank Tim Allison of the Apache Tika project, it was a pleasure to do coordinated disclosure with him. And also a big thanks to Rohan Padhye who was really quick implementing new features in JQF.

Make sure you add the files included on github to your input corpus collection, as we saw it’s worth having a collection of crashes for other libraries when targeting new libraries.

Crash bash

Fuzzing Bash-4.4 patch 12 with AFL mainly fork bombed the fuzzing machine, but it also found this crash (they all have the same root cause):

<&-<${}
<&"-"<"$[~]"
<&"-"<"${}"
<&"-"<"${$0}"
<&"-"<$(())

It also works on a Bash 3.2.57, but some friends told me that they needed the following to reproduce:

echo -ne '<&-<${}'|bash

A Ubuntu user told me it was not reproducible at all, but I rather suspect his whoopsie didn’t want him to see it. Edit: As pointed out by Matthew in the comments it also works on Ubuntu.

It looks like a nullpointer dereference to me:

Program received signal SIGSEGV, Segmentation fault.
0x000912a8 in buffered_getchar () at input.c:565
565	  return (bufstream_getc (buffers[bash_input.location.buffered_fd]));
(gdb) bt
#0  0x000912a8 in buffered_getchar () at input.c:565
#1  0x0002f87c in yy_getc () at /usr/homes/chet/src/bash/src/parse.y:1390
#2  0x000302cc in shell_getc (remove_quoted_newline=1) at
/usr/homes/chet/src/bash/src/parse.y:2299
#3  0x0002e928 in read_token (command=0) at
/usr/homes/chet/src/bash/src/parse.y:3115
#4  0x00029d2c in yylex () at /usr/homes/chet/src/bash/src/parse.y:2675
#5  0x000262cc in yyparse () at y.tab.c:1834
#6  0x00025efc in parse_command () at eval.c:261
#7  0x00025de8 in read_command () at eval.c:305
#8  0x00025a70 in reader_loop () at eval.c:149
#9  0x0002298c in main (argc=1, argv=0xbefff824, env=0xbefff82c) at
shell.c:792
(gdb) p bash_input.location.buffered_fd
$1 = 0
(gdb) p buffers
$2 = (BUFFERED_STREAM **) 0x174808
(gdb) x/10x 0x174808
0x174808:	0x00000000	0x00000000	0x00000000	0x00000000
0x174818:	0x00000000	0x00000000	0x00000000	0x00000000
0x174828:	0x00000000	0x00000000

The maintainers of bash were notified.

About the CVEs in libtiff 4.0.3

There has been a lot of afl fuzzing going on, a lot of image libraries were targeted, I also fuzzed some libraries, for example libtiff (back when it was still on remotesensing.org…). I sent around 10 to 20 crash files for the different tools to the maintainer that seemed to be kind of unique crash cases, although I didn’t analyze a lot of the crashes in-depth. Others found similar issues and CVEs like CVE-2014-8129, CVE-2014-8128, CVE-2014-8127 and CVE-2014-9330 were assigned, additionally I got CVE-2015-8870.

Here’s the example that I analyzed a little bit more closely (and that got the identifier CVE-2015-8870) in libtiff version 4.0.3 (until this month the last stable). It’s one of the errors in the bmp2tiff command line tool. Here’s what happens when you run it with one of my crash files (bmp2tiff crash-file.bmp outfile.tiff).

First, width and length variables are read from the bmp file header. Then the needed memory for the uncompressed image is calculated and allocated (line 595 in bmp2tiff.c):

uncompr_size = width * length;
...
uncomprbuf = (unsigned char *)_TIFFmalloc(uncompr_size);

However, there is no check for an integer overflow. So in my example afl made a file that results in the following values (gdb output):

(gdb) p width
$70 = 65536
(gdb) p length
$71 = 65544
(gdb) p uncompr_size
$72 = 524288

Where 524289 is (65536 * 65544) % MAX_INT. However, later on the width and length is used to calculate offsets on the uncomprbuf buffer, which results in pointers that are far off (heap buffer overflow).

Although I didn’t check the entire code, I think this is not easily exploitable, as it can only be used to read (more or less) arbitrary memory regions and write them to the output file. While this might be interesting in scenarios where you look for memory leaks, I doubt that it’s useful in any realistic attack scenario. Drop me a comment if I’m wrong. So the fix was to check if an integer overflow occurs on line 595 in bmp2tiff.c, which is done in the new version according to the maintainer.

Take a second and think about how many projects are probably using libtiff.

Looking into another crash file with an arbitrary WRITE and turning it into a fully weaponized exploit is still on my TODO list… we’ll see.

cheers,
floyd

New year – Vallader app, fuzzing and advisories

Happy new year everybody,

As some of you know I’m learning a new language (Vallader Romansh) and because that language is only spoken by a few ten thousand people there is no dictionary Android app. So hey, here is a version I coded in half a day on github and on Google Play. I never took the time to improve it, so I thought I simply release it today (which took me another half a day). The app isn’t very stable, not well tested, but I guess better some app than no app at all. Send me pull requests ๐Ÿ˜‰

Moreover, I’ve been fuzzing quiet a lot in the last few months and the results are crazy, thanks to AFL. I’m writing heap buffer overflow exploits and I hope I’ll write some more posts about it soon.

If you haven’t seen it, we’ve been releasing a few advisories in 2014.

Additionally, I just changed some settings on this page. You won’t be bothered with third party JavaScript includes on this domain anymore.

How webservers react on specific characters

One thing I did during my Master Thesis a while ago, was to test how different webservers react to all kind of characters. One of the first things I tested was all characters represented by one byte (00 to FF) and their percent encoded equivalents (%00 to %FF). Of course the results may vary with other server versions, server configurations, server side code, client libraries or the sent HTTP headers. For example python’s urllib2 is not able to send 0A (line feed) in an URI (which makes sense). I tried to use standard components as best as I could. The webservers I used were:

  • An Apache 2.2.12 server (port 80), Ubuntu 9.10 machine with PHP 5.2.10
  • On the same machine a Tomcat 6.0.26 server (port 8080) with JSP (Java Server Pages)
  • On a Microsoft-IIS/6.0, Windows 2003 Server R2/SP2 with ASP.NET 2.0.50727 a script in C# on Virtualbox 3.1.8

So here are the main results in one picture:

character_table_for_testing_webservers

The ‘Name’ column means that the character was injected into the parameter name, e.g. na%00me=value&a=b. The fields with ‘S’ are explained in another section of my Master Thesis, but some of the time you can guess the behavior. E.g. I think you know what & stands for in GET parameters, right? ๐Ÿ˜‰

This kind of information is useful when you are trying to write a fuzzer, that is more focused to do some tests that make sense. Would be interesting if this table is useful for someone else.