In the last half a year I have been doing some fuzzing with AFL-based Java fuzzers, namely Kelinci and JQF. I didn’t really work with java-afl. The contents of this post are:
Various AFL-based Java fuzzers are available that can be used to find more or less severe security issues. By combining these with sanitizers provided by the Java Security Manager, additional instrumentation can be achieved.
- Brief explanation of AFL-based Java fuzzers
- First steps and explaination of how an uncaught exception in Apache Commons was found
- Explanation of the goals for Java fuzzing
- Using the Java Security Manager for fuzzing
- Targeting Apache Tika
- Apache Tika findings
- Fuzzing Apache Tika with Kelinci
- JQF and a bug in Java
- Adding a x86 fuzzing box
- Summary
This blog post will mention several files, they are included on github. Additionally, the zip file includes several other files that reproduce the same bugs.
AFL-based Java fuzzing tools
The AFL fuzzer is really popular nowadays as it performs instrumented fuzzing. If you are not familiar with AFL, it’s probably better if you at least quickly look at AFL before you read this post. It is especially important to understand how AFL handles hangs (test cases that take too much time to process) and crashes (e.g. target program segfault).
Kelinci is one of the first AFL for Java implementations and is very promising, although the approach with having two processes per fuzzing instance is a little clumsy and can get confusing. One process is the native C side, which takes mutated inputs produced by AFL and sends them to the second process via TCP socket. The second process is the Java process that feeds the input to the target program and sends back the code paths taken with this input. There are certain error messages in the Java part of this fuzzer that are not always exactly clear (at least to me), but they seem to indicate that the fuzzer is not running in a healthy state anymore. However, so far Kelinci worked very well for me and came up with a lot of results. There has not been any development for 7 months, so I hope the author will pick it up again.
JQF is actively maintained and the last changes were commited a couple of days ago. It does not take the classic fuzzer approach that most fuzzers for security researchers take, but instead is based on Java Unit Tests and focuses more on developers. It currently has only limited support of AFL’s -t switch for the timeout settings and there is also only rudimentary afl-cmin support. While this is perfect for developers using Unit Tests, it is not the most flexible fuzzer for security researchers fuzzing Java code.
java-afl has not been updated in four months. This is actually the fuzzer I didn’t successfully use at all. I tried to ask the developer about how to run it properly, but didn’t get an answer that would help me run it on the test case I had in mind. If you have better luck with java-afl, please let me know, it would be interesting to hear how this fuzzer performs.
First steps with Apache Commons
I started with the Apache Common’s Imaging JPEG parser as a target. The choice was simple because it was one of the examples explained for the Kelinci fuzzer. Apache Commons is a very popular library for all kind of things that are missing or incomplete in the Java standard library. When going through the author’s example, I realized that he gave the fuzzer only one input file containing the text “hello”, which is not a JPEG file and not a very good starting corpus. While it’s probably lcamtuf’s very interesting experiment that makes people believe using such corpus data is a valid choice, it is not a valid choice for proper fuzzing runs. lcamtuf’s experiment was good to proof the point that the fuzzer is smart, but for productive fuzzing proper input files have to be used to achieve good results. Fuzzing is all about corpus data in the end. So I took the JPEG files in lcamtuf’s corpus on the AFL website and some files from my private collection. The fuzzer quickly turned up with an additional ArrayIndexOutOfBoundsException which I reported to Apache (file ArrayIndexOutOfBoundsException_DhtSegment_79.jpeg). That was quite an easy start into Java fuzzing. If you would do the same for other parsers of Apache Commons (for example PNG parser), you would most probably find some more unchecked exceptions.
Goals: Taking a step back
After this quick experiment I gave the idea of fuzzing Java some more thoughts. Fuzzing is originally applied to programs that are not memory safe, hoping that we are able to find memory corruption issues. Out of bound read or writes in Java code simply do not result in memory corruption but in more or less harmless Exceptions such as IndexOutOfBoundsException. While it might be desirable to find (code robustness) issues and might result in Denial of Service issues, the severity of these issues is usually low. The question is what kind of behavior and fuzzing results are we looking for? There are different scenarios that might be of interest, but the attack vector (how does the attacker exploit the issue in the real world?) matters a lot when looking at them. Here is my rough (over)view on Java fuzzing:
- Finding bugs in the JVM.
- Arbitrary Java code as input. This could be helpful in more exotic scenarios, for example when you need to escape from a sandboxed JVM. In most other scenarios this attack vector is probably just unrealistic, as an attacker would be executing Java code already.
- Feeding data into built-in classes/functions (fuzzing the standard library), such as strings. This is not very likely to come up with results, but you never know, maybe there are Java deserialization vulnerabilities lurking deep in the JVM code?
- Finding low-severity or non-security issues such as code that throws an Exception it didn’t declare to throw (RuntimeExceptions).
- Finding memory corruption bugs in Java code that uses native code (for example JNI or CNI). This is probably a very good place to use Java fuzzing, but I don’t encounter this situation very much except in Android apps. And fuzzing Android apps is an entirely different beast that is not covered here.
- Fuzzing pure Java code.
- We could go for custom goals. This might depend on your business logic. For example, if the code heavily uses file read/writes maybe there is some kind of race condition? Also the idea of differential fuzzing for crypto libraries makes a lot of sense.
- Finding “ressource management” issues, such as Denial of Service (DoS) issues, OutOfMemoryExceptions, high CPU load, high disk space usage, or functions that never return.
- Finding low-severity or non-security issues such as RuntimeExceptions.
- Finding well-known security issues for Java code, such as Java deserialization vulnerabilities, Server Side Request Forgery (SSRF), and External Entity Injection (XXE).
I was especially interested in the last three points in this list: Finding ressource issues, RuntimeExceptions and well-known Java security issues. While I already found a RuntimeException in my little experiment described above, I was pretty sure that I would be able to detect certain ressource management issues by checking the “hangs” directory of AFL. However, the last point of finding well-known security issues such as SSRF seems tricky. The fuzzer would need additional instrumentation or sanitizers to detect such insecure behavior. Just as Address Sanitizer (ASAN) aborts on invalid memory access for native code (which then leads to a crash inside AFL), it would be nice to have sanitizers that take care about such areas in the Java world. A file sanitizer for example might take a whitelist of files that are allowed to be accessed by the process, but abort if any other file is accessed. This could be used to detect XXE and SSRF scenarios. A network sanitizer might do the same if sockets are used. Imagine a Java image file parsing library as a target. From a security perspective such a library should never open network sockets, as this would indicate Server Side Request forgery. This is a very realistic scenario, and I did find XXE issues in PNG XMP metadata parsing libraries before.
Java Security Manager
After doing some research it turned out that there is nothing like a file whitelist sanitizer for native code where AFL is usually used. So if we would fuzz any C/C++ code we would have to write our own parser and as stated by Jakub Wilk it might be tricky to implement due to async-signal-safe filesystem functions. So if you feel like writing one, please go ahead.
Back to Java I found out that there is already such a sanitizer. The best part is that it’s a built-in feature of the JVM and it’s called Java Security Manager. Look at this simple Java Security Manager policy file I created for running the Kelinci fuzzer with our simple Apache Commons JPEG parsing code:
grant { permission java.io.FilePermission "/tmp/*", "read,write,delete"; permission java.io.FilePermission "in_dir/*", "read"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master0/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master1/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave0/*", "read, write, delete"; permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave1/*", "read, write, delete"; permission java.net.SocketPermission "localhost:7007-", "accept, listen, resolve"; permission java.lang.RuntimePermission "modifyThread"; };
All it does is allowing file access to the temporary directory, reading from the input directory (in_dir) and writing to the output directory (out_dir) of AFL. Moreover, it allows the Kelinci Java process to listen on TCP port 7007 as well as to modify other threads. As the Security Manager is built into every Java JVM, you can simply start it with your usual command line with two more arguments:
java -Djava.security.manager -Djava.security.policy=java-security-policy.txt
So in our case we can run the Kelinci fuzzer server process with:
java -Djava.security.manager -Djava.security.policy=java-security-policy.txt -Djava.io.tmpdir=/tmp/ -cp bin-instrumented:commons-imaging-1.0-instrumented.jar edu.cmu.sv.kelinci.Kelinci driver.Driver @@
I went back and ran the Kelinci fuzzer some more hours on the Apache Commons JPEG parser without getting any new results with the Java Security Manager. However, at this point I was convinced that the Java Security Manager would take Java fuzzing to the next level. I just needed a different target first.
Targeting Apache Tika
Fast forward several days later, I stumbled over the Apache Tika project. As Apache Tika was formerly part of Apache Lucene, I was convinced that a lot of servers on the Internet would allow users to upload arbitrary files to be parsed by Apache Tika. As I’m currently maintaining another related research about web based file upload functionalities (UploadScanner Burp extension) this got me even more interested.
Apache Tika is a content analysis toolkit and can extract text content from over a thousand different file formats. A quick’n’dirty grep-estimate turned out that it has about 247 Java JAR files as dependencies at compile time. Apache Tika also had some severe security issues in the past. So as a test target Apache Tika seemed to fit perfectly. On the other hand I also knew that using such a big code base is a bad idea when fuzzing with AFL. AFL will more or less quickly deplete the fuzzing bitmap when the instrumented code is too large. Afterwards, AFL will be unable to detect when an input results in an interesting code path being taken. I was also not sure if I could successfully use the Java fuzzers to instrument the huge Apache Tika project. However, I decided to go on with this experiment.
I first tried to get things running with Kelinci, but ran into multiple issues and ended up creating a “works-for-me” Kelinci fork. After Kelinci was running, I also tried to get the JQF fuzzer running, however, I ran into similar but distinct problems and therefore decided to stick with Kelinci at this point. For Tika I had to adopt the Java Security Manager Policy:
grant { //Permissions required by Kelinci permission java.lang.RuntimePermission "modifyThread"; permission java.net.SocketPermission "localhost:7007", "listen, resolve"; permission java.net.SocketPermission "localhost:7008", "listen, resolve"; permission java.net.SocketPermission "localhost:7009", "listen, resolve"; permission java.net.SocketPermission "localhost:7010", "listen, resolve"; permission java.net.SocketPermission "[0:0:0:0:0:0:0:1]:*", "accept, resolve"; permission java.io.FilePermission "in_dir/*", "read"; permission java.io.FilePermission "corpus/*", "read, write"; permission java.io.FilePermission "crashes/*", "read"; permission java.io.FilePermission "out_dir/*", "read, write"; //Permissions required by Tika permission java.io.FilePermission "tika-app-1.17.jar", "read"; permission java.io.FilePermission "tika-app-1.17-instrumented.jar", "read"; permission java.io.FilePermission "/tmp/*", "read, write, delete"; permission java.lang.RuntimePermission "getenv.TIKA_CONFIG"; permission java.util.PropertyPermission "org.apache.tika.service.error.warn", "read"; permission java.util.PropertyPermission "tika.config", "read"; permission java.util.PropertyPermission "tika.custom-mimetypes", "read"; permission java.util.PropertyPermission "org.apache.pdfbox.pdfparser.nonSequentialPDFParser.eofLookupRange", "read"; permission java.util.PropertyPermission "org.apache.pdfbox.forceParsing", "read"; permission java.util.PropertyPermission "pdfbox.fontcache", "read"; permission java.util.PropertyPermission "file.encoding", "read"; //When parsing certain PDFs... permission java.util.PropertyPermission "user.home", "read"; permission java.util.PropertyPermission "com.ctc.wstx.returnNullForDefaultNamespace", "read"; //When parsing certain .mdb files... permission java.util.PropertyPermission "com.healthmarketscience.jackcess.resourcePath", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.brokenNio", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.charset.VERSION_3", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.columnOrder", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.enforceForeignKeys", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.allowAutoNumberInsert", "read"; permission java.util.PropertyPermission "com.healthmarketscience.jackcess.timeZone", "read"; };
To produce this policy file manually was much more annoying than for Apache Commons. The reason is that the necessary permissions we need to whitelist depend on the input file. So if a PNG file is fed into Apache Tika, it will need other runtime property permissions than if a PDF file is fed into Apache Tika. This means that we have to do a dry run first that will go through the entire input corpus of files and run them once with the minimum policy file. If a security exception occurs, it might be necessary to whitelist another permission. This process takes a lot of time. However, as an article from 2004 states:
There’s currently no tool available to automatically generate a [Java security] policy file for specific code.
So that’s why I wrote another quick’n’dirty hack/tool to generate Java security policy files. As it’s not a beauty I gave it the ugly name TMSJSPGE on github. However, it does it’s job and generates a Java security policy file. It will feed each corpus file to the target process (Tika in this case) and add a new rule to the security policy.
If you look at the above property permissions, I’m still not sure what they are all doing. However, I just decided I’ll go with them and allow Tika to use them.
If you run your fuzzer with different input files, you might be required to adopt the Java Security policy, as other code paths might require new permissions. So the above security policy for Apache Tika is likely to be incomplete.
Apache Tika findings
As already explained, a good input corpus is vital for a successful fuzzing run. Additionally, I had to run Tika with as many files as possible to make sure the Java Security Policy covered most permissions necessary. Over the years I’ve collected many input sample files (around 100’000) by doing fuzzing runs with various libraries and by collecting third-party files (that’s actually a topic for another day). So I decided I will run the TMSJSPGE tool with each of these 100’000 files to create the best Security Policy I can. When I checked back on the TMSJSPGE I saw that the tool was stuck feeding a certain file to Apache Tika. This means that Apache Tika never returned a result and the process hung. And that meant I already found security issues in Apache Tika 1.17 before I even started fuzzing. After removing the file that resulted in a hang and restarting TMSJSPGE, Apache Tika hung with several other files as well. Some of the files triggered the same hang and after deduplicating, the following two security issues were reported to Apache Tika:
- CVE-2018-1338 – DoS (Infinite Loop) Vulnerability in Apache Tika’s BPGParser (file 3_hang_and_uncaught_TiffProcessingException.bpg), where the code simply never returned.
- CVE-2018-1339 – DoS (Infinite Loop) Vulnerability in Apache Tika’s ChmParser (file 1_100_percent_cpu_dos.chm), which also used 100% CPU during this infinite loop.
I was wondering where these input files I had in my collection were coming from. Several BPG files triggering the issue were from a fuzzing run I once did for libbpg, so they were produced by AFL when creating BPG files for a native library. But the chm file triggering the other issue was a file that I downloaded a long time ago from the fuzzing project. It was a file Hanno Bรถck provided that came out of a fuzzing run for CHMLib. Interesting.
So here I was and had already found an uncaught exception in Apache Commons and two low severity issues in Apache Tika without even starting to do proper fuzzing.
To get an idea of the Java classes causing the issue I ran Apache Tika with a debugger and the triggering file, stopped the execution during the infinite loop and printed a stack trace. But most of the hard work to figure out the actual root causes of these issues was done by the maintainers, most importantly by Tim Allison and the Apache Tika team. That is also true for all the upcoming issues.
Fuzzing Apache Tika with Kelinci
After sorting out the input files that resulted in a hang, I started a couple of afl-fuzz fuzzing instances and waited. The behavior of the Kelinci fuzzer is sometimes a little brittle, so I often got the “Queue full” error message. It means the fuzzer is not running properly anymore and that timeouts will occur. I had to restart the fuzzing instances several times and tried to tweak the command line settings to improve stability. However, over time the instances often managed to fill up the queue again. Anyway, a couple of instances ran fine and found several “AFL crashes”. Keep in mind that “AFL crashes” in this case just mean uncaught Java exceptions. After looking through and deduplicating issues, I reported the following non-security (or very low severity, a matter of definition) issues to the maintainers of the libraries used by Apache Tika:
- Two independent StackOverflowException issues in Apache PDFBOX to parse PDF files (files 5_uncaught_stackoverflow_checkPagesDictionary.pdf and 6_uncaught_stackoverflow_getInheritableAttribute.pdf)
- An ArrayIndexOutOfBoundsException in Apache Commons ZipFile to parse zip files (files 7_uncaught_ArrayIndexOutOfBoundsException_1.zip and 7_uncaught_ArrayIndexOutOfBoundsException_2.zip)
- An IllegalArgumentException in Gagravarr VorbisJava to parse ogg files (files 8_uncaught_IllegalArgumentException_Skeleton.ogv and 9_uncaught_IllegalArgumentException_ogg_page.ogv)
The hang directory of AFL did not show any interesting results. After running each of the files in the hang directory with Apache Tika I found a PDF file that took nearly a minute to process, but none of the files lead to a full hang of the Tika thread. I suspect that the synchronization of the two processes was one of the reasons no infinite hangs were found by the fuzzer.
However, at this stage I was most disappointed that none of the crashes indicated that anything outside of the specified Java Security Manager policy was triggered. I guess this was a combination of my brittle configuration of Kelinci and the fact that it is probably not as easy to find arbitrary file read or write issues. But in the end you often simply don’t know what’s exactly the reason for not being successful with fuzzing.
JQF and a bug in Java
At one point I also wanted to try the JQF fuzzer on my ARM fuzzing machines with Apache Tika. It didn’t work for me at first and I found out that OpenJDK on ARM had horrible performance with JQF, so I switched to Oracle’s Java. Additionally, Apache Tika would simply not run with JQF. After the Tika 1.17 issues were fixed in Apache Tika I thought it was time to notify the maintainers of the fuzzers, so they could try to fuzz Apache Tika themselves. Rohan (maintainer of JQF) quickly fixed three independent issues and implemented a test case/benchmark for the fixed Tika 1.18 in JQF. After that I was able to fuzz Tika with my own corpus, but the performance was very bad for various reasons. One reason was the weak ARM boxes, but JQF couldn’t handle timeouts either (AFL’s -t switch). Rohan attempted a fix, but it’s only working sometimes. Rohan was also very quick to implement afl-cmin and said running with a Java Security Manager policy should be no problem. However, I couldn’t try those features properly due to the performance problems on the ARM machines. As I was not in the mood to switch fuzzing boxes, I just tried to get the fuzzer running somehow. After cutting down the input corpus and removing all PDF files that were taking potentially longer to be processed by Apache Tika, the fuzzer crept slowly forward. After not paying attention for 10 days, another hang was found by JQF in Apache Tika 1.18… I thought! However, after submitting this bug to Apache Tika, they pointed out that this was actually a bug in the Java standard libraries affecting Java before version 10 that I rediscovered:
- Endless loop in RiffReader (file 10_hang.riff), where the code simply never returned.
Unfortunately Java/Oracle never assigned a CVE for this. So Tim Allison from Apache Tika asked them to assign one, after 3 months and an endless stream of status update emails with no content we are still waiting for a CVE number(edit: after four months Oracle assigned CVE-2018-3214 to this issue with a CVSS score of 5.3 and fixed it with a Java update). As this is not fixed in Java 8 (edit: depending on the minor version update), Tim Allison also mitigated it on the Apache Tika side.
The hang file was created by the JQF fuzzer by modifying a sample QCP file “fart_3.qcp” from the public ffmpeg samples. So without actively targeting Java itself, I had rediscovered a bug in Java’s standard libraries, as Tika used it. Quite an interesting twist.
Adding a x86 fuzzing box
At the same time I also realized that these ARM JQF fuzzer instances were stuck. The endless RIFF loop file was detected as a crash (which might just be bad behavior of JQF for hangs), so I didn’t really know the reason why they were stuck currently. I tried to run the current input file on another machine, but the testcase didn’t hang. So I didn’t figure out why the fuzzer got stuck, but as Rohan pointed out the timeout handling (AFL’s “hangs”) isn’t optimal yet. JQF will detect timeouts when the infinite loop hits instrumented part of the Java code, as it will be able to measure the time that passed. However, JQF will hang for now if a test file makes the code loop forever in non-instrumented code. I removed all RIFF/QCP input files so hopefully I wouldn’t rediscover the RIFF endless loop bug again (I never switched to Java 10) and restarted the fuzzing instances.
I decided to additionally use a 32bit x86 VMWare fuzzing box, maybe it would run more stable there. I setup JQF with Java 8 again and without RIFF files as inputs. The x86 virtual machine performed much better, executing around 10 testcases per second. So I let these instances run for several days… just to realize when I came back that both instances got stuck after 7 hours of running. I checked again if the current input file could be the reason and this time this was exactly the problem, so another bug. Rinse and repeat, the next morning another bug. So after a while (at least 5 iterations) I had a bag full of bugs:
- An endless loop in Junrar (file 11_hang_junrar_zero_header2.rar), where the code simply never returned when the rar header size is zero. I contacted one of the maintainers, beothorn. It was fixed and this issue ended up as CVE-2018-12418.
- Infinite loop in Apache Tika’s IptcAnpaParser for handling IPTC metadata (file 12_hang_tika_iptc.iptc), where the code simply never returned. This was fixed and assigned CVE-2018-8017.
- Infinite loop in Apache PDFbox’ AdobeFontMetricsParser (file 16_570s_fontbox_OOM.afm), after nearly 10 minutes (on my machine) leading to an out of memory situation. This was fixed and assigned CVE-2018-8036.
- An issue when a specially crafted zip content is read with Apache Commons Compress (file 14_69s_tagsoup_HTMLScanner_oom.zip) that leads to an out of memory exception. This was fixed in Apache Commons Compress and CVE-2018-11771 was assigned. Another zip file created (file 15_680s_commons_IOE_push_back_buffer_full.zip) runs for 11 minutes (on my machine) leading to IOException with a message that the push back buffer is full and is probably related to the issue. Also probably the same issue is a file where Tika takes an arbitrary amount of time (during the tests between 20 seconds and 11 minutes) to process a zip file (file 13_48s_commons_truncated_zip_entry3.zip). This last one is worth a note as JQF correctly detected this as a hang and put it in AFL’s hang directory. The underlying problem of CVE-2018-11771 was that a read operation started to return alternating values of -1 and 345 when called by an InputStreamReader with UTF-16. The minimal code to reproduce is:
@Test public void testMarkResetLoop() throws Exception { InputStream is = Files.newInputStream(Paths.get("C:/14_69s_tagsoup_HTMLScanner_oom.zip")); ZipArchiveInputStream archive = new ZipArchiveInputStream(is); ZipArchiveEntry entry = archive.getNextZipEntry(); while (entry != null) { if (entry.getName().contains("one*line-with-eol.txt")) { Reader r = new InputStreamReader(archive, StandardCharsets.UTF_16LE); int i = r.read(); int cnt = 0; while (i != -1) { if (cnt++ > 100000) { throw new RuntimeException("Infinite loop detected..."); } i = r.read(); } } entry = archive.getNextZipEntry(); } }
After all these fixes I ran the fuzzer again on a nightly build of Apache Tika 1.19 and it didn’t find any new issues in more than 10 days. So my approach of fuzzing Tika seems to be exhausted. As always, it doesn’t mean another approach wouldn’t find new issues.
Summary
This is where I stopped my journey of Java fuzzing for now. I was a little disappointed that the approach with the Java Security Manager still did not find any security issues such as SSRF and that I only found ressource management issues. However, I’m pretty sure this strategy is still the way to go, it probably just needs other targets. As you can see there are loose ends everywhere and I’m definitely planning to go back to Java fuzzing:
- Use Kelinci/JQF with other Apache Commons parsers, e.g. for PNG
- Write sanitizers such as file or socket opening for native code AFL
- Contribute to the AFL-based Java fuzzers
However, for now there are other things to break on my stack.
I would like to thank Tim Allison of the Apache Tika project, it was a pleasure to do coordinated disclosure with him. And also a big thanks to Rohan Padhye who was really quick implementing new features in JQF.
Make sure you add the files included on github to your input corpus collection, as we saw it’s worth having a collection of crashes for other libraries when targeting new libraries.