Java Bugs with and without Fuzzing – AFL-based Java fuzzers and the Java Security Manager

This is a cross-posting of a blog post already published on the modzero company blog. If you’ve already read that, there is no need to read this, as both blog posts have the same content.

In the last half a year I have been doing some fuzzing with AFL-based Java fuzzers, namely Kelinci and JQF. I didn’t really work with java-afl. The contents of this post are:

Various AFL-based Java fuzzers are available that can be used to find more or less severe security issues. By combining these with sanitizers provided by the Java Security Manager, additional instrumentation can be achieved.

This blog post will mention several files, they are included on https://github.com/modzero/mod0javaFuzzingResults. Additionally, the zip file includes several other files that reproduce the same bugs.

AFL-based Java fuzzing tools

The AFL fuzzer is really popular nowadays as it performs instrumented fuzzing. If you are not familiar with AFL, it’s probably better if you at least quickly look at AFL before you read this post. It is especially important to understand how AFL handles hangs (test cases that take too much time to process) and crashes (e.g. target program segfault).

Kelinci is one of the first AFL for Java implementations and is very promising, although the approach with having two processes per fuzzing instance is a little clumsy and can get confusing. One process is the native C side, which takes mutated inputs produced by AFL and sends them to the second process via TCP socket. The second process is the Java process that feeds the input to the target program and sends back the code paths taken with this input. There are certain error messages in the Java part of this fuzzer that are not always exactly clear (at least to me), but they seem to indicate that the fuzzer is not running in a healthy state anymore. However, so far Kelinci worked very well for me and came up with a lot of results. There has not been any development for 7 months, so I hope the author will pick it up again.

JQF is actively maintained and the last changes were commited a couple of days ago. It does not take the classic fuzzer approach that most fuzzers for security researchers take, but instead is based on Java Unit Tests and focuses more on developers. It currently has only limited support of AFL’s -t switch for the timeout settings and there is also only rudimentary afl-cmin support. While this is perfect for developers using Unit Tests, it is not the most flexible fuzzer for security researchers fuzzing Java code.

java-afl has not been updated in four months. This is actually the fuzzer I didn’t successfully use at all. I tried to ask the developer about how to run it properly, but didn’t get an answer that would help me run it on the test case I had in mind. If you have better luck with java-afl, please let me know, it would be interesting to hear how this fuzzer performs.

First steps with Apache Commons

I started with the Apache Common’s Imaging JPEG parser as a target. The choice was simple because it was one of the examples explained for the Kelinci fuzzer. Apache Commons is a very popular library for all kind of things that are missing or incomplete in the Java standard library. When going through the author’s example, I realized that he gave the fuzzer only one input file containing the text “hello”, which is not a JPEG file and not a very good starting corpus. While it’s probably lcamtuf’s very interesting experiment that makes people believe using such corpus data is a valid choice, it is not a valid choice for proper fuzzing runs. lcamtuf’s experiment was good to proof the point that the fuzzer is smart, but for productive fuzzing proper input files have to be used to achieve good results. Fuzzing is all about corpus data in the end. So I took the JPEG files in lcamtuf’s corpus on the AFL website and some files from my private collection. The fuzzer quickly turned up with an additional ArrayIndexOutOfBoundsException which I reported to Apache (file ArrayIndexOutOfBoundsException_DhtSegment_79.jpeg). That was quite an easy start into Java fuzzing. If you would do the same for other parsers of Apache Commons (for example PNG parser), you would most probably find some more unchecked exceptions.

Goals: Taking a step back

After this quick experiment I gave the idea of fuzzing Java some more thoughts. Fuzzing is originally applied to programs that are not memory safe, hoping that we are able to find memory corruption issues. Out of bound read or writes in Java code simply do not result in memory corruption but in more or less harmless Exceptions such as IndexOutOfBoundsException. While it might be desirable to find (code robustness) issues and might result in Denial of Service issues, the severity of these issues is usually low. The question is what kind of behavior and fuzzing results are we looking for? There are different scenarios that might be of interest, but the attack vector (how does the attacker exploit the issue in the real world?) matters a lot when looking at them. Here is my rough (over)view on Java fuzzing:

  • Finding bugs in the JVM.
    • Arbitrary Java code as input. This could be helpful in more exotic scenarios, for example when you need to escape from a sandboxed JVM. In most other scenarios this attack vector is probably just unrealistic, as an attacker would be executing Java code already.
    • Feeding data into built-in classes/functions (fuzzing the standard library), such as strings. This is not very likely to come up with results, but you never know, maybe there are Java deserialization vulnerabilities lurking deep in the JVM code?
    • Finding low-severity or non-security issues such as code that throws an Exception it didn’t declare to throw (RuntimeExceptions).
  • Finding memory corruption bugs in Java code that uses native code (for example JNI or CNI). This is probably a very good place to use Java fuzzing, but I don’t encounter this situation very much except in Android apps. And fuzzing Android apps is an entirely different beast that is not covered here.
  • Fuzzing pure Java code.
    • We could go for custom goals. This might depend on your business logic. For example, if the code heavily uses file read/writes maybe there is some kind of race condition? Also the idea of differential fuzzing for crypto libraries makes a lot of sense.
    • Finding “ressource management” issues, such as Denial of Service (DoS) issues, OutOfMemoryExceptions, high CPU load, high disk space usage, or functions that never return.
    • Finding low-severity or non-security issues such as RuntimeExceptions.
    • Finding well-known security issues for Java code, such as Java deserialization vulnerabilities, Server Side Request Forgery (SSRF), and External Entity Injection (XXE).

I was especially interested in the last three points in this list: Finding ressource issues, RuntimeExceptions and well-known Java security issues. While I already found a RuntimeException in my little experiment described above, I was pretty sure that I would be able to detect certain ressource management issues by checking the “hangs” directory of AFL. However, the last point of finding well-known security issues such as SSRF seems tricky. The fuzzer would need additional instrumentation or sanitizers to detect such insecure behavior. Just as Address Sanitizer (ASAN) aborts on invalid memory access for native code (which then leads to a crash inside AFL), it would be nice to have sanitizers that take care about such areas in the Java world. A file sanitizer for example might take a whitelist of files that are allowed to be accessed by the process, but abort if any other file is accessed. This could be used to detect XXE and SSRF scenarios. A network sanitizer might do the same if sockets are used. Imagine a Java image file parsing library as a target. From a security perspective such a library should never open network sockets, as this would indicate Server Side Request forgery. This is a very realistic scenario, and I did find XXE issues in PNG XMP metadata parsing libraries before.

Java Security Manager

After doing some research it turned out that there is nothing like a file whitelist sanitizer for native code where AFL is usually used. So if we would fuzz any C/C++ code we would have to write our own parser and as stated by Jakub Wilk it might be tricky to implement due to async-signal-safe filesystem functions. So if you feel like writing one, please go ahead.

Back to Java I found out that there is already such a sanitizer. The best part is that it’s a built-in feature of the JVM and it’s called Java Security Manager. Look at this simple Java Security Manager policy file I created for running the Kelinci fuzzer with our simple Apache Commons JPEG parsing code:

grant {
    permission java.io.FilePermission "/tmp/*", "read,write,delete";
    permission java.io.FilePermission "in_dir/*", "read";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master0/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/master1/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave0/*", "read, write, delete";
    permission java.io.FilePermission "/opt/kelinci/kelinci/examples/commons-imaging/out_dir/slave1/*", "read, write, delete";
    permission java.net.SocketPermission "localhost:7007-", "accept, listen, resolve";
    permission java.lang.RuntimePermission "modifyThread";
};

All it does is allowing file access to the temporary directory, reading from the input directory (in_dir) and writing to the output directory (out_dir) of AFL. Moreover, it allows the Kelinci Java process to listen on TCP port 7007 as well as to modify other threads. As the Security Manager is built into every Java JVM, you can simply start it with your usual command line with two more arguments:

java -Djava.security.manager -Djava.security.policy=java-security-policy.txt

So in our case we can run the Kelinci fuzzer server process with:

java -Djava.security.manager -Djava.security.policy=java-security-policy.txt -Djava.io.tmpdir=/tmp/ -cp bin-instrumented:commons-imaging-1.0-instrumented.jar edu.cmu.sv.kelinci.Kelinci driver.Driver @@

I went back and ran the Kelinci fuzzer some more hours on the Apache Commons JPEG parser without getting any new results with the Java Security Manager. However, at this point I was convinced that the Java Security Manager would take Java fuzzing to the next level. I just needed a different target first.

Targeting Apache Tika

Fast forward several days later, I stumbled over the Apache Tika project. As Apache Tika was formerly part of Apache Lucene, I was convinced that a lot of servers on the Internet would allow users to upload arbitrary files to be parsed by Apache Tika. As I’m currently maintaining another related research about web based file upload functionalities (UploadScanner Burp extension) this got me even more interested.

Apache Tika is a content analysis toolkit and can extract text content from over a thousand different file formats. A quick’n’dirty grep-estimate turned out that it has about 247 Java JAR files as dependencies at compile time. Apache Tika also had some severe security issues in the past. So as a test target Apache Tika seemed to fit perfectly. On the other hand I also knew that using such a big code base is a bad idea when fuzzing with AFL. AFL will more or less quickly deplete the fuzzing bitmap when the instrumented code is too large. Afterwards, AFL will be unable to detect when an input results in an interesting code path being taken. I was also not sure if I could successfully use the Java fuzzers to instrument the huge Apache Tika project. However, I decided to go on with this experiment.

I first tried to get things running with Kelinci, but ran into multiple issues and ended up creating a “works-for-me” Kelinci fork. After Kelinci was running, I also tried to get the JQF fuzzer running, however, I ran into similar but distinct problems and therefore decided to stick with Kelinci at this point. For Tika I had to adopt the Java Security Manager Policy:

grant {
    //Permissions required by Kelinci
    permission java.lang.RuntimePermission "modifyThread";
    
    permission java.net.SocketPermission "localhost:7007", "listen, resolve";
    permission java.net.SocketPermission "localhost:7008", "listen, resolve";
    permission java.net.SocketPermission "localhost:7009", "listen, resolve";
    permission java.net.SocketPermission "localhost:7010", "listen, resolve";
    permission java.net.SocketPermission "[0:0:0:0:0:0:0:1]:*", "accept, resolve";
    
    permission java.io.FilePermission "in_dir/*", "read";
    permission java.io.FilePermission "corpus/*", "read, write";
    permission java.io.FilePermission "crashes/*", "read";
    permission java.io.FilePermission "out_dir/*", "read, write";
    
    //Permissions required by Tika
    permission java.io.FilePermission "tika-app-1.17.jar", "read";
    permission java.io.FilePermission "tika-app-1.17-instrumented.jar", "read";

    permission java.io.FilePermission "/tmp/*", "read, write, delete";
    
    permission java.lang.RuntimePermission "getenv.TIKA_CONFIG";
    
    permission java.util.PropertyPermission "org.apache.tika.service.error.warn", "read";
    permission java.util.PropertyPermission "tika.config", "read";
    permission java.util.PropertyPermission "tika.custom-mimetypes", "read";
    permission java.util.PropertyPermission "org.apache.pdfbox.pdfparser.nonSequentialPDFParser.eofLookupRange", "read";
    permission java.util.PropertyPermission "org.apache.pdfbox.forceParsing", "read";
    permission java.util.PropertyPermission "pdfbox.fontcache", "read";
    permission java.util.PropertyPermission "file.encoding", "read";

    //When parsing certain PDFs...
    permission java.util.PropertyPermission "user.home", "read";
    permission java.util.PropertyPermission "com.ctc.wstx.returnNullForDefaultNamespace", "read";
    
    //When parsing certain .mdb files...
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.resourcePath", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.brokenNio", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.charset.VERSION_3", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.columnOrder", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.enforceForeignKeys", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.allowAutoNumberInsert", "read";
    permission java.util.PropertyPermission "com.healthmarketscience.jackcess.timeZone", "read";
};

To produce this policy file manually was much more annoying than for Apache Commons. The reason is that the necessary permissions we need to whitelist depend on the input file. So if a PNG file is fed into Apache Tika, it will need other runtime property permissions than if a PDF file is fed into Apache Tika. This means that we have to do a dry run first that will go through the entire input corpus of files and run them once with the minimum policy file. If a security exception occurs, it might be necessary to whitelist another permission. This process takes a lot of time. However, as an article from 2004 states:

There’s currently no tool available to automatically generate a [Java security] policy file for specific code.

So that’s why I wrote another quick’n’dirty hack/tool to generate Java security policy files. As it’s not a beauty I gave it the ugly name TMSJSPGE on github. However, it does it’s job and generates a Java security policy file. It will feed each corpus file to the target process (Tika in this case) and add a new rule to the security policy.

If you look at the above property permissions, I’m still not sure what they are all doing. However, I just decided I’ll go with them and allow Tika to use them.

If you run your fuzzer with different input files, you might be required to adopt the Java Security policy, as other code paths might require new permissions. So the above security policy for Apache Tika is likely to be incomplete.

Apache Tika findings

As already explained, a good input corpus is vital for a successful fuzzing run. Additionally, I had to run Tika with as many files as possible to make sure the Java Security Policy covered most permissions necessary. Over the years I’ve collected many input sample files (around 100’000) by doing fuzzing runs with various libraries and by collecting third-party files (that’s actually a topic for another day). So I decided I will run the TMSJSPGE tool with each of these 100’000 files to create the best Security Policy I can. When I checked back on the TMSJSPGE I saw that the tool was stuck feeding a certain file to Apache Tika. This means that Apache Tika never returned a result and the process hung. And that meant I already found security issues in Apache Tika 1.17 before I even started fuzzing. After removing the file that resulted in a hang and restarting TMSJSPGE, Apache Tika hung with several other files as well. Some of the files triggered the same hang and after deduplicating, the following two security issues were reported to Apache Tika:

I was wondering where these input files I had in my collection were coming from. Several BPG files triggering the issue were from a fuzzing run I once did for libbpg, so they were produced by AFL when creating BPG files for a native library. But the chm file triggering the other issue was a file that I downloaded a long time ago from the fuzzing project. It was a file Hanno Böck provided that came out of a fuzzing run for CHMLib. Interesting.

So here I was and had already found an uncaught exception in Apache Commons and two low severity issues in Apache Tika without even starting to do proper fuzzing.

To get an idea of the Java classes causing the issue I ran Apache Tika with a debugger and the triggering file, stopped the execution during the infinite loop and printed a stack trace. But most of the hard work to figure out the actual root causes of these issues was done by the maintainers, most importantly by Tim Allison and the Apache Tika team. That is also true for all the upcoming issues.

Fuzzing Apache Tika with Kelinci

After sorting out the input files that resulted in a hang, I started a couple of afl-fuzz fuzzing instances and waited. The behavior of the Kelinci fuzzer is sometimes a little brittle, so I often got the “Queue full” error message. It means the fuzzer is not running properly anymore and that timeouts will occur. I had to restart the fuzzing instances several times and tried to tweak the command line settings to improve stability. However, over time the instances often managed to fill up the queue again. Anyway, a couple of instances ran fine and found several “AFL crashes”. Keep in mind that “AFL crashes” in this case just mean uncaught Java exceptions. After looking through and deduplicating issues, I reported the following non-security (or very low severity, a matter of definition) issues to the maintainers of the libraries used by Apache Tika:

The hang directory of AFL did not show any interesting results. After running each of the files in the hang directory with Apache Tika I found a PDF file that took nearly a minute to process, but none of the files lead to a full hang of the Tika thread. I suspect that the synchronization of the two processes was one of the reasons no infinite hangs were found by the fuzzer.

However, at this stage I was most disappointed that none of the crashes indicated that anything outside of the specified Java Security Manager policy was triggered. I guess this was a combination of my brittle configuration of Kelinci and the fact that it is probably not as easy to find arbitrary file read or write issues. But in the end you often simply don’t know what’s exactly the reason for not being successful with fuzzing.

JQF and a bug in Java

At one point I also wanted to try the JQF fuzzer on my ARM fuzzing machines with Apache Tika. It didn’t work for me at first and I found out that OpenJDK on ARM had horrible performance with JQF, so I switched to Oracle’s Java. Additionally, Apache Tika would simply not run with JQF. After the Tika 1.17 issues were fixed in Apache Tika I thought it was time to notify the maintainers of the fuzzers, so they could try to fuzz Apache Tika themselves. Rohan (maintainer of JQF) quickly fixed three independent issues and implemented a test case/benchmark for the fixed Tika 1.18 in JQF. After that I was able to fuzz Tika with my own corpus, but the performance was very bad for various reasons. One reason was the weak ARM boxes, but JQF couldn’t handle timeouts either (AFL’s -t switch). Rohan attempted a fix, but it’s only working sometimes. Rohan was also very quick to implement afl-cmin and said running with a Java Security Manager policy should be no problem. However, I couldn’t try those features properly due to the performance problems on the ARM machines. As I was not in the mood to switch fuzzing boxes, I just tried to get the fuzzer running somehow. After cutting down the input corpus and removing all PDF files that were taking potentially longer to be processed by Apache Tika, the fuzzer crept slowly forward. After not paying attention for 10 days, another hang was found by JQF in Apache Tika 1.18… I thought! However, after submitting this bug to Apache Tika, they pointed out that this was actually a bug in the Java standard libraries affecting Java before version 10 that I rediscovered:

The hang file was created by the JQF fuzzer by modifying a sample QCP file “fart_3.qcp” from the public ffmpeg samples. So without actively targeting Java itself, I had rediscovered a bug in Java’s standard libraries, as Tika used it. Quite an interesting twist.

Adding a x86 fuzzing box

At the same time I also realized that these ARM JQF fuzzer instances were stuck. The endless RIFF loop file was detected as a crash (which might just be bad behavior of JQF for hangs), so I didn’t really know the reason why they were stuck currently. I tried to run the current input file on another machine, but the testcase didn’t hang. So I didn’t figure out why the fuzzer got stuck, but as Rohan pointed out the timeout handling (AFL’s “hangs”) isn’t optimal yet. JQF will detect timeouts when the infinite loop hits instrumented part of the Java code, as it will be able to measure the time that passed. However, JQF will hang for now if a test file makes the code loop forever in non-instrumented code. I removed all RIFF/QCP input files so hopefully I wouldn’t rediscover the RIFF endless loop bug again (I never switched to Java 10) and restarted the fuzzing instances.

I decided to additionally use a 32bit x86 VMWare fuzzing box, maybe it would run more stable there. I setup JQF with Java 8 again and without RIFF files as inputs. The x86 virtual machine performed much better, executing around 10 testcases per second. So I let these instances run for several days… just to realize when I came back that both instances got stuck after 7 hours of running. I checked again if the current input file could be the reason and this time this was exactly the problem, so another bug. Rinse and repeat, the next morning another bug. So after a while (at least 5 iterations) I had a bag full of bugs:

  • An endless loop in Junrar (file 11_hang_junrar_zero_header2.rar), where the code simply never returned when the rar header size is zero. I contacted one of the maintainers, beothorn. It was fixed and this issue ended up as CVE-2018-12418.
  • Infinite loop in Apache Tika’s IptcAnpaParser for handling IPTC metadata (file 12_hang_tika_iptc.iptc), where the code simply never returned. This was fixed and assigned CVE-2018-8017.
  • Infinite loop in Apache PDFbox’ AdobeFontMetricsParser (file 16_570s_fontbox_OOM.afm), after nearly 10 minutes (on my machine) leading to an out of memory situation. This was fixed and assigned CVE-2018-8036.
  • An issue when a specially crafted zip content is read with Apache Commons Compress (file 14_69s_tagsoup_HTMLScanner_oom.zip) that leads to an out of memory exception. This was fixed in Apache Commons Compress and CVE-2018-11771 was assigned. Another zip file created (file 15_680s_commons_IOE_push_back_buffer_full.zip) runs for 11 minutes (on my machine) leading to IOException with a message that the push back buffer is full and is probably related to the issue. Also probably the same issue is a file where Tika takes an arbitrary amount of time (during the tests between 20 seconds and 11 minutes) to process a zip file (file 13_48s_commons_truncated_zip_entry3.zip). This last one is worth a note as JQF correctly detected this as a hang and put it in AFL’s hang directory. The underlying problem of CVE-2018-11771 was that a read operation started to return alternating values of -1 and 345 when called by an InputStreamReader with UTF-16. The minimal code to reproduce is:
@Test
public void testMarkResetLoop() throws Exception {
    InputStream is = Files.newInputStream(Paths.get("C:/14_69s_tagsoup_HTMLScanner_oom.zip"));
    ZipArchiveInputStream archive = new ZipArchiveInputStream(is);
    ZipArchiveEntry entry = archive.getNextZipEntry();
    while (entry != null) {
        if (entry.getName().contains("one*line-with-eol.txt")) {
            Reader r = new InputStreamReader(archive, StandardCharsets.UTF_16LE);
            int i = r.read();
            int cnt = 0;
            while (i != -1) {
                if (cnt++ > 100000) {
                    throw new RuntimeException("Infinite loop detected...");
                }
                i = r.read();
            }
        }
        entry = archive.getNextZipEntry();
    }
}

After all these fixes I ran the fuzzer again on a nightly build of Apache Tika 1.19 and it didn’t find any new issues in more than 10 days. So my approach of fuzzing Tika seems to be exhausted. As always, it doesn’t mean another approach wouldn’t find new issues.

Summary

This is where I stopped my journey of Java fuzzing for now. I was a little disappointed that the approach with the Java Security Manager still did not find any security issues such as SSRF and that I only found ressource management issues. However, I’m pretty sure this strategy is still the way to go, it probably just needs other targets. As you can see there are loose ends everywhere and I’m definitely planning to go back to Java fuzzing:

  • Use Kelinci/JQF with other Apache Commons parsers, e.g. for PNG
  • Write sanitizers such as file or socket opening for native code AFL
  • Contribute to the AFL-based Java fuzzers

However, for now there are other things to break on my stack.

I would like to thank Tim Allison of the Apache Tika project, it was a pleasure to do coordinated disclosure with him. And also a big thanks to Rohan Padhye who was really quick implementing new features in JQF.

Make sure you add the files included on https://github.com/modzero/mod0javaFuzzingResults to your input corpus collection, as we saw it’s worth having a collection of crashes for other libraries when targeting new libraries.

Schubser and his cookie dealing friend

I actually forgot to post this in February, so I’m a little late but the topic is as current as it was back then. One week in February my colleague at modzero AG, Jan Girlich and me took some time to review our tools and make three of them available on github.

Jan wrote a Proof of Concept (PoC) Android app that allows exploiting Java object deserialization vulnerabilities in Android and named this project modjoda (Modzero Java Object Deserialization on Android). To test the issue, he also wrote a vulnerable demo application to try the exploit.

I wrote mod0schubser, which provides a simple TCP- and TLS-level Man-In-The-Middle (MITM) proxy for people with python experience. It can be used when all the other proxy tools seem to be too complicated and you just want to do some modifications of the traffic in Python. Additionally, I wrote the mod0cookiedealer tool, a tool to demonstrate the impact of missing HTTP cookie flags (secure and HTTPonly). If you remember Firesheep, mod0cookiedealer is a modern implementation of Firesheep as a browser web-extension.

BSides Zurich – Nail in the JKS coffin

On Saturday I was happy to speak at the fabulous BSides Zurich about the Java Key Store topic. You can find my slides “Nail in the JKS coffin” as a PDF here. It was my second time at a BSides format and I really like the idea of having a short talk and then some more time to discuss the topic with interested people. I also included the “after the presentation” slides we used for roughly 50% of the discussion time. I hope you enjoyed the talk and I’m looking forward to hear some feedback. Although it was sold out, you should definitely come next year, it was one of my favorite public conferences.

cheers,
floyd

Java Key Store (JKS) format is weak and insecure (CVE-2017-10356)

While preparing my talk for the marvelous BSides Zurich I noticed again how nearly nobody on the Internet warns you that Java’s JKS file format is weak and insecure. While users only need to use very strong passwords and keep the Key Store file secret to be on the safe side (for now!), I think it is important to tell people when a technology is weak. People should stop using JKS now, as I predict a very long phase-out period. JKS was around and the default since Java had its first Key Store. Your security relies on a single SHA-1 calculation here.

Please note that I’m not talking about any other Key Store type (BKS, PKCS#12, etc.), but see the cryptosense website for articles about them.

I don’t want to go into the details “why” JKS is insecure, you can read all about it here:

I wrote an email to the Oracle security team, as I think assigning a CVE number would help people to refer to this issue and raise awareness for developers. My original email sent on September, 18 2017:

I would like to ask Oracle to assign a CVE Number for Java’s weak
encryption in JKS files for secure storage of private keys (Java Key
Store files). JKS uses a weak encryption scheme based on SHA1.

I think it is important to raise awareness that JKS is weak by assigning
a CVE number, even when it is going to be replaced in Java 1.9 with PKCS#12.

The details of the weakness are published on the following URLs:

– As an article in the POC||GTFO 0x15 magazine, I attached it to this
email, the full magazine can also be found on
https://www.alchemistowl.org/pocorgtfo/pocorgtfo15.pdf
– https://cryptosense.com/mighty-aphrodite-dark-secrets-of-the-java-keystore/
– https://github.com/floyd-fuh/JKS-private-key-cracker-hashcat

As the article states, no documentation anywhere in the Java world
mentions that JKS is a weak storage format. I would like to change this,
raise awareness and a CVE assignment would help people refer to this issue.

The timeline so far:

  • September, 18 2017: Notified Oracle security team via email
  • September, 18 2017: Generic response that my email was forwarded to the Oracle team that investigates these issues
  • September, 20 2017: Oracle assigned a tracking number (S0918336)
  • September, 25 2017: Automated email status report: Under investigation / Being fixed in main codeline
  • October, 10 2017: Requested an update and asked if they could assign a CVE number
  • October, 11 2017: Response, they are still investigating.
  • October, 13 2017: Oracle writes “We have confirmed the issue and will be addressing it in a future release”. In an automated email I get Oracle states “The following issue reported by you is fixed in the upcoming Critical Patch Update, due to be released at 1:00 PM, U.S. Pacific Time, on October 17, 2017.”.
  • October 17, 2017: Oracle assigned a CVE in their Oracle Critical Patch Update Advisory – October 2017: CVE-2017-10356. The guys from Cryptosense got credited too it seems. However, the documentation of Oracle so far didn’t change anywhere I could see it.
  • November 16, 2017: I asked again to clarify what the countermeasures are and what they are planning to do with JKS. They seem to be mixing my CVE and the JKS issues with other issue in other Key Store types.
  • November 17, 2017: Oracle replied (again, mixing-in issues of other Key Store types): “In JDK 9 the default keystore format is PKCS#11 which doesn’t have the limits of the JKS format — and we’ve put in some migration capability also. For all versions we have increased the iteration counts [sic!] used significantly so that even though the algorithms are weak, a brute-force search will take a lot longer. For older versions we will be backporting the missing bits of PKCS#11 so that it can be used as the keystore type.”. That was the good part of the answer, even though JKS has no iteration count. The second part where I asked if they could add some links to their Critical Path Update Advisory was: “In order to prevent undue risks to our customers, Oracle will not provide additional information about the specifics of vulnerabilities beyond what is provided in the Critical Patch Update (or Security Alert) advisory and pre-release note, the pre-installation notes, the readme files, and FAQs.”.

That’s it for me for now. I’m too tired to start arguing about keeping technical details secret. So basically I have to hope that everyone finds this blog posts when searching for CVE-2017-10356.

Cracking Java’s weak encryption – Nail in the JKS coffin

POC||GTFO journal edition 0x15 came out a while ago and I’m happy to have contributed the article “Nail in the JKS coffin”. You should really read the article, I’m not going to repeat myself here. I’ve also made the code available on my “JKS private key cracker hashcat” github repository.

For those who really need a TL;DR, the developed cracking technique relies on three main issues with the JKS format:

  1. Due the unusual design of JKS the key store password can be ignored and the private key password cracked directly.
  2. By exploiting a weakness of the Password Based Encryption scheme for the private key in JKS described by cryptosense, the effort to try a password is minimal (one SHA-1 calculation).
  3. As public keys are not encrypted in the JKS file format, we can determine the algorithm and key size of the public key to know the PKCS#8 encoded fingerprint we have to expect in step 2.

For a practical TL;DR, see the github repository on how JksPrivkPrepare.jar can be used together with the hashcat password cracking tool to crack passwords.

Not affected of the described issues are other key store file formats such as JCEKS, PKCS12 or BKS. It is recommended to use the PKCS12 format to store private keys and to store the files in a secure location. For example it is recommended to store Android app release JKS files somewhere else than a repository such as git.

OWASP AntiSamy Project XSS

From the OWASP AntiSamy Project page’s “What is it” section:

It’s an API that helps you make sure that clients don’t supply malicious cargo code in the HTML they supply for their profile, comments, etc., that get persisted on the server. The term “malicious code” in regards to web applications usually mean “JavaScript.” Cascading Stylesheets are only considered malicious when they invoke the JavaScript engine. However, there are many situations where “normal” HTML and CSS can be used in a malicious manner. So we take care of that too.

So as far as I understand it, it is trying to prevent Cross Site Scripting (XSS). But to be fair, the user guide is a little bit more realistic:

AntiSamy does a very good job of removing malicious HTML, CSS and JavaScript, but in security, no solution is guaranteed. AntiSamy’s success depends on the strictness of your policy file and the predictability of browsers’ behavior. AntiSamy is a sanitization framework; it is up to the user how it does its sanitization. Using AntiSamy does not guarantee filtering of all malicious code. AntiSamy simply does what is described by the policy file.

Anyway, I found a XSS that worked in the case of the web application I was testing:

<a href="http://example.com"&amp;/onclick=alert(8)>foo</a>

Version: antisamy-1.5.2.jar with all default configuration files there are.

Browsers tested (all working): Firefox 22.0 (on Mac OSX), Safari 6.0.5 (on Mac OSX), Internet Explorer 11.0.9200 (Windows 7) and Android Browser (Android 2.2).

Disclosure timeline:
July 16th, 2013: Wrote to Arshan (maintainer) about the issue
July 16th, 2013: Response, questions about version and browser compatiblity
July 16th, 2013: Clarification about versions/browser and that I only tried getNumberOfErrors(), informed that I’m planning to release this blog post end of August
July 23rd, 2013: Some more E-Mails about similar issue that was just resolved (not the same issue), including Kristian who comitted a fix
July 25th, 2013: Kristian sent a mail, he will have a look at the getNumberOfErrors() logic before releasing an update
July 31st, 2013: Asked if there are any updates on the issue, no response
Sept 09th, 2013: Asked if there are any updates on the issue, response that it should be fixed. Requested new .jar file
Oct 21st, 2013: Tested with the newest version available for download, antisamy 1.5.3. Problem still present. Public release.

Antisamy doesn’t give any error messages and getNumberOfErrors() is 0. Although the getCleanHTML() will give back sanitised code without the XSS, people relying only on the getNumberOfErrors() method to check if an input is valid or not will have a XSS.

Btw. the included configuration file names are somehow misleading (the names include company names like ebay). Those names are made up, doesn’t mean the companies use those conifg files at all. I don’t even know if they are using the antisamy project at all.

I can’t recommend relying on that project. Proper output encoding is important and is the real XSS prevention. And validating HTML with regex is hard. Very hard. Very, very hard. Don’t.

A Weird Idea – Java, Strings and Memory

Disclaimer: I’m not a Java internals specialist. Let me know if I got a point wrong here.

Wiping sensitive information when you don’t need it anymore is a very basic security rule. It makes sense to everyone when we talk about hard discs, but the same rule should be applied to memory (RAM). While memory management is quite straight forward in C derived languages (eg. allocate some memory, override it with random bytes when the sensitive information is not needed anymore), it’s quite hard to do it in Java. Is it even possible? Java Strings are immutable, that means you can not change the memory allocated for a String. The Java VM will make a copy of the String and change the copy. So from this perspective, Strings are not a good way to store sensitive information. So let’s do it in char arrays I thought and wrote the following Java class:

/**
* ----------------------------------------------------------------------------
* "THE BEER-WARE LICENSE" (Revision 42):
* <floyd at floyd dot ch> wrote this file. As long as you retain this notice you
* can do whatever you want with this stuff. If we meet some day, and you think
* this stuff is worth it, you can buy me a beer in return
* floyd http://floyd.ch @floyd_ch <floyd at floyd dot ch>
* August 2012
* ----------------------------------------------------------------------------
**/

import java.util.Arrays;
import java.io.IOException;
import java.io.Serializable;
import java.lang.Exception;

/**
 * ATTENTION: This method is only ASCII safe, not Unicode.
 * It should be used to store sensitive information which are ASCII
 * 
 * This class keeps track of the passed in char[] and wipes all the old arrays
 * whenever they are changed. This means:
 * 
 * char[] r1 = {0x41, 0x41, 0x41, 0x41, 0x41, 0x41};
 * SecureString k = new SecureString(r1);
 * k.destroy();
 * System.out.println(r1); //Attention! r1 will be {0x00, 0x00, 0x00, 0x00, 0x00, 0x00}
 * 
 * char[] r2 = {0x41, 0x41, 0x41, 0x41, 0x41, 0x41};
 * SecureString k = new SecureString(r2);
 * k.append('C');
 * System.out.println(r2); //Attention! r2 will be {0x00, 0x00, 0x00, 0x00, 0x00, 0x00}
 * System.out.println(k.getByteArray()); //correct
 * 
 * General rule for this class:
 * - Never call toString() of this class. It will just return null:
 * char[] r3 = {0x41, 0x41, 0x41, 0x41, 0x41, 0x41};
 * SecureString k = new SecureString(r3);
 * System.out.println(k); //will print null, because it calls toString()
 * 
 * - Choose the place to do the destroy() wisely. It should be done when an operation
 * is finished and the string is not needed anymore.
 */

public class SecureString implements Serializable{
	private static final long serialVersionUID = 1L;
	private char[] charRepresentation;
	private char[] charRepresentationLower;
	private byte[] byteRepresentation;
	private boolean isDestroyed = false;
	private int length=0;
	private final static int OBFUSCATOR_BYTE = 0x55;
	
	/**
	 * After calling the constructor you should NOT wipe the char[]
	 * you just passed in (the stringAsChars reference)
	 * @param stringAsChars
	 */
	public SecureString(char[] stringAsChars){
		//Pointing to the same address as original char[]
		this.length = stringAsChars.length;
		this.initialise(stringAsChars);
	}
	
	/**
	 * After calling the constructor you should NOT wipe the byte[]
	 * you just passed in (the stringAsBytes reference)
	 * @param stringAsBytes
	 */
	public SecureString(byte[] stringAsBytes){
		//Pointing to the same address as original byte[]
		this.length = stringAsBytes.length;
		this.initialise(stringAsBytes);
	}
	
	/**
	 * @return length of the string
	 */
	public int length(){
		return this.length;
	}
	
	/**
	 * Initialising entire object based on a char array
	 * @param stringAsChars
	 */
	private void initialise(char[] stringAsChars){
		this.length = stringAsChars.length;
		charRepresentation = stringAsChars; 
		charRepresentationLower = new char[stringAsChars.length];
		byteRepresentation = new byte[stringAsChars.length];
		for(int i=0; i<stringAsChars.length; i++){
			charRepresentationLower[i] = Character.toLowerCase(charRepresentation[i]);
			byteRepresentation[i] = (byte)charRepresentation[i];
		}
	}
	
	/**
	 * Initializing entire object based on a byte array
	 * @param stringAsBytes
	 */
	private void initialise(byte[] stringAsBytes){
		this.length = stringAsBytes.length;
		byteRepresentation = stringAsBytes;
		charRepresentation = new char[stringAsBytes.length];
		charRepresentationLower = new char[stringAsBytes.length];
		for(int i=0; i<stringAsBytes.length; i++){
			charRepresentation[i] = (char) byteRepresentation[i];
			charRepresentationLower[i] = Character.toLowerCase(charRepresentation[i]);
		}
	}
	
	/**
	 * We first obfuscate the string by XORing with OBFUSCATOR_BYTE
	 * So it doesn't get serialized as plain text. THIS METHOD
	 * WILL DESTROY THIS OBJECT AT THE END.
	 * @param out
	 * @throws IOException
	 */
	private void writeObject(java.io.ObjectOutputStream out) throws IOException{		
		out.writeInt(byteRepresentation.length);
		for(int i = 0; i < byteRepresentation.length; i++ ){
			out.write((byte)(byteRepresentation[i]) ^ (byte)(OBFUSCATOR_BYTE));
		}
		//TODO: Test if the Android Intent is really only calling writeObject once,
		//otherwise this could be a problem. Then I suggest that the SecureString
		//is not passed in an intent (and never serialized)
		destroy();
	}
	
	/**
	 * We first have to deobfuscate the string by XORing with OBFUSCATOR_BYTE
	 * @param out
	 * @throws IOException
	 */
	private void readObject(java.io.ObjectInputStream in) throws IOException, ClassNotFoundException{
		//we first have to deobfuscate
		int newLength = in.readInt();
		byteRepresentation = new byte[newLength];
		for(int i = 0; i < newLength; i++ ){
			byteRepresentation[i] = (byte) ((byte)(in.read()) ^ (byte)(OBFUSCATOR_BYTE));
		}
		initialise(byteRepresentation);
	}
	
	/**
	 * Use this method wisely
	 * 
	 * When the destroy() method of this SecureString is called, the byte[] 
	 * this method had returned will also be wiped!
	 * 
	 * Never WRITE directly to this byte[], but only READ
	 * 
	 * @return reference to byte[] of this SecureString
	 */
	public byte[] getByteArray(){
		if(this.isDestroyed)
			return null;
		return byteRepresentation;
	}
	
	/**
	 * This method should never be called.
	 */
	@Deprecated
	@Override
	public String toString(){
		return null;
	}
	
	/**
	 * This method should never be called, because it means you already stored the
	 * sensitive information in a String.
	 * @param string
	 * @return
	 * @throws Exception
	 */
	
	@Deprecated
	public boolean equals(String string) throws Exception{
		throw new Exception("YOU ARE NOT SUPPOSED TO CALL equals(String string) of SecureString, "+
				"because your are not supposed to have the other value in a string in the first place!");
	}
	/**
	 * This method should never be called, because it means you already stored the
	 * sensitive information in a String.
	 * @param string
	 * @return
	 * @throws Exception
	 */
	
	@Deprecated
	public boolean equalsIgnoreCase(String string) throws Exception{
		throw new Exception("YOU ARE NOT SUPPOSED TO CALL equalsIgnoreCase(String string) of SecureString, "+
				"because your are not supposed to have the other value in a string in the first place!");
	}
	/**
	 * Comparing if the two strings stored in the SecureString objects are equal
	 * @param secureString2
	 * @return true if strings are equal
	 */
	public boolean equals(SecureString secureString2){
		if(this.isDestroyed)
			return false;
		return Arrays.equals(this.charRepresentation, secureString2.charRepresentation);
	}
	
	/**
	 * Comparing if the two strings stored in the SecureString objects are equalsIgnoreCase
	 * @param secureString2
	 * @return true if strings are equal ignoring case
	 */
	public boolean equalsIgnoreCase(SecureString secureString2){
		if(this.isDestroyed)
			return false;
		return Arrays.equals(this.charRepresentationLower, secureString2.charRepresentationLower);
	}
	
	/**
	 * Delete a char at the given position
	 * @param index
	 */
	public void deleteCharAt(int index){
		if(this.isDestroyed)
			return;
		if(this.length == 0 || index >= length)
			return;
		wipe(charRepresentationLower);
		wipe(byteRepresentation);
		char[] newCharRepresentation = new char[length-1];
		for(int i=0; i<index; i++)
			newCharRepresentation[i] = charRepresentation[i];
		for(int i=index+1; i<this.length-1; i++)
			newCharRepresentation[i-1] = charRepresentation[i];
		wipe(charRepresentation);
		initialise(newCharRepresentation);
	}
	
	/**
	 * Append a char at the end of the string
	 * @param c
	 */
	public void append(char c){
		wipe(charRepresentationLower);
		wipe(byteRepresentation);
		char[] newCharRepresentation = new char[length+1];
		for(int i=0; i<this.length; i++)
			newCharRepresentation[i] = charRepresentation[i];
		newCharRepresentation[length-1] = c;
		wipe(charRepresentation);
		initialise(newCharRepresentation);
	}
	
	/**
	 * Internal method to wipe a char[]
	 * @param chars
	 */
	private void wipe(char[] chars){
		for(int i=0; i<chars.length; i++){
			//set to hex AA
			int val = 0xAA;
			chars[i] = (char) val; 
			//set to hex 55
			val = 0x55;
			chars[i] = (char) val; 
			//set to hex 00
			val = 0x00;
			chars[i] = (char) val; 
		}
	}
	
	/**
	 * Internal method to wipe a byte[]
	 * @param chars
	 */
	private void wipe(byte[] bytes){
		for(int i=0; i<bytes.length; i++){
			//set to hex AA
			int val = 0xAA;
			bytes[i] = (byte) val; 
			//set to hex 55
			val = 0x55;
			bytes[i] = (byte) val; 
			//set to hex 00
			val = 0x00;
			bytes[i] = (byte) val;
		}
	}
	
	/**
	 * Safely wipes all the data
	 */
	public void destroy(){
		if(this.isDestroyed)
			return;
		wipe(charRepresentation);
		wipe(charRepresentationLower);
		wipe(byteRepresentation);
		
		//loose references
		charRepresentation = null; 
		charRepresentationLower = null; 
		byteRepresentation = null; 
		
		this.length = 0;
		
		//TODO: call garbage collector?
		//Runtime.getRuntime().gc();
		
		this.isDestroyed = true;		
	}
	
	/**
	 * This method is only to check if non-sensitive data
	 * is part of the SecureString
	 * @return true if SecureString contains the given String
	 */
	public boolean contains(String nonSensitiveData){
		if(this.isDestroyed)
			return false;
		if(nonSensitiveData.length() > this.length)
			return false;
		char[] nonSensitiveDataChars = nonSensitiveData.toCharArray();
		int positionInNonSensitiveData = 0;
		for(int i=0; i < this.length; i++){
			if(positionInNonSensitiveData == nonSensitiveDataChars.length)
				return true;
			if(this.charRepresentation[i] == nonSensitiveDataChars[positionInNonSensitiveData])
				positionInNonSensitiveData++;
			else
				positionInNonSensitiveData = 0;
		}
		return false;
	}
}

Does that make any sense? How does the Java VM handle this? And how does the Android Dalvik Java VM handle this? Does the Android system cache the UI inputs anyway (so there would be no point in wiping our copies)? Is the data really going to be wiped when I use this class? Where are the real Java internals heroes out there?

Update 1, 11th September 2013: Added Beerware license. Note: This was just some idea I had. Use at your own risk.

Google Ads Encryption Key

Just this last time I will talk about this encryption key, but then I’ll shut up (for me this is so 2011). I presented this thing last year at conferences, but if you just look at the slides I think it’s hard to get the point.

During my Android research in 2011 I encountered several bad practises that are wide spread. Especially hard-coded symmetric encryption keys in the source code of Android apps are often used. One special occurence of one of these encryption keys I just couldn’t forget: Why would Google (Ads) themself use such a thing? It’s not a big deal, but it just adds no real protection and I kept wondering.

445 apps I decompiled used Google Ads code and included the following AES symmetric encryption key:

byte[] arrayOfByte1 = { 10, 55, -112, -47, -6, 7, 11, 75, 
                       -7, -121, 121, 69, 80, -61, 15, 5 };

This AES key is used to encrypt the last known location of the user (all location providers: GPS, WIFI, etc) and send it to Google via the JSON uule parameter. Most of the time the key is located in com.google.ads.util.AdUtil.java, com.google.ads.LocationTracker.java or if the app uses an obfuscator in u.java or r.java. Here’s the corresponding code:

String getLocationParam(){
    List localList = getLastKnownLocations();
    StringBuilder localStringBuilder1 = new StringBuilder();
    int i = 0;
    int j = localList.size();
    if (i < j){
      Location localLocation = (Location)localList.get(i);
      String str1 = protoFromLocation(localLocation);
      String str2 = encodeProto(str1);
      if (str2 != null){
        if (i != 0)
          break label89;
        StringBuilder localStringBuilder2 = localStringBuilder1.append("e1+");
      }
      while (true){
        StringBuilder localStringBuilder3 = localStringBuilder1.append(str2);
        i += 1;
        break;
        label89: StringBuilder localStringBuilder4 = localStringBuilder1.append("+e1+");
      }
    }
    return localStringBuilder1.toString();
  }

String encodeProto(String paramString){
    try{
      Cipher localCipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
      byte[] arrayOfByte1 = { 10, 55, 144, 209, 250, 7, 11, 75, 249, 135, 121, 69, 80, 195, 15, 5 };
      SecretKeySpec localSecretKeySpec = new SecretKeySpec(arrayOfByte1, "AES");
      localCipher.init(1, localSecretKeySpec);
      byte[] arrayOfByte2 = localCipher.getIV();
      byte[] arrayOfByte3 = paramString.getBytes();
      byte[] arrayOfByte4 = localCipher.doFinal(arrayOfByte3);
      int i = arrayOfByte2.length;
      int j = arrayOfByte4.length;
      byte[] arrayOfByte5 = new byte[i + j];
      int k = arrayOfByte2.length;
      System.arraycopy(arrayOfByte2, 0, arrayOfByte5, 0, k);
      int m = arrayOfByte2.length;
      int n = arrayOfByte4.length;
      System.arraycopy(arrayOfByte4, 0, arrayOfByte5, m, n);
      String str1 = Base64.encodeToString(arrayOfByte5, 11);
      str2 = str1;
      return str2;
    }
    catch (GeneralSecurityException localGeneralSecurityException){
      while (true)
        String str2 = null;
    }
  }

This code was produced by a decompiler, so it won’t compile, but you get the idea. Instead of using symmetric crypto, they should have used asymmetric crypto (public key in app, private key on server). There would be no way to intercept the last known location on the network then. Ok, here’s the timeline:

June 2011: Notified security@google.com (auto-response)
October 2011: Talked about it at 0sec / #days and spoke to two friends at Google who are not in the Android/Ads team. One of them (thank you!) sent my mail to the responsible people at Google.
Februar 2012: After a couple of emails I got the answer that they are going to switch to SSL

If you ever find the uule parameter on your network, let me know. One last thing: Depending on your decompiler, the AES key will look like this (wrong decompilation):

byte[] arrayOfByte1 = { 10, 55, 144, 209, 250, 7, 11, 75, 
                       249, 135, 121, 69, 80, 195, 15, 5 };

This is not a valid way to define a byte array in Java, because the integers must be between -127 and 128 (signed integers). That’s another funny fact: if you ever have to brute force a Java byte array like this, I would start with keys where every byte starts with a 0 bit (MSB=0). I saw a lot more of those keys. It seems that some Java developers don’t know about the instantiation with signed integers and therefore choose only numbers between 0 and 128.