How To Win At Java Code Audit

Reviewing Java source code can pose a challenge for a security auditor, as methods used to exploit programs in C or C++, namely memory corruption bugs, are mitigated by Java itself, which hides the details of memory management from the programmer. This same tendency to hide implementation details with a layer of abstraction leads to an entire class of common Java programming errors which can have a critical impact on the security of the application.

Java vulnerabilities are most commonly found in places where unsanitized user input is passed, directly or indirectly, on to an underlying library or service. To put it another way, vulnerabilities aren’t found in the Java code itself, they are found by following user input through the Java source and out the other side.

The tendency of Java to hide implementation details from the developer actually creates these vulnerabilities in places where it might not otherwise exist. Java developers use wrapper libraries for backend services, such as SQL or LDAP, and assume that they automatically sanitize their inputs, when usually they do not. In most cases, Java wrapper libraries themselves are simply classes that store and manipulate strings which are just passed directly on to the wrapped service. In many of these implementations, such as the ORM library Hibernate, there are architectural reasons why this behavior can not be changed.

In this post, I will describe a class of extremely common Java vulnerabilities, specifically these “pass-through” bugs, characterized by user input passing directly through Java unexamined.

For our first example, we’ll look at one of the most commonly used (and misused) constructs in the Java programming language:

The File Class Constructor

The File class has several constructors, but the most common takes a single string argument, which is the full path to a file. The second most commonly used constructor takes two string arguments, which are effectively appended together and treated the same as the single string argument.

The Java documentation uses the word ‘Canonicalization’ all over the place. All paths fed in to the File constructor are canonicalized. Many people understand this as “All the dot-dot-slashes are removed.”

While this is technically true, a canonicalized path has no path meta-characters, canonicalization doesn’t simply remove them – it resolves them correctly!

For example, the path “/www/hosts/mydomain.org/docs/../../../../etc/” would be “/etc/” after canonicalization.

This confusion over what canonicalization means commonly leads to directory traversal vulnerabilities in Java-based services.

Imagine a simple web server in Java, which does the following:

Accept an HTTP request for a particular URL: “http://www.mydomain.org/PATH”
Calls the File constructor with the web root and path: File f = new File( “/www/hosts/mydomain.org/docs”, PATH );
Simply opens the file and returns it to the requester as an HTTP response.

Perhaps the assumption is that somehow the File constructor filters out path meta-characters such as “../”, which it doesn’t. Some developers assume that the first argument to the file constructor will somehow act like a chroot and prevent “../” in the second argument from traversing to a higher directory. This is not the case, as both arguments are simply appended and treated as one big path string.

Whatever the developer assumptions, this error appears in different variations across a surprisingly large percentage of Java code.

This category of errors comes from the fact that Java can’t interact with the file system directly – it has to pass path information through to the operating system. In fact, the specific path meta-characters that can lead to injection will vary from platform to platform – even though Java tries to be “platform independent”! An obvious example: “../” on a Linux system is the same as “..\” on Windows.

To find these errors, simply search for places where user-controlled input is passed directly in to the File class constructor, without any additional logic to remove path meta-characters such as “../” or “..\”.

Our next example of a “pass-through” bug is in the use of a common Java logging library:

Log4J Javascript Injection

The most commonly used Java logging library is Log4J from Apache. Log4J provides a number of different methods that write data to a log file, for example:
Logger log = Logger.getLogger("mylogger"); log.error("This is an error message"); log.warn("This is a warning message"); log.debug("This is a debug message");

Log4J does not do any sanitization of strings passed in to it by the various logging methods, it simply takes the string it is given and writes this directly to the log file.

Most web applications that use Log4J will commonly include user-supplied values in at least some of their logging messages, for example:

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // do stuff, then on error something like: Logger logger = Logger.getLogger("GetLogger"); logger.error( "Invalid value for parameter fnord: "+ request.getParameter("fnord")); }

This makes some kind of sense – an error condition has been caused by invalid input, so the developer wants to see what the bad input was.

Web developers are also in the habit of viewing their web application log files directly from the web server, sometimes they even include HTML formatting tags in with their calls to Log4J methods so that the logs will be formatted nicely in the browser.

Imagine the following common scenario:

The production web server is on the domain http://somesite.com and the QA server is on http://qa.somesite.com.
Developers working on the QA server routinely view the Log4J logs through the browser by visiting http://qa.somesite.com/logs/mylog.
The subdomain www.somesite.com redirects to somesite.com, so all of the regular sites domain cookies are for somesite.com.

Now we construct a standard Cross-Site Scripting cookie-stealing attack by injecting some Javascript into the “fnord” parameter mentioned above. The program will then save this Javascript into the Log4J logs.

When the developer opens these logs in the browser, the Javascript will execute, and have access to the cookies for the domain qa.somesite.com.

However, the situation is much worse than this. There is a single exception to the Same Origin Policy which happens to apply to this situation.

A Javascript can set its effective domain by manipulating the “document.domain” variable. The exception to the Same Origin Policy says that a script can only set its domain to a suffix of its current domain.

This means that an attack script that runs from qa.somesite.com can reset its domain to somesite.com, and then access all of somesite.com‘s cookies!

A successful log-file script injection on a QA, development, or staging server which is a subdomain can result in stealing saved credentials from developers for the main domain!

This category of attacks passes directly through Java into a log file, which is then loaded by a browser.

To find these types of errors, look for calls to Log4J or similar logging functions which include HTML formatting tags and/or unsanitized user input.

Stay tuned for the second exciting installment of “How to Win at Java Code Audit”, including LDAP injection, Null Byte Injection, and ORM injection!

Leave a Reply