MySQL Regex Conditional Errors and SQL Injection

Until a few days ago, I did not know any way of causing MySQL to throw an error based on a condition in a query. There is no documented way of doing this, but sometimes when trying to exploit a non-trivial SQL injection, you face situations where you do need to be able to force the backend DB to throw an error in order to get some feedback and carry on with your attack. This happened to a colleague of mine, who was stuck trying to exploit a SQL injection against a PHP + MySQL web application. The scenario was the following: the website appeared to be vulnerable to SQL injection, but no helpful error messages were shown back to the user, and common exploitation vectors seemed to be filtered (use of SELECT, ORDER BY, AND, UNION, etc.). After doing some research, I found a technique that allows MySQL users to force the database to throw an error using regular expressions, and without the need of using any of the mentioned filtered statements. I would like to highlight that by no means am I showcasing a fancy new technique, this has been out there for a while now, it is just something that is not very common and that I was not aware of.

Regular Expressions and MySQL

Since the latest version of MySQL 3, this database provides support for Regular Expressions within a SQL query (https://dev.mysql.com/doc/refman/5.7/en/regexp.html). To this aim, MySQL comes with the operators REGEX and RLIKE (just a synonym). A simple example of the use of a regular expression (regex) in a SQL query could be the following, which would return ‘1’ (true) as the number 2 belongs to the matching string: SELECT '12345' RLIKE '2'; 1 On the contrary, the following example would return ‘0’ (false) as the number 6 does not belong to the matching string: SELECT '12345' RLIKE '6'; 0 Now, let’s try something slightly different. What would happen if we left the regular expression empty? SELECT '12345' RLIKE ''; ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp As we can see, MySQL returns an error, complaining about an empty expression in the query. This is quite interesting, and can prove very helpful when exploiting a hard filtered SQL injection. Think of the following example: a website has a page that lists the current users, and a search function that filters by username. For the sake of the example, let’s also suppose the scenario is similar to the one found by my colleague, which is: no helpful errors outputted, and a certain level of filtering against common SQL statements. A basic request would look something like: https://www.example.com/searchUser?username=John And we can guess that the underlying (vulnerable) SQL query would be similar to: SELECT * FROM `example.user` WHERE `username` = [$username]; We could use basic blind SQL injection techniques in conjunction with RLIKE to force the application to crash under some circumstances: https://www.example.com/searchUser?username=(‘a’)RLIKE(if(1=1,'(‘,’a’)) In the above request, we are saying that if the condition (1=1 in this case) is true, then a regex with the value of ‘(‘ should be evaluated, otherwise it should evaluate ‘a’. The key thing here is that if the regex is evaluated with ‘(‘, it will provoke an error (similar to the one we saw before with the empty expression) and crash. By contrast, if the expression is evaluated with a valid regex (‘a’ in this case) it will continue its execution flow normally. This of course, can be used to get more relevant information.

Speeding Up Exploitation

It is not only possible to cause MySQL to throw an error using this technique, but it is also possible to speed up the exploitation process under some circumstances. MySQL returns different error messages depending on the cause of the failure, as shown below. This behaviour can be used to speed up the exploitation of a SQL injection but with a caveat, that it will only work if the application shows the error message back to the user. SELECT 1 REGEXP '' ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp SELECT 1 REGEXP '(' ERROR 1139 (42000): Got error 'parentheses not balanced' from regexp SELECT 1 REGEXP '[' ERROR 1139 (42000): Got error 'brackets ([ ]) not balanced' from regexp SELECT 1 REGEXP '\\' ERROR 1139 (42000): Got error 'trailing backslash (\)' from regexp SELECT 1 REGEXP '*' ERROR 1139 (42000): Got error 'repetition-operator operand invalid' from regexp SELECT 1 REGEXP 'a{1,1,1}' ERROR 1139 (42000): Got error 'invalid repetition count(s)' from regexp SELECT 1 REGEXP '[a-9]' ERROR 1139 (42000): Got error 'invalid character range' from regexp SELECT 1 REGEXP 'a{1,' ERROR 1139 (42000): Got error 'braces not balanced' from regexp SELECT 1 REGEXP '[[.ab.]]' ERROR 1139 (42000): Got error 'invalid collating element' from regexp SELECT 1 REGEXP '[[:ab:]]' ERROR 1139 (42000): Got error 'invalid character class' from regexp Usually, when exploiting a blind SQL injection 8 requests would need to be sent to a vulnerable web application to extract one byte of data from its database. Since the only value one request can extract is either true or false, one request for each of the 8 bits in a byte is needed. By utilizing conditional errors, instead of having 2 distinguishable states, 11 different states can be distinguished, 10 for the different error messages and 1 if no error occurred. Using these 11 states, 47% of all the 256 possible values of a byte could be determined in only 2 requests, another 47% in 3 requests, and the remaining 6% in 4 requests. If the possible values were narrowed down to only the printable characters (ASCII decimal 32-127), then 100% could be determined in 2 requests, or if the possible values were further narrowed down to numeric (0-9), only 1 request for each digit would be needed. A simple example to understand this is the following, where the first character of the MySQL version is being guessed: https://www.example.com/searchUser?username=('a')RLIKE (IF(ASCII(SUBSTRING((SELECT version()),1,1))<31,'', IF(ASCII(SUBSTRING((SELECT version()),1,1))<52,'(', IF(ASCII(SUBSTRING((SELECT version()),1,1))<73,'[', IF(ASCII(SUBSTRING((SELECT version()),1,1))<94,'\\\\', IF(ASCII(SUBSTRING((SELECT version()),1,1))<115,'*', IF(ASCII(SUBSTRING((SELECT version()),1,1))<136,'a{1,1,1}', IF(ASCII(SUBSTRING((SELECT version()),1,1))<157,'[a-9]', IF(ASCII(SUBSTRING((SELECT version()),1,1))<178,'a{1', IF(ASCII(SUBSTRING((SELECT version()),1,1))<199,'[[.ab.]]', IF(ASCII(SUBSTRING((SELECT version()),1,1))<230,'[[:ab:]]',1))))))))))) Let’s suppose that the application returns the “Got error ‘braces not balanced’ from regexp” error message, aligning with our third IF statement. This means that the ASCII char value for the first character of our version() result value is in the ASCII range of 115-136. Subsequent requests using this very same technique can be used to further narrow down the value until this is guessed. This method can be applied to pretty much every SQL injection you find out there that shows the error message generated by the request. To sum up, although not documented, it is possible to force MySQL to throw errors based on certain conditions within a query. This can be used to exploit SQL injections against hard filtered applications that do not allow other more common attack vectors, and also to substantially speed up the exploitation process under some circumstances.


Find out how we can help with your cyber challenge

Please enter your contact details using the form below for a free, no obligation, quote and we will get back to you as soon as possible. Alternatively, you can email us directly at [email protected]
Contact Us

Contact Us React out to one of our cyber experts and we will arrange a call