After digging deeper on how prepared statements work and reading up on the many many sources about SQL injection and possible prevention I have to come to the following conclusion:
Use prepared statements.
This is why.
Prepared statements are parsed and planned by the query planner. After all this is done the parameters are added in the query-tree for execution. This means the parameters are no longer parsed and used as-is. Which means any SQL injected into these values will not be executed but used as values (as intended). This is what makes it safe.
The downside of this, as discussed at length in this thread, is that the query planner has to devise a query plan based on incomplete data. It doesn’t know what the passed values are and therefor doesn’t know the cardinality of these values. So it has to create a generic query-plan which can be efficient for some values and inefficient for others.
Now if you have a small database you will hardly notice the difference but when you have millions of rows this could be an performance issue. Depending on your use-case your mileage may vary.
So what about using plain sql where you concatenate all the values in your SQL statement?
Essentially it all comes down to your string values. Anything between 2 single quotes is considered a literal string. Anything outside these quotes are not strings and could be non-string values, SQL commands or logic.
And since we are using Java/JavaScript we have the benefit of checking what datatype a value is so we can safely assume numbers are numbers and dates are dates.
Consider the following:
SELECT * FROM users WHERE id='<id>' AND tenantid=1
And the following passed value:
' OR '1' = '1' --
Merging these together will give you the following SQL:
SELECT * FROM users WHERE id='' OR '1' = '1' --' AND tenantid=1
Mind you the – is the SQL comment so anything after – will be seen as a comment and not parsed/executed
So this type of SQL injection will now show ALL your user data.
What can be done against this? Escaping the quotes in your parameters.
So if we take the above value and escape it you get the following SQL:
SELECT * FROM users WHERE id='\' OR \'1\' = \'1\' --' AND tenantid=1
So now the SQL sees the passed value as is and the query will function as designed.
So that is simple to fix right?
Actually, no.
Each DB vendor has their own implementation of how it parses SQL. For example PostgreSQL has a feature called Dollar Quoted Strings which is a way of escaping a string by wrapping it inside $$$$. The thing is that when you have a $$ inside your string that that acts the same as doing a single quote. So everything behind the $$ will be seen as outside the string like so:
SELECT $$<value>$$
And the value:
test$$,now(),$$
Will result in:
SELECT $$test$$,now(),$$$$
This will result in ‘test’ and the current date.
So for PostgreSQL you need to escape these special characters as well. And MySQL has it’s own share of special characters. So does MSSQL and Oracle.
There seem to be other use cases as well that are exploitable like using ASCII() or CHAR() functions and the like.
I did find some proposed solutions online that encodes the values into HEX and then let it be de-encoded in the query. This might solve the issue of escaping but now you have another problem. Performance, which was the main issue of this thread. Encoding the HEX and de-encoding the HEX is not slow, it’s just that you usually don’t have indexes defined that support these. So your queries will not be using any indexes unless you created them as such (using functional indexes).
So long story short.
Be safe, use prepared statements. If it’s slow consider rewriting the query.
And if you really REALLY need to use plain sql queries make really sure your input values are valid. Perhaps check if they fall in a certain domain that is suitable for your query. Anything outside this domain should be rejected.
Even values that are already in your database can be a risk. Lets say you inserted them using a prepared statement and then later use those values in a merged SQL statement…I think you get the picture.
I guess the only safe use would be if your input values are generated by your own code like a date object or UUID.
Just my 2 cents.
Hope this helps.
Reading material:
https://www.owasp.org/index.php/SQL_Inj … heat_Sheet
https://en.wikipedia.org/wiki/SQL_injection
http://ferruh.mavituna.com/sql-injectio … sheet-oku/