Can parameterized statement stop all SQL injection?
When articles talk about parameterized queries stopping SQL attacks they don't really explain why, it's often a case of "It does, so don't ask why" -- possibly because they don't know themselves. A sure sign of a bad educator is one that can't admit they don't know something. But I digress. When I say I found it totally understandable to be confused is simple. Imagine a dynamic SQL query
sqlQuery='SELECT * FROM custTable WHERE User=' + Username + ' AND Pass=' + password
so a simple sql injection would be just to put the Username in as ' OR 1=1-- This would effectively make the sql query:
sqlQuery='SELECT * FROM custTable WHERE User='' OR 1=1-- ' AND PASS=' + password
This says select all customers where they're username is blank ('') or 1=1, which is a boolean, equating to true. Then it uses -- to comment out the rest of the query. So this will just print out all the customer table, or do whatever you want with it, if logging in, it will log in with the first user's privileges, which can often be the administrator.
Now parameterized queries do it differently, with code like:
sqlQuery='SELECT * FROM custTable WHERE User=? AND Pass=?'
parameters.add("User", username)
parameters.add("Pass", password)
where username and password are variables pointing to the associated inputted username and password
Now at this point, you may be thinking, this doesn't change anything at all. Surely you could still just put into the username field something like Nobody OR 1=1'--, effectively making the query:
sqlQuery='SELECT * FROM custTable WHERE User=Nobody OR 1=1'-- AND Pass=?'
And this would seem like a valid argument. But, you would be wrong.
The way parameterized queries work, is that the sqlQuery is sent as a query, and the database knows exactly what this query will do, and only then will it insert the username and passwords merely as values. This means they cannot effect the query, because the database already knows what the query will do. So in this case it would look for a username of "Nobody OR 1=1'--" and a blank password, which should come up false.
This isn't a complete solution though, and input validation will still need to be done, since this won't effect other problems, such as XSS attacks, as you could still put javascript into the database. Then if this is read out onto a page, it would display it as normal javascript, depending on any output validation. So really the best thing to do is still use input validation, but using parameterized queries or stored procedures to stop any SQL attacks.
The links that I have posted in my comments to the question explain the problem very well. I've summarised my feelings on why the problem persists, below:
Those just starting out may have no awareness of SQL injection.
Some are aware of SQL injection, but think that escaping is the (only?) solution. If you do a quick Google search for
php mysql query
, the first page that appears is themysql_query
page, on which there is an example that shows interpolating escaped user input into a query. There's no mention (at least not that I can see) of using prepared statements instead. As others have said, there are so many tutorials out there that use parameter interpolation, that it's not really surprising how often it is still used.A lack of understanding of how parameterized statements work. Some think that it is just a fancy means of escaping values.
Others are aware of parameterized statements, but don't use them because they have heard that they are too slow. I suspect that many people have heard how incredibly slow paramterized statements are, but have not actually done any testing of their own. As Bill Karwin pointed out in his talk, the difference in performance should rarely be used as a factor when considering the use of prepared statements. The benefits of prepare once, execute many, often appear to be forgotten, as do the improvements in security and code maintainability.
Some use parameterized statements everywhere, but with interpolation of unchecked values such as table and columns names, keywords and conditional operators. Dynamic searches, such as those that allow users to specify a number of different search fields, comparison conditions and sort order, are prime examples of this.
False sense of security when using an ORM. ORMs still allow interpolation of SQL statement parts - see 5.
Programming is a big and complex subject, database management is a big and complex subject, security is a big and complex subject. Developing a secure database application is not easy - even experienced developers can get caught out.
Many of the answers on stackoverflow don't help. When people write questions that use dynamic SQL and parameter interpolation, there is often a lack of responses that suggest using parameterized statements instead. On a few occasions, I've had people rebut my suggestion to use prepared statements - usually because of the perceived unacceptable performance overhead. I seriously doubt that those asking most of these questions are in a position where the extra few milliseconds taken to prepare a parameterized statement will have a catastrophic effect on their application.