Why would adding LIMIT slow down a SQL query?

Could any of the SQL gurus please educate me on why limiting a SQL query to the last 100 records would cause an unacceptable slowdown in the Query results?

Documents table has about 400k records with ‘iddc’ being the Integer PK. The code below limits to 100 records - an identical Query without the limit (vServer, query, args, -1) produces much faster results.

		var query 		= "SELECT iddc FROM documents WHERE email_account = ? AND doc_type = ? ORDER BY doc_date desc";
		
		var args 		= new Array()
		args[0] 		= vEMaccount;		
		args[1]			= vDocType;
		
		var dataset 	= databaseManager.getDataSetByQuery(vServer, query, args, 100);

Hi Graham,

I guess you don’t have an index on the column doc_date ?
In that case the database process to obtain the sorted first 100 records could take some time…

Regards,

Thanks Lambert

I had to double-check but yes there is an index on doc_date.

CREATE INDEX doc_date
ON documents
USING btree
(doc_date DESC NULLS LAST);

Hmm, strange…

Have you called this query almost 2000 times with the same parameters ?
In other words : what happens if you put indexes on columns email_account and doc_type ?
Or : this query would perform better if you could use doc_date (date range) in the where clause since
doc_date has an index.

On large tables we always put an index on some kind of date column and tell our customers to use
that column in end-user queries. Our initial form foundset sort is also set to that date column.
The servoy load foundset always works with limit ? and we see great performance gain when filtering and sorting on indexed columns.

Regards,

Yes doc_type and email_account also have indexes.

Both Queries are triggered by clicking on tab_labels and will have slightly different parameters each time. The fast - unlimited - query is when Users select their Email_Account + Inbox/Pending/Drafts etc., that will almost always have less than Servoy’s 200 record initial load.

The slow - limit=100 - query is for the Emails Sent/Received selections that have many thousands of records - however most of the time they initially only want to see the last day or two. I (mistakenly) thought that by reducing the limit it would be faster.

On next update will remove the LIMIT=100 but was curious to learn what I might be doing wrong.

Appreciate your time on this.

Hi Graham,

You say the one without the limit gives you already a small resultset (less than 200) and the one with the limit would normally return more than that.
If so then it is not necessarily strange that the query is slower. It will have to select all the records matching your WHERE clause, then SORT it all before it can apply the LIMIT.
How large is your resultset (before the limit) ?
Also doing an EXPLAIN ANALYZE in a query editor will give more insight in how the 2 queries behave (I assume you are using PostgreSQL here).

Hi Robert

The original Query (without limit) for Inbox/Pending/Drafts will generally just result in 0-100 records.

Selecting all Sent/Received Emails for a persons Email_Account would result in many thousands of records - but generally they only want to check on Sent/Received in last day or so. That gave me the idea of limiting to the last 100 as there is a comprehensive Search tab for checking further.

Yes - I’m using Postgres and will take a look at the Query Analyzer - thanks for the tip.

Have a good weekend.

Graham