Speeding up a SQL query

Riccardino · February 3, 2006, 2:38pm

Hi,

I’m in the need of gather data from a SQL query in the most efficient, but my little SQL knowledge prevents me to find an alternative.

I’ve built a query for a daily planner that retrieves all the appointments of a certain time unit (1/2 hour) of a given room.
The query is the following:

select ap.appoid, an.lastname, ap.apponotes, i.esacolor, ap.anagid
from appointments ap, customers an, categories i
where ap.appodate = ‘2006-01-31’
and ap.matrix_agenda like ‘%|09:00|%’
and ap.anagid = an.anagid
and ap.roomid = 15
and i.categoriesid = ap.tipologid
UNION SELECT ap.appoid, NULL, ap.apponotes, i.esacolor, ap.anagid
from appointments ap, categories i
where ap.appodate = ‘2006-01-31’
and ap.matrix_agenda like ‘%|09:00|%’
and ap.roomid = 15
and ap.anagid = 0
and i.categoriesid = ap.tipologid
order by 1

This query is not particularly slow (according to Interactive SQL, execution time is between 0,011 and 0,022 seconds), but it’s fired 21 times per room (all the 1/2 hour from 9:00 to 19:00) and there are 7 rooms, which gives me a total of more or less a couple of seconds. Another 1/2 second is taken from html rendering, so I need almost 3 second for moving to a day to another: too much if the user is in a hurry.
I tried to enhance the response of the db creating indexes, but execution times remained the same.

So, my question is: is it possible to write this query in a more efficient way?
Since the query is executed 147 times, even a small reduction could do the difference…

Any help appreciated :-)

ROCLASI · February 3, 2006, 3:02pm

Without really analyzing your query…why not use 1 (or 2) query to get the whole set of data and then loop thru the result set?
This is how I build my HTML based overviews.
Also the HTML rendering is a big bottleneck. It’s slow.

Riccardino · February 3, 2006, 3:08pm

ROCLASI:
Without really analyzing your query…why not use 1 (or 2) query to get the whole set of data and then loop thru the result set?

It’s a backup option: since I wasn’t sure it would be faster, I’ve kept this as the last alternative, because it requires me to change the whole method… :-)

This is how I build my HTML based overviews.
Also the HTML rendering is a big bottleneck. It’s slow.

Yes, it’s not very fast (especially on macs). But if I can lower the whole process of 50%, despite of html rendering I get a totally acceptable 1 second of refresh time.

I’ll try to do as you suggest.

Thanks, Robert

swingman · February 3, 2006, 6:40pm

Hi Riccardo,

You can use a pivot table or a sequence-generating function + a join to retrieve all appointments for one day in one go.

In Postgres there is a function to create a sequence of something, so you could create a list of all the timeslots in one go…

select current_date + s.a as dates from generate_series(0,14,7) as s(a)
------------ 
2004-02-05 
2004-02-12 
2004-02-19 
(3 rows)

you could do this with time instead of dates

Riccardino · February 4, 2006, 9:02am

swingman:
Hi Riccardo,

You can use a pivot table or a sequence-generating function + a join to retrieve all appointments for one day in one go.

In Postgres there is a function to create a sequence of something, so you could create a list of all the timeslots in one go…
select current_date + s.a as dates from generate_series(0,14,7) as s(a)

2004-02-05
2004-02-12
2004-02-19
(3 rows)




you could do this with time instead of dates

Hi, Christian

thanks for your tip.
Unfortunately, my problem is not retrieving all the appos of a certain date: my function need to query the appos one by one because I have a predefined grid of intervals (30 min. each) and I need to know:
if there’s an appointment in this cell of the grid (that what the matrix is used for);
IF THERE IS: if the appointment belongs to a registered customer or not
IF THERE’S NOT: the cell must be white if the office is open, blue if it’s closed.

So, quite a lot of situations

But I found a way to improve the performance: I do a pre-query that retrieves all the appos of the day, then I place them in an array.
After that, before launching the stadard query, I check if there are appointments for the current interval: this will reduce the number of queries and speed up the whole process

swingman · February 4, 2006, 6:54pm

Ciao Ric,

the idea was to use a function or pivot to generate a list of all slots. Then you join the list of slots with your appointments. You end up with one row per slot, regardless of there being appointments or not.

Try the following:
define a stored calc in servoy which changes the time of the appointment into an integer 00.00 up to 00.30 = 1, 00.30 up to 01.00 = 2 etc.

Define a table called pivot with one field, an autoincrementing integer. Create 48 records so you have enough for the diary slots.

Then join the appointment table with the pivot GROUP BY the pivot integer… you can do the eqivalent of 48 queries in one go. And you can iterate over the rows of the resulting dataset and create one cell per row…

Riccardino · February 5, 2006, 6:33pm

swingman:
Ciao Ric,

the idea was to use a function or pivot to generate a list of all slots. Then you join the list of slots with your appointments. You end up with one row per slot, regardless of there being appointments or not.

Try the following:
define a stored calc in servoy which changes the time of the appointment into an integer 00.00 up to 00.30 = 1, 00.30 up to 01.00 = 2 etc.

Define a table called pivot with one field, an autoincrementing integer. Create 48 records so you have enough for the diary slots.

Then join the appointment table with the pivot GROUP BY the pivot integer… you can do the eqivalent of 48 queries in one go. And you can iterate over the rows of the resulting dataset and create one cell per row…

Wow: definitely a very interesting option.
I’ll give it a try

tweetie · February 6, 2006, 8:16am

like '%|09:00|%'
```Any chance of removing the first % ?
Searches wit blah% are fast....searches with %blah% are slow.
In time this will be getting slower and slower when your db grows, see my topic on this http://forum.servoy.com/viewtopic.php?t=5797
If your column is made up |09:00|10:00|11:00| maybe you could this in a separate table and make 3 rows instead of |09:00|10:00|11:00|

Riccardino · February 6, 2006, 9:13am

tweetie:

like '%|09:00|%'
```Any chance of removing the first % ?
Searches wit blah% are fast....searches with %blah% are slow.
In time this will be getting slower and slower when your db grows, see my topic on this http://forum.servoy.com/viewtopic.php?t=5797
If your column is made up |09:00|10:00|11:00| maybe you could this in a separate table and make 3 rows instead of |09:00|10:00|11:00|

I didn’t consider the “like” operator slowdown (the number of records is not very large).
Anyway, it seems a good option, but are you sure that adding a join wouldn’t waste the speed gain of moving from “like” to “=”?

tweetie · February 6, 2006, 9:28am

Riccardino:
I didn’t consider the “like” operator slowdown (the number of records is not very large).
Anyway, it seems a good option, but are you sure that adding a join wouldn’t waste the speed gain of moving from “like” to “=”?

It’s not the LIKE operator that is slow…but the %blah% part. Indexing on the column only indexes the first word (correct me if I am wrong, but that was the feeling I had with my really large db)…when you do a search in a large database like mentioned (%blah%) in my other thread it actually does a full text search on a column. If you search on the first word of the column (blah%)…the searches are fast no mather the size of the db.
If you want to do some real testing on large db. Create a new db (don’t use the ones you have) and use this SQL-statement (in Sybase Central’s Interactive SQL) to create a large db so you can test your queries

begin
declare i integer;
SET i = 1;
WHILE i <= 10000 LOOP
   INSERT INTO dbname( pk, columnname ) VALUES ( i,'|09:00|10:00|11:00');
   SET i = i + 1;
END LOOP;
end

Replace dbname, pk, columnname by the real names

chris.gruber · February 8, 2006, 2:52pm

While I agree with the comment of trying to get the page to be populated by a single query is the way to go. If you consider that 7*21 is 147 hits to the database for one page, each with its own different query is very expensive. Having one query could reduce the interactions with the database to a fraction of 147.

Your query has a UNION which can be expensive and a LIKE operator which is not easily index.

Lets start with an idea to remove the UNION. Within the UNION, the upper SELECT statement and lower SELECT statement share criteria and column list is very similar.

Here are the common criteria/predicates between the upper and lower SELECTs.

–where ap.appodate = ‘2006-01-31’
–and ap.matrix_agenda like ‘%|09:00|%’
–and ap.roomid = 15
–and i.categoriesid = ap.tipologid

Leaving only the following predicates as different.

and ap.anagid = 0 – not really sure what this means but I will defer
and ap.anagid = an.anagid – use to join the customers and appointments table

There is an IF expression with SQL Anywhere that may be helpful to accomodate the special anagid=0 case. The following query may be a format that gets your results without the need for a UNION.

select ap.appoid, if ap.anagid=0 then NULL else an.lastname endif as lastname, ap.apponotes, i.esacolor, ap.anagid
from appointments ap, customers an, categories i
where ap.appodate = ‘2006-01-31’
and ap.matrix_agenda like ‘%|09:00|%’
and (ap.anagid = an.anagid or ap.anagid=0)
and ap.roomid = 15
and i.categoriesid = ap.tipologid

To remove the LIKE clause, you can look at the nature of the data. It is a date manipulation that you are working with. The HOUR function may help you here. For example, HOUR(ap.matrix_agenda)=9 and to accomodate half hour segments MINUTE( ap.matrix_agenda) <30.

So your query could look like the following:

select ap.appoid, if ap.anagid=0 then NULL else an.lastname endif as lastname, ap.apponotes, i.esacolor, ap.anagid
from appointments ap, customers an, categories i
where ap.appodate = ‘2006-01-31’
–and ap.matrix_agenda like ‘%|09:00|%’
and (MINUTE( ap.matrix_agenda) <30 and HOUR(ap.matrix_agenda)=9 )
and (ap.anagid = an.anagid or ap.anagid=0)
and ap.roomid = 15
and i.categoriesid = ap.tipologid

SQL Anywhere can index functions which may prove useful to enable indexed lookups with this functions. See the CREATE INDEX statement on details on this.

My suggestions may be way off, if I am misinterpreting the meaning. Please test and verify the results.

Chris Gruber
iAnywhere Solutions

PS - Just as an idea for a query that could service most your data requirements for the page could be as follows:

select ap.appoid, if ap.anagid=0 then NULL else an.lastname endif as lastname, ap.apponotes, i.esacolor, ap.anagid , (HOUR(ap.matrix_agenda) || if MINUTE( ap.matrix_agenda)<30 then ‘:00’ else ‘:30’ endif) as timesegment
from appointments ap, customers an, categories i
where ap.appodate = ‘2006-01-31’
–and ap.matrix_agenda like ‘%|09:00|%’
–and (MINUTE( ap.matrix_agenda) <30 and HOUR(ap.matrix_agenda)=9 )
and (ap.anagid = an.anagid or ap.anagid=0)
–and ap.roomid = 15
and i.categoriesid = ap.tipologid
order by roomid,timesegment,ap.appoid

Riccardino · May 6, 2006, 4:10pm

chris.gruber:
Your query has a UNION which can be expensive and a LIKE operator which is not easily index.

Lets start with an idea to remove the UNION. Within the UNION, the upper SELECT statement and lower SELECT statement share criteria and column list is very similar.

Here are the common criteria/predicates between the upper and lower SELECTs.

–where ap.appodate = ‘2006-01-31’
–and ap.matrix_agenda like ‘%|09:00|%’
–and ap.roomid = 15
–and i.categoriesid = ap.tipologid

Leaving only the following predicates as different.

and ap.anagid = 0 – not really sure what this means but I will defer
and ap.anagid = an.anagid – use to join the customers and appointments table

There is an IF expression with SQL Anywhere that may be helpful to accomodate the special anagid=0 case. The following query may be a format that gets your results without the need for a UNION.

select ap.appoid, if ap.anagid=0 then NULL else an.lastname endif as lastname, ap.apponotes, i.esacolor, ap.anagid
from appointments ap, customers an, categories i
where ap.appodate = ‘2006-01-31’
and ap.matrix_agenda like ‘%|09:00|%’
and (ap.anagid = an.anagid or ap.anagid=0)
and ap.roomid = 15
and i.categoriesid = ap.tipologid

I re-wrote the query in order to follow your suggestion about UNION clause, Chris.
To be honest, I see very little difference in performance (at the moment, I haven’t touch the indexes, though).

chris.gruber:
To remove the LIKE clause, you can look at the nature of the data. It is a date manipulation that you are working with. The HOUR function may help you here. For example, HOUR(ap.matrix_agenda)=9 and to accomodate half hour segments MINUTE( ap.matrix_agenda) <30.

Not exactly: the column ap.matrix is a text field that stores data regarding the cells in which this appointment should show up (see attachment).
Since it’s a sort of grid, the matrix column indicates the cells to be occupied, storing data like:

|11:00|11:30|12:00|

which means that this appointment will begin at 9:00 and end at 10:00.

Since, if I understand well, the LIKE clause can be very expensive, do you think there’s a way to avoid using it even if the data I should manage are text?

chris.gruber:
SQL Anywhere can index functions which may prove useful to enable indexed lookups with this functions. See the CREATE INDEX statement on details on this.

I certainly will.
Thanks, Chris

Topic		Replies	Views
query optimization Classic Servoy	4	3502	June 1, 2006
Any thoughts on optimizing this Sybase SQL query? Classic Servoy	6	4157	September 24, 2009
SQL generated searching related records Classic Servoy	2	2560	October 11, 2010
speeding new records Classic Servoy	22	8349	February 27, 2013
databaseManager.getDataSetByQuery Classic Servoy	5	2856	June 22, 2009

Speeding up a SQL query

Related topics