I am reading a lot of records (>50k)from one table on one system, do a bit of change on the fly and write it back onto another system.
I read a big bunch of data at once (using getDataSetByQuery) but writing is very slow since I use a loop with newrecord(). Is there a way to speed this up, apart from using a pure sql insert query with the whole bunch of data at once? I was thinking of extending the foundset or using addRow?
lesouef:
I am reading a lot of records (>50k)from one table on one system, do a bit of change on the fly and write it back onto another system.
I read a big bunch of data at once (using getDataSetByQuery) but writing is very slow since I use a loop with newrecord(). Is there a way to speed this up, apart from using a pure sql insert query with the whole bunch of data at once? I was thinking of extending the foundset or using addRow?
is autosave true ? you probably want it false here and save all data at once
it is “on”. just tried with off and writing at the end, the improvement is 40 recs in 12s instead of 14s. not the big change I am expecting.
Is there an equivalent of a SQL INSERT DELAYED for this kind of large imports?
How can I be as fast as importing from a file? (about 100 times faster) note my time measurement does not take into account the sql source response time, but start when I knows the getDataSetByQuery size.
lesouef:
it is “on”. just tried with off and writing at the end, the improvement is 40 recs in 12s instead of 14s. not the big change I am expecting.
Is there an equivalent of a SQL INSERT DELAYED for this kind of large imports?
How can I be as fast as importing from a file? (about 100 times faster) note my time measurement does not take into account the sql source response time, but start when I knows the getDataSetByQuery size.
that is a long time… what takes that long ? I don’t think a simple newRecord would take 0.3 seconds. Maybe some query is run that takes more time ? Can you log in your code and see what statement takes more time ?
login my script? what is this? is that gonna time the duration line per line?
function is attached…
and I just tested to output the queried dataset to an output file, this is 100 times faster…
query.txt (3.52 KB)
Hi Lesouf,
Is this form you use the controller on visible perhaps?
If so try to use a non-visible form or use foundset instead of controller.
This way you don’t have delays for screen redraw.
Hope this helps.
the thing is so long that I refresh the screen myself to make sure it runs.
in normal condition, it refreshes once finished, so I don’t think this is the problem.
the console can’t even be used realiably, it times out after a while.
I’ll try foundset way, and see if any easier.
actually I’d like to know what servoy uses to import from file, because this is the right way. I have tested from source to file, and it lasts 4.5ms/record, then from file to source > 0.32ms/record while direct sql import is 0.3s/record which 42 times slower. So i’d like to create my records as import does if this is not foundset.newrecord.
I’ll be back after the foundset way.
foundset.newrecord turns out to be an excellent improvement as long as there is no side function to slow it down like using a related record to fill one of the fields. it puts me in the import speed range.
databaseMannager.autoSave = false;
createNewRecords();
databaseManager.startTransaction();
if (databaseManager.saveData())
{
databaseManager.commitTransaction();
}
else
{
databaseManager.rollbackTransaction();
}
databaseMannager.autoSave = false;
that should be the fastest way to create new records.
The only thing is that if you really created many many many new records then maybe a saveData() and a commit/start should be done every X number of rows.
jcompagner:
databaseMannager.autoSave = false;
createNewRecords();
databaseManager.startTransaction();
if (databaseManager.saveData())
{
databaseManager.commitTransaction();
}
else
{
databaseManager.rollbackTransaction();
}
databaseMannager.autoSave = false;
that should be the fastest way to create new records. The only thing is that if you really created many many many new records then maybe a saveData() and a commit/start should be done every X number of rows.
Hi Johan,
Besides your remark above there is a great performance penalty when the foundset in which you add or update records is the foundset of a currently active and shown form…
In our framework we do all our multi record update in a duplicateFoundSet of the active form to avoid this and the performance gain is remarkable.
Regards,
yes thats true because then the UI must update itself after every newrecord call
especially when you are also changing the selection to that new record that really shouldn’t be done.
jcompagner:
yes thats true because then the UI must update itself after every newrecord call
especially when you are also changing the selection to that new record that really shouldn’t be done.
Yep, you’re right.
I thought this to be useful (performance) info for the forum Servoy community…
Regards,
to answer last 4 posts:
I imports 75k records roughly in this case.
I save data every 500 records to get some action on TV for the user!
About using a dupped foundset, I did not do this, thinking that only controller.newrecord was refreshing, not foundset.newrecord. Seems like both are refreshing? what’s the refresh rate? every record? I don’t see anything if I don’t insert a command to do so so far, even when using controller.newrecord in a loop…
Sorry for bumping this old thread… We are also in the process of importing a lot of data. A lot means 150k records, but now testing with a 50k subset.
What we do is indeed a loop like this:
var vSet = databaseManager.getFoundSet("db:/<our table>");
var vRecord;
databaseManager.setAutoSave(false);
databaseManager.startTransaction();
for (i = 0; i <= vAmountOfRecords; i++) {
vRecord = vSet.getRecord(vSet.newRecord());
//import records
//and even some related records aswell
if (i % 500 == 0) {
databaseManager.commitTransaction(true);
// we thought clearing the foundset might be a good idea
vSet.clear();
// start a new transaction
databaseManager.startTransaction();
}
}
// we are done but there might be an open transaction
if (databaseManager.hasTransaction()) {
databaseManager.commitTransaction(true);
}
So we commit every 500 records. This is quite fast at start, but the problem is that each commit takes longer to execute. Committing every 200 or 1000 does not really matter.
Can anyone explain why those commits take longer every next one?
I dont know the answer to that, but we do every (large) import by creating 500 INSERT row in plain text, and fire it by rawSQL,
blazingly fast, even at 100.000k of rows…
We thought about that too and tried that in the past. However, we have quite a lot of stored calculations in our solution (unfortunately).
Using rawsql means you have to recalculate all the records after, which would take the same amount of time in the end.
Apart from that, you don’t want to recalculate 100k records in one go. That would cause memory problems.
I ran some more tests, and it turns out the commits take longer when creating related records. When I create records in the main table only, commit time stays the same.
So despite the vSet.clear() after doing the commit, somehow the related foundsets are still in memory and also in the next commit.
Can this behaviour be explained?
Hi Ruben,
What if you get a new foundset instead of clearing it?
So use
vSet = databaseManager.getFoundSet("db:/<our table>");
``` instead of ```
vSet.clear();
Hi Robert,
Same thing happens… By the way this is a comparison of the logged commit times on using related records or not. So this every 200 records.
Related records No related records
Commit took 5 seconds Commit took 2 seconds
Commit took 4 seconds Commit took 2 seconds
Commit took 5 seconds Commit took 2 seconds
Commit took 6 seconds Commit took 3 seconds
Commit took 7 seconds Commit took 3 seconds
Commit took 7 seconds Commit took 2 seconds
Commit took 7 seconds Commit took 2 seconds
Commit took 8 seconds Commit took 3 seconds
Commit took 9 seconds Commit took 3 seconds
Commit took 9 seconds Commit took 3 seconds
Commit took 10 seconds Commit took 2 seconds
Commit took 10 seconds Commit took 2 seconds
Commit took 11 seconds Commit took 3 seconds
Commit took 11 seconds Commit took 3 seconds
Commit took 13 seconds Commit took 3 seconds
Commit took 12 seconds Commit took 2 seconds
Commit took 14 seconds Commit took 2 seconds
Commit took 12 seconds Commit took 3 seconds
Commit took 13 seconds Commit took 2 seconds
Commit took 13 seconds Commit took 3 seconds
Commit took 13 seconds Commit took 2 seconds
Commit took 17 seconds Commit took 2 seconds
Commit took 18 seconds Commit took 3 seconds
Commit took 18 seconds Commit took 3 seconds
is this a smart client?
What if you give the client way more memory, does that help?
Also if you look at the database performance tab on the admin page, do you see there timings go up?
Or what kind of queries are being done, are those piling up?
BTW: if it’s smart client consider doing the import server side using the HC plugin to save on latency.