[Modeling-users] SQLite inserts slow

Discussion:

Ernesto Revilla

2004-06-22 20:08:13 UTC

Dear Sébastien and everybody else:

I found SQLite inserts a bit slow. When enabling logging, I found that
new keys are retrieved using the following SQL statement:

Transaction: BEGIN
Evaluating: UPDATE PK_SEQ_ROOM SET id=((select max(id) from PK_SEQ_ROOM)+1)
Evaluating: select id FROM PK_SEQ_ROOM
Transaction: COMMIT

Then, after checking the table definition I saw that PK_SEQ_ROOM has
just the id field, and as I read anywhere below, just one row, so the
subquery should not be needed, not?

(file SQLiteAdaptorChannel.py near line 82)

On the other hand, changing this does not make it much faster. at least
with my box, committing on each update and read for new primary keys is
a bit slow, but for this delay sqlite has to be blamed.

Interesting enough that the following instruction makes it much faster
causing SQLite to turn off disk sync:
*PRAGMA default_synchronous = OFF;

*
With best regards,
Erny

Sebastien Bigaret

2004-06-23 15:11:02 UTC

Permalink

I found SQLite inserts a bit slow. When enabling logging, I found that new
Transaction: BEGIN
Evaluating: UPDATE PK_SEQ_ROOM SET id=((select max(id) from PK_SEQ_ROOM)+1)
Evaluating: select id FROM PK_SEQ_ROOM
Transaction: COMMIT
Then, after checking the table definition I saw that PK_SEQ_ROOM has just
the id field, and as I read anywhere below, just one row, so the subquery
should not be needed, not?
(file SQLiteAdaptorChannel.py near line 82)

Well, right: in fact, since sqlite db is not mt-safe at all, the
subquery has no interest (and BTW the second one should read SELECT
LAST_INSERT_ID() in mysql e.g., I need to change that).

On the other hand, changing this does not make it much faster. at least with
my box, committing on each update and read for new primary keys is a bit
slow, but for this delay sqlite has to be blamed.

Probably the pb. for you here is that every single new object fetches
its own PK; yes, it would be really more efficient to fetch the <n> next
PK if we need <n> new values... Could you fill a RFE so that I do not
forget?

Interesting enough that the following instruction makes it much faster
*PRAGMA default_synchronous = OFF;

Yes, but this makes the sqlite-db file very fragile, I believe this can
completely waste your data when some errors happen, or am I wrong? (no
time to check that in details right now)

-- Sébastien.

Ernesto Revilla

2004-06-24 07:21:01 UTC

Permalink

Hi again,

Post by Sebastien Bigaret

Post by Ernesto Revilla
Interesting enough that the following instruction makes it much faster
*PRAGMA default_synchronous = OFF;

Yes, but this makes the sqlite-db file very fragile, I believe this can
completely waste your data when some errors happen, or am I wrong? (no
time to check that in details right now)

It depends. sqlite writes all changes to a journal file, and when
commiting processes the journal file and appends the results to the end
of the file. The fragile point is after the journal file has been
processed, bu before the data has been written to disk. Of course, this
makes it more fragile, but the probability of a system crash is not very
high, provided we are using powersupplies and stable OS. If we have a
patch for this issue, I'll turn on syncing again.

With best regards,
Erny

Post by Sebastien Bigaret
-- Sébastien.

Sebastien Bigaret

2004-06-29 16:40:22 UTC

Permalink

Ok fine, and indeed, I've just read the documentation
http://www.hwaci.com/sw/sqlite/lang.html#pragma
and this definitely seems to be far less dangerous than what I thought it
could be.

Do you think this could be useful enough to have a setting to control that
within the framework? For example we could have a env.variable
'SQLITE_SYNCHRONOUS_MODE' set to either 'NORMAL' (default), 'FULL' or 'OFF',
an env.var. that the framework could use to set the synchr.mode when opening a
connection.

While we're at it, are there other PRAGMAs that you all using
sqlite think they should be controlled in the same way?

-- Sébastien.

Post by Ernesto Revilla
Hi again,

Post by Sebastien Bigaret

Post by Ernesto Revilla
Interesting enough that the following instruction makes it much faster
*PRAGMA default_synchronous = OFF;

Yes, but this makes the sqlite-db file very fragile, I believe this can
completely waste your data when some errors happen, or am I wrong? (no
time to check that in details right now)

It depends. sqlite writes all changes to a journal file, and when commiting
processes the journal file and appends the results to the end of the
file. The fragile point is after the journal file has been processed, bu
before the data has been written to disk. Of course, this makes it more
fragile, but the probability of a system crash is not very high, provided we
are using powersupplies and stable OS. If we have a patch for this issue,
I'll turn on syncing again.
With best regards,
Erny

Ernesto Revilla

2004-06-29 17:02:00 UTC

Permalink

Hi,

I don't know, but I think this is not needed. Just a little comment in
the documentation should be enough, because changing the
* *default_synchronous * *setting with PRAGMA is permament setting. Of
course, if you have to rebuild the database hundreds of times, this
would be a nive feature.

Anyway, perhaps it would be a nice thing to have the possibility to do
initialization for each connection. For postgresql and others it can be
used to set the isolation level, any encoding or language parameters,
query modes, etc.

Regards,
Erny

Post by Sebastien Bigaret
Ok fine, and indeed, I've just read the documentation
http://www.hwaci.com/sw/sqlite/lang.html#pragma
and this definitely seems to be far less dangerous than what I thought it
could be.
Do you think this could be useful enough to have a setting to control that
within the framework? For example we could have a env.variable
'SQLITE_SYNCHRONOUS_MODE' set to either 'NORMAL' (default), 'FULL' or 'OFF',
an env.var. that the framework could use to set the synchr.mode when opening a
connection.
While we're at it, are there other PRAGMAs that you all using
sqlite think they should be controlled in the same way?
-- Sébastien.

Post by Ernesto Revilla
Hi again,

Post by Sebastien Bigaret

Post by Ernesto Revilla
Interesting enough that the following instruction makes it much faster
*PRAGMA default_synchronous = OFF;

Yes, but this makes the sqlite-db file very fragile, I believe this can
completely waste your data when some errors happen, or am I wrong? (no
time to check that in details right now)

It depends. sqlite writes all changes to a journal file, and when commiting
processes the journal file and appends the results to the end of the
file. The fragile point is after the journal file has been processed, bu
before the data has been written to disk. Of course, this makes it more
fragile, but the probability of a system crash is not very high, provided we
are using powersupplies and stable OS. If we have a patch for this issue,
I'll turn on syncing again.
With best regards,
Erny

Mario Ruggier

2004-06-29 18:15:03 UTC

Permalink

Post by Sebastien Bigaret
While we're at it, are there other PRAGMAs that you all using
sqlite think they should be controlled in the same way?

I have not played with sqlite pragmas, so none that I can think of.
But, this makes me think of another minor issue... given that the
adaptorName is specified in the model, should there not be a place or
hook to specify any adaptor/db-specific initialization sql/code?

Or, stated differently, should not adaptorName be specified in the
config file, along with the connection dictionary? And, should not the
config file contain or point to sql or other code to be executed on
specific events, such as when db is created, when connection is
established, etc. This flexibility, while it may risk being abused ;),
could avoid getting stuck by the frameworks limitations at any given
moment....

mario

Ernesto Revilla

2004-06-30 12:42:07 UTC

Permalink

Hi,

Yes, I have the same opinion as Mario:
1. the adaptor (name) to be used should not be part of the model, but
read in vía config. file, although I know that code generation is a bit
different for each db adaptor. (I hope that soon dynamic will get a big
push and within a year everyone uses dynamic, so we have no code
generation phase: just change the config file and reinit the
application, and voilà, you use another database backend.)

2. there should be hooks on a per db adaptor base (at least, at db
creation time, and at connection time). Perhaps, which instructions to
send to the database should be part of either a general config file with
a section for each db adaptor, or a different config file for each db
adaptor.

With best regards,
Erny

Post by Mario Ruggier

Post by Sebastien Bigaret
While we're at it, are there other PRAGMAs that you all using
sqlite think they should be controlled in the same way?

I have not played with sqlite pragmas, so none that I can think of.
But, this makes me think of another minor issue... given that the
adaptorName is specified in the model, should there not be a place or
hook to specify any adaptor/db-specific initialization sql/code?
Or, stated differently, should not adaptorName be specified in the
config file, along with the connection dictionary? And, should not the
config file contain or point to sql or other code to be executed on
specific events, such as when db is created, when connection is
established, etc. This flexibility, while it may risk being abused
;), could avoid getting stuck by the frameworks limitations at any
given moment....
mario

Sebastien Bigaret

2004-07-01 16:31:08 UTC

Permalink

1. the adaptor (name) to be used should not be part of the model, but read
in vía config. file, although I know that code generation is a bit different
for each db adaptor. (I hope that soon dynamic will get a big push and
just change the config file and reinit the application, and voilà, you use
another database backend.)

OProvided that you've not noticed that already, you'll be pleased to
know that this is already possible! Look at the models used in the
tests: they do not have the adaptorName set, but they get it from the
files Postgresql.cfg, MySQL.cfg etc.

This is driven by the env. variable MDL_DB_CONNECTIONS_CFG, described
at: http://modeling.sourceforge.net/UserGuide/env-vars-core.html

2. there should be hooks on a per db adaptor base (at least, at db creation
time, and at connection time). Perhaps, which instructions to send to the
database should be part of either a general config file with a section for
each db adaptor, or a different config file for each db adaptor.

Agreed, I've just added this to the TODO list; BTW, I'm more inclined to
have specific python hooks that could be triggered at creation time and
at connection time.

-- Sébastien.