Discussion:
[Modeling-users] one model, multiple database instances
Delio Brignoli
2004-08-09 21:57:11 UTC
Permalink
Hello,

I'm evaluating the use of modeling for our new project(s) and was wondering if some of you could give me pointers on how to access multiple database instances of the same model in the same application. (I understand
that each database instance will be isolated and don't expect to insert object from an objectStore into another)
After creating my first model definition and 30 minutes dive in the docs, it seems I'll have to create an EC
passing a 'properly crafted' :-) ObjectStore instance. Am I on the right track?

regards
--
Delio
Sebastien Bigaret
2004-08-10 03:48:21 UTC
Permalink
Hi,
Post by Delio Brignoli
Hello,
I'm evaluating the use of modeling for our new project(s) and was
wondering if some of you could give me pointers on how to access
multiple database instances of the same model in the same
application. (I understand that each database instance will be
isolated and don't expect to insert object from an objectStore into
another) After creating my first model definition and 30 minutes dive
in the docs, it seems I'll have to create an EC passing a 'properly
crafted' :-) ObjectStore instance. Am I on the right track?
Without more details, I guess you intend to do some load-balancing between db-instances in aread-only app., right?

First, if you want to do that in different processes, simply change
the model's connection dictionary before using it; each process will
then use its own db instance.

Regarding EC(parentObjectStore): it's usually used to make child ECs;
passing it a dedicated ObjectStore is completely untested. But you're on
the right track: you can either give it a dedicated DBContext, or use
the default behaviour (which assigns a ObjectStoreCoordinator) and
design your own ObjStoreCoord. so that e.g. it returns different
DBContexts for different ECs.
In both cases, you'll have to take care of the DBContext's Database
(see e.g. DBContext.handleNotification): either each one gets its own,
or they all share the same. If every DBContext gets its own Database
object, then each EC should *always* get the same DBContext, or you'll
defeat the snapshotting ref. counting.

-- Sébastien.
Delio Brignoli
2004-08-10 07:18:05 UTC
Permalink
On 10 Aug 2004 07:47:43 +0200
Sebastien Bigaret <***@users.sourceforge.net> wrote:

Hi,
I realized I wrote my first reply in a hurry so I'm writing this up to give you more info on what I want to achieve so you can have a better idea of what's involved.

The application runs in a multithreaded application server, a client to database instance mapping
is done by the application already ( in its first incarnation it was accessing the DB (almost-)directly )
let's say that we have a mapping between an ID string and a database connection string.
These are my ideas and the may not play well with the way modeling is designed, but I hope will be a starting point.

What I want to achieve:

When I create an EC, I'd like to pass the ID string as a parameter and this should give me an EC
that works 'like' a plain old connection to the mapped database instance.

How to get there:

- change the mapping into: ID -> ( connection params, model )
?- subclass EC to support the ID string be passed on __init__
?- subclass ObjectStoreCoordinator to use the ID string information in EC to create/select a proper DatabaseContext.
?- at which level caches work in modeling: I obviously don't want rows from different database instances to mixup :-P

regards
--
Delio
Sebastien Bigaret
2004-08-12 20:24:00 UTC
Permalink
Post by Sebastien Bigaret
Without more details, I guess you intend to do some load-balancing
between db-instances in aread-only app., right?
not at this stage, load balancing if further down the track, now we
need to access several DBs with the same schema, it is a requirement
that each customer has is own separate database instance; fixing one
fixes the other as well.
Oh oh, so this is something completely different than what I initially had in
mind.

If I understand you right, you have a reference model, something like
a template model, that you want to use for every customer account.
Each one will get its own model, say a copy of the reference model,
that it should use to access its data in a separate database.

But since in the framework, entities' names share a single namespace, you won't
be able to load two different models with entities having the same name in the
same process.

What seems to be a serious problem at first sight also suggests that
there exists a possible and easy solution.

Say your reference model contains one entity 'Document', then for each account
'login', you can create a model named e.g. 'model_login', containing the entity
'Document_login'. This way you'll be able to dynamically create the models,
create or destroy the dedicated databases from them (see the script
mdl_generate_db_schema.py for details on how this can be done), and access each
account's data in a dedicated database (since every 'model_login' will get its
own connection dictionary, as expected).

[...]
My knowledge of the framework is limited, I literaly unpacked the
source less than 24 hours ago, ported some of our biz object, logic
and have one test from the test suit succesfully complete (by the way,
being able to express the model in python made my life easier) (in
case you are interested I tried the dynamic metaclass based approach
first but I'd get an exception when populating the class as soon as I
defined an __init__ method: check_old_init method tried to len() args
but python chocked ;) ) ...
I'm interested, if you could supply a simple example and fill a bug report
@sf.net that would help a lot ;)
but back to the issue,
I'll look into the framework more to understand exactly what it's
involved in adding this feature and since I can already hear my boss
asking it, I'll ask you straight away how much it would cost to
contract you to add this feature (and how long it would take).
Hopefully the solution exposed above can make it; if not, then do not hesitate
to get back to me!


Oh, but wait... sorry, I didn't realize you sent an other email with more
When I create an EC, I'd like to pass the ID string as a parameter and
this should give me an EC that works 'like' a plain old connection to
the mapped database instance.
- change the mapping into: ID -> ( connection params, model )
?- subclass EC to support the ID string be passed on __init__
?- subclass ObjectStoreCoordinator to use the ID string information in
EC to create/select a proper DatabaseContext.
?- at which level caches work in modeling: I obviously don't want rows
from different database instances to mixup :-P
Okay, so good news, we're definitely talking about the same thing ;)

So what you need to do is: use the 'ID' (login e.g.) to dynamically build (or
load or...) the corresponding model 'model_ID', use the correct entity in this
model e.g. 'Document_ID' when fetching objects for example, and use the correct
class (that you'd probably want to
dynamic.build...() from the model) to build new objects --this way they get
automatically inserted into the right table in the right database
(look in class ClassDescription or EntityClassDescription for a method
automatically building the right object of the right class given its entity).

I think this makes everything work the way you want; and obviously you won't
run into any cache trouble or so, since you're just using different models,
that all!

What do you think?

-- Sébastien.
Mario Ruggier
2004-08-10 08:36:00 UTC
Permalink
Post by Sebastien Bigaret
Post by Delio Brignoli
Hello,
I'm evaluating the use of modeling for our new project(s) and was
wondering if some of you could give me pointers on how to access
multiple database instances of the same model in the same
application. (I understand that each database instance will be
isolated and don't expect to insert object from an objectStore into
another) After creating my first model definition and 30 minutes dive
in the docs, it seems I'll have to create an EC passing a 'properly
crafted' :-) ObjectStore instance. Am I on the right track?
Without more details, I guess you intend to do some load-balancing
between db-instances in aread-only app., right?
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not be
the most efficient to way to do this, but possibly the most flexible
and one that would guarantee data integrity and consistency the most.
Any opinions on whether this would be a good idea or not?

mario
Sebastien Bigaret
2004-08-12 20:28:09 UTC
Permalink
Mario Ruggier wrote:
[...]
Post by Mario Ruggier
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not
be the most efficient to way to do this, but possibly the most
flexible and one that would guarantee data integrity and consistency
the most. Any opinions on whether this would be a good idea or not?
I'm sure we've played with that idea some day in the past, if I remember well it
was probably with Yannick Gingras (I can't check the archives right now).

The problem is that, while this sounds interesting, this could
probably only be done with small databases --because, unless we
want to automatically analyse a model in detail, handling
relationships will end up in loading a whole database in memory.

Moreover, I'm convinced that it will be difficult to cover every
transformations. But I can be proved wrong!

-- Sébastien.
Mario Ruggier
2004-08-18 12:33:03 UTC
Permalink
Post by Sebastien Bigaret
[...]
Post by Mario Ruggier
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not
be the most efficient to way to do this, but possibly the most
flexible and one that would guarantee data integrity and consistency
the most. Any opinions on whether this would be a good idea or not?
I'm sure we've played with that idea some day in the past, if I remember well it
was probably with Yannick Gingras (I can't check the archives right now).
The problem is that, while this sounds interesting, this could
probably only be done with small databases --because, unless we
want to automatically analyse a model in detail, handling
relationships will end up in loading a whole database in memory.
OK, as things are it sounds like that in practice it will not be the
way to go.
It would be difficult to guarantee that the object graph of the target
EC will stay within a certain size... before the target model integrity
is respected and one could actually do a saveChanges()...
Post by Sebastien Bigaret
Moreover, I'm convinced that it will be difficult to cover every
transformations. But I can be proved wrong!
Well, I assume that the transformation logic itself will be written
case-per-case... looping over the tables and their relationships,
copying/maping them from one EC to another. There should not be any
limitations there... python is pretty capable ;)

Anyhow, w.r.t. the initial argument of this thread, what you suggest,
i.e. to dynamically create as many models as you need but with class
names done with Table_ID, seems rather opaque and awkward to me...

How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to specify
its connection dictionary? i.e.

ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)

I guess EditingContext could also grow an optional model parameter...
for when the system will support more than the defaultModel.

For EditingContexts that inherit from another, they would need to exist
in the same db context, so they will be obliged to use their parent's
connection dictionary...

In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)

mario
Sebastien Bigaret
2004-08-24 18:57:51 UTC
Permalink
Hi Mario and all,

Mario Ruggier <***@ruggier.org> wrote:
[...]
Anyhow, w.r.t. the initial argument of this thread, what you suggest, i.e. to
dynamically create as many models as you need but with class names done with
Table_ID, seems rather opaque and awkward to me...
Well, awkward I don't know, probably a matter of taste! but anyhow it
was not my suggestion, it was the initial requirement: Delio wants a
database per user. Each of these database needs to be uniquely defiend
within the server, and I think that naming each of them after the
user/login name is quite straightforward ;)

Now, since all entities live in a single namespace, then they need to be
uniquely defined as well --that's why they are also named after the
user's name, the only reason, but also a very strong one.
How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to
specify its connection dictionary? i.e.
ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)
I guess EditingContext could also grow an optional model
parameter... for when the system will support more than the
defaultModel.
For EditingContexts that inherit from another, they would need to
exist in the same db context, so they will be obliged to use their
parent's connection dictionary...
NB: I'll answer here as if you suggested that we could have entities
sharing the same name in different model having different connection
dict; this is equivalent to what you say if we consider that a model
and its connection dict. *cannot* be separated --later in the post
I'll explain why they cannot be decorrelated.


The real problem here is that you're gonna have objects corresponding
different entities, all named Document but living in different
database. Say you have database DB1 and DB2, you suggest that we can
have at runtime objects doc1 and doc2, resp. being object for entities
Document in DB1 and Document in DB2.

Now you'll see the real problem if you realize that each object is
uniquely identified by its GlobalID. And a GlobalID is the combination
od an entity name and its PK value(s). If obj1 and obj2 and the same
PK, then they will get the same gID. The consequences are, as expected,
devastative: snapshots are all mixed up, faults will pick up from the db
cache some data that in fact belongs to another database, etc...

In brief: your suggestion requires that the entities' names live in a
separate namespace, or put differently, that an entity is identified by
its name and its model's name. Alas, the whole framework has been
coded with the idea of unique, distinct entities names, so while this
is possible, and while this could be added as a TODO item, this
definitely won't come up as a priority item for long :/
In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)
I strongly disagree here: the connection dictionary does belong to the
model: this is entity-relationship modeling, entities describe classes
stored in databases. Or, if you prefer: as soon as an object is
fetched, the model is tightly bound to its connection dictionary: its
connection dictionary becomes a read-only attribute. So, even if the
values for the conn.dict. can be specified elsewhere (as you suggested,
and as it was implemented w/ MDL_DB_CONNECTIONS_CFG since), it is a
constituent part of the model.


-- Sébastien.
Mario Ruggier
2004-08-26 12:36:12 UTC
Permalink
Post by Sebastien Bigaret
Hi Mario and all,
[...]

Thanks for the detailed reply! It helps me understand better what's
involved.... further comments below.
Post by Sebastien Bigaret
Post by Mario Ruggier
How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to
specify its connection dictionary? i.e.
ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)
I guess EditingContext could also grow an optional model
parameter... for when the system will support more than the
defaultModel.
For EditingContexts that inherit from another, they would need to
exist in the same db context, so they will be obliged to use their
parent's connection dictionary...
NB: I'll answer here as if you suggested that we could have entities
sharing the same name in different model having different
connection
dict; this is equivalent to what you say if we consider that a model
and its connection dict. *cannot* be separated --later in the post
I'll explain why they cannot be decorrelated.
Yes, exactly... a model instance cannot change its connection
dictionary!
I think what I was saying, but without the intimate knowledge of the
internals of the framework to be so precise, is for a model "class" to
be instantiated at EditingContext() time, as opposed to model load time
(this will give you the model "class"... ).
Post by Sebastien Bigaret
The real problem here is that you're gonna have objects corresponding
different entities, all named Document but living in different
database. Say you have database DB1 and DB2, you suggest that we can
have at runtime objects doc1 and doc2, resp. being object for entities
Document in DB1 and Document in DB2.
Now you'll see the real problem if you realize that each object is
uniquely identified by its GlobalID. And a GlobalID is the combination
od an entity name and its PK value(s). If obj1 and obj2 and the same
PK, then they will get the same gID.
But, is this simply a matter of changing the algorithm of how a GID is
calculated?
What if you throw in for example the database name as part of that
calculation?
Then, two objects may have the same entity name, same model name, same
PK even... but if they come from a different db, there will not be any
ambiguity... Note that this should also allow for the possibility to
have the same object from the same database but in two distinct
EditingContexts (not one that inherits from the other), to be correctly
identified by the framework as the same object.
Post by Sebastien Bigaret
The consequences are, as expected,
devastative: snapshots are all mixed up, faults will pick up from the db
cache some data that in fact belongs to another database, etc...
In brief: your suggestion requires that the entities' names live in a
separate namespace, or put differently, that an entity is identified by
its name and its model's name.
Yes, entities' names should be in a different namespace... but, I would
go even further, and suggest also that models live in a separate
namespace, defined by the database instance. Thus, you can instantiate
the same model **exactly**, but bound to different databases.
Post by Sebastien Bigaret
Alas, the whole framework has been
coded with the idea of unique, distinct entities names, so while this
is possible, and while this could be added as a TODO item, this
definitely won't come up as a priority item for long :/
Certainly, and this is exactly the the kind of feedback I'm looking for
when I put out this suggestion... I want to understand better if the
idea makes any sense, if having such a way of initialising an EC brings
other advantages or disadvantages, and how difficult it is to modify
the framework so drastically.

I cannot appreciate how much work such a thing could be, but from what
you say above I suspect it may be less than I thought ;)

In any case, I think it would be too early to add this as a TODO item
as yet, as further poking and contemplation is probably a good idea.
Anyone else following this issue have an opinion?
Post by Sebastien Bigaret
Post by Mario Ruggier
In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)
I strongly disagree here: the connection dictionary does belong to the
model: this is entity-relationship modeling, entities describe classes
stored in databases. Or, if you prefer: as soon as an object is
fetched, the model is tightly bound to its connection dictionary: its
connection dictionary becomes a read-only attribute. So, even if the
values for the conn.dict. can be specified elsewhere (as you suggested,
and as it was implemented w/ MDL_DB_CONNECTIONS_CFG since), it is a
constituent part of the model.
Yes, this emphasises what what said above.... once an object is
fetched, the model is bound to its conn dictionary... i.e to the
database. So, if the db can be specified at EC() creation time, then
(conceptually, at least ;-\ ) there need not be any GID clashing
between objects coming form the same model and the same entity...
Post by Sebastien Bigaret
-- Sébastien.
Thanks again!

mario
Sebastien Bigaret
2004-08-30 18:35:15 UTC
Permalink
Mario Ruggier <***@ruggier.org> wrote:
[...]
Post by Mario Ruggier
Post by Sebastien Bigaret
The real problem here is that you're gonna have objects corresponding
different entities, all named Document but living in different
database. Say you have database DB1 and DB2, you suggest that we can
have at runtime objects doc1 and doc2, resp. being object for entities
Document in DB1 and Document in DB2.
Now you'll see the real problem if you realize that each object is
uniquely identified by its GlobalID. And a GlobalID is the combination
od an entity name and its PK value(s). If obj1 and obj2 and the same
PK, then they will get the same gID.
But, is this simply a matter of changing the algorithm of how a GID is
calculated?
What if you throw in for example the database name as part of that
calculation?
Then, two objects may have the same entity name, same model name, same PK
even... but if they come from a different db, there will not be any
ambiguity... Note that this should also allow for the possibility to have
the same object from the same database but in two distinct EditingContexts
(not one that inherits from the other), to be correctly identified by the
framework as the same object.
If by "database name" you mean adaptor name + db name + user name
(Oracle for example can handle two db with the same name for two
different users), then yes, in theory this works.
Post by Mario Ruggier
Post by Sebastien Bigaret
The consequences are, as expected,
devastative: snapshots are all mixed up, faults will pick up from the db
cache some data that in fact belongs to another database, etc...
In brief: your suggestion requires that the entities' names live in a
separate namespace, or put differently, that an entity is identified by
its name and its model's name.
Yes, entities' names should be in a different namespace... but, I would go
even further, and suggest also that models live in a separate namespace,
defined by the database instance. Thus, you can instantiate the same model
**exactly**, but bound to different databases.
Again, in this case, yes, this would work. But this requires a lot of
changes. And back to reality, I believe that:

1. Any app. that would require/benefit from such a feature can anyhow
solve the problem differently and without much trouble,

2. this would add either some heavy changes in the API or some sort of
"implicit" knowledges about the EC e.g. for some operations. For
example: would we require that ec.fetch() specifies the exact Entity
object (or an equivalent string, with the whole conn.dicyt (possibly
hashed)? Or would we rather consider by default the model to which
the EC is bound, making the portion of code::

ec.fetch('Person')

fuzzy about which Entity it fetches?
Post by Mario Ruggier
[...] I want to understand better if the idea makes any sense, if
having such a way of initialising an EC brings other advantages or
disadvantages, and how difficult it is to modify the framework so
drastically.
I cannot appreciate how much work such a thing could be, but from what
you say above I suspect it may be less than I thought ;)
In any case, I think it would be too early to add this as a TODO item
as yet, as further poking and contemplation is probably a good
idea. Anyone else following this issue have an opinion?
You're right, that's way too early!... As you probably already guessed,
the more I'm thinking about such an extension, the less I like it. The
concepts is simple, but the consequences, esp. on the API, are not
desirable. I believe that asking more than an Entity to be supplied to
e.g. ec.fetch() is not a very good idea, and I do not like the idea of
adding the level of implicit described above in statements like
ec.fetch().

Plus, I now tend to believe that this is another YAGNI --the use-case
that initited the discussion can be solved more elegantly (as ar as
I'm concerned) by other means, as described in other posts.

-- Sébastien.
Sebastien Bigaret
2004-08-30 18:40:01 UTC
Permalink
Mario Ruggier <***@ruggier.org> wrote:
[...]
Post by Mario Ruggier
Post by Sebastien Bigaret
The real problem here is that you're gonna have objects corresponding
different entities, all named Document but living in different
database. Say you have database DB1 and DB2, you suggest that we can
have at runtime objects doc1 and doc2, resp. being object for entities
Document in DB1 and Document in DB2.
Now you'll see the real problem if you realize that each object is
uniquely identified by its GlobalID. And a GlobalID is the combination
od an entity name and its PK value(s). If obj1 and obj2 and the same
PK, then they will get the same gID.
But, is this simply a matter of changing the algorithm of how a GID is
calculated?
What if you throw in for example the database name as part of that
calculation?
Then, two objects may have the same entity name, same model name, same PK
even... but if they come from a different db, there will not be any
ambiguity... Note that this should also allow for the possibility to have
the same object from the same database but in two distinct EditingContexts
(not one that inherits from the other), to be correctly identified by the
framework as the same object.
If by "database name" you mean adaptor name + db name + user name
(Oracle for example can handle two db with the same name for two
different users), then yes, in theory this works.
Post by Mario Ruggier
Post by Sebastien Bigaret
The consequences are, as expected,
devastative: snapshots are all mixed up, faults will pick up from the db
cache some data that in fact belongs to another database, etc...
In brief: your suggestion requires that the entities' names live in a
separate namespace, or put differently, that an entity is identified by
its name and its model's name.
Yes, entities' names should be in a different namespace... but, I would go
even further, and suggest also that models live in a separate namespace,
defined by the database instance. Thus, you can instantiate the same model
**exactly**, but bound to different databases.
Again, in this case, yes, this would work. But this requires a lot of
changes. And back to reality, I believe that:

1. Any app. that would require/benefit from such a feature can anyhow
solve the problem differently and without much trouble,

2. this would add either some heavy changes in the API or some sort of
"implicit" knowledges about the EC e.g. for some operations. For
example: would we require that ec.fetch() specifies the exact Entity
object (or an equivalent string, with the whole conn.dicyt (possibly
hashed)? Or would we rather consider by default the model to which
the EC is bound, making the portion of code::

ec.fetch('Person')

fuzzy about which Entity it fetches?
Post by Mario Ruggier
[...] I want to understand better if the idea makes any sense, if
having such a way of initialising an EC brings other advantages or
disadvantages, and how difficult it is to modify the framework so
drastically.
I cannot appreciate how much work such a thing could be, but from what
you say above I suspect it may be less than I thought ;)
In any case, I think it would be too early to add this as a TODO item
as yet, as further poking and contemplation is probably a good
idea. Anyone else following this issue have an opinion?
You're right, that's way too early!... As you probably already guessed,
the more I'm thinking about such an extension, the less I like it. The
concepts is simple, but the consequences, esp. on the API, are not
desirable at all. I believe that asking more than an Entity to be
supplied to e.g. ec.fetch() is not a very good idea, and I do not like
the idea of adding the level of implicit described above in statements
like ec.fetch(). Makes sense?

-- Sébastien.
Sebastien Bigaret
2004-08-30 18:57:58 UTC
Permalink
Ooops, two posts, only the last paragraphs that differ.
The second one is the right one. I removed the
discussion of whether thsi is a YAGNI, just because I
think that the consequences on the API and clarity of
MDL statements are sufficient to discard the feature
(unless it is proven that both approaches --more details
or going the implict way) can be avoided.

Just to make it clear ;)

-- Sébastien.
n***@tin.it
2004-08-31 22:38:20 UTC
Permalink
On Mon, 30 Aug 2004 22:55:37 +0200
Post by Sebastien Bigaret
Ooops, two posts, only the last paragraphs that differ.
The second one is the right one. I removed the
discussion of whether thsi is a YAGNI, just because I
think that the consequences on the API and clarity of
MDL statements are sufficient to discard the feature
(unless it is proven that both approaches --more details
or going the implict way) can be avoided.
Hello,

I'm back from the dead (sorta), unfortunately haven't had time to try
Sebastien's solution ( append login name to entities and modelname );
Tricky part is writing your biz ligic once:

class A:

__metaclass__ = dynamic.CustomObjectMeta
entityName = 'A'
mdl_define_properties = 1

def custom_method( self ):
pass

and then have class A_login1, A_login2 generated on the fly,
playing with the entityName attribute, I'll give this a try later.

thanks for your feedback
--
Delio
Mario Ruggier
2004-09-10 12:12:00 UTC
Permalink
Sorry about the delay in getting back on this. Also, as this discussion
is on a very low priority enhancement, and one that Sébastien clearly
does not like, I do not want to belabor the issue too much. But for the
sake of correctness, I will just add a few comments that I hope are
relevant.
Post by Sebastien Bigaret
[...]
Post by Mario Ruggier
Post by Sebastien Bigaret
The real problem here is that you're gonna have objects
corresponding
different entities, all named Document but living in different
database. Say you have database DB1 and DB2, you suggest that we can
have at runtime objects doc1 and doc2, resp. being object for entities
Document in DB1 and Document in DB2.
Now you'll see the real problem if you realize that each object is
uniquely identified by its GlobalID. And a GlobalID is the
combination
od an entity name and its PK value(s). If obj1 and obj2 and the same
PK, then they will get the same gID.
But, is this simply a matter of changing the algorithm of how a GID is
calculated?
What if you throw in for example the database name as part of that
calculation?
Then, two objects may have the same entity name, same model name, same PK
even... but if they come from a different db, there will not be any
ambiguity... Note that this should also allow for the possibility to have
the same object from the same database but in two distinct
EditingContexts
(not one that inherits from the other), to be correctly identified by the
framework as the same object.
If by "database name" you mean adaptor name + db name + user name
(Oracle for example can handle two db with the same name for two
different users), then yes, in theory this works.
I guess "database name" will have to be equivalent to the conn
dictionary, and mapped down to a single database instance, however that
might be handled by the underlying db.
Post by Sebastien Bigaret
Post by Mario Ruggier
Post by Sebastien Bigaret
The consequences are, as expected,
devastative: snapshots are all mixed up, faults will pick up from the db
cache some data that in fact belongs to another database, etc...
In brief: your suggestion requires that the entities' names live in a
separate namespace, or put differently, that an entity is
identified by
its name and its model's name.
Yes, entities' names should be in a different namespace... but, I would go
even further, and suggest also that models live in a separate
namespace,
defined by the database instance. Thus, you can instantiate the same model
**exactly**, but bound to different databases.
Again, in this case, yes, this would work. But this requires a lot of
1. Any app. that would require/benefit from such a feature can anyhow
solve the problem differently and without much trouble,
I can appreciate that the framework change may not be feasible at all,
and that workarounds are available. My concern though is primarily to
simplify and improve how the framework can be used. It is (kind of ;)
like saying that there is no need to make a piece of code clearer
because it works...
Post by Sebastien Bigaret
2. this would add either some heavy changes in the API or some sort of
"implicit" knowledges about the EC e.g. for some operations. For
example: would we require that ec.fetch() specifies the exact Entity
object (or an equivalent string, with the whole conn.dicyt (possibly
hashed)? Or would we rather consider by default the model to which
ec.fetch('Person')
fuzzy about which Entity it fetches?
I am not sure I understand your first point here. I was assuming
something along the lines of the second one, that is:
- an EC instance *must* be initialised with a Model instance, and a
ConnDict
- each object in the EC corresponds to a Type (from the model) and a
table row in the db identified by the conndict
- if a second EC is initialiased in the same way, then if the same
object is loaded in either of these two ECs, it will have the same gid.
- nested ECs must assume the same model and conndict of the parent EC

Thus, except for the difference in EC initialisation, I do not see why
the API should change at all... at least the exposed API, as internally
there may well be major issues that I am not even considering.
Post by Sebastien Bigaret
Post by Mario Ruggier
[...] I want to understand better if the idea makes any sense, if
having such a way of initialising an EC brings other advantages or
disadvantages, and how difficult it is to modify the framework so
drastically.
I cannot appreciate how much work such a thing could be, but from what
you say above I suspect it may be less than I thought ;)
In any case, I think it would be too early to add this as a TODO item
as yet, as further poking and contemplation is probably a good
idea. Anyone else following this issue have an opinion?
You're right, that's way too early!... As you probably already guessed,
the more I'm thinking about such an extension, the less I like it. The
concepts is simple, but the consequences, esp. on the API, are not
desirable at all. I believe that asking more than an Entity to be
supplied to e.g. ec.fetch() is not a very good idea, and I do not like
the idea of adding the level of implicit described above in statements
like ec.fetch(). Makes sense?
Again, I do not get this point, i.e. why do you need to specify more
than the Entity to ec.fetch() ? The ec instance itself knows what the
context of the fetch is, and that is the model instance plus the
conndict it was initialised with. The fetch interface should not change
at all.... but, I think I may be missing something here!
Post by Sebastien Bigaret
-- Sébastien.
Cheers, mario

Mario Ruggier
2004-08-18 12:33:05 UTC
Permalink
Post by Sebastien Bigaret
[...]
Post by Mario Ruggier
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not
be the most efficient to way to do this, but possibly the most
flexible and one that would guarantee data integrity and consistency
the most. Any opinions on whether this would be a good idea or not?
I'm sure we've played with that idea some day in the past, if I remember well it
was probably with Yannick Gingras (I can't check the archives right now).
The problem is that, while this sounds interesting, this could
probably only be done with small databases --because, unless we
want to automatically analyse a model in detail, handling
relationships will end up in loading a whole database in memory.
OK, as things are it sounds like that in practice it will not be the
way to go.
It would be difficult to guarantee that the object graph of the target
EC will stay within a certain size... before the target model integrity
is respected and one could actually do a saveChanges()...
Post by Sebastien Bigaret
Moreover, I'm convinced that it will be difficult to cover every
transformations. But I can be proved wrong!
Well, I assume that the transformation logic itself will be written
case-per-case... looping over the tables and their relationships,
copying/maping them from one EC to another. There should not be any
limitations there... python is pretty capable ;)

Anyhow, w.r.t. the initial argument of this thread, what you suggest,
i.e. to dynamically create as many models as you need but with class
names done with Table_ID, seems rather opaque and awkward to me...

How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to specify
its connection dictionary? i.e.

ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)

I guess EditingContext could also grow an optional model parameter...
for when the system will support more than the defaultModel.

For EditingContexts that inherit from another, they would need to exist
in the same db context, so they will be obliged to use their parent's
connection dictionary...

In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)

mario
Mario Ruggier
2004-08-18 12:33:06 UTC
Permalink
Post by Sebastien Bigaret
[...]
Post by Mario Ruggier
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not
be the most efficient to way to do this, but possibly the most
flexible and one that would guarantee data integrity and consistency
the most. Any opinions on whether this would be a good idea or not?
I'm sure we've played with that idea some day in the past, if I remember well it
was probably with Yannick Gingras (I can't check the archives right now).
The problem is that, while this sounds interesting, this could
probably only be done with small databases --because, unless we
want to automatically analyse a model in detail, handling
relationships will end up in loading a whole database in memory.
OK, as things are it sounds like that in practice it will not be the
way to go.
It would be difficult to guarantee that the object graph of the target
EC will stay within a certain size... before the target model integrity
is respected and one could actually do a saveChanges()...
Post by Sebastien Bigaret
Moreover, I'm convinced that it will be difficult to cover every
transformations. But I can be proved wrong!
Well, I assume that the transformation logic itself will be written
case-per-case... looping over the tables and their relationships,
copying/maping them from one EC to another. There should not be any
limitations there... python is pretty capable ;)

Anyhow, w.r.t. the initial argument of this thread, what you suggest,
i.e. to dynamically create as many models as you need but with class
names done with Table_ID, seems rather opaque and awkward to me...

How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to specify
its connection dictionary? i.e.

ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)

I guess EditingContext could also grow an optional model parameter...
for when the system will support more than the defaultModel.

For EditingContexts that inherit from another, they would need to exist
in the same db context, so they will be obliged to use their parent's
connection dictionary...

In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)

mario
Mario Ruggier
2004-08-18 12:33:07 UTC
Permalink
Post by Sebastien Bigaret
[...]
Post by Mario Ruggier
Another situation in which this could apply is for migrating from one
db+model to another... a little modeling mapping code could read from
one EC and populate another that uses a modified model. This may not
be the most efficient to way to do this, but possibly the most
flexible and one that would guarantee data integrity and consistency
the most. Any opinions on whether this would be a good idea or not?
I'm sure we've played with that idea some day in the past, if I remember well it
was probably with Yannick Gingras (I can't check the archives right now).
The problem is that, while this sounds interesting, this could
probably only be done with small databases --because, unless we
want to automatically analyse a model in detail, handling
relationships will end up in loading a whole database in memory.
OK, as things are it sounds like that in practice it will not be the
way to go.
It would be difficult to guarantee that the object graph of the target
EC will stay within a certain size... before the target model integrity
is respected and one could actually do a saveChanges()...
Post by Sebastien Bigaret
Moreover, I'm convinced that it will be difficult to cover every
transformations. But I can be proved wrong!
Well, I assume that the transformation logic itself will be written
case-per-case... looping over the tables and their relationships,
copying/maping them from one EC to another. There should not be any
limitations there... python is pretty capable ;)

Anyhow, w.r.t. the initial argument of this thread, what you suggest,
i.e. to dynamically create as many models as you need but with class
names done with Table_ID, seems rather opaque and awkward to me...

How big of shaker would it be to give the EditingContext control over
which database it is working with? Even if it is the same model and
entity namespace in memory, why not allow the EditingContext to specify
its connection dictionary? i.e.

ec1 = EditingContext(connDict1)
ec2 = EditingContext(connDict2)

I guess EditingContext could also grow an optional model parameter...
for when the system will support more than the defaultModel.

For EditingContexts that inherit from another, they would need to exist
in the same db context, so they will be obliged to use their parent's
connection dictionary...

In fact this reminds me of a remark some months ago, about extracting
the connection dictionary out of the model altogether, where I had
claimed, without really substantiating it, that it does not seem that
that is where the connection dictionary belongs... ;)

mario
Loading...