Discussion:
[Modeling-users] Couple of questions...
Aaron Freeman
2004-04-20 14:47:06 UTC
Permalink
(1) Is there a debug flag somewhere that dump all SQL queries to the
console?

(2) How are nested joins accomplished?

For example, in the AuthorBooks schema, lets assume there is another table
called publisher that has a 1-M relationship with books, like so (sorry
about the formatting):

............................................................................
.............
.+-----------+.(0,1)..............(0,*).+-------+.(0,*)..............(0,1).+
-----------+.
.|..Writer...|<-author----------books->>|.Book..|<<-books-------publisher->|
.Publisher.|.
.|-----------|.(nullify)......(cascade).|-------|..........................|
-----------|.
.|.lastName..|..........................|.title.|..........................|
.bizName...|.
.|.firstName.|..........................|.price.|..........................|
.location..|.
.|.age.......|.(0,1)....................+-------+..........................+
-----------+.
.|.birthday..|<-pygmalion--+................................................
.............
.+-----------+.(nullify)...|................................................
.............

.....|....................|.................................................
............
......+--------------------+................................................
.............
............................................................................
.............

How would I get writers by publisher?

More specifically, is there an elegant way to do a nested join in a single
query, or do I just have to do a single writer-book join and then iterate
over the resultset for the second join?

Code examples would be great!

Thanks for your time.

-Aaron
Sebastien Bigaret
2004-04-20 16:55:27 UTC
Permalink
Hi Aaron,
Post by Aaron Freeman
(1) Is there a debug flag somewhere that dump all SQL queries to the
console?
Absolutely: set the environment variable MDL_ENABLE_DATABASE_LOGGING to
any true value. For details, see:
http://modeling.sourceforge.net/UserGuide/env-vars-core.html
Post by Aaron Freeman
(2) How are nested joins accomplished?
For example, in the AuthorBooks schema, lets assume there is another table
called publisher that has a 1-M relationship with books, like so (sorry
[...
Writer <-author----books->> Book <<-books----publisher-> Publisher
...]
Post by Aaron Freeman
How would I get writers by publisher?
More specifically, is there an elegant way to do a nested join in a single
query, or do I just have to do a single writer-book join and then iterate
over the resultset for the second join?
ec.fetch('Writer', 'books.publisher.bizName=="P1"')
returns every writer who published (at least) a book w/ publisher P1.

The generated SQL query is (postgresql here):

SELECT DISTINCT t0.ID, t0.LAST_NAME, t0.FIRST_NAME, t0.AGE,
t0.FK_WRITER_ID, t0.BIRTHDAY
FROM WRITER t0
INNER JOIN ( BOOK t1
INNER JOIN PUBLISHER t2
ON t1.FK_PUBLISHER=t2.ID )
ON t0.ID=t1.FK_WRITER_ID
WHERE t2.BIZ_NAME = 'P1';
Post by Aaron Freeman
Code examples would be great!
I'm including in the end of the message a PyModel named EAuthorBooks
(E stands for Extended :) derived from the one used in the User's
Guide, with sample data, demonstrating the approach.

Regards,


-- Sébastien.



PS: you'll need to setup a database before running the example, e.g. w/
postgresql:

$ mdl_generate_DB_schema.py \
--admin-dsn "localhost:template1:postgres:" \
-v -C pymodel_extAutborBooks.py


------------------------------------------------------------------------
#! /usr/bin/env python
# -*- coding: iso-8859-1 -*-
"Extended AuthorBooks model, with a Publisher"

from Modeling.PyModel import *

##
# Defaults
AFloat.defaults['precision'] = 10
AFloat.defaults['scale'] = 2
AString.defaults['width'] = 40

Association.defaults['delete']=['nullify', 'nullify']

Entity.defaults['properties'] = [
APrimaryKey('id', isClassProperty=0, isRequired=1, doc='PK')
]

##
# Adapt to fit your own configuration!!

_connDict = {'database': 'EAUTHOR_BOOKS', 'host':'localhost',
'user':'postgres','password':''}

model = Model('EAuthorBooks',adaptorName='Postgresql',
connDict=_connDict)

model.doc = ' ... '
model.version='0.1'
model.entities = [
#
Entity('Book',
properties=[ AString('title', isRequired=1, columnName='title'),
AFloat('price'),
],
),
Entity('Writer',
properties=[ AString('lastName',isRequired=1, width=30 ),
AString('firstName'),
AInteger('age', displayLabel='Age'),
ADateTime('birthday', usedForLocking=0),
]
),
Entity('Publisher',
properties=[ AString('bizName',isRequired=1),
AString('location'),
]
),
]
#---
model.associations=[
Association('Book', 'Writer',
relations=['author', 'books'],
delete=['nullify', 'cascade'],
keys=['FK_Writer_Id', 'id']),
Association('Writer', 'Writer',
relations=['pygmalion', None],
delete=['nullify', None],
keys=['FK_Writer_id', 'id']),
Association('Book', 'Publisher', relations=['publisher', 'books'])
]

model.build()
model=model.component

if __name__=="__main__":
#import pdb ; pdb.set_trace()

from Modeling.dynamic import build
build(model, define_properties=1)

from Modeling.ModelSet import defaultModelSet
defaultModelSet().addModel(model)

from Modeling.EditingContext import EditingContext
ec=EditingContext()

from EAuthorBooks.Publisher import Publisher
from EAuthorBooks.Writer import Writer
from EAuthorBooks.Book import Book

# Initialization
p1=Publisher(); p1.bizName="P1"
p2=Publisher(); p2.bizName="P2"

w1=Writer(); w1.lastName="in p1"
w2=Writer(); w2.lastName="in p2"
w3=Writer(); w3.lastName="in p1 and p2"

b1w1=Book(); b1w1.title="b1w1"
b2w1=Book(); b2w1.title="b2w1"
b1w2=Book(); b1w2.title="b1w2"
b2w2=Book(); b2w2.title="b2w2"
b1w3=Book(); b1w3.title="b1w3"
b2w3=Book(); b2w3.title="b2w3"
ec.insert(p1);
ec.insert(p2); ec.insert(w1); ec.insert(w2); ec.insert(w3)
ec.insert(b1w1); ec.insert(b2w1); ec.insert(b1w2); ec.insert(b2w2)
ec.insert(b1w3); ec.insert(b2w3)

w1.addToBooks(b1w1); b1w1.author=w1; w1.addToBooks(b2w1); b2w1.author=w1
w2.addToBooks(b1w2); b1w2.author=w2; w2.addToBooks(b2w2); b2w2.author=w2
w3.addToBooks(b1w3); b1w3.author=w3; w3.addToBooks(b2w3); b2w3.author=w3

b1w1.publisher=p1; p1.addToBooks(b1w1)
b2w1.publisher=p1; p1.addToBooks(b2w1);
b1w2.publisher=p2; p2.addToBooks(b1w2)
b2w2.publisher=p2; p2.addToBooks(b2w2);
b1w3.publisher=p1; p1.addToBooks(b1w3)
b2w3.publisher=p2; p2.addToBooks(b2w3);

ec.saveChanges()

# query
ec=EditingContext()
print [w.lastName
for w in ec.fetch('Writer', 'books.publisher.bizName=="P1"')]
# returns: w1 and w3
ec=EditingContext()
print [w.lastName
for w in ec.fetch('Writer', 'books.publisher.bizName=="P2"')]
# returns: w2 and w3
------------------------------------------------------------------------
John Lenton
2004-04-20 16:58:06 UTC
Permalink
Post by Aaron Freeman
(1) Is there a debug flag somewhere that dump all SQL queries to the
console?
(2) How are nested joins accomplished?
For example, in the AuthorBooks schema, lets assume there is another table
called publisher that has a 1-M relationship with books, like so (sorry
............................................................................
.............
.+-----------+.(0,1)..............(0,*).+-------+.(0,*)..............(0,1).+
.|..Writer...|<-author----------books->>|.Book..|<<-books-------publisher->|
.Publisher.|.
.|-----------|.(nullify)......(cascade).|-------|..........................|
.|.lastName..|..........................|.title.|..........................|
.bizName...|.
.|.firstName.|..........................|.price.|..........................|
.location..|.
.|.age.......|.(0,1)....................+-------+..........................+
.|.birthday..|<-pygmalion--+................................................
.............
.+-----------+.(nullify)...|................................................
.............
.....|....................|.................................................
............
......+--------------------+................................................
.............
............................................................................
.............
How would I get writers by publisher?
More specifically, is there an elegant way to do a nested join in a single
query, or do I just have to do a single writer-book join and then iterate
over the resultset for the second join?
I don't think so. I'd do it with

[ i.valueForKeyPath('books.author') for i in ec.fetch('Publisher') ]

which is clean enough for me. I guess what you'd want is for
valueForKeyPath to be callable on a class... :)
--
John Lenton (***@vialibre.org.ar) -- Random fortune:
Sobre toda cosa guardada guarda tu corazón, porque de él emana la vida.
-- Salomón.
Sebastien Bigaret
2004-04-20 19:47:01 UTC
Permalink
[...]
Post by John Lenton
Post by Aaron Freeman
How would I get writers by publisher?
More specifically, is there an elegant way to do a nested join in a single
query, or do I just have to do a single writer-book join and then iterate
over the resultset for the second join?
I don't think so. I'd do it with
[ i.valueForKeyPath('books.author') for i in ec.fetch('Publisher') ]
which is clean enough for me. I guess what you'd want is for
valueForKeyPath to be callable on a class... :)
Yes, that's another approach indeed. But while I'm a fervent defender
for KeyValueCoding ;)) in such cases one should be careful. Here are
some thoughts:

- if the database have a lot of books, the KVC approach will load them
all in memory: KVC traverses relationships at the object level, and
in this case this means fetching Books published by every Publisher,
then these Books' Authors. This is probably the worst case one can
get. That's probably why you were asking for a callable vFKP on a
class, weren't you?

- In the resulting lists, if an author has published N books for a
publisher, the corresponding list will receive N duplicates for that
author.
Post by John Lenton
Post by Aaron Freeman
ec.fetch('Writer', 'books.publisher.bizName=="P1"')
does not have these possible drawbacks.


However and admittedly the dotted notation as it is designed for now can
only handle one kind of semantics for to-many relationships: for
example, there is no way to ask for authors whose books were published
by one single publisher only.

In other words, then you'll have to use KVC or plain python, traverse
the relationships and check properties, or, hmmm, get back to plain
SQL w/ things like:

SELECT DISTINCT t0.ID, t0.LAST_NAME, t0.FIRST_NAME, t0.AGE,
t0.FK_WRITER_ID, t0.BIRTHDAY
FROM WRITER t0
WHERE NOT EXISTS (SELECT * FROM BOOK,WRITER,PUBLISHER
WHERE WRITER.ID=t0.ID
AND WRITER.ID=BOOK.FK_WRITER_ID
AND BOOK.FK_PUBLISHER=PUBLISHER.ID
AND PUBLISHER.BIZ_NAME!='P1')

just to get the authors who have no other publisher than P1.


This makes me think that it could be nice to have something like
Post by John Lenton
Post by Aaron Freeman
ec.fetch('Writer', 'NOT EXISTS(books.publisher.bizName!="P1")'
so that we could avoid having to write plain SQL just like above (yerk:)


What do you all think?

-- Sébastien.
Mario Ruggier
2004-04-21 06:21:07 UTC
Permalink
Hello!

I had come across these kind of limitations a few months ago... not
developing at the moment, but I find the issue interesting... so, just
my 2 centimes ;)
Post by Sebastien Bigaret
This makes me think that it could be nice to have something like
ec.fetch('Writer', 'NOT EXISTS(books.publisher.bizName!="P1")'
I would say yes this would be very nice to have... and this almost
seems to me a consistent and natural extension of what is supported
already, i.e. 'AND', 'OR', 'NOT', 'IN', 'NOT IN', 'like', ilike', ...
anything else?

Almost... because EXISTS applies to a independent subquery, while the
other keywords are constraints on the current query. Would this be a
possible source of confusion?

What happens if we intermix these keywords, e.g.

ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.bizName NOT IN ["P1","p2"] ) '

to get the authors that have books with at least 51 pages as well as
having published with both of P1 and P2 but with no one else. Would
this work? There would be a simpler way to do this?

mario
Mario Ruggier
2004-04-21 18:56:05 UTC
Permalink
Post by Mario Ruggier
ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.bizName NOT IN ["P1","p2"] ) '
oops, sorry, just realized that's pretty silly. Lousy example...
Make that something like:

ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.country NOT IN ["c1","c2"] ) '

or so...

mario
Sebastien Bigaret
2004-04-27 17:03:03 UTC
Permalink
Post by Mario Ruggier
Almost... because EXISTS applies to a independent subquery, while the other
keywords are constraints on the current query. Would this be a possible source
of confusion?
What happens if we intermix these keywords, e.g.
ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.bizName NOT IN ["P1","p2"] ) '
to get the authors that have books with at least 51 pages as well as having
published with both of P1 and P2 but with no one else. Would this work? There
would be a simpler way to do this?
oops, sorry, just realized that's pretty silly. Lousy example...
ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.country NOT IN ["c1","c2"] ) '
or so...
Not a problem:) My idea of this is that everything should nicely play
one w/ the other, or we'll lose most of the feature, so we should be
able to intermix every keyword w/o problem. I'll play with this some
day for sure. To be more precise, in our case (a fetchspec) EXISTS do
apply to an independent query but this query is always bound to the
original one: look at this:

=# SELECT DISTINCT t0.ID, t0.LAST_NAME, t0.FIRST_NAME, t0.AGE,
-# t0.FK_WRITER_ID, t0.BIRTHDAY
-# FROM WRITER t0
-# WHERE NOT EXISTS (SELECT * FROM BOOK,WRITER,PUBLISHER
(# WHERE WRITER.ID=t0.ID
(# AND WRITER.ID=BOOK.FK_WRITER_ID
(# AND BOOK.FK_PUBLISHER=PUBLISHER.ID
(# AND PUBLISHER.BIZ_NAME!='P1');

the second WHERE clause binds the "independent query" to the first one
with WRITER.ID=t0.ID.

--> that's why I think this shouldn't be difficult to implement, but I
need to experiment with nested EXISTS clause (I do not have enonugh
sql knowledge to foresee the possible coming complications).


BTW I'll will be mostly off for a week, not sure if I'll read email
for that time, and I'll be back online for sure some day next
week.

-- Sébastien.
John Lenton
2004-04-21 20:13:05 UTC
Permalink
Post by Sebastien Bigaret
Yes, that's another approach indeed. But while I'm a fervent defender
for KeyValueCoding ;)) in such cases one should be careful. Here are
- if the database have a lot of books, the KVC approach will load them
all in memory: KVC traverses relationships at the object level, and
in this case this means fetching Books published by every Publisher,
then these Books' Authors. This is probably the worst case one can
get. That's probably why you were asking for a callable vFKP on a
class, weren't you?
- In the resulting lists, if an author has published N books for a
publisher, the corresponding list will receive N duplicates for that
author.
agreed, and I think I pointed out (I know I tried to point out) the
second one of these. However the first is what I understood was being
requested.

The way I think I'd *like* to solve the second one is along the lines of

ec.fetch('Writer.books.publisher', unique=True)

or, perhaps (if something sensible can be found to fill in the
ellipsis),

ec.fetch('Writer.books.publisher', groupBy='...')


The first could be ameliorated by having fetch return an iterator
instead of a list...

in fact...

if fetch returned something smarter than a list, it might be
possible to do something like

ec.fetch('Writer') - \
ec.fetch('Writer', 'books.publisher.bizName == "P1"')

this means ec.fetch returns a set, as in relational algebra[1] sets,
which you can operate on. These set objects wouldn't perform the
select on the database until you accessed them as sequences, so the
above wouldn't even touch the database until someone used the result
for something other than operating on other sets. Maybe 'fetch' is a
bad name for a method returning something that hasn't actually
fetched :)

If this falls in the 'intersting, but show me the code' category,
I'm afraid it'll have to wait :(
Post by Sebastien Bigaret
ec.fetch('Writer', 'books.publisher.numPages>50
AND (books.publisher.bizName IN ["P1","P2"])
AND NOT EXISTS(books.publisher.country NOT IN ["c1","c2"] ) '
well, that's the same as

ec.fetch('Writer', 'books.publisher.numPages > 50 \
AND books.publisher.bizName IN ["P1", "P2"]') - \
ec.fetch('Writer', 'books.publisher.country IN ["c1","c2"]')

right?


References:
1) http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter3/node7.html
--
John Lenton (***@vialibre.org.ar) -- Random fortune:
Above all else -- sky.
Sebastien Bigaret
2004-04-27 17:09:04 UTC
Permalink
Post by John Lenton
agreed, and I think I pointed out (I know I tried to point out) the
second one of these. However the first is what I understood was being
requested.
Seems like I misunderstood your post, sorry
Post by John Lenton
The way I think I'd *like* to solve the second one is along the lines of
ec.fetch('Writer.books.publisher', unique=True)
or, perhaps (if something sensible can be found to fill in the
ellipsis),
ec.fetch('Writer.books.publisher', groupBy='...')
The first could be ameliorated by having fetch return an iterator
instead of a list...
[+ more on "smarter lists", such as algebra sets]

Interesting idea indeed, I'm afraid i've not enough time to comment on this
now though. Should I forget about it, you can recall me of your comments in a
week when I'll be back online. Even if it sounds like dropping support for
py2.1 (not sure however), it seems bright enough to be worth the
investigation ;)

[do not misunderstannd me here, I do not plan to drop support for
py2.1 by now]
Post by John Lenton
1) http://www.cs.sfu.ca/CC/354/zaiane/material/notes/Chapter3/node7.html
Still did not find the time or energy :) to look at the reference, but I
promise we'll discuss this deeper.

-- Sébastien.

Aaron Freeman
2004-04-21 15:50:04 UTC
Permalink
First, thank you all for the fast and details replies; they are much
appreciated. ;-)

I've been playing around with the code that was posted, and while I still
don't have it working, it is getting closer.

First, I tried the Sebastien's code example (called extended.py here); this
is what I got:

jabartik extendedTest # python2.1 extended.py
Traceback (most recent call last):
File "extended.py", line 70, in ?
build(model, define_properties=1)
File "/usr/lib/python2.1/site-packages/Modeling/dynamic.py", line 168, in
build
add_properties(c, e)
File "/usr/lib/python2.1/site-packages/Modeling/dynamic.py", line 148, in
add_properties
prop=property(getattr(aClass, 'get'+part_func_name),
NameError: global name 'property' is not defined

I also tried loading (and tweaking for the added Publisher table) the
AuthorBooks example into the Zope ZModeler tool and then using it generate
the code skeletons and database schema; that worked just fine. I then tried
using these along with the code from Sebastien's example with the via the
Post by Sebastien Bigaret
from Modeling.EditingContext import EditingContext
ec=EditingContext()
from AuthorBooks.Publisher import Publisher
from AuthorBooks.Writer import Writer
from AuthorBooks.Book import Book
p1=Publisher()
p1.bizname="P1"
ec.insert(p1)
ec.saveChanges()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.1/site-packages/Modeling/EditingContext.py", line
737, in saveChanges
self.validateChanges()
File "/usr/lib/python2.1/site-packages/Modeling/EditingContext.py", line
765, in validateChanges
self.objectForGlobalID(gID).validateForSave()
File "/usr/lib/python2.1/site-packages/Modeling/CustomObject.py", line
595, in validateForSave
error.finalize()
File "/usr/lib/python2.1/site-packages/Modeling/Validation.py", line 145,
in finalize
if self._dict: raise self
Modeling.Validation.ValidationException: Validation for key
OBJECT_WIDE_VALIDATION failed:
- Validation of object <AuthorBooks.Publisher.Publisher instance at
0x849e07c> as a whole failed
Validation for key bizname failed:
- Key is required but value is void

After some scratching, I realized that the setBizname method in
Publisher.py, below, isn't getting called.

def setBizname(self, bizname):
"Change the Publisher / bizname attribute value"
self.willChange()
self._bizname = bizname
Post by Sebastien Bigaret
dir(p1)
['_CustomObject__editingContext', '_bizname', '_books', '_location',
'_v_classDescription', 'bizname']

So the 'bizname' attribute is getting appended and self._bizname = bizname
isn't getting called, which causes the insert to die since bizname is set to
non-null in the database. Manually forcing it with a p1._bizname="xyz" and
then doing an insert works fine.

So, the questions:

1) Based on some archived posts, I was under the impression that Modeling
core supports Python 2.1 (I am using 2.1.3 since this is leading up to a
Zope 2.6.4 product); is that still true?

2) Do I need the Modeling.dynamic module? This appears to require python
2.2, but if the classes are being generated via an XML file or the ZModeler
tool, this shouldn't be required, right?

At any rate, thanks again for your time and help.

-Aaron
Sebastien Bigaret
2004-04-27 16:53:02 UTC
Permalink
Hi Aaron,
Post by Aaron Freeman
First, thank you all for the fast and details replies; they are much
appreciated. ;-)
I've been playing around with the code that was posted, and while I still
don't have it working, it is getting closer.
I'm sorry, I assumed py2.2+, not py2.1

We definitely still support py2.1, but of course then some of the
features are not available. For example, there's nothing like properties
in py2.1.

The code I posted should work given that:

- you replace p.bizname="x" w/ p.setBizname("o") and
p.bizname w/ the appropriate getter p.getBizname()

(this will make the ValidationException at saveChanges() disappear)
Post by Aaron Freeman
build(model, define_properties=0)
since py2.1 has no properties.


And that's obviously what you discovered on your own.
Post by Aaron Freeman
1) Based on some archived posts, I was under the impression that Modeling
core supports Python 2.1 (I am using 2.1.3 since this is leading up to a
Zope 2.6.4 product); is that still true?
To be clear: yes, it is.
Post by Aaron Freeman
2) Do I need the Modeling.dynamic module? This appears to require python
2.2, but if the classes are being generated via an XML file or the ZModeler
tool, this shouldn't be required, right?
You absolutely do not need too, right. I just use it (even though it's
not documented yet) since most people tends to think its the easiest way
(including me :) --but that does not mean that code generation, either
from the ZModeler or from mdl_generate_python_code.py, is deprecated.


-- Sébastien.
Loading...