[egenix-users] cursor.rowcount abnormalities and performance
question
stuff at mailzilla.net
stuff at mailzilla.net
Wed Jul 30 13:51:21 CEST 2003
Thanks Marc-Andre, Steve and Jim for the quick responses to my questions.
I guess I shouldn't be too surprised that MS doesn't provide a mechanism
for retrieving the number of rows but it is rather comical that they do
provide 2 distinct values which you can at least (hopefully, correctly)
infer that there are 0 rows that match -or- at least 1 row that matches.
You'd think that they could go the proverbial extra mile and provide a
real value in the latter case, but then again, it would probably put a
huge load on SQL Server. Cynicism of MS aside, at least I can safely test
for the existence of data using cursor.rowcount != 0.
Interesting discussion about fetchone, fetchall and fetchmany. Seems
like fetchmany is the way to go if there is a large data set expectancy.
For smaller datasets it's more of a grey area but I suppose fetchall is
the easiest to implement.
If time permits I'll create a few test cases using all 3 methods to
retrieve small, medium and large rowsets with small, medium and large
column sets to see if I can make any general assumptions of which fetchXYZ
method to use based on the dataset expectancy.
Thanks again for the replies,
Phil
On Wed, 30 Jul 2003, M.-A. Lemburg wrote:
> stuff at mailzilla.net wrote:
> > Hello,
> >
> > When using mxODBC with the Windows SQL Server Driver I've noticed that
> > when performing a query with executedirect the cursor object has a
> > rowcount of -1 or 0 and nothing else. I am using Python 2.2 and the
> > mxODBC package that ships with BlackAdder.
> >
> > -1 seems to be the initial value of rowcount before queries are performed
> > but I'm unsure as to why this would remain the value after a query
> > considering that the count should be >= 0, at least fundamentally. In my
> > tests I have only seen values of 0 (when there are no records) and -1 when
> > there are some records-- in these cases, I am expecting this to be a
> > positive value representing the number of records that matched the query.
>
> -1 means "no information available".
>
> > If I do a fetch operation (i.e. fetchone) on the cursor that has a
> > rowcount of -1 I do indeed retrieve records. There seems to be a
> > disconnect between rowcount and being able to fetch records.
> >
> > Is SQL Server/ODBC sending over misleading info or is this a problem with
> > mxODBC?
>
> It's a general problem with the ODBC driver you are using. MS Access
> has similar problems. Both seem to not know how many rows there
> actually are in the result set until you try to fetch them all.
>
> Some ODBC drivers update the .rowcount information after the first
> fetch which is why mxODBC tries hard to update the information
> (it does so each time you request .rowcount until something
> other than -1 is returned).
>
> > I generally use rowcount to determine whether I need to retreive
> > records using fetchone inside a while loop (no need to execute the loop
> > if rowcount is 0). It's a little cleaner if I can reliably use rowcount
> > but if the value doesn't sync with the actual result set then I can safely
> > do something like this:
> >
> > while 1:
> > row = c.fetchone()
> > if not row: break
> > processTheRecord(row)
> >
> > If rowcount can be relied upon, I could do this, which is much cleaner:
> >
> > for i in range(c.rowcount):
> > row = c.fetchone
> > processTheRecord(row)
>
> True. In this case, however, you will have to complain to
> Microsoft for not providing the necessary information to the
> application using their ODBC driver :-/
>
> > Performance question- Since I need to perform some manipulation on each
> > record I believe it is more efficient to perform a while loop and
> > fetchone rather than getting everything via fetchall and then iterating
> > over that result (since theoretically, fetchall would be iterating to get
> > each record so technically speaking that would require iterating twice).
> > If this assumption is false, please let me know. Perhaps there are some
> > internal ODBC/mxODBC optimizations that allow fetchall to actually perform
> > much faster than several fetchone's. My concern is that if fetchall
> > iterates through the result set to produce the list of rows and then I am
> > iterating through the fetched list then I can accomplish the same goal w/
> > 1 less iteration via a series of fetchone's. Also, I imagine it would be
> > safer to process each record individually then fetching the entire result
> > set at once in terms of system resource utilization (i.e. if the result
> > set were millions of records there would need to be data structs allocated
> > to hold them).
> >
> > I'm curious to see if my reluctance to use fetchall is well-founded or
> > overly paranoid.
>
> This is hard to tell: by using .fetchall() you save quite
> a few Python function calls which are costly, .fetchone()
> OTOH save memory. You might even be better off by using
> .fetchmany() and process e.g. 10 rows at a time.
>
> I'd suggest you write a short test script and time the results.
>
> > Thanks for any info,
>
>
More information about the egenix-users
mailing list