From jan.murre at catalyz.nl Tue Sep 26 16:08:53 2017 From: jan.murre at catalyz.nl (Jan Murre) Date: Tue Sep 26 15:09:00 2017 Subject: [egenix-users] Data truncation by Microsoft ODBC driver for NVARCHAR Message-ID: Hi, I am query-ing a MS SQL database from Redhat Linux using the "Microsoft ODBC Driver 13 for SQL Server". There is a NVARCHAR(30) field in our database that is filled with data having a 2-byte utf-8 char on the last position. When query-ing, the ODBC driver issues this warning: mx.ODBC.Error.Warning: ('01004', 0, '[Microsoft][ODBC Driver 13 for SQL Server]String data, right truncation', 8668) This results in corrupted data in the resultsset, because only the first byte of this 2-byte utf-8 char is in the column. I tried with serveral settings for 'connection.encoding' and 'connections.stringformat', but without success. Is this an ODBC driver issue? Would it be possible to work around this with certain settings of mxODBC? Regards, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: /mailman-archives/egenix-users/attachments/20170926/9059a15d/attachment.htm From jan.murre at catalyz.nl Tue Sep 26 16:46:32 2017 From: jan.murre at catalyz.nl (Jan Murre) Date: Tue Sep 26 15:46:38 2017 Subject: [egenix-users] Re: Data truncation by Microsoft ODBC driver for NVARCHAR In-Reply-To: References: Message-ID: An issue in the php driver from microsoft that seems related: https://github.com/Microsoft/msphpsql/issues/231 On Tue, Sep 26, 2017 at 3:08 PM, Jan Murre wrote: > Hi, > > I am query-ing a MS SQL database from Redhat Linux using the "Microsoft > ODBC Driver 13 for SQL Server". > > There is a NVARCHAR(30) field in our database that is filled with data > having a 2-byte utf-8 char on the last position. When query-ing, the ODBC > driver issues this warning: > > mx.ODBC.Error.Warning: ('01004', 0, '[Microsoft][ODBC Driver 13 for SQL > Server]String data, right truncation', 8668) > > This results in corrupted data in the resultsset, because only the first > byte of this 2-byte utf-8 char is in the column. > > I tried with serveral settings for 'connection.encoding' and > 'connections.stringformat', but without success. > > Is this an ODBC driver issue? Would it be possible to work around this > with certain settings of mxODBC? > > Regards, Jan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: /mailman-archives/egenix-users/attachments/20170926/6151138a/attachment.htm From info at egenix.com Tue Sep 26 18:08:33 2017 From: info at egenix.com (eGenix Team: M.-A. Lemburg) Date: Tue Sep 26 17:08:52 2017 Subject: [egenix-users] =?utf-8?q?ANN=3A_Python_Meeting_D=C3=BCsseldorf_-_27=2E09=2E2017?= Message-ID: <0d939a54-0f9c-2fec-c625-74ad2ba10508@egenix.com> [This announcement is in German since it targets a local user group meeting in D?sseldorf, Germany] ________________________________________________________________________ ANK?NDIGUNG Python Meeting D?sseldorf http://pyddf.de/ Ein Treffen von Python Enthusiasten und Interessierten in ungezwungener Atmosph?re. Mittwoch, 27.09.2017, 18:00 Uhr Raum 1, 2.OG im B?rgerhaus Stadtteilzentrum Bilk D?sseldorfer Arcaden, Bachstr. 145, 40217 D?sseldorf Diese Nachricht ist auch online verf?gbar: http://www.egenix.com/company/news/Python-Meeting-Duesseldorf-2017-09-27 ________________________________________________________________________ NEUIGKEITEN * Bereits angemeldete Vortr?ge: Dr. Uwe Ziegenhagen "Datenanalyse mit Python pandas" Charlie Clark "Typ-Systeme in Python" Weitere Vortr?ge k?nnen gerne noch angemeldet werden: info@pyddf.de Allerdings wird vermutlich bei diesem Treffen kein Platz mehr sein, sondern erst beim n?chsten Mal im 27.09.2017. * Startzeit und Ort: Wir treffen uns um 18:00 Uhr im B?rgerhaus in den D?sseldorfer Arcaden. Das B?rgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet sich an der Seite der Tiefgarageneinfahrt der D?sseldorfer Arcaden. ?ber dem Eingang steht ein gro?es "Schwimm' in Bilk" Logo. Hinter der T?r direkt links zu den zwei Aufz?gen, dann in den 2. Stock hochfahren. Der Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt. Google Street View: http://bit.ly/11sCfiw ________________________________________________________________________ EINLEITUNG Das Python Meeting D?sseldorf ist eine regelm??ige Veranstaltung in D?sseldorf, die sich an Python Begeisterte aus der Region wendet: * http://pyddf.de/ Einen guten ?berblick ?ber die Vortr?ge bietet unser YouTube-Kanal, auf dem wir die Vortr?ge nach den Meetings ver?ffentlichen: * http://www.youtube.com/pyddf/ Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, D?sseldorf: * http://www.egenix.com/ * http://www.clark-consulting.eu/ ________________________________________________________________________ PROGRAMM Das Python Meeting D?sseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion. Vortr?ge k?nnen vorher angemeldet werden, oder auch spontan w?hrend des Treffens eingebracht werden. Ein Beamer mit XGA Aufl?sung steht zur Verf?gung. (Lightning) Talk Anmeldung bitte formlos per EMail an info@pyddf.de ________________________________________________________________________ KOSTENBETEILIGUNG Das Python Meeting D?sseldorf wird von Python Nutzern f?r Python Nutzer veranstaltet. Um die Kosten zumindest teilweise zu refinanzieren, bitten wir die Teilnehmer um einen Beitrag in H?he von EUR 10,00 inkl. 19% Mwst, Sch?ler und Studenten zahlen EUR 5,00 inkl. 19% Mwst. Wir m?chten alle Teilnehmer bitten, den Betrag in bar mitzubringen. ________________________________________________________________________ ANMELDUNG Da wir nur f?r ca. 20 Personen Sitzpl?tze haben, m?chten wir bitten, sich per EMail anzumelden. Damit wird keine Verpflichtung eingegangen. Es erleichtert uns allerdings die Planung. Meeting Anmeldung bitte formlos per EMail an info@pyddf.de ________________________________________________________________________ WEITERE INFORMATIONEN Weitere Informationen finden Sie auf der Webseite des Meetings: http://pyddf.de/ Mit freundlichen Gr??en, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 26 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ From mal at egenix.com Wed Sep 27 00:06:11 2017 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Sep 26 23:06:19 2017 Subject: [egenix-users] Data truncation by Microsoft ODBC driver for NVARCHAR In-Reply-To: References: Message-ID: <93e5d91c-4e56-50e4-a393-e46beeabe605@egenix.com> Hi Jan, just to clarify: you have the field filled with 29 characters and the last one is a character which needs two bytes UTF-8 representation ? You may want to try to set the connection's .stringformat to NATIVE_UNICODE_STRINGFORMAT. This will result in mxODBC requesting data as Unicode. However, please note that your specific case may also be a bug in the driver, since these often use UTF-8 strings internally to store Unicode data and then "forget" to adjust the buffer lengths to accommodate for the increase in size when the strings have multi-byte representations. If you could provide an example and specific driver and database versions, we can try to replicate the problem. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 26 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ On 26.09.2017 15:08, Jan Murre wrote: > Hi, > > I am query-ing a MS SQL database from Redhat Linux using the "Microsoft > ODBC Driver 13 for SQL Server". > > There is a NVARCHAR(30) field in our database that is filled with data > having a 2-byte utf-8 char on the last position. When query-ing, the ODBC > driver issues this warning: > > mx.ODBC.Error.Warning: ('01004', 0, '[Microsoft][ODBC Driver 13 for SQL > Server]String data, right truncation', 8668) > > This results in corrupted data in the resultsset, because only the first > byte of this 2-byte utf-8 char is in the column. > > I tried with serveral settings for 'connection.encoding' and > 'connections.stringformat', but without success. > > Is this an ODBC driver issue? Would it be possible to work around this with > certain settings of mxODBC? > > Regards, Jan > > > > > _______________________________________________________________________ > eGenix.com User Mailing List http://www.egenix.com/ > https://www.egenix.com/mailman/listinfo/egenix-users > From mal at egenix.com Wed Sep 27 11:02:09 2017 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Sep 27 10:02:19 2017 Subject: [egenix-users] Data truncation by Microsoft ODBC driver for NVARCHAR In-Reply-To: References: <93e5d91c-4e56-50e4-a393-e46beeabe605@egenix.com> Message-ID: <7808fbb3-75fa-66a4-55e8-687c65d6d6c6@egenix.com> Hi Jan, I'm glad this fixes your problem. We've seen several such issues with ODBC drivers before. In general, using NATIVE_UNICODE_STRINGFORMAT is the best way to go, since this causes least surprises (and raises errors early in Python when e.g. sending data to the database which cannot be properly encoded). The only reason, we're not making this the default is backwards compatibility. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 27 2017) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ________________________________________________________________________ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ On 27.09.2017 09:43, Jan Murre wrote: > Hi Marc-Andre, > > The situation is that the NVARCHAR(30) field is filled with this string > "GUARANTEE 2 mobilhomes c?te ? ". That is 30 characters. > When it comes back from a cursor.fetchall() the resulting string is: > > "GUARANTEE 2 mobilhomes c\xc3\xb4te \xc3". > > So, the utf-8 chars are in the results, but the results has been wrongly > truncated to 30 bytes. > > I tried with the "NATIVE_UNICODE_STRINGFORMAT" and that gives me correct > results!! > I thought I tried this already, but maybe not in the right combination > with other params. > I am glad this fixes our problem! > > However, the truncation maybe is still some kind of bug in the driver. > The driver being used is the native MS SQL 13.0 (and also 13.1) driver > for Linux (RHEL). > > Regards, Jan > > > > > On Tue, Sep 26, 2017 at 11:06 PM, M.-A. Lemburg > wrote: > > Hi Jan, > > just to clarify: you have the field filled with 29 characters > and the last one is a character which needs two bytes UTF-8 > representation ? > > You may want to try to set the connection's .stringformat > to NATIVE_UNICODE_STRINGFORMAT. This will result in mxODBC > requesting data as Unicode. > > However, please note that your specific case may also be > a bug in the driver, since these often use UTF-8 strings > internally to store Unicode data and then "forget" to > adjust the buffer lengths to accommodate for the increase > in size when the strings have multi-byte representations. > > If you could provide an example and specific driver and database > versions, we can try to replicate the problem. > > Thanks, > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Sep 26 2017) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > ________________________________________________________________________ > > ::: We implement business ideas - efficiently in both time and costs ::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > > http://www.malemburg.com/ > > > On 26.09.2017 15:08, Jan Murre wrote: > > Hi, > > > > I am query-ing a MS SQL database from Redhat Linux using the > "Microsoft > > ODBC Driver 13 for SQL Server". > > > > There is a NVARCHAR(30) field in our database that is filled with data > > having a 2-byte utf-8 char on the last position. When query-ing, > the ODBC > > driver issues this warning: > > > > mx.ODBC.Error.Warning: ('01004', 0, '[Microsoft][ODBC Driver 13 > for SQL > > Server]String data, right truncation', 8668) > > > > This results in corrupted data in the resultsset, because only the > first > > byte of this 2-byte utf-8 char is in the column. > > > > I tried with serveral settings for 'connection.encoding' and > > 'connections.stringformat', but without success. > > > > Is this an ODBC driver issue? Would it be possible to work around > this with > > certain settings of mxODBC? > > > > Regards, Jan > > > > > > > > > > > _______________________________________________________________________ > > eGenix.com User Mailing List > http://www.egenix.com/ > > https://www.egenix.com/mailman/listinfo/egenix-users > > > > >