Issue 6347 - JLDAP: LDIFReader should use UTF-8 character set instead of US-ASCII
Summary: JLDAP: LDIFReader should use UTF-8 character set instead of US-ASCII
Status: UNCONFIRMED
Alias: None
Product: JLDAP
Classification: Unclassified
Component: JLDAP (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-23 09:07 UTC by sbrys@novell.com
Modified: 2020-04-30 18:56 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description sbrys@novell.com 2009-10-23 09:07:27 UTC
Full_Name: 
Version: 
OS: 
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (81.240.252.176)


The JLDAP com.novell.ldap.LDIFWriter class uses the UTF-8 character set when
writing LDIF content to output streams. However, the com.novell.ldap.LDIFReader
class uses the US-ASCII character set to read LDIF content from input streams.
The LDIFReader class should also use UTF-8, so LDIF files containing special
characters can be read successfully.

Proposed patch for

   http://www.openldap.org/devel/cvsweb.cgi/~checkout~/com/novell/ldap/util/LDIFReader.java?rev=1.42&cvsroot=JLDAP&hideattic=1&sortbydate=0

-----------------------/snip/-----------------------
--- LDIFReader.java.ori	2009-10-23 10:56:00.000000000 +0200
+++ LDIFReader.java	2009-10-23 10:56:35.000000000 +0200
@@ -117,7 +117,7 @@
         }
 
         setVersion( version );
-        InputStreamReader isr = new InputStreamReader(in, "US-ASCII");
+        InputStreamReader isr = new InputStreamReader(in, "UTF-8");
         bufReader = new BufferedReader(isr);
 
         // In order to determine if it is a LDIF content file or LDIF change
-----------------------/snip/-----------------------
Comment 1 Howard Chu 2009-11-30 04:44:26 UTC
moved from Incoming to Contrib
Comment 2 drmuey+github 2020-04-30 18:30:19 UTC
Would this make it so that non-ascii strings ar enot base 64 encoded?
Comment 3 Howard Chu 2020-04-30 18:56:14 UTC
(In reply to drmuey+github from comment #2)
> Would this make it so that non-ascii strings ar enot base 64 encoded?

Changing how the reader works has no bearing on whether the writer does base 64 encoding. Your question makes no sense.

The LDIF spec says the input values may be in UTF8 or base64 encoded, so this is a legitimate bug that should be fixed.