Issue 5256 - encoding of guide.sdf
Summary: encoding of guide.sdf
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: documentation (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-02 15:15 UTC by dieter@dkluenter.de
Modified: 2009-02-17 06:50 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description dieter@dkluenter.de 2007-12-02 15:15:12 UTC
Full_Name: Dieter Kluenter
Version: REL_ENG_2_4
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (84.142.202.173)


Hi,
the guide *.sdf files are UTF-8 encoded, but the online availeable HTML files
seem to be iso8859-1 encoded, which leeds to display errors, an example is part
9.2.1. 
... proxy���s efficiency ...

-Dieter 

Comment 1 Gavin Henry 2007-12-02 20:09:56 UTC
<quote who="dieter@dkluenter.de">
> Full_Name: Dieter Kluenter
> Version: REL_ENG_2_4
> OS: Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (84.142.202.173)
>
>
> Hi,
> the guide *.sdf files are UTF-8 encoded, but the online availeable HTML
> files
> seem to be iso8859-1 encoded, which leeds to display errors, an example is
> part
> 9.2.1.
> ... proxyâs efficiency ...

Is this something I can fix, or is it part of the doc build for the main
site?

>
> -Dieter
>
>
>

Comment 2 Howard Chu 2007-12-02 20:20:25 UTC
ghenry@suretecsystems.com wrote:
> <quote who="dieter@dkluenter.de">
>> Hi,
>> the guide *.sdf files are UTF-8 encoded, but the online availeable HTML
>> files
>> seem to be iso8859-1 encoded, which leeds to display errors, an example is
>> part
>> 9.2.1.
>> ... proxy’s efficiency ...
> 
> Is this something I can fix, or is it part of the doc build for the main
> site?

It's in the SDF source files, therefore something you should fix. I don't know 
what text editor you used on those files, but you should use something else, 
or use one that you can set to just ISO8859-1.

I can't think of a grep regex that will detect 8-bit characters, but you need 
to find them all and get rid of them.
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/

Comment 3 Howard Chu 2007-12-02 20:32:50 UTC
hyc@symas.com wrote:
> ghenry@suretecsystems.com wrote:
>> <quote who="dieter@dkluenter.de">
>>> Hi,
>>> the guide *.sdf files are UTF-8 encoded, but the online availeable HTML
>>> files
>>> seem to be iso8859-1 encoded, which leeds to display errors, an example is
>>> part
>>> 9.2.1.
>>> ... proxy’s efficiency ...
>> Is this something I can fix, or is it part of the doc build for the main
>> site?
> 
> It's in the SDF source files, therefore something you should fix. I don't know
> what text editor you used on those files, but you should use something else,
> or use one that you can set to just ISO8859-1.
> 
> I can't think of a grep regex that will detect 8-bit characters, but you need
> to find them all and get rid of them.

(Used a binary editor to generate a regexp for [ 0x80 - 0xff ]). There were 
only two occurrences in the sdf files; both in backends.sdf. Both are the same 
character, an apostrophe used instead of a single quote.

(binary patch xx)
X=`cat xx`
% grep "[$X]" *.sdf
backends.sdf:same connection. This connection pooling strategy can enhance the 
proxy’s
backends.sdf:maybe stored procedures can’t be considered programming, anyway ;).

-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/

Comment 4 Kurt Zeilenga 2007-12-02 20:39:46 UTC
On Dec 2, 2007, at 8:10 PM, ghenry@suretecsystems.com wrote:

> <quote who="dieter@dkluenter.de">
>> Full_Name: Dieter Kluenter
>> Version: REL_ENG_2_4
>> OS: Linux
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (84.142.202.173)
>>
>>
>> Hi,
>> the guide *.sdf files are UTF-8 encoded, but the online availeable  
>> HTML
>> files
>> seem to be iso8859-1 encoded, which leeds to display errors, an  
>> example is
>> part
>> 9.2.1.
>> ... proxy’s efficiency ...
>
> Is this something I can fix, or is it part of the doc build for the  
> main
> site?


You can avoid using non-ASCII characters...


>
>
>>
>> -Dieter
>>
>>
>>
>
>

Comment 5 Hallvard Furuseth 2007-12-02 20:44:36 UTC
hyc@symas.com writes:
> I can't think of a grep regex that will detect 8-bit characters, but
> you need to find them all and get rid of them.

grep -v '[^ -~<tab>]'

where you type ^V tab to the shell to get the <tab>.

To search for probably-UTF-8 chars:

perl -ne '/[\300-\377][\200-\277]/ && print "$ARGV:$_"'

To search for probably-not-UTF-8 8-bit chars:

perl -ne '/[\300-\377](?![\200-\277])|(^|[^\200-\377])[\200-\277]/ && print "$ARGV:$_"'

-- 
Regards,
Hallvard

Comment 6 Gavin Henry 2007-12-02 21:11:47 UTC
hyc@symas.com wrote:
> hyc@symas.com wrote:
>> ghenry@suretecsystems.com wrote:
>>> <quote who="dieter@dkluenter.de">
>>>> Hi,
>>>> the guide *.sdf files are UTF-8 encoded, but the online availeable HTML
>>>> files
>>>> seem to be iso8859-1 encoded, which leeds to display errors, an example is
>>>> part
>>>> 9.2.1.
>>>> ... proxy’s efficiency ...
>>> Is this something I can fix, or is it part of the doc build for the main
>>> site?
>> It's in the SDF source files, therefore something you should fix. I don't know
>> what text editor you used on those files, but you should use something else,
>> or use one that you can set to just ISO8859-1.
>>
>> I can't think of a grep regex that will detect 8-bit characters, but you need
>> to find them all and get rid of them.
> 
> (Used a binary editor to generate a regexp for [ 0x80 - 0xff ]). There were 
> only two occurrences in the sdf files; both in backends.sdf. Both are the same 
> character, an apostrophe used instead of a single quote.
> 
> (binary patch xx)
> X=`cat xx`
> % grep "[$X]" *.sdf
> backends.sdf:same connection. This connection pooling strategy can enhance the 
> proxy’s
> backends.sdf:maybe stored procedures can’t be considered programming, anyway ;).
> 

I use Vim and gedit. The above would have been copy and paste from the FAQ.

I'll do a Vim trigger to clean up things like this before a commit/make.

Thanks all.

-- 
Kind Regards,

Gavin Henry.
Managing Director.

T +44 (0) 1224 279484
M +44 (0) 7930 323266
F +44 (0) 1224 824887
E ghenry@suretecsystems.com

Open Source. Open Solutions(tm).

http://www.suretecsystems.com/

Comment 7 dieter@dkluenter.de 2007-12-03 20:23:38 UTC
"Gavin Henry" <ghenry@suretecsystems.com> writes:

> <quote who="dieter@dkluenter.de">
>> Full_Name: Dieter Kluenter
>> Version: REL_ENG_2_4
>> OS: Linux
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (84.142.202.173)
>>
>>
>> Hi,
>> the guide *.sdf files are UTF-8 encoded, but the online availeable HTML
>> files
>> seem to be iso8859-1 encoded, which leeds to display errors, an example is
>> part
>> 9.2.1.
>> ... proxyâs efficiency ...
>
> Is this something I can fix, or is it part of the doc build for the main
> site?

Whoever is compiling the *.sdf files and putting the result online, so
I think it is part of the doc build process.

-Dieter

-- 
Dieter Klünter | Systemberatung
http://www.dkluenter.de
GPG Key ID:8EF7B6C6

Comment 8 Howard Chu 2007-12-05 04:53:05 UTC
moved from Incoming to Documentation
Comment 9 Gavin Henry 2008-02-19 21:15:38 UTC
>> Is this something I can fix, or is it part of the doc build for the main
>> site?
> 
> Whoever is compiling the *.sdf files and putting the result online, so
> I think it is part of the doc build process.
> 
> -Dieter
> 

So, build process or my mistake?

Comment 10 Gavin Henry 2008-04-16 08:14:51 UTC
changed state Open to Test
Comment 11 Gavin Henry 2008-08-27 20:13:55 UTC
changed state Test to Release
Comment 12 Gavin Henry 2008-11-24 19:36:36 UTC
changed state Release to Closed
Comment 13 Howard Chu 2009-02-17 06:50:38 UTC
moved from Documentation to Archive.Documentation