Windows 7 Forums


Reply
Thread Tools

˙ūK in text files

 
 
Todd
Guest
Posts: n/a
Thanked:
 
      06-24-2011
Hi All,

I asked about this once before and found the answer but the
article had expired. :-(

I can open certain text files in notepad. If I open them
in Leafpad (A Linux text editor), all I get is "˙ūK".

Would the kind individual who told me was coding was missing
please re-inform me so I can ask the Leafpad folks to include
it?

Many thanks,
-T

Funny thing, I open the file in Hexedit and it look just fine.
Hmmmm.

For the time being, I am stuck with Notepad in Wine.
(I would use Notepad in Wine a lot more, except for the bug
where the Open With path leaves off the first two letters of
the path. I reported it to Wine.)
 
Reply With Quote
 
 
 
 
Sunny Bard
Guest
Posts: n/a
Thanked:
 
      06-24-2011
Todd wrote:

> I can open certain text files in notepad. If I open them
> in Leafpad (A Linux text editor), all I get is "˙ūK".


Sounds like some of your text files use unicode character encoding, but
leafpad doesn't handle them, only your 8 bit ASCII files ...

 
Reply With Quote
 
Paul
Guest
Posts: n/a
Thanked:
 
      06-24-2011
Todd wrote:
> Hi All,
>
> I asked about this once before and found the answer but the
> article had expired. :-(
>
> I can open certain text files in notepad. If I open them
> in Leafpad (A Linux text editor), all I get is "˙ūK".
>
> Would the kind individual who told me was coding was missing
> please re-inform me so I can ask the Leafpad folks to include
> it?
>
> Many thanks,
> -T
>
> Funny thing, I open the file in Hexedit and it look just fine.
> Hmmmm.
>
> For the time being, I am stuck with Notepad in Wine.
> (I would use Notepad in Wine a lot more, except for the bug
> where the Open With path leaves off the first two letters of
> the path. I reported it to Wine.)


When you have a URL in your bookmarks, where the original
article has been delete, take the URL here and try the
archive. If a original web site uses "No Robots", the site cannot
be archived. But if the web site is a regular one, you can
go back several years, and be able to read the original page.

http://www.archive.org

I'm finding fewer matches on that site, than I used to,
and I don't know exactly what that means. The archive.org
server has room for (5500) 1TB disks, and you'd think they'd
never have to throw out old content.

Paul
 
Reply With Quote
 
Todd
Guest
Posts: n/a
Thanked:
 
      06-24-2011
On 06/23/2011 08:33 PM, Sunny Bard wrote:
> Todd wrote:
>
>> I can open certain text files in notepad. If I open them
>> in Leafpad (A Linux text editor), all I get is "˙ūK".

>
> Sounds like some of your text files use unicode character encoding, but
> leafpad doesn't handle them, only your 8 bit ASCII files ...
>


Hi Sunny,

If memory serves me, and I don't think it is at the moment,
I think is is Unicode 16 or some such.

-T
 
Reply With Quote
 
Jeff Layman
Guest
Posts: n/a
Thanked:
 
      06-24-2011
On 24/06/2011 04:27, Todd wrote:
> Hi All,
>
> I asked about this once before and found the answer but the
> article had expired. :-(
>
> I can open certain text files in notepad. If I open them
> in Leafpad (A Linux text editor), all I get is "˙ūK".
>
> Would the kind individual who told me was coding was missing
> please re-inform me so I can ask the Leafpad folks to include
> it?
>
> Many thanks,
> -T
>
> Funny thing, I open the file in Hexedit and it look just fine.
> Hmmmm.
>
> For the time being, I am stuck with Notepad in Wine.
> (I would use Notepad in Wine a lot more, except for the bug
> where the Open With path leaves off the first two letters of
> the path. I reported it to Wine.)


Just wondering why you chose to post this in a Win7 newsgroup. Wouldn't
it be better to post in a linux group?

If you got the answer previously, would it be available through Google
Groups?

--

Jeff
 
Reply With Quote
 
Todd
Guest
Posts: n/a
Thanked:
 
      06-24-2011
On 06/24/2011 12:57 AM, Jeff Layman wrote:
> On 24/06/2011 04:27, Todd wrote:
>> Hi All,
>>
>> I asked about this once before and found the answer but the
>> article had expired. :-(
>>
>> I can open certain text files in notepad. If I open them
>> in Leafpad (A Linux text editor), all I get is "˙ūK".
>>
>> Would the kind individual who told me was coding was missing
>> please re-inform me so I can ask the Leafpad folks to include
>> it?
>>
>> Many thanks,
>> -T
>>
>> Funny thing, I open the file in Hexedit and it look just fine.
>> Hmmmm.
>>
>> For the time being, I am stuck with Notepad in Wine.
>> (I would use Notepad in Wine a lot more, except for the bug
>> where the Open With path leaves off the first two letters of
>> the path. I reported it to Wine.)

>
> Just wondering why you chose to post this in a Win7 newsgroup. Wouldn't
> it be better to post in a linux group?


Actually no. The offending file came from W7's regedit. No one
over on the Linux group would know what I am talking about.

A note about my office server/workstation. It is Linux. I run
several Virtual Machines to support the various other OS'es
that my customer's use: XP, Vista, W7, Fedora, others.
When I am doing my own work, I try to stay in the host: the VM's
are slower, although XP does a good job of keeping up.

I was modifying a .reg file for a customer that I had exported
from W7 and was annoyed that my favorite Linux text editor
(Leafpad) wouldn't read the darned thing. So I ask this group again
what the encoding was called so I could ask the Leafpad guys
to support it. Funny thing about Open Source. Open Office
(you all should switch to Libre Office ASAP) being the exception,
if you write a well documented, respectful letter to the author,
you usually get what you want.

I mention my full set up, Linux and all, because I thought it
was best to disclose everything that was going on, in case
others knew of something I was missing. Thought, when the
troll/evangelists find out you are using other OS'es as well
(Xp, Linux, etc.), you do run the risk of getting snotted on.
But, these trolls' knowledge is usually very pedestrian, so
they are never very helpful anyway. And, real deals (non-posers
and other experts) don't care. Or they just ask me what
is going on, like you did, like a professional.

A tip on getting rid of the troll/evangelists is to abbreviate
Microsoft as M$ enough times and they will eventually kill
file you. Then you are left with helpful folks -- snot free.
And, the trolls seldom add to the knowledge of man kind, other
than M$ can do no wrong.


>
> If you got the answer previously, would it be available through Google
> Groups?
>


For some reason, I have never been able to find my posting to
this group over on Google. Other groups, but not this one.
Do you have a tip on this?

Many thanks,
-T
 
Reply With Quote
 
Paul
Guest
Posts: n/a
Thanked:
 
      06-25-2011
Todd wrote:

>
> For some reason, I have never been able to find my posting to
> this group over on Google. Other groups, but not this one.
> Do you have a tip on this?
>
> Many thanks,
> -T


Alt.windows7.general is a "new" group, in terms of date created.
It is carried on AIOE and Eternal-September, and probably servers
like them.

Google, on the other hand, is deaf-dumb-blind. They don't have
an effective "abuse" address. The only way alt.windows7.general
would get added to their archive, is if a valid, signed, newgroup
request (server to server messaging) of some sort was received.
And because Google is clueless, we don't even know if anyone monitors
that stuff or not. There isn't any external signs of intelligence
at Google.

Not just any "newgroup" command will work, because in the past,
hundreds of thousands of them have been created, to the point
server admins just ignored them. (When they can't be authenticated.)
It means alternatives have to be used, to manage groups.

This also caused problems for microsoft.* , because it wasn't
created by normal server to server messaging. A guy used to "fake"
the necessary messages, to make it look like Microsoft was
managing their connection to USENET. When Microsoft shut down
their own USENET server, they had the option of emitting
a couple thousand "rmgroup" messages, to cause other server
admins to consider deleting all those groups from their servers.
At least one server administrator claimed, if a valid signed
set of messages had been received, he would have considered
removing microsoft.* . As you can see, microsoft.* still
exists, and as far as I know, Google is still archiving it.

So there are proper ways to do things, and a lot of "epic fails"
along the way. And it takes a public presence (working "abuse"
address, or server admin address to send requests to), to make
a properly functioning USENET operation.

*******

Based on your description, that this is a text file from Regedit,
I was able to recreate the condition here. I tested creation of
both .txt and .reg , by exporting from Windows 7 regedit. I used
a hexeditor, to examine the file, which is where I got the hex
code from (the 0xFE thing).

The sequence in the .txt is actually

0xFF 0xFE K e y N a m e

implying this is a sixteen bit wide text encoding. And the first
two characters are a declaration of the encoding.

In the .reg file I can see

0xFF 0xFE W i n d o w s R e g i s t r y E d i t o r

so the same thing is happening.

Notepad seems to be aware of this encoding, which is why
everything appears "normal". It is even possible, if you
installed WINE on the Linux box, it comes with a
"Notepad" lookalike, which likely supports whatever encoding
that is as well. (Just the basic WINE install, should
give you a working Notepad lookalike, without actually
copying a Notepad over from elsewhere.)

And here is the answer, with regard to the encoding -

http://en.wikipedia.org/wiki/Byte_order_mark

"The byte order mark (BOM) is a Unicode character used to
signal the endianness (byte order) of a text file or stream.
Its code point is U+FEFF. BOM use is optional, and, if used,
should appear at the start of the text stream. Beyond its specific
use as a byte-order indicator, the BOM character may also indicate
which of the several Unicode representations the text is encoded in."

HTH,
Paul
 
Reply With Quote
 
Dave \Crash\ Dummy
Guest
Posts: n/a
Thanked:
 
      06-25-2011
Todd wrote:
>> Just wondering why you chose to post this in a Win7 newsgroup.
>> Wouldn't it be better to post in a linux group?

>
> Actually no. The offending file came from W7's regedit. No one over
> on the Linux group would know what I am talking about.


There is your answer. Windows 7 Regedit exports files in 16 bit Unicode
format, probably to accommodate the many languages Windows supports. You
need to convert the files to 8 bit ANSI or use a Linux compatible text
editor that will accommodate Unicode.
--
Crash

"The fewer the facts, the stronger the opinion."
~ Arnold H. Glasow ~
 
Reply With Quote
 
Todd
Guest
Posts: n/a
Thanked:
 
      06-26-2011
On 06/25/2011 03:56 AM, Paul wrote:
> Todd wrote:
>
>>
>> For some reason, I have never been able to find my posting to
>> this group over on Google. Other groups, but not this one.
>> Do you have a tip on this?
>>
>> Many thanks,
>> -T

>
> Alt.windows7.general is a "new" group, in terms of date created.
> It is carried on AIOE and Eternal-September, and probably servers
> like them.
>
> Google, on the other hand, is deaf-dumb-blind. They don't have
> an effective "abuse" address. The only way alt.windows7.general
> would get added to their archive, is if a valid, signed, newgroup
> request (server to server messaging) of some sort was received.
> And because Google is clueless, we don't even know if anyone monitors
> that stuff or not. There isn't any external signs of intelligence
> at Google.
>
> Not just any "newgroup" command will work, because in the past,
> hundreds of thousands of them have been created, to the point
> server admins just ignored them. (When they can't be authenticated.)
> It means alternatives have to be used, to manage groups.
>
> This also caused problems for microsoft.* , because it wasn't
> created by normal server to server messaging. A guy used to "fake"
> the necessary messages, to make it look like Microsoft was
> managing their connection to USENET. When Microsoft shut down
> their own USENET server, they had the option of emitting
> a couple thousand "rmgroup" messages, to cause other server
> admins to consider deleting all those groups from their servers.
> At least one server administrator claimed, if a valid signed
> set of messages had been received, he would have considered
> removing microsoft.* . As you can see, microsoft.* still
> exists, and as far as I know, Google is still archiving it.
>
> So there are proper ways to do things, and a lot of "epic fails"
> along the way. And it takes a public presence (working "abuse"
> address, or server admin address to send requests to), to make
> a properly functioning USENET operation.
>
> *******
>
> Based on your description, that this is a text file from Regedit,
> I was able to recreate the condition here. I tested creation of
> both .txt and .reg , by exporting from Windows 7 regedit. I used
> a hexeditor, to examine the file, which is where I got the hex
> code from (the 0xFE thing).
>
> The sequence in the .txt is actually
>
> 0xFF 0xFE K e y N a m e
>
> implying this is a sixteen bit wide text encoding. And the first
> two characters are a declaration of the encoding.
>
> In the .reg file I can see
>
> 0xFF 0xFE W i n d o w s R e g i s t r y E d i t o r
>
> so the same thing is happening.
>
> Notepad seems to be aware of this encoding, which is why
> everything appears "normal". It is even possible, if you
> installed WINE on the Linux box, it comes with a
> "Notepad" lookalike, which likely supports whatever encoding
> that is as well. (Just the basic WINE install, should
> give you a working Notepad lookalike, without actually
> copying a Notepad over from elsewhere.)
>
> And here is the answer, with regard to the encoding -
>
> http://en.wikipedia.org/wiki/Byte_order_mark
>
> "The byte order mark (BOM) is a Unicode character used to
> signal the endianness (byte order) of a text file or stream.
> Its code point is U+FEFF. BOM use is optional, and, if used,
> should appear at the start of the text stream. Beyond its specific
> use as a byte-order indicator, the BOM character may also indicate
> which of the several Unicode representations the text is encoded in."
>
> HTH,
> Paul


Excellent response. Thank you! Now I can ask the Leafpad
guys to support it.

-T

Wine does have a Notepad. I am using it, but try to avoid it
due to a bug where Notepad drops the first two letter from
the "Open With" path.
 
Reply With Quote
 
Todd
Guest
Posts: n/a
Thanked:
 
      06-26-2011
On 06/25/2011 04:57 AM, Dave "Crash" Dummy wrote:
> Todd wrote:
>>> Just wondering why you chose to post this in a Win7 newsgroup.
>>> Wouldn't it be better to post in a linux group?

>>
>> Actually no. The offending file came from W7's regedit. No one over
>> on the Linux group would know what I am talking about.

>
> There is your answer. Windows 7 Regedit exports files in 16 bit Unicode
> format, probably to accommodate the many languages Windows supports. You
> need to convert the files to 8 bit ANSI or use a Linux compatible text
> editor that will accommodate Unicode.


Thank you. Thank was what I was looking for!

-T
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Windows 7 CMD Help & Tips Fire cat Customization 16 04-18-2012 05:17 PM
Windows 7 Command Reference rakesh.kulkarni26 Windows 7 Support 2 03-25-2011 02:35 PM
GetDiz Text Editor Nibiru2012 Free Software Database 0 11-23-2010 05:06 AM
Files missing but appear on Search? carn1x Windows 7 Support 1 07-07-2010 09:23 AM
Cleartype and Text capture software mtheriault Windows 7 Support 4 10-29-2009 09:32 AM


All times are GMT +1. The time now is 04:32 PM.
W7Forums is an independent website and is not affiliated with Microsoft Corporation.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33