Do 64-bit OS's really always run 10% slower than 32-bit OS's?

J

J. P. Gilliver (John)

Richard said:
They don’t. If they are displayed like that it is a bug in your
newsreader.
Strange: the post to which I posted the last followup most definitely
did (and my newsreader just presents the content inline anyway, so no
problem); this one didn't. I've noticed it increasingly recently that
some posts (not just yours) are in that form. It's not a problem!
I have no idea why you’d think that (see below for further discussion).
OK, see below.
“n-bit†isn’t a very exact way of describing a CPU’s capabilities, but
TTBOMK everything that people call a “32-bit CPU†has 32-bit user
addresses. That’s not something an application can readily do anything
about.
Agreed. I'm out of touch with what goes on in today's 32 and 64 bit
world; certainly some 8 bit processors had 16 bit address buses, and I
think some 16-bit ones had 32. (Sometimes implemented as individual
pins, sometimes half that with a high/low pin.)
CPU caches are measured in kilobytes or megabytes, not bits.
They have a width as well as a depth.
Consider an instruction fetch, as a motivating example. In the x86 ISA
one instruction may be as little as 8 bits wide. The other 56 bits of
the memory read are not wasted, because they contain the next few
instructions, so no access to physical memory is required for them -
they are already inside the CPU.
I didn't realise that such packing occurs by default.
The situation is similar with data. While it’s not inevitable that data
that is used together is stored together, it’s something that often
arises naturally even before one puts any effort into performance
improvement.
Indeed.
[]
For many kinds of parallel operation there is a separate set of very
wide registers used by the SIMD instructions, and those are available in
the 32-bit ISA too.
It all gets very confusing when you have cores that _can_ do operations
on, say, four or eight 8-bit values at once (a lot of image processing
for example), and then you also have multiple cores ... all too new for
me!
--
J. P. Gilliver. UMRA: 1960/<1985 MB++G()AL-IS-Ch++(p)Ar@T+H+Sh0!:`)DNAf

The Daily Mail has led the campaign to limit pornography - "it demeans and
belittles women," they explain, "and that's our job." (Sandi Toksvig
[scripted], News Quiz 2013-7-26.)
 
P

Paul

J. P. Gilliver (John) said:
(Did you know that your posts take the form of a text attachment?)



I've been thinking that!

No, that figure of 10% was puzzling me too. If run on totally 32-bit
_hardware_, they'd have to do two fetches per instruction, but that
would only be slower (and that by 50%, not 10%) if the code wasn't
optimised for 64-bit. (Which a lot of it isn't; on the whole, I wouldn't
run a 64-bit OS on 32-bit hardware.)
On Intel Core2, the feature is called MacroFusion.

http://abinstein.blogspot.ca/2007/06/decoding-x86-from-p6-to-core-2-part-3.html

"So instead of trying to fused all possible macroinstruction pairs,
Core 2 Duo fuses only the selected macroinstructions -

* The first macroinstruction must be a TEST X, Y or a CMP X, Y
where only one operand of X and Y is an immediate or a memory word.
* The second macroinstruction must be a conditional jump that checks
the carry flag (CF) or zero flag (ZF).
* The macroinstructions are not working in 64-bit mode. <---

...the frequency of macro-fused operations in SPEC2006 CPU ranges from
0-16% in integer codes and just 0-8% in floating-point codes. In other
words, in the best case, macro-fusion would reduce the number of
macroinstructions from 100% to 92% for integer and just 96% for
floating-point execution, hardly the whopping 20-25% reduction as
described by Intel's marketing department
"

And that's what is supposed to make 64 bit code slower than 32 bit code
on an Intel Core2. "macroinstructions are not working in 64-bit mode".

Apparently AMD doesn't use macrofusion in their design, and without
that feature, the execution rate of 32 bit and 64 bit is the same.

Paul
 
J

JJ

Huh ? AFAIK there are no PAE enabled applications, PAE is an internal OS
thing. As J.O. Aho wrote, PAE does not extend the applications' virtual
addressing space.
Or do you mean ChromiumOS ? Or AWE (Address Windowing Extensions)
instead of PAE ?
Crap. My brain has been glitchy lately.
Yes, I meant the AWE.

Chromium (Canary?), which is the base of Google Chrome.
Its 32-bit version is not yet large address aware, in the PE header.
32-bit Firefox already has this bit enabled.
 
J

JJ

Huh ? AFAIK there are no PAE enabled applications, PAE is an internal OS
thing. As J.O. Aho wrote, PAE does not extend the applications' virtual
addressing space.
Or do you mean ChromiumOS ? Or AWE (Address Windowing Extensions)
instead of PAE ?
Crap. My brain has been glitchy lately.
Yes, I meant the AWE.

Chromium (Canary?), which is the base of Google Chrome.
Its 32-bit version is not yet large address aware, in the PE header.
32-bit Firefox already has this bit enabled.
Or... is this bit for PAE instead of AWE?
 
P

Pascal Hambourg

JJ a écrit :
Chromium (Canary?), which is the base of Google Chrome.
Its 32-bit version is not yet large address aware, in the PE header.
32-bit Firefox already has this bit enabled.
Or... is this bit for PAE instead of AWE?
Neither. "Large Address Aware" indicates that a 32-bit application can
handle the full 32-bit 4 GiB addressing space instead of the traditional
2 GiB space (the remaining 2 GiB being reserved for the kernel on 32-bit
Windows versions). Of course this is only relevant for 32-bit
applications running on 64-bit Windows versions (WOW64).
 
Y

Yousuf Khan

I have a 1.6GHz Lenovo W510 with 16MB of memory and I'm not sure
if I should put 32 bit or 64 bit Windows on it.

I have 64-bit Linux on now but I'm making it a dual boot system.

Normally I'd just "go" with 64-bit, to follow the crowd, without
really knowing why - but I was always told that 64-bit OS's always
run about 10% slower than 32-bit OS's.

Is that true that applications *always* run slower on 64-bit OS's
than on 32-bit OS's?
No, it was never as much as 10% slower, it was more like 1-2% slower
sometimes. Most of the reason for the slowness was operating
system-dependent. Because the real OS kernel is 64-bit, the 32-bit apps
are presented with a simulated 32-bit OS to work with. The simulated
32-bit kernel is just a wrapper for the 64-bit kernel, converting 32-bit
calls to 64-bit and vice-versa (they called this "thunking"). That extra
layer of OS added the 1-2%. That of course is only a problem for 32-bit
apps, and it's not much of a problem at that. Native 64-bit apps have no
such additional layer to go through. Also as I said, it was
OS-dependent, so some OS's might have had a much more efficient thunking
layer than others, e.g. think Linux vs. Windows.

However, on the plus side, on a 64-bit OS, all 32-bit apps see their own
full private 3GB of address space, they aren't sharing it with other
32-bit apps. So this may help performance with those apps as they are
much less memory constrained than on their own native 32-bit OS.

And then finally, you have 16GB of RAM! You will be wasting more than
12GB of that RAM if you load it down with a 32-bit OS! Since 32-bit OS's
don't see much more than 3-4GB of RAM, usually (though there are
complicated ways around that).

Yousuf Khan
 
P

Pascal Hambourg

Yousuf Khan a écrit :
However, on the plus side, on a 64-bit OS, all 32-bit apps see their own
full private 3GB of address space, they aren't sharing it with other
32-bit apps.
AFAIK, the applications see their own full private address space and
don't share it with other applications on 32-bit systems too. Of course
they share the physical RAM, but that does not change on 64-bit systems.
 
P

Peter Köhlmann

Pascal said:
Yousuf Khan a écrit :

AFAIK, the applications see their own full private address space and
don't share it with other applications on 32-bit systems too. Of course
they share the physical RAM, but that does not change on 64-bit systems.
That is not completely right.
On 32Bit systems, even with PAE enabled (which excludes all windows systems
except few selected Server versions) any application can see a max of about
3 GByte memory. The remaining 1 GByte is mapped to the OS
On 64 bit systems, any 32bit application can have a max of nearly 4 GByte
memory. More is not possible since in 32bit mode, only 4 GByte are
adressable
 
P

Pascal Hambourg

Peter Köhlmann a écrit :
That is not completely right.
What part exactly is not right ?
On 32Bit systems, even with PAE enabled (which excludes all windows systems
except few selected Server versions) any application can see a max of about
3 GByte memory. The remaining 1 GByte is mapped to the OS
On 64 bit systems, any 32bit application can have a max of nearly 4 GByte
memory.
This has been mentionned earlier in the thread. However it does not mean
that applications share their address space with others, does it ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top