Yes, yes, I know it's supposed to sound impressive. But the floating-point unit really hasn't changed. And most processing is still done in 32 bits even in 64-bit mode. The 64-bit mode is intended primarily to increase the available address space. But running through such a large address space is time-consuming. This makes it not as useful as one might think.
The following experiment will give Windows users a feel for what I am talking about. First select "Run" from the Start Menu and run debug.exe. This will bring up a window that, aside from the title bar is mostly black. But if you look closely, it has white text and starts off with a single, solitary dash. Now for this next part, I will use bold text for the text shown by the program, regular text for the text you as the user will enter, and italics for any commentary. Hit enter at the end of each line.
0B33:0100 mov si, 0 As shown on my system.
0B33:0103 mov di, 0 The text before the colon may vary.
0B33:0106 mov dl, 0
0B33:0108 mov bx, 1000
0B33:010B mov ax, [bx]
0B33:010D add si, 1
0B33:0110 adc di, 0
0B33:0113 adc dl, 0
0B33:0116 jnc 108
0B33:0118 int 3
0B33:0119 <Here you just hit enter without typing anything>
The window will appear not to do anything for a while as it simulates access of about 2 terabytes of memory. On my system, this takes about 26 minutes.
AX=C033 BX=1000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=0B33 ES=0B33 SS=0B33 CS=0B33 IP=0118 NV UP EI PL ZR AC PE CY
0B33:0118 CC INT 3
-q This will exit the program. Do NOT click on the X in the upper-right corner. That would be... bad
Okay, I know that that program is not as "user friendly" as some of you may be accustomed to. It pretty much assumes that you know what you're doing, although it does have a help feature in case you forget the syntax of some of the commands. There are some biases in the simulation. Because the simulation accesses the same memory repeatedly, this will be placed in faster "cache" memory to speed up the process. Actual use in a real 64-bit program accessing a real memory space will be slower.