A ping from threaders' prison

Carl Sassenrath, CTO
REBOL Technologies
17-Feb-2007 20:38 GMT

Article #0060
Main page || Index || Prior Article [0059] || Next Article [0061] || 6 Comments || Send feedback

Just sending out a ping that I am here... but just that...

I'm being held captive in threaders' prison.

You may know what that means. If you don't, here's an example:

Earlier this week, quite by chance during the coding of a port handler, I noticed the single simple line of C code that pushes a value on the stack:


generated this machine code:

004057B5 8B 55 FC             mov         edx,dword ptr [ebp-4]
004057B8 A1 C4 24 46 00       mov         eax,[__tls_index (004624c4)]
004057BD 64 8B 0D 2C 00 00 00 mov         ecx,dword ptr fs:[2Ch]
004057C4 8B 04 81             mov         eax,dword ptr [ecx+eax*4]
004057C7 8B 0D C4 24 46 00    mov         ecx,dword ptr [__tls_index (004624c4)]
004057CD 64 8B 35 2C 00 00 00 mov         esi,dword ptr fs:[2Ch]
004057D4 8B 0C 8E             mov         ecx,dword ptr [esi+ecx*4]
004057D7 8B 89 34 00 00 00    mov         ecx,dword ptr [ecx+34h]
004057DD 83 C1 01             add         ecx,1
004057E0 8B 35 C4 24 46 00    mov         esi,dword ptr [__tls_index (004624c4)]
004057E6 64 8B 3D 2C 00 00 00 mov         edi,dword ptr fs:[2Ch]
004057ED 8B 34 B7             mov         esi,dword ptr [edi+esi*4]
004057F0 89 8E 34 00 00 00    mov         dword ptr [esi+34h],ecx
004057F6 8B 0D C4 24 46 00    mov         ecx,dword ptr [__tls_index (004624c4)]
004057FC 64 8B 35 2C 00 00 00 mov         esi,dword ptr fs:[2Ch]
00405803 8B 0C 8E             mov         ecx,dword ptr [esi+ecx*4]
00405806 8B 89 34 00 00 00    mov         ecx,dword ptr [ecx+34h]
0040580C C1 E1 04             shl         ecx,4
0040580F 8B 80 30 00 00 00    mov         eax,dword ptr [eax+30h]
00405815 03 C1                add         eax,ecx
00405817 8B 0A                mov         ecx,dword ptr [edx]
00405819 89 08                mov         dword ptr [eax],ecx
0040581B 8B 4A 04             mov         ecx,dword ptr [edx+4]
0040581E 89 48 04             mov         dword ptr [eax+4],ecx
00405821 8B 4A 08             mov         ecx,dword ptr [edx+8]
00405824 89 48 08             mov         dword ptr [eax+8],ecx
00405827 8B 52 0C             mov         edx,dword ptr [edx+0Ch]
0040582A 89 50 0C             mov         dword ptr [eax+0Ch],edx

Even though this is non-optimized, in a perfect world on a prefect CPU, that should be about 4 or 5 instructions.

It sure got me rethinking the usage of TLS variables, at least on x86 Win32 implementations. I decided not to be held captive by the compiler to any degree (on any OS model) and recode large parts of the VM and natives to avoid TLS references (caching them SP relative instead).

I really didn't think I'd need to be doing this in the year 2007. A human-based global flow analysis!? Makes me homesick for the old A5 CPU register, you know what I mean? Or a CPU with a thread base register, or I'd even take a thread-local remap on a VM base page for TLS globals. Or just maybe... cool stuff like that happens when -O2 is enabled? Please say "yes".)


Updated 24-Mar-2017 - Edit - Copyright REBOL Technologies -