WineHQ
Bug Tracking Database – Bug 10273

 Bugzilla

 

Last modified: 2011-10-12 04:46:09 CDT  

satisfy SafeDisc 2.x heuristic API analyzer by "adjusting" API exports/entry statistics of wine builtins (affects e.g. adobe photoshop)

Bug 10273 - satisfy SafeDisc 2.x heuristic API analyzer by "adjusting" API exports/entry statistics of wine builtins (affects e.g. adobe photoshop)
satisfy SafeDisc 2.x heuristic API analyzer by "adjusting" API exports/entry ...
Status: CLOSED FIXED
AppDB: Show Apps affected by this bug
Product: Wine
Classification: Unclassified
Component: tools
0.9.48.
x86 Linux
: P1 normal
: ---
Assigned To: Mr. Bugs
: Installer, obfuscation
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2007-11-01 19:22 CDT by Anastasius Focht
Modified: 2011-10-12 04:46 CDT (History)
10 users (show)

See Also:
Regression SHA1:
Fixed by SHA1: c32e02e48d3aeba7588b566973c9fb02253e8a10
Distribution: ---
Staged patchset:


Attachments
Patch which should fix SafeDisc 2.x copy protection api analyzer issue (58.47 KB, patch)
2007-11-01 19:22 CDT, Anastasius Focht
Details | Diff
Hack to put PE header in .init section. (776 bytes, patch)
2007-11-05 06:59 CST, Alexandre Julliard
Details | Diff
stubs for user32 functions present in Vista (10.48 KB, patch)
2007-11-13 12:42 CST, Mikolaj Zalewski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Anastasius Focht 2007-11-01 19:22:18 CDT
Created attachment 8924 [details]
Patch which should fix SafeDisc 2.x copy protection api analyzer issue

Hello,

if not interested in technical details goto (2) ;-)

I made this a separate bug report like http://bugs.winehq.org/show_bug.cgi?id=9925 (SafeDisc 1.x stopper) because SafeDisc has many flavors that differ in various technical ways and can't be discussed/handled in a single SafeDisc "metabug" like http://bugs.winehq.org/show_bug.cgi?id=219
SafeDisc Major version based separation allows better tracking of "completion" state (1.x/2.x/3.x/4.x).

-----------
(1)

It took me a couple of days to get an idea what SafeDisc 2.x really does with the API exports and to find a way to cope with it...
Various anti-debugging techniques and nasty runtime code obfuscation made this journey somewhat challenging.

Parts of SafeDisc 2.x analyze the API code of the following system libraries:

kernel32.dll (wine builtin)
user32.dll (wine builtin)
gdi32.dll (wine builtin)
cdasdtst.dll (developer backdoor?)

The latter one is probably a "developer backdoor", used to verify their code/algorithms (not needed to be present).

For each of these libraries all named exports are taken into account.
A number of API entry opcode sequences are read and used for statistical analysis (number depends on type of encountered opcodes).
From what I've seen there is some kind of "behavioural probability" of each API entry calculated, with typical opcode sequences having a "weighting".
Just think of advanced heuristic detection methods of antivirus scanners and you get the idea ;-).

Anyway, wine's builtin libraries differ in their signatures from windows entry code and to make things worse, wine's +relay/+snoop feature heavily interferes with the analyzer results. :-(

First I made some experiments with additional (dummy) function exports, small __asm__ wrappers with opcode sequences to see how the distribution of opcodes influences the analyzer results.
The result was rather disappointing.
I needed a large number of exports to gain some results but not significantly enough to get below the "bad" threshold.

Then I came across the magic "0" value. As soon as I used this value either as first .byte on API entry or init value for data export, the entry scan was short circuited.
This value seemed to have the highest influence of all code/data sequences used.
I added quite a number of data exports - aliased to initialized "0" value - safe enough to let even wine's +relay work (remember: relay code influences analyzer).
If you look at the patch don't get mad ;-)
You will see a crapload of named "__wine_safedisc2" data exports/aliases added to wine builtin kernel32, user32 and gdi32.

For the record: http://bugs.winehq.org/show_bug.cgi?id=9926 which talks about "problematic" gdi32 data exports (pfnPalette stuff) in SafeDisc 3.x.
SafeDisc 2.x code triggers SEH upon exported function pointers too but this is gracefully handled. No harm at all.
I wonder if this is really a problem in SafeDisc 3.x ...

I thought about other possible solutions but found no one easier to implement to prove my findings without hurting wine too much :-)
Making wrappers of the whole kernel32, user32, gdi32 named exports API just to please the entry analyzer is not an option to me.

Another way could be the relocation of export table upon module loading, expanding it at runtime by adding the required number of "statistics fakers" (data exports with zero init).
This requires modifying the ntdll loader .. though i'm not sure if this approach breaks other applications/braindamaged PE protection stuff which expect certain conditions to be met (tables present at specific sections/memory areas).

------------------------------
(2)

Well, take it as proof of concept. Play with it.

I tested my patch with a few SafeDisc 2.x games I have original media.
No cracks/no-cd patches were used.
Only official game patches were used.

Battlefield 1924 (1.6x) - SafeDisc 2.6/2.8
Road To Rome (BF1942 expansion) - SafeDisc 2.8
Battlefield Vietnam - SafeDisc 2.9

All of these work fine for me with the patch applied.

Please test this patch on many SafeDisc 2.x games as possible and report your results (works or works not).
Make sure you have original media mounted and drive/data is visible within wine.
If it fails for first time, the media might not ready yet, try again then (I experienced this sometimes).

Use this link to verify what SafeDisc Version is used: http://www.120search.net/ (alcohol software copy protection database)

Regards
Comment 1 Stefan Dösinger 2007-11-01 19:34:45 CDT
I don't know if this is related, but Ivan Leo talked about some hook testing done by safedisk 2. Appart of other things, it checks if all CALLs have a RET. It started at the exported functions, but eventually dived into Linux libraries. GCC generated code which broke these checks, e.g.

...
xxx:
pop eax;
jmp eax;
...
call xxx
...

To find the address of the code in memory or somesuch.

Could this be the statistical heuristic you see?

Comment 2 Dan Kegel 2007-11-01 19:39:57 CDT
The patch does seem to help at least one other app get further.
Comment 3 Anastasius Focht 2007-11-01 20:03:41 CDT
Hello,

--- quote ---
I don't know if this is related, but Ivan Leo talked about some hook testing
done by safedisk 2. Appart of other things, it checks if all CALLs have a RET.
It started at the exported functions, but eventually dived into Linux
libraries. GCC generated code which broke these checks, e.g.

...
Could this be the statistical heuristic you see?
--- quote ---

Well Micro$oft compilers seem to generate such code sequences on occasion too but probably not many to have such significance.

The tests for hooked/detoured code (jump trampolines) are likely part of that "behavioral analysis".
They probably used some sort of disassembler/tracer.
But this is probably only a part of that analysis. I experimented with various opcode sequences, covering standard entry code but even a large number of them had no real significance (> of all gcc generated entries). 

--- quote ---
The patch does seem to help at least one other app get further.
--- quote ---

I am missing the application name ;-)

Regards
Comment 4 Dan Kegel 2007-11-01 20:33:03 CDT
Three guesses :-)
Comment 5 Anastasius Focht 2007-11-01 20:59:16 CDT
Hmm ... my first guess would be some Adobe application because you always come up with this stuff ;-)

Regards
Comment 6 Anastasius Focht 2007-11-02 04:38:30 CDT
Hello again,

well I rethought about why the analyzer would see gcc generated code (mostly -fPIC in wine builtins) as "problematic"...

The analyzer seems to have it's trace depth (nesting level) somewhat limited though I didn't figured out the exact number ... something around 4-5.
Usually all calls at first trace levels have standard stack frame (ebp, esp) and a RET.
Sure, there is code that deals with global offsets table, like the infamous "__i686.get_pc_thunk.bx"

--- snip ---
__i686.get_pc_thunk.bx+0: MOV EAX, DWORD PTR [ESP]
__i686.get_pc_thunk.bx+3: RET 
--- snip ---

Even if the analyzer doesn't know about the trick to get current EIP, the call should still considered harmless seen from tracer perspective, because it remains in code section and sequence is too short for code cave/hostile code.
It traces some levels deep to see which sections/modules the code brances to (either they end up in ntdll.dll or remain in builtin code).

After another debugging session I have to revise my statement about the "weighting" of opcodes .. I probably mixed it up with their disassembler state machine variables (reading partly obfuscated disassembly code isn't easy when you're not the one that wrote the logic behind it) ;-)

It's only the analyzed API entry itself that gets some sort of "behavioral probability"/"statistical value" which is calculated from nesting level/recursion depth encountered, amount of trace instructions spent in own/system library code, parameters (callstack) and the like.
That combinations and the large number of exports makes it somewhat hard to find the exact cause why the overall wine builtins entry code statistics say "bad guy".

Fortunately the overall number of exports are taken into final calculations (percentage based).
That renders few "invalid" ones, like data exports with pointer values (which might confuse the disassembler) harmless.
If 2 of 300 exports are considered completely "invalid" (like the pfnXXX gdi32 data exports) they don't have any significance on final statistics.

Just by adding enough fake API entries/data exports the overall analyzer stats can be significantly dropped below the "bad guy" threshold. 
Play with the exports and you will see, e.g. change number of dumnmy exports, aliased values, entry opcodes.
For the non-debugging freaks the result is just "loads" or "does not load", the infamous "ExitProcess( 0xfeedface)" if you trace log.

Regards
Comment 7 Alexandre Julliard 2007-11-03 12:22:15 CDT
Does it make a difference if you compile out the debug traces?  Maybe it doesn't like all these calls to libwine?
Comment 8 Peter Beutner 2007-11-03 20:44:38 CDT
From what i remember when debugging safedisc2 it was exactly this
__i686.get_pc_thunk.bx call it dislikes because it determines that it goes outside of the dll image.
Safedisc takes the PE header as start of the image. But in wine the pe header actually starts somewhere in the .text section and the __i686.get_pc_thunk.bx func is before that. ( Haven't really looked into why the builtin dlls are linked this way and if this can be changed.)
But building those three dlls without -fPIC(so no __i686.get_pc_thunk.bx) makes it work(for at least some safedisc versions).
I successfully tested it with:
Max Payne V1.0  - Safedisc: 2.30.033
          V1.05 -           2.51.020
Stronghold V1.0 -           2.40.010

Any greater version of safedisc2(>= 2.6x?) seems to have additional checks and it fails again with the usual "ExitProcess( 0xfeedface)".

btw.: ProtectionID helps with identifying the used copy protection and its version. Just google for it.
Comment 9 Anastasius Focht 2007-11-04 11:29:50 CST
Hello,

@Peter Beutner: good point with

-- quote --
__i686.get_pc_thunk.bx call it dislikes because it determines that it goes
outside of the dll image.
-- quote --

I wrote earlier "... the call should still considered harmless seen from tracer perspective, because it remains in code section .." and missed the fact that this little code snippet is indeed *outside* of PE dll section boundary (but within same ELF .text section) .. *bangs his head*.
Thanks for the hint ;-)

Additionally I ran some debugging series, gathering some statistics.
I hope bugzilla doesn't mess all the tables up...

used: wine-0.9.48-196-gbeab2c1

How to read the data:

c1,c2,c3 are statistical values, calculated from all entry point code (all named exports) of each library (tracer).
They are required to be below given threshold "condition".
The conditions are hard coded and the same for all system libraries.

Hex values are percent (0x64 = 100%, 0x32 = 50% ...)
"*" means condition not satisfied

standard:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x59       0x53        0x5D        0x5F
c2:  0x46*     0x47*       0x50*       0x3C
c3:  0x43       0x4B        0x4B        0x5A

(c2 is not met for kernel32, user32, gdi32)

with relay:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x61*       0x63*      0x63*       0x5F
c2:  0x59*       0x64*      0x64*       0x3C
c3:  0x00        0x00       0x00        0x5A

(c1 and c2 are not met for kernel32, user32, gdi32)

with WINE_NO_TRACE_MSGS and WINE_NO_DEBUG_MSGS (tracing code compiled out):

    kernel32   user32      gdi32       condition (cx < threshold)
-------------------------------------------------------------------
c1:  0x51       0x4B        0x57        0x5F
c2:  0x38       0x28        0x44*       0x3C
c3:  0x39       0x3E        0x43        0x5A

(c2 is not met by gdi32)

*** Now the interesting ones ***

kernel32, user32, gdi32 compiled without -fPIC:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x20       0x10        0x19        0x5F
c2:  0x07       0x01        0x05        0x3C
c3:  0x00       0x00        0x00        0x5A

(significantly dropped, all conditions are met)

200 fake data exports (aliased to 0 DWORD value):

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x4A       0x41        0x3F        0x5F
c2:  0x37       0x2A        0x26        0x3C
c3:  0x38       0x3B        0x33        0x5A

(all conditions are met)

200 fake data exports (aliased to 0 DWORD value) + relay:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x51       0x4D        0x44        0x5F
c2:  0x40*      0x3B        0x2E        0x3C
c3:  0x00       0x00        0x00        0x5A

(c2 is not met for kernel32, so "relay" feature is not "safe" to use)

400 fake data exports (aliased to 0 DWORD value):

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x40       0x25        0x30        0x5F
c2:  0x25       0x20        0x18        0x3C
c3:  0x30       0x30        0x27        0x5A

(all conditions are met)

400 fake data exports (aliased to 0 DWORD value) + relay:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x46       0x3F        0x34        0x5F
c2:  0x2E       0x29        0x1B        0x3C
c3:  0x00       0x00        0x00        0x5A

(all conditions are met)

===================================

The reason why +relay shifts statistics to the worse is because relay stubs have get_pc_thunk.bx in entry code.

Now the question remains whether to drop -fPIC for these libraries or finding a way to relocate the "get_pc_thunk.bx" code snippet into PE section boundaries.
There is one interesting thing I found while dumping kernel32 ELF sections:

7b8303fc l     F .text  00000000              .hidden __i686.get_pc_thunk.cx
7b8140e7 l     F .text  00000000              .hidden __i686.get_pc_thunk.bx

The address of "__i686.get_pc_thunk.cx" is within "visible" PE section space (starts at 7b820000) whereas "__i686.get_pc_thunk.bx" is not.
Both calculate global offset table addresses (choice is probably made by compiler what register does not need to be saved in that code context).
If the code location could be fixed to "visible" PE section space like for the get_pc_thunk.cx we might fix this problem while keeping -fPIC.

--- snip ---
Any greater version of safedisc2(>= 2.6x?) seems to have additional checks and
it fails again with the usual "ExitProcess( 0xfeedface)".
--- snip ---

Could you elaborate which games?
I have successfully tested the Battlefield game series which have SafeDisc 2.5 and 2.8 (either with dummy exports or by dropping -fPIC).
I've also tested one SafeDisc 3.x (doom3) and one 4.x game (BF2).
Maybe we're seeing different issues.

Regards
Comment 10 Anastasius Focht 2007-11-04 14:18:25 CST
Hello again,

well I digged further...

--- quote ---
7b8303fc l     F .text  00000000              .hidden __i686.get_pc_thunk.cx
7b8140e7 l     F .text  00000000              .hidden __i686.get_pc_thunk.bx

The address of "__i686.get_pc_thunk.cx" is within "visible" PE section space
(starts at 7b820000) whereas "__i686.get_pc_thunk.bx" is not.
Both calculate global offset table addresses (choice is probably made by
compiler what register does not need to be saved in that code context).
If the code location could be fixed to "visible" PE section space like for the
get_pc_thunk.cx we might fix this problem while keeping -fPIC.
--- quote ---

I dumped some of the generated kernel32 object files.
Because they have been compiled with "-fPIC" all of them have "__i686.get_pc_thunk.bx" and some have additionally "__i686.get_pc_thunk.cx" defined.
Both functions have their own .text, flagged as gnu linkonce (important for linking step).

--- snip ---
.text.__i686.get_pc_thunk.bx:080000DE                 public __i686_get_pc_thunk_bx
.text.__i686.get_pc_thunk.bx:080000DE __i686_get_pc_thunk_bx proc near
.text.__i686.get_pc_thunk.bx:080000DE                 mov     ebx, [esp+0]
.text.__i686.get_pc_thunk.bx:080000E1                 retn
.text.__i686.get_pc_thunk.bx:080000E1 __i686_get_pc_thunk_bx endp
--- snip ---

I outputted the linker script of my (wine)gcc, using -Wl,--verbose to see how it handles .text sections:

--- snip ---
..
.text           :
  {
    *(.text .stub .text.* .gnu.linkonce.t.*)
    KEEP (*(.text.*personality*))
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  } =0x90909090
..
--- snip ---

The final .text section is composed of all the .text, .stub, .text.*, and .gnu.linkonce.t.* sections that the linker encounters from all input files - in that (file) order.
Supplying the verbose flag to linker it outputs the following:

--- snip ---
..
attempt to open /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o succeeded
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o
attempt to open /usr/lib/gcc/i386-redhat-linux/4.1.2/crtbeginS.o succeeded
/usr/lib/gcc/i386-redhat-linux/4.1.2/crtbeginS.o
attempt to open kernel32.dll-tWCtEp.spec.o succeeded
kernel32.dll-tWCtEp.spec.o
attempt to open comm.drv.spec.o succeeded
comm.drv.spec.o
attempt to open krnl386.exe.spec.o succeeded
krnl386.exe.spec.o
attempt to open stress.spec.o succeeded
stress.spec.o
..
--- snip ---

The culprit is "crtbeginS.o". That module contains code to deal with global constructors and destructors (_init and _fini functions).
The "S" version is built with ... guess .. -fPIC.

Dumping the object code for this library indeed reveals that "__i686_get_pc_thunk_bx" is defined here.

Due to "gcc -shared" this object file is automagically used as one of the first (implicit) object file arguments to the linker.
Because the linker combines .text sections in order of supplied files, the definition of "__i686_get_pc_thunk_bx" is pulled in by object files that deal with startup code (before wine code runs) - within ELF .text section part - just before the dll PE header/section begin.

.init section -> _init_proc() -> call_gmon_start(), frame_dummy(), __wine_spec_init_ctor() ....

The reason that "__i686.get_pc_thunk.cx" code is within "valid" PE section space is because none of the internal startup code defines/needs it.
The definition is taken from first kernel32 object file that defines/references it. 

So it seems the issue is out of wine's scope, one has either to drop -fPIC or to use dummy data exports :(

Regards
Comment 11 Mikolaj Zalewski 2007-11-04 17:35:19 CST
If the only problem is a 4-byte function then IMHO this is fixable. As the last resort we could copy it into the DLL code and as the last build step analyze the ELF file and rewrite all the calls of the original to calls of the copy. But maybe it's also possible to fix it cleanly by changing some linker flags or the wine loader? I'll try to search something more about it and if I find a game protected by it I could try to write some code.

Also http://support.microsoft.com/kb/935448 suggests that Windows doesn't allow to relocate some core DLLs. If all of kernel32, gdi32 and user32 are such then maybe dropping -fPIC is a valid solution (of course on x86)?

Does Alexandre have an opinion how clean the fix should be or maybe know how this should be fixed?
Comment 12 Peter Beutner 2007-11-04 20:06:21 CST
(In reply to comment #9)
> Now the question remains whether to drop -fPIC for these libraries or finding a
> way to relocate the "get_pc_thunk.bx" code snippet into PE section boundaries.

I rather hoped it would be possible to relocate the PE header somehow so that it sits right before the ELF .text section.
For example by putting the PE header in an extra section and
then with a custom linker script put that section in front of the text section.(which I think should work at least theoretically ;) )
But I'm still trying to understand how all that building/linking&loading stuff exactly works in wine, so no guarantees it is possible at all.

> Could you elaborate which games?
> I have successfully tested the Battlefield game series which have SafeDisc 2.5
> and 2.8 (either with dummy exports or by dropping -fPIC).
> I've also tested one SafeDisc 3.x (doom3) and one 4.x game (BF2).
> Maybe we're seeing different issues.

The game was Stronghold patched to V1.2 which reported SafeDisc 2.60.something.
And it definitely didn't work with wine-0.9.48(and dropping -fPIC). But after updating to the latest git today it suddenly works :). Maybe it was that ioctl problem that got fixed.(forgot the bug number). Will test it with some other games as well when I find the time.
Comment 13 Anastasius Focht 2007-11-05 03:22:53 CST
Hello,

--- quote ---
The game was Stronghold patched to V1.2 which reported SafeDisc 2.60.something.
And it definitely didn't work with wine-0.9.48(and dropping -fPIC). But after
updating to the latest git today it suddenly works :). Maybe it was that ioctl
problem that got fixed.(forgot the bug number).
--- quote ---

You probably ran into this one: http://bugs.winehq.org/show_bug.cgi?id=10225 ... That took me some time because I first blamed the SafeDisc security driver.
In the end I verified the security driver returned errors on purpose which triggered a wine server bug ;-)

--- quote ---
I rather hoped it would be possible to relocate the PE header somehow so that
it sits right before the ELF .text section.
For example by putting the PE header in an extra section and
then with a custom linker script put that section in front of the text
section.(which I think should work at least theoretically ;) 
--- quote ---

Yes, a custom linker script would be needed to place a section directly before all other init/startup code. Probably wine tools need to be modified too... 
I played a bit with linker flags and it seems -shared always pulls in startup object files first (despite the argument placement) ... the wine object file with PE section header follows as first visible argument, usually <dllname>-<tempname>.spec.o

--- quote ---
Also http://support.microsoft.com/kb/935448 suggests that Windows doesn't allow
to relocate some core DLLs. If all of kernel32, gdi32 and user32 are such then
maybe dropping -fPIC is a valid solution (of course on x86)?
--- quote ---

Well if micro$oft claims that for their core dlls that would be a welcome incident for dropping -fPIC at least for wine "core" libraries.
AFAIK at least builtin kernel32.dll is automagically mapped by loader on process init, so relocation incident is very unlikely...

Regards
Comment 14 Alexandre Julliard 2007-11-05 06:59:40 CST
Created attachment 8969 [details]
Hack to put PE header in .init section.

You can try a hack like this one.
Comment 15 Anastasius Focht 2007-11-05 12:31:16 CST
Hello,

unfortunately the PE header patch works only for one of the 3 conditions mentioned earlier...

Comparison with previous data (from comment #9):

standard:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x59       0x53        0x5D        0x5F
c2:  0x46*      0x47*       0x50*       0x3C
c3:  0x43       0x4B        0x4B        0x5A

with PE header .init section patch (moves PE header before pc thunk code):

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x59       0x53        0x5C        0x5F
c2:  0x46*      0x43*       0x53*       0x3C
c3:  0x00       0x00        0x00        0x5A

At least condition c3 is completely satisfied.
By doing some dirty stuff *g*, I forced the analyzer to load my own, custom made dll at the point where it checks the system libraries.
I ran some series to find more about the meaning of calculated stats values.

--------

The custom dll exports a specific number of functions (10 for percentage values) with pure asm code to control generation of opcodes.

I made some test cases with different nesting levels and calling destinations.

"t_x" = testcase x
"n_x" = nesting level x (e.g. func0() { func1(); }  func1() { ...} )
"i_fn" = internal function
"exp" = exports

t_1: 10 data exp, init with 0
t_2: 10 exp, call to i_fn, n_1, simple register calculation
t_3: 10 exp, call to i_fn, n_1, emulates get_pc_thunk.bx()
t_4: 10 exp, call to i_fn, n_1, { mov eax, <outside_of_pe_dest> ; jmp eax } 
t_5: 10 data exp, init with 0xE8 (call xxx opcode) to check if the analyzer makes difference of section (.data vs .text)
t_6: 10 exp, call to i_fn, n_1, calls GetModuleHandleA("kernel32.dll")
t_7: 10 exp, call to i_fn, n_10, at n_10 call GetModuleHandleA("kernel32.dll") - nesting level test
t_8: 10 exp, call to GetModuleHandleA("kernel32.dll")
t_9: 5 exp, call GetModuleHandleA("kernel32.dll") and
     5 exp, call to i_fn, n_1, calls GetModuleHandleA("kernel32.dll")

     t1     t2     t3     t4     t5     t6      t8    t9    condition (cx < th)
--------------------------------------------------------------------
c1:  0x00   0x64   0x64   0x64   0x64   0x64   0x64   0x64   0x5F
c2:  0x00   0x64   0x64   0x64   0x64   0x64   0x64   0x32   0x3C
c3:  0x00   0x00   0x00   0x64   0x50   0x00   0x64   0x32   0x5A

Well ... determining the meaning of conditions seem to be pretty hard because several information is taken into account and I only varied call destination and nesting level.
I dropped the idea of testing various other opcode sequences/entry signatures (length of sequences, stack parameters, ..) due to time constraints ... 

c3 seems to be directly related to the destination of first call/jump encountered at nesting level 0.
So the get_pc_thunk.bx() location problem was only one reason.

That makes two options left: drop -fPIC or go with fake data exports.
Though dropping -fPIC for the libraries in question seems to be more natural solution ;-)

Regards
Comment 16 Peter Beutner 2007-11-05 13:11:12 CST
(In reply to comment #13)
> I played a bit with linker flags and it seems -shared always pulls in startup
> object files first (despite the argument placement) ... the wine object file
> with PE section header follows as first visible argument, usually
> <dllname>-<tempname>.spec.o

You can change the order as I found out today. Pass -nostartfiles to gcc and it won't link against those crt*.o files itself and you can specify them on the cmdline in the order you need, i.e. after the *.spec.o file. That seems to work
as the pe header now is the first thing in the .text section. But same as AJ's hack it isn't sufficient to please safedisc :(.


(In reply to comment #15)
> So the get_pc_thunk.bx() location problem was only one reason.

And the location obviously wasn't even a real problem as that test(c3) did pass before already. And changing the location produces even worse c2 results for gdi32. So it seems it is the fact that there is a call at all in these first 8 instructions that safedisc considers "bad".
Or does compiling without -fPIC does change something else that might be tested by safedisc?

Just dropping -fPIC would be really the easiest solution. Crawling through the safedisc code and find out how exactly these tests work isn't exactly fun :(
Comment 17 Alexandre Julliard 2007-11-05 14:07:30 CST
I don't think we can drop -fPIC, some distros don't allow text relocations, and we can't guarantee that the specified load address will always be free.
Comment 18 Anastasius Focht 2007-11-05 15:46:21 CST
Hello,

--- quote ---
I don't think we can drop -fPIC, some distros don't allow text relocations, and
we can't guarantee that the specified load address will always be free.
--- quote ---

Pity .. so we have to think about other solutions.

--- quote ---
And changing the location produces even worse c2 results for
gdi32.
--- quote ---

Thats something I wondered about too. I disassembled and compared the entry code of the first checked exports - it did not change at all: same stack frame, same location of __i686.get_pc_thunk.bx().
Usually within 8 instruction range, which makes it "bad guy".
So c2 must include additional information from somewhere.

--- quote ---
 So it seems it is the fact that there is a call at all in these first 8
instructions that safedisc considers "bad".
--- quote ---

True, any instruction which transfers execution flow within that range seems to be considered bad by analyzer (jmp, jcond, call ...) - regardless of the destination (within own .text section other outside).
Though "outside" destinations seem to have "modifier" effect on c3.

They probably analyzed windows API entry code and made up some statistics how many instructions are needed to safely detect any "hostile" execution flow transfer (jump trampolines) without having too much false positives.
Besides there are other ways to trace/hook system calls without the need to overwrite opcodes on entry code ;-)

I did some more testing with my custom "backdoor" dll and presented the analyzer some lengthy code sequences (which do just nothing but waste instructions + cpu cycles).

"t_x" = testcase x
"exp" = exports

t_1: 10 exp, seq len: 20 instr, standard stack frame (ebp), register math
t_2: 9 exp, seq len: 20 instr, standard stack frame (ebp)
     1 exp, seq len: 6 instr, standard stack frame (ebp)
t_3: 5 exp, seq len: 20 instr, standard stack frame (ebp)
     5 exp, seq len: 6 instr, standard stack frame (ebp)
t_4: 10 exp, seq len: 10 instr, standard stack frame (ebp), 9 x nop + 1 x ret

     t1     t2     t3     t4     condition (cx < th)
--------------------------------------------------------------------
c1:  0x00   0x0A   0x32   0x00   0x5F
c2:  0x00   0x01   0x19   0x00   0x3C
c3:  0x00   0x00   0x00   0x00   0x5A

The instruction classes used seem to have no influence. The disassembler only treats any control transfer instructions special (including ret).

So if we pad the entry code at least 8-9 instructions long with NOP or other padding opcodes we should get around these idiotic checks.

For testing I abused the wine tools and wine's relay thunk feature.
Seemed to me quickest solution for testing, sorry for that ;-)

--- snip ---

diff --git a/tools/winebuild/spec32.c b/tools/winebuild/spec32.c
index 8df5b85..9164827 100644
--- a/tools/winebuild/spec32.c
+++ b/tools/winebuild/spec32.c
@@ -124,7 +124,16 @@ static void output_relay_debug( DLLSPEC *spec )

         output( "\t.align %d\n", get_alignment(4) );
         output( ".L__wine_spec_relay_entry_point_%d:\n", i );
-
+
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+        output( "\tnop\n" );
+
         if (odp->flags & FLAG_REGISTER)
             output( "\tpushl %%eax\n" );
         else
--- snip ---

(rebuild)

The results seem very positive to the current state of affair.
With the "hacked" relay thunk (first 8 bytes in thunk are NOP) the following stats are generated:

*** Note: you have to run the target with +relay to let this test work ***

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x12       0x04        0x14        0x5F
c2:  0x02       0x00        0x04        0x3C
c3:  0x00       0x00        0x00        0x5A

So if wine could make default thunks/entry code with just enough no-op opcode padding before jumping to real thing we can keep -fPIC.
This "feature" should be tested thoroughly, maybe some other brain damaged PE protectors/virii might find this entry code somewhat suspicious.

Though this kind of entry code is not uncommon, Micro$oft has such kind of API entry padding too to support "hot patching" technique (their infamous "mov edi, edi").

Regards
Comment 19 Alexandre Julliard 2007-11-06 16:20:44 CST
(In reply to comment #18)

> So if wine could make default thunks/entry code with just enough no-op opcode
> padding before jumping to real thing we can keep -fPIC.

We don't have thunks for normal entry points, so there's nowhere to put the nops, they would have to be generated directly by gcc.

But with proper use of the ELF visibility attribute it should be possible to prevent gcc from loading the PIC register at least on simple functions, I'm looking into that.
Comment 20 Alexandre Julliard 2007-11-08 07:09:11 CST
I committed a number of fixes that should help. The checks now succeed for me when building with gcc 4.2. Other gcc versions may yield different results depending on how good they are at avoiding PIC register loads.
Comment 21 Anastasius Focht 2007-11-08 08:21:54 CST
Hello,

statistics with the gcc hidden visibility attr (make functions internal) and stub entry points "NOP" patches committed:

before (wine-0.9.48-xxx):

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x59       0x53        0x5D        0x5F
c2:  0x46*      0x47*       0x50*       0x3C
c3:  0x43       0x4B        0x4B        0x5A

current wine-0.9.48-422-g0bfba69:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x44       0x4B        0x3C        0x5F
c2:  0x2E       0x3A        0x21        0x3C
c3:  0x3A       0x47        0x37        0x5A

The standard case seems safe now, although user32 is near c2 threshold.
We should watch this, it's possible that future changes/additions might trigger it again.

====

current wine-0.9.48-422-g0bfba69 + relay:

    kernel32   user32      gdi32       condition (cx < threshold)
--------------------------------------------------------------------
c1:  0x52       0x5E*       0x53        0x5F
c2:  0x43*      0x5C*       0x42*       0x3C
c3:  0x00       0x00        0x00        0x5A

AJ could you please make the relay thunk entry code NOP padded too (tools/winebuild/spec32.c:output_relay_debug)?
If people run the games/apps with +relay tracking other bugs down, SafeDisc 2.x won't tolerate this.
NOP padding relay thunks won't hurt and helps people to submit logs for other bugs.

--

If anyone encounters problems with SafeDisc 2.x games not running despite recent patches, please file bug reports.
Although I think this was the major showstopper for SafeDisc 2.x left.
New targets... ;-)

Regards
Comment 22 Peter Beutner 2007-11-08 16:43:19 CST
user32 doesn't pass the check here.(compiled with gcc-4.2.2)
Just changing one other function to not retrieve the PIC register and it works.

You are not by any chance running a distribution where -fstack-protector is enabled by default? Because compiling with that makes it work, too.

btw. compiling with gcc-3.4.6 and it works right away.
Comment 23 Anastasius Focht 2007-11-08 18:01:23 CST
Hello,

--- snip ---
You are not by any chance running a distribution where -fstack-protector is
enabled by default? Because compiling with that makes it work, too.
--- snip ---

Not the case here. Gcc's stack smashing guard code won't solve this problem and additionally it applies only to certain classes of functions.
Just an example, the following entry code was generated on my ix86 system with gcc version 4.1.2 20070925 (Red Hat 4.1.2-27) and -fstack-protector:

--- snip ---
push    ebp
mov     ebp, esp
sub     esp, 48h
mov     [ebp-0Ch], ebx
call    __i686_get_pc_thunk_bx
add     ebx, 4D866h
mov     [ebp-8], esi
mov     esi, eax
mov     [ebp-30h], edx
mov     eax, large gs:14h         ; fetch random cookie from TLS
mov     [ebp-10h], eax            ; setup stack guard
xor     eax, eax                  ; clear value
...
--- snip ---

The stack cookie/guard setup code is inserted *after* PIC register load, so no help from compiler here.

--- snip ---
btw. compiling with gcc-3.4.6 and it works right away.
--- snip ---

Thats the problem. Different gcc compiler versions, different code generation (optimization techniques).

Regards
Comment 24 Peter Beutner 2007-11-08 19:50:24 CST
(In reply to comment #23)
> 
> Not the case here. Gcc's stack smashing guard code won't solve this problem 

It won't solve the problem. But it definitely has an influence on how many functions pass the safedisc check. Was just a thought that this might explain the differences we are seeing.
 
> The stack cookie/guard setup code is inserted *after* PIC register load, so no
> help from compiler here.

no, sometimes it is inserted before the PIC register load.
see:
000b0940 <GetSystemMetrics>:
   b0940:       55                      push   %ebp
   b0941:       89 e5                   mov    %esp,%ebp
   b0943:       83 ec 58                sub    $0x58,%esp
   b0946:       8b 45 08                mov    0x8(%ebp),%eax
   b0949:       89 5d f4                mov    %ebx,-0xc(%ebp)
   b094c:       65 8b 15 14 00 00 00    mov    %gs:0x14,%edx
   b0953:       89 55 f0                mov    %edx,-0x10(%ebp)
   b0956:       31 d2                   xor    %edx,%edx
   b0958:       e8 33 57 f6 ff          call   16090 <__i686.get_pc_thunk.bx>
   b095d:       81 c3 97 86 03 00       add    $0x38697,%ebx

When compiled with -fstack-protector this function will pass the safedisc check, without it would fail.
Comment 25 Anastasius Focht 2007-11-09 03:43:53 CST
Hello,

--- quote ---
But it definitely has an influence on how many
functions pass the safedisc check. Was just a thought that this might explain
the differences we are seeing.
--- quote ---

Yep, the gcc 4.1.2 snippet was just an example to show every gcc version behaves different (might generate different entry code).

--- quote ---
When compiled with -fstack-protector this function will pass the safedisc
check, without it would fail.
--- quote ---

Yes, it is the stack protectors 3 additional instructions that makes more functions pass because they move the PIC load (first call) beyond SafeDisc 2.x analyzer threshold of 8 entry code instructions.

Considering the wide range of used gcc versions more user32 functions should be marked hidden to gain more distance to analyzer c2 threshold.

Regards
Comment 26 Dan Kegel 2007-11-09 18:07:06 CST
With today's wine from git, building with gcc-4.0.3 didn't
let me install an Adobe product properly, but doing a 'make clean'
and configuring with CC=gcc-3.3  (3.3.6) worked fine.

I still couldn't activate the product, but that's probably a different problem.
Comment 27 Dan Kegel 2007-11-12 01:24:36 CST
Increasing priority
Comment 28 Alexandre Julliard 2007-11-12 05:52:31 CST
(In reply to comment #25)
> Considering the wide range of used gcc versions more user32 functions should be
> marked hidden to gain more distance to analyzer c2 threshold.

All functions that can be marked hidden already are, there's no possibility to do more in that area. We could probably fix a couple more functions by removing traces, but there isn't a lot of play here either. And considering that apparently many gcc versions work fine, I don't think it's worth a lot of effort to fix the few broken ones.
Comment 29 Mikolaj Zalewski 2007-11-13 12:42:49 CST
Created attachment 9141 [details]
stubs for user32 functions present in Vista

I was wandering whether adding stubs for more functions in the user32.spec won't improve the ratio. But as I see there is a lot of stubs commented out in user32.spec so I don't know if we can add stubs there or is the risk too high that it will break applications calling GetProcAddress and trying to use the new functionality? And does this patch make any improvement?
Comment 30 Alexandre Julliard 2007-11-13 12:55:33 CST
(In reply to comment #29)
> I was wandering whether adding stubs for more functions in the user32.spec
> won't improve the ratio. But as I see there is a lot of stubs commented out in
> user32.spec so I don't know if we can add stubs there or is the risk too high
> that it will break applications calling GetProcAddress and trying to use the
> new functionality? And does this patch make any improvement?

It would probably improve the ratio a bit, but in general we can't add stubs for functions that only exist in newer versions, because apps then start calling them and crash.

Comment 31 Dan Kegel 2007-11-14 15:27:37 CST
Dapper's default compiler seems to
be gcc-4.0.3, which doesn't work, but it comes
with an alternate compiler, gcc-3.3, which does.

If we're not going to support some versions
of gcc, we might want to add configure checks
to warn about known incompatible versions
so people can fall back sensibly.
Comment 32 Peter Beutner 2007-11-17 11:29:02 CST
(In reply to comment #28)
> All functions that can be marked hidden already are, there's no possibility to
> do more in that area. We could probably fix a couple more functions by removing
> traces, but there isn't a lot of play here either. And considering that
> apparently many gcc versions work fine, I don't think it's worth a lot of
> effort to fix the few broken ones.

hmm, you can hardly call the compiler broken just because it loads the PIC register during the first eight instructions and not one or two instructions later. And relying that much on a specific function entry sequence seems rather fragile to me.(and the fact worries me that the working gcc versions are mostly older ones, while the most recent ones(4.2.2 and some 4.3 snapshot i tested) don't work)

Anyway, thinking about it I was wondering if we couldn't do more using the visibility attribute. Shouldn't it be possible to mark the exported API functions of a dll with visibility=hidden as well? Because (if I see this correctly) as far as the elf linker is concerned they don't need to be exported at all. When you link a dll/program against lets say user32, no user32 api reference actually is linked to the corresponding symbol in user32.dll.so. Instead they are linked to the PE import table that winebuild/winegcc creates and links into that dll/program. So as long as everybody uses winegcc/winebuild you can mark everything in the dll as hidden. Or am I missing something?
When marking those functions hidden the number of functions that pass the safedisc test nearly doubles(from 143 to 242, if my testing app counts correctly, threshold is somewhere around 145). Imho that would be much safer than relying on the question if the gcc optimization of the day would still produce enough functions with a "safe" function entry sequence.

Another option would be to mark these functions with visibility=protected, which means that every references to symbols defined in the same object are satisfied locally, i.e. calling functions in the same dll don't go through the plt. But the symbols are still exported and visible outside the library.
That doesn't work for functions where at some point a reference to that function is used, but that are only a few cases(3 in user32).
Though I vaguely remember having read somewhere that using "protected" isn't as optimal as it sounds.
Comment 33 Alexandre Julliard 2007-11-18 07:21:43 CST
(In reply to comment #32)
> hmm, you can hardly call the compiler broken just because it loads the PIC
> register during the first eight instructions and not one or two instructions
> later. And relying that much on a specific function entry sequence seems rather
> fragile to me.(and the fact worries me that the working gcc versions are mostly
> older ones, while the most recent ones(4.2.2 and some 4.3 snapshot i tested)
> don't work)

I'm not saying the compiler is broken in general (though that may well be true of many 4.0 versions...)

> Anyway, thinking about it I was wondering if we couldn't do more using the
> visibility attribute. Shouldn't it be possible to mark the exported API
> functions of a dll with visibility=hidden as well? Because (if I see this
> correctly) as far as the elf linker is concerned they don't need to be exported
> at all. When you link a dll/program against lets say user32, no user32 api
> reference actually is linked to the corresponding symbol in user32.dll.so.
> Instead they are linked to the PE import table that winebuild/winegcc creates
> and links into that dll/program. So as long as everybody uses winegcc/winebuild
> you can mark everything in the dll as hidden. Or am I missing something?

They need to be exported for the benefit of C++ winelib apps that need to import  some dlls directly (like a gcc-compiled mfc), because the name mangling of gcc is not compatible with msvc.

> Another option would be to mark these functions with visibility=protected,
> which means that every references to symbols defined in the same object are
> satisfied locally, i.e. calling functions in the same dll don't go through the
> plt. But the symbols are still exported and visible outside the library.
> That doesn't work for functions where at some point a reference to that
> function is used, but that are only a few cases(3 in user32).
> Though I vaguely remember having read somewhere that using "protected" isn't as
> optimal as it sounds.

The semantics of protected would be what we need, but yes, I don't think it works all that well in practice.

The only really foolproof way would be to hack gcc to not put a call in the first 8 instructions. But I think the chances of getting such a patch into upstream gcc is fairly small...
Comment 34 M Welinder 2007-12-01 18:47:21 CST
That sounds fairly simple to fix, even without gcc changes.

1. Change the .c.o rules into two rules: .c.s and .s.o

2. For the .s rule, use a specialized assembler program which throws a
   set of nops into the text as needed and feeds that to the regular
   assembler.

That should be a <50 lines perl script assembler.
Comment 35 M Welinder 2007-12-01 19:36:50 CST
Specifically...

In Makefile, replace .c.o rule with these.  (The first one cancels the
built-in .c.o rule.)

%.o : %.c

.c.s:
	$(CC) -S $(ALLCFLAGS) -o $@.tmp $<
	$(TOPSRCDIR)/tools/asmfixer <$@.tmp >$@
	$(RM) $@.tmp

And add this in tools/asmfixer:

#!/usr/bin/perl -w

use strict;

my $instrs = 100000;

while (<STDIN>) {
    $instrs = 0 if (/^[a-zA-Z_][a-zA-Z_0-9]*:/);
    while ($instrs < 8 && /^\t(call|j)/) { print "\tnop\n"; $instrs++; }
    $instrs++ if /^\t[a-z]/;
    print;
}
Comment 36 Dan Kegel 2008-01-14 18:47:02 CST
FWIW, here are two "bad" compilers, i.e. if you build today's git
with these compilers, photoshop CS's installer will abort before
asking you to activate:
  Ubuntu 7.10's default compiler, gcc-4.1.3 20070929 (prerelease)
  Ubuntu 6.06's default compiler, gcc-4.0.3 (Ubuntu 4.0.3-1ubuntu5)
On both of these operating systems, to build Wine properly, you can
install gcc-3.3 as a workaround, e.g.

  sudo apt-get install gcc-3.3
  cd wine-git
  make distclean || true
  ./configure CC=gcc-3.3
Comment 37 Alexandre Julliard 2008-01-16 13:39:23 CST
It works for me with gcc-4.1.3 20070929, but it requires my .init section patch. I've just committed it.
Comment 38 Dan Kegel 2008-04-03 08:28:59 CDT
Shall we mark this fixed, then?
Comment 39 Anastasius Focht 2008-04-03 13:49:46 CDT
Hello,

yes because it targeted a specific issue encountered with SafeDisc v2 and it should work for the majority of people now.

It might be feasible to setup a small wine wiki entry, describing the problem and solution (bad/recommended gcc versions) so this knowledge is not lost.
Future bug reports targeting this issue should be closed and redirected to wiki entry then.

Pending issues with later SafeDisc versions are tracked separately.
Though I'm not really happy with the existence of the "generic" ones like SafeDisc (http://bugs.winehq.org/show_bug.cgi?id=219) and SecuROM (http://bugs.winehq.org/show_bug.cgi?id=7065).
Such "meta" bugs will stay for a very long time because they cover various versions.

I made the fault of keeping various meta bugs alive by adding more and more info/patches/workarounds.
Ideally there should be bugs for major versions or version ranges, which employ different techniques (3.x-5.x, 6.x-7.x, ...) with a few games each as testbed.

Regards
Comment 40 Alexandre Julliard 2008-04-04 10:06:22 CDT
Closing bugs fixed in 0.9.59.
Comment 41 Anastasius Focht 2011-10-12 04:46:09 CDT
Hello,

filling/correcting fields ...

Regards


Hosted By CodeWeavers