WineHQ
Bug Tracking Database – Bug 3817

 Bugzilla

 

Last modified: 2016-01-19 07:45:00 CST  

InstallShield very slow when copying many small files

Bug 3817 - InstallShield very slow when copying many small files
InstallShield very slow when copying many small files
Status: NEW
AppDB: Show Apps affected by this bug
Product: Wine
Classification: Unclassified
Component: -unknown
0.9.1.
Other Linux
: P2 minor
: ---
Assigned To: Mr. Bugs
http://www.gamefront.com/files/files/...
: download, Installer, performance
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2005-11-11 19:13 CST by Scott Ritchie
Modified: 2016-01-19 07:45 CST (History)
14 users (show)

See Also:
Regression SHA1:
Fixed by SHA1:
Distribution: ---
Staged patchset:


Attachments
Endlessly looping configure after running make in package build (173.15 KB, text/plain)
2007-07-15 15:55 CDT, Scott Ritchie
Details
callgrind profiling path (55.60 KB, image/png)
2008-02-03 16:34 CST, ebfe
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Scott Ritchie 2005-11-11 19:13:41 CST
InstallShield seems to run very slowly when it is moving many small files,
rather than a few big files.  This results in programs that have hundreds of
little bitmap files to install taking several hours, like Hearts of Iron 2.
Comment 1 Erik Moden 2005-11-12 18:16:41 CST
I also get this problem, and I can add that this also causes a high cpu load:

Tasks: 145 total,   2 running, 143 sleeping,   0 stopped,   0 zombie
Cpu(s): 22.1% us, 31.7% sy,  0.0% ni, 45.8% id,  0.0% wa,  0.0% hi,  0.3% si
Mem:   1035296k total,  1018028k used,    17268k free,    28176k buffers
Swap:  2048248k total,     2720k used,  2045528k free,   728936k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
19310 erik      16   0  365m  24m 8848 R 74.3  2.4  14:23.37 wine
19305 erik      15   0  7028 5124 4788 S 30.2  0.5   4:48.88 wineserver
Comment 2 Robert Lundmark 2006-02-19 15:07:48 CST
This is still a big problem in 0.9.8. Would be nice to get some input from the
devs on why this happens.
Comment 3 Vitaliy Margolen 2006-02-19 20:42:18 CST
I think it's the asynch, case-unsensitivity and slowness in wineserver
comunications. You could run oprofile and see where does Wine wastes all that time.
Comment 4 Rob Shearman 2006-06-28 15:40:27 CDT
I have made a few optimisations to the COM and RPC code in recents weeks, so it
would be good if this was retested.
Comment 5 Scott Ritchie 2006-07-12 13:49:54 CDT
The problem is still present in 0.9.16, however it looks like you made quite a
few patches that didn't get in until 0.9.17.  I'll test that after I get the
0.9.17 package built.
Comment 6 Scott Ritchie 2006-07-13 01:52:42 CDT
After testing 0.9.17, there doesn't seem to be much improvement in terms of
overall speed.

CPU usage, however, is pretty low.  I suspect there may be something more
nebulous at work here.

My filesystem type is reiserfs, by the way, and I can send you a CD image of
Hearts of Iron 2 if you'd like to test it.
Comment 7 Rob Shearman 2006-07-21 04:25:32 CDT
Yes, getting a CD image of Hearts of Iron 2 would be useful.
Comment 8 Rob Shearman 2006-07-30 18:03:42 CDT
Using oprofile, I generated a list of all of the functions using over 1% of the
time in the applicate:
407955   20.6001  ntdll.dll.so             RtlUpcaseUnicodeString
309817   15.6445  libwine.so.1             wine_cp_wcstombs
218996   11.0584  ntdll.dll.so             RtlIsNameLegalDOS8Dot3
198561   10.0265  libwine.so.1             wine_utf8_mbstowcs
118514    5.9845  ntdll.dll.so             RtlUpcaseUnicodeStringToCountedOemString
90594     4.5746  ntdll.dll.so             wine_nt_to_unix_file_name
74824     3.7783  ntdll.dll.so             RtlUnicodeToOemN
58600     2.9591  libwine.so.1             memicmpW
54659     2.7601  ntdll.dll.so             ntdll_umbstowcs
45692     2.3073  ole32.dll.so             StorageImpl_GetNextBlockInChain
43063     2.1745  ntdll.dll.so             RtlUnicodeStringToOemSize
35359     1.7855  ntdll.dll.so             hash_short_file_name
26010     1.3134  ole32.dll.so             StorageUtl_ReadDWord
25451     1.2852  ntdll.dll.so             .plt
24873     1.2560  ntdll.dll.so             find_free_area
21688     1.0952  libwine.so.1             tolowerW

As can be seen, RtlUpcaseUnicodeString is the top CPU user which is likely being
called by the file or loader code.
Comment 9 Rob Shearman 2006-07-31 19:59:09 CDT
There is case where performance can be improved, but I'm afraid this bug looks
like it boils down to the basic problem of it being damn hard to prove a
case-insensitive filename doesn't exist on a case-sensitive filesystem.
Approximately 88% of the CPU time is in doing this.
Comment 10 Scott Ritchie 2006-07-31 23:01:48 CDT
I'm not so sure it's that difficult.

The best place for this might be kernel space, but couldn't we cache the (case
insensitive) file names in a directory somewhere and just compare our current
file with that?  I'm not sure why that would necessarily be much more cpu
intensive than a straight comparison.
Comment 11 Jan Zerebecki 2006-08-03 09:47:16 CDT
see http://wiki.winehq.org/CaseInsensitiveFilenames for more info
Comment 12 Stephen Mirowski 2006-10-06 17:47:37 CDT
This looks like the same problem I am having with Star Wars: KotOR I and II. 
For Knights I, it starts going slow right after inserting disc 3 and continues
slow with disc 4 (both have wav files).  Total wine install is about 16 hrs. 
Probably a 15 min install on a Windows box.

Knights II, the problem starts on disc 4.

Again, CPU goes 100% installing these files.  Interesting, though, the CD is
spinning so very hard for each file.  Almost like it's searching the entire disc
for the file.

I have tried this on 0.9.21 and .22.
Comment 13 Scott Ritchie 2007-03-12 03:11:14 CDT
I posted a link to this bug on a web forum, and got a few comments:

http://forums.somethingawful.com/showthread.php?threadid=2351561&perpage=40&pagenumber=3

The next two posts are reposts of their comments.
Comment 14 Scott Ritchie 2007-03-12 03:11:37 CDT
1st comment:

The bug pretty clearly shows a specific case of showing nonexistence of a
case-insensitive name on a case sensitive filesystem, which isn't exactly a
string comparison like I talked about before (in-memory comparison). The problem
is higher level than the one that call would help out with. Personally, I would
have written the operation as a lookup on a trie generated for each directory's
file listings. I specifically say trie because so, so many people just dump shit
into a hash table and don't give a rip about memory use.

I'm not precisely sure how the vfat / fat32 driver handles existence of
case-insensitive filenames, but I'd recommend looking at that for a solution
from those who have put more than 2 minutes thinking about the issue. From what
I can tell in the posts, the existing method is really, really inefficient for
large numbers of files (and even worse for longer filename inputs).

I think it wouldn't be a problem to add some code to Wine itself to cache
filenames it encounters on a path internally - not necessarily as a kernel
extension for userspace. Another option as a stopgap solution is to write
something for FUSE to cover the case insensitivity problem at the filesystem
level and have Wine wrap around the local filesystem via FUSE. I dunno how
lightweight FUSE is, but it may be worth a shot.
Comment 15 Scott Ritchie 2007-03-12 03:11:56 CDT
It shouldn't be that complicated.
Get a list of every file in a directory, and then determine the distance on the
character table between lower case and upper case.

Write a comparison function, it theoretically should just be twice as slow as
finding a case insensitive string in a list of case insensitive strings.

If the function is building a list of every possible filename outcomes for each
file before checking against it, _THAT_ is a waste of CPU cycles and will work
slow as hell. It should just be done realtime...I haven't looked at the code yet.

The method I described would be fast, any modern PC could do it hundreds of
thousands of times a second.


So, it's a matter of...lets say we have a directory

c:\whatever
contains
hello.txt
HeLlO.txt(at least in *nix FS)
llama.txt
kitty.txt

We're looking for case insensitive hell.txt

so we iterate through the list looking for H and h, for each match, take the
next letter of hell, compare against e and E. It's just a couple embedded loops,
and it would be pleanty fast. Do it for the last two letters of hell. Don't
apply this algorithm to the list of files preemptively, apply it to the file
we're looking for on the fly, in an embedded loop that would only have to scan
the list once.
Comment 16 Rob Shearman 2007-03-12 05:41:57 CDT
The vfat / fat32 driver does already have a function for find a filename
case-insensitively and Wine already uses it. It doesn't help the people that are
installing this app on other filesystems.

The comment about using trees and caching the information don't take into
account problems with other processes creating files and having to keep the
cache up to date. The comment about efficient upper-case comparison doesn't
realise that the RtlUpcaseUnicodeString function is about as efficient as it can
be, but it is just being called too many times.
Comment 17 Scott Ritchie 2007-03-12 18:33:06 CDT
Does that mean this bug doesn't exist if the .wine directory resides on a
case-insensitive file system?
Comment 18 Jan Zerebecki 2007-03-24 01:13:58 CDT
Rob Shearman, how about watching with inotify and update the cache from it's events?
Comment 19 Rob Shearman 2007-03-24 07:32:48 CDT
Sure, you can do that. You would only be able to do it for a selected number of
directories at a time and you'd have to decide when to process the inotify
events - a separate thread may not be an option due to applications not
expecting this.
Comment 20 jvlad 2007-07-08 17:42:46 CDT
What's about TTL approach?
Say we have read a directory to lookup for a file. Next time we can check if 
the cache is not dirty by applyting TTL logic. If TTL is not expired, we can 
use cache, otherwise it's necessary to refill.
Say with 2 seconds TTL, I'd expect great performance increase. As a bonus, we 
would not need to keep coherency at all. Nothing wrong if one process has 
created a file and another process doesn't see this for 2 seconds. The same is 
for file deletion/rename operations. The only thing needed is critical section 
that will isolate concurent access from different threads.
Comment 21 Scott Ritchie 2007-07-15 15:55:20 CDT
Created attachment 7122 [details]
Endlessly looping configure after running make in package build
Comment 22 Scott Ritchie 2007-07-15 15:56:40 CDT
Comment on attachment 7122 [details]
Endlessly looping configure after running make in package build

posted to wrong bug due to Bugzillas "helpful" feature of jumping to the next
one on your list whenever you make a comment...
Comment 23 Scott Ritchie 2008-01-18 03:13:35 CST
"The vfat / fat32 driver does already have a function for find a filename
case-insensitively and Wine already uses it. It doesn't help the people that are
installing this app on other filesystems."

When does Wine determine to do this?  If I drag a folder from my ~/.wine/drive_c directory to a vfat filesystem and then create a symlink to it, is Wine smart enough to use that?

I tried doing exactly this, and performance is still really slow for the application (System Shock 2).  I'm not 100% certain the load time is due to this issue, but I do know that System Shock 2 works by loading up some files from one folder and then looking for newer ones with the same (case-insensitive) name in the main folder to use instead as newer replacements.
Comment 24 ebfe 2008-02-03 16:34:13 CST
Created attachment 10600 [details]
callgrind profiling path

profiling wine while saving a single file in a directory of thousand files.
Comment 25 ebfe 2008-02-03 16:34:26 CST
i've taken a look at this and imho there are several things important about this whole thing:

- the major performance-drawback comes from wine trying to find files in large directories. I've attached a profiling-graph that shows how 70% of the cpu-cycles for a single file-lookup are spend in the function 'wine_nt_to_unix_file_name'. different profiling-scenarios reveal similar results.

- the basic problem is to map a case-insensitive filesystem like NTFS/FAT to a case-sensitive filesystem like linux vfs. when wine gets a request for "foo.BAR", it looks for the that filename. when it is not found, wine scrolls through the entire directory, converting every filename to utf8 and doing a case-insensitive comparison on that. this is done for every filename-lookup so in a directory of a thousand files, adding another 100 files leads to 100.000 utf8/case-conversions.

- the hash- and utf8-conversion functions are horribly slow. those functions rely on byte-wise operations which are in fact only emulated on newer (read: 468 and up) processors. rewriting the utf8-conversion to do 2 bytes at a time (e.g. build a conversion-table at runtime for two-byte input) will effectively double those functions' performance.


there are some general problems:

- currently wine simply opens a stream to the directory and reads all entries. as there is no ordering of filenames applied, the first entry that matches wins. the order is completely left to the OS and therefor random from our point of view. this may lead to the situation, where wine returns the file "foo.Bar" one time and "Foo.bar" another time, when asked for "FOO.BAR".
- as far as i can see, there is an unsolvable file-locking problem. creating a file on the "real" fs implies automatic locking against other processes which may do the same. since filenames are immutable between all processes performing on the "real" filesystem, but not between those and wine, we will always theoretically face toctou-bugs.


imho the best solution for this (except the locking-thing) would be to rely on a cached intermediate state. when looking up files in a certain directory, we read the entire directory into a cache-structure (including ordering Foo.bar and foo.Bar) and start monitoring this directory using inotify.
one big advantage about inotify is the clarity of messages: when a new file is added to the directory in question, we get to know the file's name in advance and don't need to re-read the entire directory. consider a directory with 1.000 files, adding 100 files would require reading the directory once, reading from the cache 100 times and adding to the cache 100 times.
if inotify is not present, we can always fall back to the slower version already implemented.
Comment 26 Austin English 2008-06-12 13:19:59 CDT
Is this still an issue in current (1.0-rc4 or newer) wine?
Comment 27 Antonio López 2008-08-09 18:43:36 CDT
Yes, this is still an issue in wine 1.1.2 (Jack Keane installation).
Comment 28 Radosław Ciechowski 2009-07-05 13:59:23 CDT
This is still an issue in wine 1.1.25 (Hearts of Iron 2 installation).
Comment 29 Joseph Miller 2009-09-05 09:10:55 CDT
> imho the best solution for this (except the locking-thing) would be to rely on
> a cached intermediate state. when looking up files in a certain directory, we
> read the entire directory into a cache-structure (including ordering Foo.bar
> and foo.Bar) and start monitoring this directory using inotify.
> one big advantage about inotify is the clarity of messages: when a new file is
> added to the directory in question, we get to know the file's name in advance
> and don't need to re-read the entire directory. consider a directory with 1.000
> files, adding 100 files would require reading the directory once, reading from
> the cache 100 times and adding to the cache 100 times.
> if inotify is not present, we can always fall back to the slower version
> already implemented.

This sounds like the best solution to me as well.  The only problem with it is that it is unlikely one would want wine to monitor every single directory that any program accesses.  Windows programs are notorious for opening many files in many directories.

Perhaps the cache could be designed as a needs-based cache.  The cache could keep a list of each directory accessed, the last time it was accessed, and a count for how many times the directory has been accessed in the last 2-3 seconds.  Once the count rises above a threshold for a directory, caching of the directory would begin.  Once a directory hasn't been accessed for a period of time, it could be removed from monitoring and caching.  This would provide for smart memory management and significant performance increases as well.

More comments are needed on this so a request can be submitted for the best way to do this.  Wine's got a good history of rejecting patches and I'm not going to be very excited about starting work only to have to redo everything once I finish.

We also need a test program to demo this.  I will try to make a simple EXE to demonstrate this do a simple benchmark so progress can be shown.
Comment 30 Scott Ritchie 2009-09-05 17:12:39 CDT
Another route that might be more robust is to incorporate ciopfs (the case-insensitive on purpose filesystem) into the Wine packaging itself.  If the entire ~/.wine directory is case-insensitive then we get both speed improvements and avoid the possibility of file confusion (such as when a user extracts a patched FOO.DAT into a folder containing Foo.dat).


Will will need to be modified to know when it's running on a ciopfs location and use the code similar to fat32 above, though that shouldn't be too hard.
Comment 31 Austin English 2009-09-05 21:58:52 CDT
(In reply to comment #30)
> Another route that might be more robust is to incorporate ciopfs (the
> case-insensitive on purpose filesystem) into the Wine packaging itself.  If the
> entire ~/.wine directory is case-insensitive then we get both speed
> improvements and avoid the possibility of file confusion (such as when a user
> extracts a patched FOO.DAT into a folder containing Foo.dat).
> 
> 
> Will will need to be modified to know when it's running on a ciopfs location
> and use the code similar to fat32 above, though that shouldn't be too hard.

And if a native program writes to ~/.wine, what happens?
Comment 32 Joseph Miller 2009-09-05 22:33:58 CDT
> And if a native program writes to ~/.wine, what happens?

I agree.  The point of wine is portability.  The problem with doing something like forcing a filesystem is that one cannot forsee every possible use scenario and many things are likely to break. It also seems that wine is tending toward integration with native, not separation e.g. desktop integration, etc.

The other angle is that if one wants to use a different filesystem for their ~/.wine then they can already set this up manually.
Comment 33 ebfe 2009-09-06 00:34:42 CDT
Strong -1 on the cache solution. No matter how it is implemented, it will remain arbitrary to a certain degree.

It *will* be a endless source of weird bugs and glitches related to file access in programs running under wine.
Comment 34 Scott Ritchie 2009-09-06 01:13:31 CDT
> And if a native program writes to ~/.wine, what happens?

That's the point - the ~/.wine prefix would be mounted case-insensitive at login (rather than wineserver start), so there would never be a conflicting file created.
Comment 35 Austin English 2009-09-06 01:31:46 CDT
(In reply to comment #34)
> > And if a native program writes to ~/.wine, what happens?
> 
> That's the point - the ~/.wine prefix would be mounted case-insensitive at
> login (rather than wineserver start), so there would never be a conflicting
> file created.

I assume you intend for this to be done for any WINEPREFIX creation, not just ~/.wine?

There may be performance penalties associated with doing so..
Comment 36 sehe 2009-11-07 17:09:53 CST
Is there already a way to let wine know that it is running on a ci-fs (like ciopfs or zfs-fuse with 'casesensitivity=insensitive' property set)?
Comment 37 Elad Alfassa 2010-01-22 12:11:29 CST
Wine should determine the prefix's file system, and check if it is in a list of known case-insensitive file system's (ntfs, fat, ciopfs and such). It should be easy to implement this small check.

Integrating ciopfs into wine is a grate idea. native apps usually doesn't writes into ~/.wine. We can mount all the prefixes on login as suggested above. 

Please fix this bug ASAP.
Comment 38 Xavier Vachon 2010-08-26 08:13:13 CDT
Is this still an issue in current git? (wine-1.3.1-182-g56b8d5d)

I remember playing KOTOR1 not very long ago and I don't think it took that long to install. I can try to install it again soon and see how much time it takes.
Comment 39 Dan Kegel 2010-09-24 12:14:48 CDT
Seems fast enough to me lately.  Can someone give an example of an
installer with this problem?  (Preferably downloadable.)
Comment 40 Béla Gyebrószki 2011-11-20 08:35:37 CST
(In reply to comment #39)
> Seems fast enough to me lately.  Can someone give an example of an
> installer with this problem?  (Preferably downloadable.)

Hearts of Iron 3 demo also uses Installshield, the installed demo contains more than 40k, mostly small .tga, .txt, .dds files.
Installation took nearly 1 hour with Wine-1.3.33 on a EXT4 filesystem.
The same installation lasted for about 10 minutes in Virtualbox (WinXp guest).

http://www.fileplanet.com/203292/200000/fileinfo/Hearts-of-Iron-3-Demo
Comment 41 Béla Gyebrószki 2012-07-09 10:32:16 CDT
Still present in Wine 1.5.8, tested with Hearts of Iron 3 demo. Installation took 48 minutes on a EXT4 partition which was 88% full.

The 2 patches in bug #17956 (attachment #40936 [details] and attachment #40937 [details]) greatly reduce the time needed to install HoI 3 demo: installation took only 12 minutes on the same partition. That's roughly the same amount of time that I received in a Win XP (running in Virtualbox).
Comment 42 Justin Noah 2013-04-28 03:52:20 CDT
Still present in in Wine 1.5.29, tested with Hearts of Iron 3 demo. At 12 minutes in, the install was at 24%, cpu usage spiked, and internal fans were spinning on high so I did not let it finish.

I ran the install from a fresh wine prefix, no patches to wine (unless Mint14Cinnamon adds any).
Comment 43 Ken Sharp 2014-02-01 09:21:50 CST
Still present in wine-1.7.11-206-g82b3813.

Tested with HOI3 Demo.
http://www.gamefront.com/files/files/14192631/HeartsOfIron3_Demo.exe


Hosted By CodeWeavers