[203] | 1 | <?xml version="1.0" encoding="iso-8859-1"?>
|
---|
| 2 | <!DOCTYPE chapter PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
|
---|
| 3 | <chapter id="largefile">
|
---|
| 4 | <chapterinfo>
|
---|
| 5 | &author.jeremy;
|
---|
| 6 | &author.jht;
|
---|
| 7 | <pubdate>March 5, 2005</pubdate>
|
---|
| 8 | </chapterinfo>
|
---|
| 9 | <title>Handling Large Directories</title>
|
---|
| 10 |
|
---|
| 11 | <para>
|
---|
| 12 | <indexterm><primary>performance degradation</primary></indexterm>
|
---|
| 13 | <indexterm><primary>large numbers of files</primary></indexterm>
|
---|
| 14 | <indexterm><primary>large directory</primary></indexterm>
|
---|
| 15 | Samba-3.0.12 and later implements a solution for sites that have experienced performance degradation due to the
|
---|
| 16 | problem of using Samba-3 with applications that need large numbers of files (100,000 or more) per directory.
|
---|
| 17 | </para>
|
---|
| 18 |
|
---|
| 19 | <para>
|
---|
| 20 | <indexterm><primary>read directory into memory</primary></indexterm>
|
---|
| 21 | <indexterm><primary>strange delete semantics</primary></indexterm>
|
---|
| 22 | The key was fixing the directory handling to read only the current list requested instead of the old
|
---|
| 23 | (up to samba-3.0.11) behavior of reading the entire directory into memory before doling out names.
|
---|
| 24 | Normally this would have broken OS/2 applications, which have very strange delete semantics, but by
|
---|
| 25 | stealing logic from Samba4 (thanks, Tridge), the current code in 3.0.12 handles this correctly.
|
---|
| 26 | </para>
|
---|
| 27 |
|
---|
| 28 | <para>
|
---|
| 29 | <indexterm><primary>large directory</primary></indexterm>
|
---|
| 30 | <indexterm><primary>performance</primary></indexterm>
|
---|
| 31 | To set up an application that needs large numbers of files per directory in a way that does not
|
---|
| 32 | damage performance unduly, follow these steps:
|
---|
| 33 | </para>
|
---|
| 34 |
|
---|
| 35 | <para>
|
---|
| 36 | <indexterm><primary>canonicalize files</primary></indexterm>
|
---|
| 37 | First, you need to canonicalize all the files in the directory to have one case, upper or lower &smbmdash; take your
|
---|
| 38 | pick (I chose upper because all my files were already uppercase names). Then set up a new custom share for the
|
---|
| 39 | application as follows:
|
---|
| 40 | <smbconfblock>
|
---|
| 41 | <smbconfsection name="[bigshare]"/>
|
---|
| 42 | <smbconfoption name="path">/data/manyfilesdir</smbconfoption>
|
---|
| 43 | <smbconfoption name="read only">no</smbconfoption>
|
---|
| 44 | <smbconfoption name="case sensitive">True</smbconfoption>
|
---|
| 45 | <smbconfoption name="default case">upper</smbconfoption>
|
---|
| 46 | <smbconfoption name="preserve case">no</smbconfoption>
|
---|
| 47 | <smbconfoption name="short preserve case">no</smbconfoption>
|
---|
| 48 | </smbconfblock>
|
---|
| 49 | </para>
|
---|
| 50 |
|
---|
| 51 | <para>
|
---|
| 52 | <indexterm><primary>case options</primary></indexterm>
|
---|
| 53 | <indexterm><primary>match case</primary></indexterm>
|
---|
| 54 | <indexterm><primary>uppercase</primary></indexterm>
|
---|
| 55 | Of course, use your own path and settings, but set the case options to match the case of all the files in your
|
---|
| 56 | directory. The path should point at the large directory needed for the application &smbmdash; any new files created in
|
---|
| 57 | there and in any paths under it will be forced by smbd into uppercase, but smbd will no longer have to scan
|
---|
| 58 | the directory for names: it knows that if a file does not exist in uppercase, then it doesn't exist at all.
|
---|
| 59 | </para>
|
---|
| 60 |
|
---|
| 61 | <para>
|
---|
| 62 | <indexterm><primary>case-insensitive</primary></indexterm>
|
---|
| 63 | <indexterm><primary>consistent case</primary></indexterm>
|
---|
| 64 | <indexterm><primary>smbd</primary></indexterm>
|
---|
| 65 | The secret to this is really in the <smbconfoption name="case sensitive">True</smbconfoption>
|
---|
| 66 | line. This tells smbd never to scan for case-insensitive versions of names. So if an application asks for a file
|
---|
| 67 | called <filename>FOO</filename>, and it cannot be found by a simple stat call, then smbd will return file not
|
---|
| 68 | found immediately without scanning the containing directory for a version of a different case. The other
|
---|
| 69 | <filename>xxx case xxx</filename> lines make this work by forcing a consistent case on all files created by
|
---|
| 70 | &smbd;.
|
---|
| 71 | </para>
|
---|
| 72 |
|
---|
| 73 | <para>
|
---|
| 74 | <indexterm><primary>uppercase</primary></indexterm>
|
---|
| 75 | <indexterm><primary>stanza</primary></indexterm>
|
---|
| 76 | <indexterm><primary>lowercase filenames</primary></indexterm>
|
---|
| 77 | Remember, all files and directories under the <parameter>path</parameter> directory must be in uppercase
|
---|
| 78 | with this &smb.conf; stanza because &smbd; will not be able to find lowercase filenames with these settings. Also
|
---|
| 79 | note that this is done on a per-share basis, allowing this parameter to be set only for a share servicing an application with
|
---|
| 80 | this problematic behavior (using large numbers of entries in a directory) &smbmdash; the rest of your &smbd; shares
|
---|
| 81 | don't need to be affected.
|
---|
| 82 | </para>
|
---|
| 83 |
|
---|
| 84 | <para>
|
---|
| 85 | This makes smbd much faster when dealing with large directories. My test case has over 100,000 files, and
|
---|
| 86 | smbd now deals with this very efficiently.
|
---|
| 87 | </para>
|
---|
| 88 |
|
---|
| 89 | </chapter>
|
---|