Managing a Large Site
There is no limitation with the number of webs and users a TWiki site can have. But there are several considerationgs needed to run a site having thousands of webs and tens of thousands of users. This topic discusses those.On this page:
User Management
The default user management scheme using TWikiUserMappingContrib and TWiki::Users::HtPasswdUser is not suitable for tens of thousands of users because of the following factors.- TWikiUserMappingContrib maintains the list of users on the TWiki.TWikiUsersTemplate topic
- TWiki::Users::HtPasswdUser stores user account data on a text file
Web Management
If you have thousands of webs, you face the following issues.- Getting the list of all webs takes a long time due to directory traversal. This happens e.g. when you move a topic.
- The frequency of administrator help requests increases. Each web should be made as self-service as possible.
$TWiki::cfg{Mdrepo}{WebRecordRequired} = 1;
The second issue can be solved by making webs autonomous following AutonomousWebs.
Eliminating Impractical Operations
If you have thousands of webs, some operations take too long. Here are those costly operations and how to suppress them.In all public webs
Setting the{NoInAllPublicWebs}
configuration parameter to true has the following effects - On the "More topic actions" page, "in all public webs" links are suppressed since they are likely to time out.
- On te WebSearch and WebSearchAdvance topics on all webs, the "All public webs" checkbox is suppressed.
SiteChanges
TWiki.SiteChanges, the topic showing all recent changes across all webs, should be deleted.Statistics script use from browser
This is not about the number of webs, but about the number of accesses. If there are millions of page views in a month, thestatistics
script takes too longe and a times out would occur if it's invoked from browser.
Setting {Stats}{DisableInvocationFromBrowser}
configuration parameter to true disable invocation of the statistics
script from browser.
Multiple servers
For higher performance and availability, you may have multiple TWiki servers behind a load balancer for a single TWiki site. By having$TWiki::cfg{DataDir}
and $TWiki::cfg{PubDir}
on NFS or other file sharing mechanisms, you can have multiple servers for a single TWiki site easily.
If a topic is saved simultaneously by two or more people, on different servers sharing $TWiki::cfg{DataDir}
, something may break - cases of broken RCS files are reported though their causes haven't been identified.
Even if $TWiki::cfg{DataDir}
and $TWiki::cfg{PubDir}
are shared by multiple servers, log files should not be because of the frequency they are updated.
For example:
use Sys::Hostname; $TWiki::cfg{LogFile} = '/var/twiki/logs/log%DATE%.' . hostname . '.txt';
logYYYMM.SERVER_HOSTNAME.txt
If each server has its own log file, the statistics
script needs to see log files of all the servers to provide real data.
If {Stats}{LogFileGlob}
configuration parameter is set as shown below, the statistics
script reads access log files matching the file glob (wildcard) instead of the file specified by {LogFileName}
.
$TWiki::cfg{Stats}{LogFileGlob} = "/var/twiki/logs/log%DATE%.*.txt";