We have both free texts and texts which are restricted. The restricted texts must be protected by means of usernames and passwords, and require a contract.
The Oslo interface is good enough for this user category, or it will require only small modifications, e.g.. links to documents containing the hits, preferably with the hits highlighted.
Typically, these users are linguists or language technology developers coming with their own tools, e.g. another disambiguator, a separate morphological analyser, or in general will need command line access to the whole corpus to achieve what they want. Also other scholars may belong to this group. These users will need access to our corpus machine(s), and will invariably be required to accept our user contract.
Users that want shell access to our corpus have to be members of the group bound . These user will have shell access to both the free and bound corpus on our machine. Users which are not members of this group will have no shell access to any of our corpus files.
There will be two groups with access to /usr/local/share/corp, with the following access rights:
| Group | Description | Intended users |
|---|---|---|
| bound | Access to read the bound corpus | External linguists |
| corpus | Access to alter our orig. catalogues | Project workers (group as today) |
External users will get their own user account, belonging to the groups myself and bound , and will be able to install their own tools and programs for corpus processing, analysis, etc. External users will not get access to the orig/ directory.
To let the bound group members be able to analyse, we need to do some minor adjustments - as other they automatically have full access to the Xerox tools, and the compiled fst's are available in /opt/smi/sme/bin/sme-num.fst etc. The Xerox tools and vislcg are available in /opt/Xerox/bin . A couple of tools are missing right now, and need to be added to /opt/ by a crontab.
TODO:
Users of only the free corpus won't need anything but a browser.
Users of the bound corpus will need a username and password to the Oslo computer (until the base is moved to Tromsø). These usernames and passwords will be created and administered by the Oslo people, later by ourselves.
TODO:
09 for each user group
Divide our texts in two parts, also for the graphical interface:
Altering the CVS group may be a topic for future discussion:
Today, the cvs group has access to alter and read our linguistic source code. In the future, we may split this access into alter OR read, and make it more fine-grained, according to subtree (gt, kt, st, xtdoc), or even according to language.