Managing data
see also ManagingConfigs
Quick Start
- We combine subversion (online book, project home) and SVK (online book, project home) to get a distributed versioned file system that can store all kinds of data, not just software projects.
- Subversion cheat sheet/20 minute tutorial: Subversion Basic Work Cycle
- SVK cheat sheet/20 minute tutorial: SVKCheatSheet
- Our central subversion repository has restricted access (ask rsajan@horizonHYRUQAPZ.csueastbay.edu for more info). It can be found at http://acc.csueastbay.edu/svn/ahat.
- trunk is the main development line, branches and tags are used as usual. snapshots are for versions frozen in time of things other than software, e.g., submitted versions of papers complete with relevant data sets, etc.
- The repository is backed up (daily incrementals, weekly differentials, monthly fulls)
- We no longer backup user directories on acc.csueastbay.edu. Therefore, you should keep files you care about in the versioned system, not on individual servers.
Pocket Labs and Mirrored Repositories
As a research lab, it's important to
not have the same files on all the servers in your account, but to have the option to have different files on different machines. It's also important for us to be able to share code on many of our projects. We have separated the abstractions of authentication and authorization from the abstraction of a file system, and are using a state-of-the-art answer to the question of file system. (We'll get back to you about state-of-the-art answers to authentication and authorization. ) We use subversion (
online book,
project home) + SVK (
online book,
project home), a distributed authoring and versioning file system, aka mirrored repository, aka DAV_FS. Subversion is the versioned file system (repository), while SVK is the distributed (mirrored) component of that technical mumbo-jumbo.
Because we're a research lab and our needs are less structured and predictable than an academic department or an ISP, you essentially organize your files as you please, checking them out and in to whatever machine you are working on at the moment. Data in the versioned file system is organized (conceptually) into projects. Projects can be shared or private, depending on the authorization settings. What's more you can mirror part or all of a repository on a machine with SVK, so you don't need network access to check your changes in. You can synchronize with the central repository when you get network access. Once you have your own private project, you don't need any further authorization to create subprojects, etc., that are private.
Getting started with Subversion
If you've never used a version control system, you'll want to read (at least)
Fundamental Concepts &
Basic Usage. Don't panic, though, they are quick reads.
If you have used a version control system such as RCS, or an older distributed authoring systems such as CVS, you can get started with a quick skim of
Basic Usage. You're cheating yourself, however, if you don't read more. Subversion is much, much more than simply friendlier CVS or RCS for groups.
When working with group project files and/or server configuration project files, you'll also need to read
branching and merging.
Getting started with SVK
The subversion documentation is less mature than the subversion documentation, so the best approach is to go through the subversion docs, then our
SVKCheatSheet
Backups
Mirroring provides backup capability as well as version control, so long as all the important information is kept in the mirrored file system. That leaves the question of backing up the repository itself, which subversion and SVK both support.
A mirror case study
The online subversion book gives all sorts of great info about using a readonly mirror with
svnsync as a continuous backup of all your data. Actually doing the configuration and setup is left as an exercise for the reader, as it were.
see also ServerAdminGuidelines