-
Transifex and SSH keys
-
Returned from GUADEC, back to the development of transifex.
I’m at a point where most of the basic functionality works (if not all) for locally-hosted repositories. The
init.pyscript creates 4 dummy “remote” repositories (cvs, svn, hg, git) in/var/tmp/, and checkout/clones them “locally”, and you can commit/push to them through transifex.Transifex should be able to commit to remote repos, and the norm to do this is over SSH. So, the next step is to enable all operations happen over SSH, which basically means allowing modules to have repositories starting with
ssh://or something, and switching it’s authtype (in the DB) from ‘None’ to ’ssh’ (I’ll think if this is needed at all).A lengthy email was sent to Fedora Infrastructure for advices on how to handle SSH keys in general. So the story is that we’ve got the TG app running on a server under a user (eg.
transifex@translate.fpo), which will need to commit to many remote systems (eg.fedoral10n@hg.hosted.fpo). There are a number of things to consider for the VCS checkout/commmit/push to take place:We need SSH keys (obvisouly).
Normally
sshstores keys in~/.ssh/. If we’d like to override that and, say, store the keys in the database (because we are in a load-balanced environment where >1 filesystems are used) or in a different directory (eg. because we hate ~tildes since they remind us of the curly hair of the first girl that broke our heart), then we are starting to run into problems. For example, CVS doesn’t support passing arguments tossh, which means we should create our own special ssh executable that actually callssshwith the right options (u-g-l-y).The keys should apparently be encrypted (ie. created with
ssh-keygen -N'something'). If so, who will “type” the passphrase to unlock them?If the web server gets compromised, can we make sure that it can’t access the SSH keys? What about the passphrase?
Maybe we ought to create a separate process/daemon that will actually do the VCS operations (and have access to the keys) and the webserver will instruct it to do whatever it wants. This way, the worse that could happen is to damage a remote VCS but at least the keys won’t be accessed (since they are owned by a different user than the one the webserver is running as).
Some upstreams might want to run the code that does the VCS operations themselves to make sure it’s OK. If so, we’d need to write an XMLRPC or something that enables either them to “pull” with a cron script from our service the operations (easier) or us to “push” requests for commit to their service (instant/live).
Things might get simpler if we rely on
ssh-agentto handle the passphrase (either with or without the separate process/user). If the admin (a human) writes the passphrase, we won’t need to store it anywhere, which is fantastic.Finally, we should decide where the SSH key cherry-picking will take place. Will it be in the DB (easier to maintain) or in
~/.ssh/.config(easier to code, more secure)? Probably the second, which means we should solve the “how to update & load balance it” deployment issues.
All these sound a bit complicated, but it could just be because of my bad English. For start, we can just run
ssh-agentand then the webserver (./start-transifex.py) and all operations that require SSH (eg.hg clone ssh://...) will “just work”. On the downside is that a compromise of the web server can cause the exposure of our SSH keys, but at least the passphrase is protected. It could be possible to protect the keys byfopens with SELinux, and only allow the VCS commands to access them. Not sure how much security this adds though; probably a fair amount. (TODO: Send an email to fedora-selinux-list.)I think that if we are to do it the
ssh-agentway, it will mean that I finished my Google Summer of Code project 10 days ago and we are ready for a test drive of transifex.Besides, what’s the worse thing that could happen? Commit a bunch of files on some Fedora VCS repos (for start). Which itself is undo-able.
Unless it happens 2 seconds before a release. Ah well.

Not sure if this was mentioned (or implemented) before but – somewhat orthogonal to your concerns about SSH keys, I would try to ensure that the system limits what can be committed; e.g. only changes to po/*.po.
If you implement this as a separate process (the one with access to the ssh keys), it seems reasonably safe.
Right, besides imposing ACLs on the VCS side, module maintainers have the choice to do so on the transifex side with a regex. This way, only the relevant files (eg.
po/*.po,po/Changelog,po/LINGUAS) are presented to the translator.Of course, as you already mentioned, this is orthogonal to the SSH keys: it works just in the web system. But it certainly controls a lot what one can and cannot do in the standard procedures of the web system.