"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: install symbolic links in the work tree
To:
Sebastien Marie <semarie@online.fr>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 31 May 2020 14:46:59 +0200

Download raw body.

Thread
On Sun, May 31, 2020 at 02:27:49PM +0200, Sebastien Marie wrote:
> On Thu, May 28, 2020 at 12:51:58PM +0200, Stefan Sperling wrote:
> > This adds support for creating symbolic links in the work tree.
> > 
> > At present symbolic links in the repository are installed as regular
> > files in the work tree, which contain the link target path.
> > We keep doing so as a fallback if a particular symbolic link cannot
> > be installed.
> > 
> > A big concern here is to avoid creating symbolic links which point
> > anywhere outside of the work tree. Did I get this right?
> > 
> > Only tested with 'got checkout' yet, but more tests will likely follow.
> > 
> 
> I am a bit late in the party, but here my review of the code.
> 
> First, a general remark about "if a symbolic link cannot be installed we instead
> create a regular file".
> 
> I am a bit unsure about this semantic. Does this particular file is managed in a
> special way ? or does the next got command will just assume a symlink was
> deleted and replaced by a plain file ?

We can never trust the on-disk state anyway. Even if we installed a symlink
the user can replace that link with a regular file, a directory, a fifo, etc.

The file index records the file type (regular or symlink):

struct got_fileindex_entry {
[...]
	uint16_t mode;
#define GOT_FILEIDX_MODE_FILE_TYPE	0x000f
#define GOT_FILEIDX_MODE_REGULAR_FILE	1
#define GOT_FILEIDX_MODE_SYMLINK	2

So that's where we can look if we want to know whether we did originally
install a symlink or a regular file. If it is a symlink the original link
target can be obtained from repository: It is contained in the blob the
file index entry points at.

Then we can look at what's on disk. It might be a real symlink, or something
else. Each operation can now do something that fits the semantics we want.

Is it enough if we simply render symlinks harmless upon checkout?
I was considering whether 'got commit' should reject symlinks that point
outside of the work tree, as a precaution. What do think about that?
There is no way to prevent anyone from creating repositories with bad symlinks.
But I'm thinking maybe we should prevent accidental commit of such symlinks.

> > +			/* Path too long; install as a regular file. */
> > +			got_object_blob_rewind(blob);
> > +			return install_blob(worktree, ondisk_path, path,
> > +			    GOT_DEFAULT_FILE_MODE, st_mode, blob,
> > +			    restoring_missing_file, reverting_versioned_file,
> > +			    repo, progress_cb, progress_arg);
> 
> inside install_symlink() function, the code for got_object_blob_rewind() + install_blob()
> is used 3 times. it might be more readable to have it in a separated block with
> "goto install_as_file" ?

Probably, yes. Or we could just rewind inside install_blob().

> > +	if (!got_path_is_absolute(target_path)) {
> > +		char *parent = dirname(ondisk_path);
> 
> technically, dirname(3) could return NULL (in ENAMETOOLONG case).

Yes, good catch. I'll fix this.

> > +	resolved_path = realpath(abspath ? abspath : target_path, NULL);
> > +	if (resolved_path == NULL) {
> > +		if (errno != ENOENT)
> > +			return got_error_from_errno2("realpath", target_path);
> 
> abspath is leaked if allocated. the following should prevent that:
> 
> 			err = got_error_from_errno2("realpath", target_path);
> 			goto done;

Thanks. Will fix.

> at this point, I think there is too many variables/possibilities for the path:
> - target_path : user string
> - abspath : NULL if target_path is absolute
> - resolved_path : from target_path or abspath, could be NULL too

Yes, I was a bit annoyed at that, too ;-)

> it complexifies the code without necessity.

> strdup() is here to permit uniform exit path with free(3).

And I wanted to avoid this extra strdup(). But you may be right that
the code would be easier to follow if we just did that. I'll try this.

> > +	/* Only allow symlinks pointing at paths within the work tree. */
> > +	if (!got_path_is_child(resolved_path ? resolved_path : target_path,
> > +	        worktree->root_path, strlen(worktree->root_path))) {

> I think it isn't enough. target_path could point to something under $wd/.got/ ,
> and I think it is unwanted.

Oh, indeed. I overlooked this. Thanks!