From: Sebastien Marie Subject: Re: install symbolic links in the work tree To: gameoftrees@openbsd.org Date: Sun, 31 May 2020 14:27:49 +0200 On Thu, May 28, 2020 at 12:51:58PM +0200, Stefan Sperling wrote: > This adds support for creating symbolic links in the work tree. > > At present symbolic links in the repository are installed as regular > files in the work tree, which contain the link target path. > We keep doing so as a fallback if a particular symbolic link cannot > be installed. > > A big concern here is to avoid creating symbolic links which point > anywhere outside of the work tree. Did I get this right? > > Only tested with 'got checkout' yet, but more tests will likely follow. > I am a bit late in the party, but here my review of the code. First, a general remark about "if a symbolic link cannot be installed we instead create a regular file". I am a bit unsure about this semantic. Does this particular file is managed in a special way ? or does the next got command will just assume a symlink was deleted and replaced by a plain file ? but not creating the symlink at all would be the same problem, and I am also unsure that simply abording the checkout would be a good solution too. So I share my uncertainty, and I will be fine with it :) > diff bcbc22724ddb271a0f40cdfec5dcb49f2e52b8da /home/stsp/src/got > blob - 1ad69b7dff719dd615ae921d1fc94b3ec3fabf39 > file + lib/worktree.c > --- lib/worktree.c > +++ lib/worktree.c > @@ -943,18 +943,175 @@ get_ondisk_perms(int executable, mode_t st_mode) > return (st_mode & ~(S_IXUSR | S_IXGRP | S_IXOTH)); > } > > +/* forward declaration */ > static const struct got_error * > install_blob(struct got_worktree *worktree, const char *ondisk_path, > const char *path, mode_t te_mode, mode_t st_mode, > struct got_blob_object *blob, int restoring_missing_file, > int reverting_versioned_file, struct got_repository *repo, > + got_worktree_checkout_cb progress_cb, void *progress_arg); > + > +static const struct got_error * > +install_symlink(struct got_worktree *worktree, const char *ondisk_path, > + const char *path, mode_t te_mode, mode_t st_mode, > + struct got_blob_object *blob, int restoring_missing_file, > + int reverting_versioned_file, struct got_repository *repo, > got_worktree_checkout_cb progress_cb, void *progress_arg) > { > const struct got_error *err = NULL; > + char target_path[PATH_MAX]; > + size_t len, target_len = 0; > + char *resolved_path = NULL, *abspath = NULL; > + const uint8_t *buf = got_object_blob_get_read_buf(blob); > + size_t hdrlen = got_object_blob_get_hdrlen(blob); > + > + /* > + * Blob object content specifies the target path of the link. > + * If a symbolic link cannot be installed we instead create > + * a regular file which contains the link target path stored > + * in the blob object. > + */ > + do { > + err = got_object_blob_read_block(&len, blob); > + if (len + target_len >= sizeof(target_path)) { > + /* Path too long; install as a regular file. */ > + got_object_blob_rewind(blob); > + return install_blob(worktree, ondisk_path, path, > + GOT_DEFAULT_FILE_MODE, st_mode, blob, > + restoring_missing_file, reverting_versioned_file, > + repo, progress_cb, progress_arg); inside install_symlink() function, the code for got_object_blob_rewind() + install_blob() is used 3 times. it might be more readable to have it in a separated block with "goto install_as_file" ? > + } > + if (len > 0) { > + /* Skip blob object header first time around. */ > + memcpy(target_path + target_len, buf + hdrlen, > + len - hdrlen); > + target_len += len - hdrlen; > + hdrlen = 0; > + } > + } while (len != 0); > + target_path[target_len] = '\0'; > + > + /* > + * Relative symlink target lookup should begin at the directory > + * in which the blob object is being installed. > + */ > + if (!got_path_is_absolute(target_path)) { > + char *parent = dirname(ondisk_path); technically, dirname(3) could return NULL (in ENAMETOOLONG case). if ondisk_path is assumed to be <= PATH_MAX, it should be fine, but you checked the return value of dirname(ondisk_path) later :) > + if (asprintf(&abspath, "%s/%s", parent, target_path) == -1) { > + err = got_error_from_errno("asprintf"); > + goto done; > + } > + } > + > + /* > + * unveil(2) restricts our view of paths in the filesystem. > + * ENOENT will occur if a link target path does not exist or > + * if it points outside our unveiled path space. > + */ > + resolved_path = realpath(abspath ? abspath : target_path, NULL); > + if (resolved_path == NULL) { > + if (errno != ENOENT) > + return got_error_from_errno2("realpath", target_path); abspath is leaked if allocated. the following should prevent that: err = got_error_from_errno2("realpath", target_path); goto done; > + } at this point, I think there is too many variables/possibilities for the path: - target_path : user string - abspath : NULL if target_path is absolute - resolved_path : from target_path or abspath, could be NULL too it complexifies the code without necessity. target_path is the user string, so we should keep it. next, having just one variable for "absolute path" (resolved or not, the same of target_path or not) would be enough. in pseudo code: if (got_path_is_absolute(target_path)) abspath = strdup(target_path) else asprintf(&abspath, "%s/%s", parent, target_path) resolved_path = realpath(abspath, NULL) if (resolved_path == NULL) { resolved_path = strdup(abspath); // and maybe : got_canonpath(abspath, resolved_path); } free(abspath); strdup() is here to permit uniform exit path with free(3). > + > + /* Only allow symlinks pointing at paths within the work tree. */ > + if (!got_path_is_child(resolved_path ? resolved_path : target_path, > + worktree->root_path, strlen(worktree->root_path))) { > + /* install as a regular file */ > + got_object_blob_rewind(blob); > + err = install_blob(worktree, ondisk_path, path, > + GOT_DEFAULT_FILE_MODE, st_mode, blob, > + restoring_missing_file, reverting_versioned_file, > + repo, progress_cb, progress_arg); > + goto done; > + } I think it isn't enough. target_path could point to something under $wd/.got/ , and I think it is unwanted. > + > + if (symlink(target_path, ondisk_path) == -1) { > + if (errno == ENOENT) { > + char *parent = dirname(ondisk_path); > + if (parent == NULL) { > + err = got_error_from_errno2("dirname", > + ondisk_path); > + goto done; > + } > + err = add_dir_on_disk(worktree, parent); > + if (err) > + goto done; > + /* > + * Retry, and fall through to error handling > + * below if this second attempt fails. > + */ > + if (symlink(target_path, ondisk_path) != -1) { > + err = NULL; /* success */ > + goto done; > + } > + } > + > + /* Handle errors from first or second creation attempt. */ > + if (errno == EEXIST) { > + struct stat sb; > + ssize_t elen; > + char etarget[PATH_MAX]; > + if (lstat(ondisk_path, &sb) == -1) { > + err = got_error_from_errno2("lstat", > + ondisk_path); > + goto done; > + } > + if (!S_ISLNK(sb.st_mode)) { > + err = got_error_path(ondisk_path, > + GOT_ERR_FILE_OBSTRUCTED); > + goto done; > + } > + elen = readlink(ondisk_path, etarget, sizeof(etarget)); > + if (elen == -1) { > + err = got_error_from_errno2("readlink", > + ondisk_path); > + goto done; > + } > + if (elen == target_len && > + memcmp(etarget, target_path, target_len) == 0) > + err = NULL; > + else > + err = got_error_path(ondisk_path, > + GOT_ERR_FILE_OBSTRUCTED); > + } else if (errno == ENAMETOOLONG) { > + /* bad target path; install as a regular file */ > + got_object_blob_rewind(blob); > + err = install_blob(worktree, ondisk_path, path, > + GOT_DEFAULT_FILE_MODE, st_mode, blob, > + restoring_missing_file, reverting_versioned_file, > + repo, progress_cb, progress_arg); > + } else if (errno == ENOTDIR) { > + err = got_error_path(ondisk_path, > + GOT_ERR_FILE_OBSTRUCTED); > + } else { > + err = got_error_from_errno3("symlink", > + target_path, ondisk_path); > + } > + } > +done: > + free(resolved_path); > + free(abspath); > + return err; > +} Thanks. -- Sebastien Marie