From: Stefan Sperling Subject: Re: got-archive(1) (now with a patch) To: Benjamin Stürz Cc: gameoftrees@openbsd.org Date: Fri, 29 Dec 2023 01:40:55 +0100 On Thu, Dec 28, 2023 at 06:17:05PM +0100, Benjamin Stürz wrote: > On 12/28/23 12:51, Benjamin Stürz wrote: > > I'll take a look into the code of got, and see if I can do something, > > if no one is already working on it. > Here's a patch implementing a WIP archive command. > There are still a few things to do, > like adding options and a section in the man page. > But I think it's ready for testing. > Most of the code is copied from the checkout command, > the rest is either written by me or stolen from a man page. I am not sure about my overall end-goals for the design, and I do not have a lot of time to think about it now, but here are some quick thoughts: The archive command could work with a packing list file that contains a list of files to include in archives. The file could be versioned in some fully automatic or semi-automatic way. It could be visible, under some well-known name such as got-archive.conf or got-archive-list.txt, or be hidden somewhere in meta-data. If the command assumed an existing work tree as a starting point, rather than checking out an entire repository, then it could be more flexible. Consider multi-project repositories (fairly rare with Git, but they do exist) which could be checked out with a path-prefix via 'got checkout -p' to obtain the subtree of the repository that needs to be archived. A mixed-commit work tree would be rejected with an error, similar to how some other commands already do it. Local modifications to versioned files that are not yet committed would be tolerated but perhaps shown to the user for verification. E.g. it would be OK to locally tweak the version string in a Makefile, but having some non-committed changes in the code or in the packing list would probably be bad. One advantage of using an existing work tree and a packing list is that invoking the command could potentially become as simple as running something like this in a work tree: got archive got-0.96.tar.gz Version numbers are tricky. There are many conventions, your script already has a specific flag to deal with tag names that have a 'v' prefix, but this seems overly specific. Leaving the version entirely up to the user as a user-specified string might be the most general solution, rather than trying to be clever about it. Perhaps just let the version be a part of a user-specified archive name and leave it at that? I understand that your wrapper tool does its work based on tags in the repository but is it strictly necessary to integrate this new feature with tags so tightly? A wrapper script which checks out all tags and runs 'got archive' on each would still be possible. Instead of fts_open(3) this feature could use the work tree status crawl to pick up the versioned and unversioned files to package. I would prefer to ignore file types like fifos and device nodes. Such files should not appear in a reasonable source code tarball. Unless I am missing something, any file types other than regular files, directories, or symlinks would be out of the ordinary, wouldn't they? Besides, Git doesn't version such files either. Writing the tar headers directly is clever and avoids having to run an external 'tar' program. I hope though that this won't run into subtle bugs that result in incompatibilities with some implementations of tar. The header looks simple enough but I do not have enough experience with the tar file format to judge this. If a packing list doesn't exist yet then 'got archive' could create it based on the contents of the work tree. Users could then edit the list as needed. The list would contain both versioned and unversioned files that are expected to be present in the archive. It could also contain per-file annotations (like those used in OpenBSD's port tree's PLIST) in case that helps the design somehow.