llvm.org GIT mirror llvm / 69d7e1c
[docs] GitHub Proposal for LLVM This document was crafted from the various (320+) emails between 2nd June and 20th July regarding the move to GitHub. It tried to consolidate every issue that was raised and every solution that was presented to have a GitHub repository with sub-modules. It *does not* try to argue whether sub-modules are better or worse than any other Git solution, nor if Git is better than any other VCS, nor if GitHub is better than any other free code hosting service. This is just the final conclusions of 48 days and 320 emails (plus a lot of IRC discussions) on the LLVM community. This document will be presented at the survey that the foundation will setup for us to decide if we move to this solution or not. It reflects what was discussed on the lists, but it's not authoritative. If something is not clear enough, please refer to the mailing list discussions (hint: search for "GitHub"). Review: https://reviews.llvm.org/D22463 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276097 91177308-0d34-0410-b5e6-96231b3b80d8 Renato Golin 4 years ago
1 changed file(s) with 268 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 ==============================
1 Moving LLVM Projects to GitHub
2 ==============================
4 Introduction
5 ============
7 This is a proposal to move our current revision control system from our own
8 hosted Subversion to GitHub. Below are the financial and technical arguments as
9 to why we need such a move and how will people (and validation infrastructure)
10 continue to work with a Git-based LLVM.
12 There will be a survey pointing at this document when we'll know the community's
13 reaction and, if we collectively decide to move, the time-frames. Be sure to make
14 your views count.
16 Essentially, the proposal is divided in the following parts:
18 * Outline of the reasons to move to Git and GitHub
19 * Description on what the work flow will look like (compared to SVN)
20 * Remaining issues and potential problems
21 * The proposed migration plan
23 Why Git, and Why GitHub?
24 ========================
26 Why move at all?
27 ----------------
29 The strongest reason for the move, and why this discussion started in the first
30 place, is that we currently host our own Subversion server and Git mirror in a
31 voluntary basis. The LLVM Foundation sponsors the server and provides limited
32 support, but there is only so much it can do.
34 The volunteers are not Sysadmins themselves, but compiler engineers that happen
35 to know a thing or two about hosting servers. We also don't have 24/7 support,
36 and we sometimes wake up to see that continuous integration is broken because
37 the SVN server is either down or unresponsive.
39 With time and money, the foundation and volunteers could improve our services,
40 implement more functionality and provide around the clock support, so that we
41 can have a first class infrastructure with which to work. But the cost is not
42 small, both in money and time invested.
44 On the other hand, there are multiple services out there (GitHub, GitLab,
45 BitBucket among others) that offer that same service (24/7 stability, disk space,
46 Git server, code browsing, forking facilities, etc) for the very affordable price
47 of *free*.
49 Why Git?
50 --------
52 Most new coders nowadays start with Git. A lot of them have never used SVN, CVS
53 or anything else. Websites like GitHub have changed the landscape of open source
54 contributions, reducing the cost of first contribution and fostering
55 collaboration.
57 Git is also the version control most LLVM developers use. Despite the sources
58 being stored in an SVN server, most people develop using the Git-SVN integration,
59 and that shows that Git is not only more powerful than SVN, but people have
60 resorted to using a bridge because its features are now indispensable to their
61 internal and external workflows.
63 In essence, Git allows you to:
65 * Commit, squash, merge, fork locally without any penalty to the server
66 * Add as many branches as necessary to allow for multiple threads of development
67 * Collaborate with peers directly, even without access to the Internet
68 * Have multiple trees without multiplying disk space.
70 In addition, because Git seems to be replacing every project's version control
71 system, there are many more tools that can use Git's enhanced feature set, so
72 new tooling is much more likely to support Git first (if not only), than any
73 other version control system.
75 Why GitHub?
76 -----------
78 GitHub, like GitLab and BitBucket, provide free code hosting for open source
79 projects. Essentially, they will completely replace *all* the infrastructure that
80 we have today that serves code repository, mirroring, user control, etc.
82 They also have a dedicated team to monitor, migrate, improve and distribute the
83 contents of the repositories depending on region and load. A level of quality
84 that we'd never have without spending money that would be better spent elsewhere,
85 for example development meetings, sponsoring disadvantaged people to work on
86 compilers and foster diversity and equality in our community.
88 GitHub has the added benefit that we already have a presence there. Many
89 developers use it already, and the mirror from our current repository is already
90 set up.
92 Furthermore, GitHub has an *SVN view* (https://github.com/blog/626-announcing-svn-support)
93 where people that still have/want to use SVN infrastructure and tooling can
94 slowly migrate or even stay working as if it was an SVN repository (including
95 read-write access).
97 So, any of the three solutions solve the cost and maintenance problem, but GitHub
98 has two additional features that would be beneficial to the migration plan as
99 well as the community already settled there.
102 What will the new workflow look like
103 ====================================
105 In order to move version control, we need to make sure that we get all the
106 benefits with the least amount of problems. That's why the migration plan will
107 be slow, one step at a time, and we'll try to make it look as close as possible
108 to the current style without impacting the new features we want.
110 Each LLVM project will continue to be hosted as separate GitHub repository
111 under a single GitHub organisation. Users can continue to choose to use either
112 SVN or Git to access the repositories to suit their current workflow.
114 In addition, we'll create a repository that will mimic our current *linear
115 history* repository. The most accepted proposal, then, was to have an umbrella
116 project that will contain *sub-modules* (https://git-scm.com/book/en/v2/Git-Tools-Submodules)
117 of all the LLVM projects and nothing else.
119 This repository can be checked out on its own, in order to have *all* LLVM
120 projects in a single check-out, as many people have suggested, but it can also
121 only hold the references to the other projects, and be used for the sole purpose
122 of understanding the *sequence* in which commits were added by using the
123 ``git rev-list --count hash`` or ``git describe hash`` commands.
125 One example of such a repository is Takumi's llvm-project-submodule
126 (https://github.com/chapuni/llvm-project-submodule), which when checked out,
127 will have the references to all sub-modules but not check them out, so one will
128 need to *init* the module manually. This will allow the *exact* same behaviour
129 as checking out individual SVN repositories, as it will keep the correct linear
130 history.
132 There is no need to additional tags, flags and properties, or external
133 services controlling the history, since both SVN and *git rev-list* can already
134 do that on their own.
136 We will need additional server hooks to avoid non-fast-forwards commits (ex.
137 merges, forced pushes, etc) in order to keep the linearity of the history.
139 The three types hooks to be implemented are:
141 * Status Checks: By placing status checks on a protected branch, we can guarantee
142 that the history is kept linear and sane at all times, on all repositories.
143 See: https://help.github.com/articles/about-required-status-checks/
144 * Umbrella updates: By using GitHub web hooks, we can update a small web-service
145 inside LLVM's own infrastructure to update the umbrella project remotely. The
146 maintenance of this service will be lower than the current SVN maintenance and
147 the scope of its failures will be less severe.
148 See: https://developer.github.com/webhooks/
149 * Commits email update: By adding an email web hook, we can make every push show
150 in the lists, allowing us to retain history and do post-commit reviews.
151 See: https://help.github.com/articles/managing-notifications-for-pushes-to-a-repository/
153 Access will be transfered one-to-one to GitHub accounts for everyone that already
154 has commit access to our current repository. Those who don't have accounts will
155 have to create one in order to continue contributing to the project. In the
156 future, people only need to provide their GitHub accounts to be granted access.
158 In a nutshell:
160 * The projects' repositories will remain identical, with a new address (GitHub).
161 * They'll continue to have SVN access (Read-Write), but will also gain Git RW access.
162 * The linear history can still be accessed in the (RO) submodule meta project.
163 * Individual projects' history will be local (ie. not interlaced with the other
164 projects, as the current SVN repos are), and we need the umbrella project
165 (using submodules) to have the same view as we had in SVN.
167 Additionally, each repository will have the following server hooks:
169 * Pre-commit hooks to stop people from applying non-fast-forward merges
170 * Webhook to update the umbrella project (via buildbot or web services)
171 * Email hook to each commits list (llvm-commit, cfe-commit, etc)
173 Essentially, we're adding Git RW access in addition to the already existing
174 structure, with all the additional benefits of it being in GitHub.
176 What will *not* be changed
177 --------------------------
179 This is a change of version control system, not the whole infrastructure. There
180 are plans to replace our current tools (review, bugs, documents), but they're
181 all orthogonal to this proposal.
183 We'll also be keeping the buildbots (and migrating them to use Git) as well as
184 LNT, and any other system that currently provides value upstream.
186 Any discussion regarding those tools are out of scope in this proposal.
188 Remaining questions and problems
189 ================================
191 1. How much the SVN view emulates and how much it'll break tools/CI?
193 For this one, we'll need people that will have problems in that area to tell
194 us what's wrong and how to help them fix it.
196 We also recommend people and companies to migrate to Git, for its many other
197 additional benefits.
199 2. Which tools will need changing?
201 LNT may break, since it relies on SVN's history. We can continue to
202 use LNT with the SVN-View, but it would be best to move it to Git once and for
203 all.
205 The LLVMLab bisect tool will also be affected and will need adjusting. As with
206 LNT, it should be fine to use GitHub's SVN view, but changing it to work on Git
207 will be required in the long term.
209 Phabricator will also need to change its configuration to point at the GitHub
210 repositories, but since it already works with Git, this will be a trivial change.
212 Migration Plan
213 ==============
215 If we decide to move, we'll have to set a date for the process to begin.
217 As usual, we should be announcing big changes in one release to happen in the
218 next one. But since this won't impact external users (if they rely on our source
219 release tarballs), we don't necessarily have to.
221 We will have to make sure all the *problems* reported are solved before the
222 final push. But we can start all non-binding processes (like mirroring to GitHub
223 and testing the SVN interface in it) before any hard decision.
225 Here's a proposed plan:
227 STEP #1 : Pre Move
229 0. Update docs to mention the move, so people are aware the it's going on.
230 1. Register an official GitHub project with the LLVM foundation.
231 2. Setup another (read-only) mirror of llvm.org/git at this GitHub project,
232 adding all necessary hooks to avoid broken history (merge, dates, pushes), as
233 well as a webhook to update the umbrella project (see below).
234 3. Make sure we have an llvm-project (with submodules) setup in the official
235 account, with all necessary hooks (history, update, merges).
236 4. Make sure bisecting with llvm-project works.
237 5. Make sure no one has any other blocker.
239 STEP #2 : Git Move
241 6. Update the buildbots to pick up updates and commits from the official git
242 repository.
243 7. Update Phabricator to pick up commits from the official git repository.
244 8. Tell people living downstream to pick up commits from the official git
245 repository.
246 9. Give things time to settle. We could play some games like disabling the SVN
247 repository for a few hours on purpose so that people can test that their
248 infrastructure has really become independent of the SVN repository.
250 Until this point nothing has changed for developers, it will just
251 boil down to a lot of work for buildbot and other infrastructure
252 owners.
254 Once all dependencies are cleared, and all problems have been solved:
256 STEP #3: Write Access Move
258 10. Collect peoples GitHub account information, adding them to the project.
259 11. Switch SVN repository to read-only and allow pushes to the GitHub repository.
260 12. Mirror Git to SVN.
262 STEP #4 : Post Move
264 13. Archive the SVN repository, if GitHub's SVN is good enough.
265 14. Review and update *all* LLVM documentation.
266 15. Review website links pointing to viewvc/klaus/phab etc. to point to GitHub
267 instead.