Change pull request refs after Commit instead of after Approval or workaround?

Deleted user December 3, 2013

Hi

We are trying to implement workflow in Stash using pull-requests reviews. And also connect this with our Jenkins CI...

But we encountered couple of troubles due to that and all exisiting solutions seems doesn't help

What we want to acomplish is described in those steps:

1. Create Branch from JIRA link in the stash

2. Pull created branch locally and start working on it

3. Commit changes to remote branch

4. Open Pull Request in Stash with changed branch

5. After Pull Request is openned trigger "Pull Request Build Job" in Jenkins

(There was some mentioning about that like here , and we solved this using Git Plugin in Jenkins and specifying refspec parameters: "+refs/pull-requests/*:refs/remotes/origin/pull-requests/*" and building the "origin/pull-requests/*/from" branch)

Also we are using Jenkins WebHook plugin to trigger job by commit to Stash repo

6. Initiate Review process with changes disscussion

7. (important) Do commit changes and update pull-request with new commits due to disscussion

8. (important) Trigger Jenkins build for new commits in the "Pull Request Build Job"

9. Review again, approve, merge

And in point 8 comes the trouble, cause stash updates the reference - "refs/pull-requests/*/from" only if it was Approved, the same with "refs/pull-requests/*/merge" when it's merged....And by commit, "Jenkins WebHook Plugin" is triggered but poll scm at jenkins side is doing nothing cause nothing updated in the remote refs... We tried also to force refs update, but got - "Stash manages these refs automatically, and they may not be updated by users.". And it makes a real pain in the process and make not much sense to use "Pull Requests" cause developer can't check his commit (if it's build successfully) after he updated pull request with disscussed changes, so to show reviewers that it passes, but he needs to wait approve and than again initiate unaprove, which already strange...

There is ofcourse option to specify branch pattern without using refspecs, but it could trigger not needed jobs and would trigger some of them before pull request was made - which is totally what we don't need...

Is there any workaround or a way to update Stash pull-requests refspecs or add additional which will be up to date with pull-request initiate branch ??

I think in github they have for this purpose "refs/pull/*/head"

5 answers

1 accepted

6 votes
Answer accepted
Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
December 3, 2013

Sergey,

I am the developer who wrote the majority of the server side of Stash's pull request implementation. I'll try to answer this question.

Foremost: I'm going to assume when you say "we are using Jenkins WebHook plugin to trigger job by commit to Stash" you mean when there is a push to Stash, since Stash doesn't support a commit hook (and a manually installed one would never be triggered).

And in point 8 comes the trouble, cause stash updates the reference - "refs/pull-requests/*/from" only if it was Approved, the same with "refs/pull-requests/*/merge" when it's merged....And by commit, "Jenkins WebHook Plugin" is triggered but poll scm at jenkins side is doing nothing cause nothing updated in the remote refs...

This is incorrect. Whether the pull request is approved has no bearing on when refs/pull-requests/*/from is updated, and while actually merging a pull request will ensure refs/pull-requests/*/merge is up-to-date, it is certainly not the only way to do so.

The hashes on a pull request change continuously. The majority of the pull requests in a given repository target a very small number of branches (usually master, for example). Every change to master updates every pull request which targets it. If the system aggressively merged pull requests on every change to their hashes, it simply would not scale. Even throwing a massive box with SSD storage at it would not be enough to keep up with even a moderately busy repository.

To manage that, Stash performs the merge lazily. All you need to do to get the refs/pull-requests/*/* refs to update is view the pull request. Viewing the pull request will automatically trigger those refs on disk to be updated to match their state in the database (which is updated automatically after every push or pull request merge). If the pull request has comments on it, that will also trigger the refs to be updated automatically by a background thread in Stash.

Neither of those will happen in time for your hook, however. Stash doesn't update anything (the database or the refs on disk) synchronouusly with pushes. The amount of processing that will be involved is not sufficiently stable to allow it to be done that way. It may be very fast, or it may take a noticeable amount of time depending on the number of pull requests targeting a given branch.

And it makes a real pain in the process and make not much sense to use "Pull Requests" cause developer can't check his commit (if it's build successfully) after he updated pull request with disscussed changes, so to show reviewers that it passes, but he needs to wait approve and than again initiate unaprove

Actually, all the developer would need to do is view the pull request and the refs would be updated. Approving or unapproving the pull request doesn't do anything in terms of updating them.

Before I continue, there are two things I want to note:

  1. The way pull request refs work changes radically in Stash 2.9. Prior to 2.9, pull request refs accumulate endlessly. Even when a pull request is declined or merged the system does not remove its entries in refs/pull-requests. As of Stash 2.9, once a pull request is declined or merged its refs are removed and can no longer be fetched. This helps keep the repositories working more quickly because having tens of thousands of refs imposes very noticeable slowdown on many operations, as well as preventing the ref advertisement from becoming very large.
  2. Doing anything which makes Stash update the pull request refs on disk more frequently may substantially increase the load on your server and cause performance problems. The merges are done lazily because they are very expensive.

With that said. The hook that you are using could be changed to add something like this:

@EventListener
public void onPullRequestRescoped(PullRequestRescopedEvent event) {
    PullRequest pullRequest = event.getPullRequest();

    //Only trigger this if the pull request was rescoped on the from side, meaning new changes
    //were pushed. Doing this after every change on the to side will cause severe performance
    //degredation for your Stash server because it happens too often
    if (!(event.getPreviousFromHash().equals(pullRequest.getFromRef().getLatestChangeset() ||
            //Using getToRef() here is required; pull requests are "scoped" to their target repository
            pullRequestService.canMerge(pullRequest.getToRef().getRepository().getId(), pullRequest.getId()).isConflicted())) {
        //Notify Jenkins; the pull request refs have been updated
    }
}

Calling PullRequestService.canMerge will force the refs to be updated if they don't match the database. PullRequestRescopedEvent is raised whenever the database hashes are updated. As noted in the comments in the code above, you will only want to do this when the from ref on the pull request is updated. If you call it on every rescope event you'll bury your server in merges (and your Jenkins server too, for that matter).

If PullRequestService.canMerge(...).isConflicted() returns true, it means the pull request has conflicts which will prevent it from being merged. If you're building the refs/pull-requests/*/merge commit, it won't build cleanly (and, from Stash 2.9+, that ref does not exist at all if the pull request has conflicts). Hence you'll only want to trigger Jenkins to build the pull request if it can be merged. If you're building refs/pull-requests/*/from, you may choose to notify Jenkins even if isConflicted() returns true; that ref will always exist as long as the pull request is open, even if it has conflicts (even on 2.9+)

Hope this helps,
Bryan Turner
Atlassian Stash

Deleted user December 3, 2013

Hi Bryan,

Thank you very much for fast reply and very detailed answer!

Best Regards

vassilevsky October 8, 2014

Bryan, thank for the detailed answer. We would still much prefer our Stash instance to update all refs/pull-requests/*/merge branches after every change. Sure, that will give the CPU and the disks a good load of work, but isn't it what they are for? We need to rebuild everything in the background, so that when we open a pull request, we already see the results. Is there a secret configuration parameter to turn off the laziness? Stash 3.2.0.

Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
October 9, 2014

Ilya, The short answer is no, there's no such setting. The longer answer is still no, because there really is no such setting; I'm not keeping any secrets. But the longer answer comes with questions about whether you have monitoring in place on your hardware. Just serving hosting operations, especially the continuous clones performed by heavy CI, generally already "give the CPU and the disks a good load of work". Do you know how much idle CPU you have? Do you know how much spare disk I/O you have? Have you turned on profiling to see how long updating those merges is taking (per merge; you can look for "git: create new merge" in the profiling logs)? Before attempting to make merges less lazy, those are details it's pretty important to not only know but have continuous monitoring on. Best regards, Bryan Turner Atlassian Stash

vassilevsky October 14, 2014

Zabbix data: 7d CPU usage: 53.2% max, 2.85% avg. 7d load average: 3.28% max, 0.185% avg. Unfortunately, no disk i/o data available. It's a SAS RAID.

Linh Nguyen December 30, 2014

Hi Bryan, The answer is very detailed and makes a lot of sense. Our team is using this pull request refspec extensively, and we really don't have that many pull requests (only 9 or 10 at a time). We would love to have a feature that Ilya requested. I understand the CPU/load concern, but we are aware of that. Could it be possible to make it an optional setting, for example, Stash by default will perform the lazy reference updates, but it is still an option for us to turn on the continuous update if we wish? I am sure it would be very helpful for not only our team, but other teams as well. Also, as a workaround, is there anyway for us to automate the "view" of the pull requests? We would want to have a cron job to view all the pull request every so often to update the reference like you said. (The webhook plugin no longer works for us). I tried using curl/Stash rest apis, none of those work. I have to have my browser open and refresh the pull request page for the references to be updated.

Adel Ben Yacoub January 27, 2017

Bryan, thank you for this answer.

I was able to trigger refs to update by calling the "changes" API method :

https://developer.atlassian.com/static/rest/stash/3.11.6/stash-rest.html#idp2587184

hope it helps.

dilipraj November 17, 2018

@Adel Ben Yacoub Hi, Adel. How did you establish Authentication? Did you use any plugin or you did it using vanilla rest?

I'm struggling with establishing authentication with Jenkins job.

0 votes
Joerg S July 8, 2019

No sure whether the issue we're currently facing relate to this issue. But it is looking very similar.

We have a setup with a forked repository where we create PR's to the original repository. This used to work fine. However since to day we're facing the following issue:

The `/refs/pull-requests/*/from` and `/refs/pull-requests/*/merge` references are getting updated immediately after updating the PR with a new commit.

However the new commit will not yet be available. When I last checked it took about 1 minute until the commit was available for the next fetch.

This of course makes it impossible to work with PR's for this repository. Furthermore intermittent fails happen even in unrelated jobs which are simply trying to checkout master. But since the referenced commit cannot be found on the remote even a simple `git fetch` is failing:

stderr: error: refs/pull-requests/66/from does not point to a valid object!
error: Could not read 7a99185626f667ab8c34d502417c47e99778e633
error: Could not read 7a99185626f667ab8c34d502417c47e99778e633
remote: error: Could not read 7a99185626f667ab8c34d502417c47e99778e633        
remote: fatal: revision walk setup failed        
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header
Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
July 8, 2019

@Joerg S

That's definitely not this issue. You should open a support ticket, or have someone who is an administrator on your Bitbucket Server instance open one, so our support team can help diagnose and address the issue. (https://support.atlassian.com)

The refs are only ever updated on disk after the necessary objects have been fetched into the repository (and Git itself has code that verifies this), so there must be more to this story than simple ref updates.

Best regards,
Bryan Turner
Atlassian Bitbucket

0 votes
Jean-Serge Gagnon December 9, 2016

This appears to be a very old thread but I was wondering if the newest version of Bitbucket had changed this behavior.

Our infrastructure team recently updated from Stash to Bitbucket v4.10.2 and we are seeing this behavior now. My understanding is that the upgrade was specifically for performance reasons.

Basically, we have a ton of CI jobs that we have configured to trigger jenkins git polling on commits to pull requests - this is no longer working so we are guessing it is a result of this lazy ref update process as the git polling trigger happens on a new push, but jenkins always says "no changes found" unless we do a new push many hours later or we visit the pull request and click the trigger jenkins button.

I guess I don't understand the benefit since anytime anyone does a push, they would manually visit the pull request which will cause the refs update to happen anyway - I personally don't see any of the developers not wanting a build when they push code to a pull request branch, so they would always visit the pull request thereby causing the refs update... 

 Perhaps there's an option in Bitbucket 4.10.2 to enable refs update (or disable the lazy update)?

Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
August 3, 2017

> I guess I don't understand the benefit since anytime anyone does a push, they would manually visit the pull request which will cause the refs update to happen anyway

There's some truth to that, but that's only thinking about the from side of the pull request. Ultimately, very few of the rescopes that happen to a pull request over its lifetime are on the from side; the vast majority tend to be on the target side.

In most repositories, the majority of pull requests are to the development branch. Looking at the repository for Bitbucket Server's own source code at the moment, for example, shows 11 of the 21 pull requests currently open at for our master branch. In the Confluence repository, 31 of 41 pull requests are targeting master--and 6 of those are over one year old. If all of those pull requests were eagerly re-merged whenever they were rescoped they'd be producing significant server load for no purpose; it's pretty clear no one is viewing those pull requests.

> Perhaps there's an option in Bitbucket 4.10.2 to enable refs update (or disable the lazy update)?

Sorry, no. The behavior remains unchanged all the way through current 5.x releases.

Matei Copot July 11, 2019

What about just updating the from, without a merge? Maybe creating a new ref `refs/pull-requests/*/from-latest` so the rest is backwards-compatible, that acts like a symlink to `refs/heads/<pr branch>`? This shouldn't be computationally expensive at all.

We have a policy to rebase on master before creating a PR anyway, so we can set up teamcity to only build the `from` part and we'll be mostly fine. Sadly, `from` currently updates with the same rules that `merge` does, so it doesn't make much of a difference, but I don't think a `from-latest` would be an unreasonable ask

Like jonas.dekegel likes this
0 votes
Chuck Burgess December 5, 2013

I do like the sound of that 2.9 improvement about PR refs tracking.

In our setup, our repo is still new enough and small enough that there are only a few PRs in existence so far. There are probably more individual repos right now as we migrate than there are total PRs.

In the instances where I've seen "my bug", the PR has been from my own fork. So, that would preclude the issue of access control. Also, looking at the PR's Overview view was not enough to clear the 404s. Yes, the list of commits on the PR were correctly visible and listed on the PR's Commits view. Further, links to all the older commits worked fine. It was only for the newer commits, those that had happened since the last time anyone had hit the PR's Diff view, that exhibited the 404s. And again, until this 404 issue cleared, no Jenkins builds would happen. As a matter of fact, even going to Jenkins and issuing a manual build was no good, because the SCM lookup would not see the new commits either (these build jobs watch the from, merge, and merge-clean refs). But once the 404 issue was resolved, the builds were triggered. In a couple of cases, this 404 issue existed over an entire weekend, because a Friday commit never triggered a build, until a Diff rendering on Monday suddenly triggered the build. Also, in thinking about viewing the Overview page kicking off a perhaps slow lazy merge, some of these 404 instances went on for over an hour while I scratched my head about why they were happening. I even tried declining a PR and reopening it... no fix. I then opened a new PR from the same fork... no fix, as the same commits were 404ing on the new PR too. The only fix came after viewing the Diff view.

Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
December 5, 2013

Chuck,

Foremost, I need to correct one thing. The URLs for the web UI are not "smart" about selecting repositories--it's notification e-mail links that do that. My apologies for that mistake. Since clicking on a commit in a notification e-mail doesn't hit the overview (or any other pull request screen), it's more likely they will trigger this type of error. Stash handles those differently to try and avoid that. We may need to modify the web UI to use the same approach.

That said, this sounds like a different issue. I would recommend creating an issue on https://support.atlassian.com so that we can help debug this further, because viewing the overview should guarantee that all commits are available within 10-12 seconds (at the extreme end; really it should be 3 seconds or less). If you open a support request, we can work with you to turn up/adjust some logging and trace out the cause of the issue.

We use forks and cross-repository pull requests extensively internally (all of my own pull requests are from my personal fork, for example, as I develop Stash itself), and I've never seen the commits not be available within seconds of viewing the overview (so fast I've never actually triggered a 404 on one).

Best regards,
Bryan Turner
Atlassian Stash

0 votes
Chuck Burgess December 5, 2013

Hey Bryan,

That lazy merge might explain the occasional "link to a PR's new commit will repeatedly 404" behavior I've seen (Stash v2.6.2). When I've seen this happen, trying to view individual commits on a PR, the link to the Commit page will 404. Viewing the PR itself was not enough to clear the issue. However, viewing the PR's overall Diff page does fix it, for all Commits that were 404ing.

Clearing this 404 issue this way also results in the Jenkins notifications happening.

Is it actually the Diff view being generated the first time that actually does this lazy merge, rather than just viewing the PR in any other way?

Bryan Turner
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
December 5, 2013

Chuck,

If the pull request is cross-repository, the lazy merge is what fetches the changesets into the destination repository. To be clear about your problem, viewing the commit list on the pull request works, correct? It's only when you click into a specific commit that you get a 404? The most likely cause is that the pull request is cross-repository, probably coming from another user's personal fork, and you do not have access to the source repository, only the target. In that situation, the URL Stash builds for viewing the changeset is in the target repository and the lazy merge is the piece that makes it all work. Since a pull request is automatically opened after it's created, the merge should be kicked off automatically as well. However, there is a known race condition there, or after new changes are pushed to the pull request's source branch.

Viewing the pull request overview triggers the lazy merge via REST to allow it to enable/disable the "Merge" button. It does _not_ block page rendering. For viewing the diff, loading the tree on the left or the diff on the right, either one, will block what you see until the merge is complete. This means visiting the overview will also fix it, but it won't be as obvious when "enough" time has gone by for the merge to complete.

If the target repository has had a lot of pull requests in it, or if you're running Stash on an underpowered or virtual machine, these merges can sometimes take several secodns. Prior to Stash 2.9, the more pull requests a repository has had the slower the merge becomes, due to the buildup of their refs. Stash 2.9 introduces changes to clean up those refs. Internally, on a repository with a little under 3,000 pull requests so far, we noticed significant speed improvements for pull requests with those changes in place. See STASH-3469 for more details on that change.

Hope this helps,
Bryan Turner
Atlassian Stash

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events