When pushing a new branch to a Stash repository, a pre-receive hook gets a RefChange with fromHash=0. Is this the intended behaviour?
Putting that in a ChangesetsBetweenRequest.Builder.exclude(fromHash)... leads to the full repo history being returned.
How can I get only the changesets that are not already in the target repo?
Community moderators have prevented the ability to post new answers.
Hi Marian,
This is, perhaps somewhat counter-intuitively, expected behaviour of Git (and thus Stash). Git makes no distinction about what are new commits vs commits that _just happened_ to exist in the repository when you pushed.
Perhaps an ASCII diagram to help:
o---o---o---o---o A \ --o---o---o---o B \ --o---o C
The question is - what commits are 'on' branch C? Is it 2 commits, or 4? The answer (as far as Git is concerned) depends entirely on what you're comparing it to. If you compare it to B then it has 2 commits, but if you compare it to A then it has 4. The fact that you might have pushed C after B is irrelevant, and as such when you create a new branch the post-receive hook will pass fromHash=0, and not the last 'seen' commit. What happens if branch B had been pushed and then deleted? If Git hadn't gc'd those commits yet then it would need to walk the entire graph to work out what was now 'visible'.
I won't lie though - this is the cause of many frustrations for Stash, because it would be very handy to know things like - which branch a commit was first 'seen' on, and who 'pushed' it. (And wait until you start having to worry about forks).
For now, as you've discovered, you can use ChangesetIndex.isMemberOf(), which is something we use to index which commits we've 'seen' before. But that is definitely a Stash specific tool/concept, and not related to how Git actually works. We may enhance how some of this works over time to meet our own requirements, but it's not going to be a trivial task.
I hope that helps?
Charles
Thanks for the explanation, it definitely helps to understand the problem.
I do have one problem with ChangesetIndex.isMemberOf() though. I consistently have the case that on any new: fork, clone, create a new branch, push new branch to Stash (no new commits), for all commits (which are existing commits) isMemberOf returns false. The javadoc does state: "true if the provided changeset is (indexed as) a member of repository"
So it seems the repo is not yet indexed. This begs the question: how do I trigger an indexing of a Repository?
Thanks,
Marian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Marian,
Indexing happens on another thread just after a push. Note that it can't happen during a pre-receive because it needs the refs to exist in the repository. Basically you can't really trigger it manually.
As I hinted at in my previous message, forks get even harder. We basically don't fully index forks because of the potential explosion in DB relationships. For large repositories each commit would have a relationship with every fork. We are still investigating how to efficiently store/retrieve this information.
My suggestion for forks would be to maybe try calling getChangeset() instead of isMemberOf(), which will return null if it was the first time it was pushed.
Let me know if that works.
Charles
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Charles,
indeed, getChangeset() works on the forked repo where isMemberOf() does not.
Can you confirm that getChangeset bypasses the ChangesetIndex when getting the data?
Many thanks,
Marian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Marian,
It doesn't bypass the index at all - it's a direct connection to the database entry. While this is going to sound confusing we do index commits in a fork, but we don't index the 'membership'. So a null value from that method indicates that Stash has never seen it before in _any_ repository, where-as a non-null ChangesetIndex indicates that Stash has indexed it _somewhere_, but IndexedChangeset.getRepositories() may possibly return nothing if the commit is only on a fork. When you merge that commit to the main repository it will then be indexed correctly and isMemberOf() would start to work for that parent repository only.
Does that make sense?
You're really getting in the nitty-gritty of how Stash works at this point, and my hope is that over time we can introduce a more accurate (but not too costly) indexing of forks.
Cheers,
Charles
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hey Charles,
thanks for the insight. I think I get the idea - which is probably wrong :)
For our workflow it looks like getChangeset is a good substitute, as we are not fork-heavy and "not in any forked repo or master" is identical to "not in this repo", but this won't be the case for everyone out there.
Looking forward to seeing those improvements soon and thanks again for your support,
Marian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Is there an Atlassian feature request for a simpler API for this? Its a common function for most pre-receive hooks. As well as the way mentioned here, the YACC plugin basically excludes refs/heads/*, which seems to work although feels like a different hack (see getBranches in https://github.com/sford/yet-another-commit-checker/blob/master/src/main/java/com/isroot/stash/plugin/ChangesetsServiceImpl.java#L84)
BTW, the atlassian-supplied (but not supported) filesize plugin (https://bitbucket.org/atlassianlabs/stash-filesize-hook-plugin) also gets this wrong - with a small max-file-size configured, using the demo project "git clone https://stash/stash/projects/PROJECT_1/repos/rep_1/browse;git checkout -b test_branch; git push" fails because the plugin checks all the commits in the branch history....
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Marian / Charles - Can either one of you tell me how to use "ChangesetIndex.getChangeset" with some code. [ ChangesetIndex is an interface, not a class & I dont see any class in API that implements ChangesetIndex ]
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Zeeshan, Simply use constructor injection, see this topic: https://answers.atlassian.com/questions/245320/is-there-any-way-to-access-a-changesetindex-without-implementing-one-yourself
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Found a workaround using ChangesetIndex.isMemberOf()
Still, it would be nice to find out if fromHash=0 is the intended behaviour.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.