feat: add Bitbucket SCM provider support#328
feat: add Bitbucket SCM provider support#328Abhinandan-Khurana wants to merge 6 commits intoColeMurray:mainfrom
Conversation
Greptile SummaryThis PR adds Bitbucket Cloud as a fully-implemented SCM provider alongside GitHub, covering the control plane, web auth flow, sandbox clone-token wiring, and Terraform/Modal deployment config. The abstraction is well-structured: provider selection, token refresh, workspace enforcement, and repo/branch listing all follow consistent patterns and are backed by new unit tests. Issues found:
|
| Accept: "application/json", | ||
| }; | ||
| } | ||
|
|
||
| throw new SourceControlProviderError( | ||
| "Bitbucket app credentials not configured", | ||
| "permanent" | ||
| ); | ||
| } | ||
|
|
||
| private async buildApiHeaders(auth?: SourceControlAuthContext): Promise<HeadersInit> { | ||
| if (auth?.token) { | ||
| return { | ||
| Authorization: `Bearer ${auth.token}`, | ||
| Accept: "application/json", | ||
| }; | ||
| } | ||
|
|
||
| return this.buildAppHeaders(); | ||
| } | ||
|
|
||
| private async fetchJson<T>(url: string, init?: RequestInit): Promise<T> { | ||
| const response = await fetch(url, init); | ||
| if (!response.ok) { | ||
| const text = await response.text(); |
There was a problem hiding this comment.
Client-credentials token fetched on every app-level API call
buildAppHeaders calls getBitbucketClientCredentialsToken unconditionally on every invocation — there is no token caching:
if (this.config.clientId && this.config.clientSecret) {
const token = await getBitbucketClientCredentialsToken({ ... }); // network call every time
return { Authorization: `Bearer ${token.access_token}`, ... };
}Because buildApiHeaders falls through to buildAppHeaders whenever no user auth is supplied, this means:
listRepositories(pagination loop): a new token is fetched from Bitbucket for each page.generatePushAuth: also callsgetBitbucketClientCredentialsTokendirectly — another separate token fetch per sandbox creation.- Repo-image scheduler:
getBranchHeadShais called in aPromise.allover all enabled repos, each triggering a fresh token fetch.
Client-credentials tokens are valid for at least one hour. The token should be cached inside the provider instance (or at module scope with an expiry check) so that multiple calls within the same request — or across requests within the same Worker instance — reuse it rather than hitting Bitbucket's token endpoint on every call.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/source-control/providers/bitbucket-provider.ts
Line: 73-97
Comment:
**Client-credentials token fetched on every app-level API call**
`buildAppHeaders` calls `getBitbucketClientCredentialsToken` unconditionally on every invocation — there is no token caching:
```typescript
if (this.config.clientId && this.config.clientSecret) {
const token = await getBitbucketClientCredentialsToken({ ... }); // network call every time
return { Authorization: `Bearer ${token.access_token}`, ... };
}
```
Because `buildApiHeaders` falls through to `buildAppHeaders` whenever no user `auth` is supplied, this means:
- **`listRepositories`** (pagination loop): a new token is fetched from Bitbucket for *each page*.
- **`generatePushAuth`**: also calls `getBitbucketClientCredentialsToken` directly — another separate token fetch per sandbox creation.
- **Repo-image scheduler**: `getBranchHeadSha` is called in a `Promise.all` over all enabled repos, each triggering a fresh token fetch.
Client-credentials tokens are valid for at least one hour. The token should be cached inside the provider instance (or at module scope with an expiry check) so that multiple calls within the same request — or across requests within the same Worker instance — reuse it rather than hitting Bitbucket's token endpoint on every call.
How can I resolve this? If you propose a fix, please make it concise.| targetBranch: data.destination?.branch?.name ?? config.targetBranch, | ||
| }; | ||
| } | ||
|
|
||
| async checkRepositoryAccess( | ||
| config: GetRepositoryConfig, | ||
| auth?: SourceControlAuthContext | ||
| ): Promise<RepositoryAccessResult | null> { | ||
| try { | ||
| const data = await this.fetchJson<BitbucketRepositoryResponse>( | ||
| `${BITBUCKET_API_BASE}/repositories/${encodeURIComponent(config.owner)}/${encodeURIComponent(config.name)}`, | ||
| { | ||
| headers: await this.buildApiHeaders(auth), | ||
| } | ||
| ); | ||
| const repository = this.mapRepository(data); |
There was a problem hiding this comment.
getBranchHeadSha skips the workspace enforcement check
Every other method that fetches repo data (checkRepositoryAccess, listBranches) calls ensureRepositoryInWorkspace before trusting the Bitbucket response. getBranchHeadSha fetches directly without that guard:
async getBranchHeadSha(config: GetBranchHeadShaConfig): Promise<string | null> {
const data = await this.fetchJson<...>(
`.../${config.owner}/${config.name}/refs/branches/...`,
{ headers: await this.buildAppHeaders() } // no workspace check
);
return data.target?.hash ?? null;
}While the repos passed to getBranchHeadSha today are sourced from resolveInstalledRepo (which already validates the workspace), the omission means:
- The contract documented on other methods is silently broken here.
- If
getBranchHeadShais called through the interface in a future code path, or if a cross-workspace repo somehow ends up in the DB, the branch SHA of an out-of-workspace repo would be resolved and returned without any error.
Adding the same workspace check that checkRepositoryAccess performs (by fetching and calling ensureRepositoryInWorkspace on the repo response first, or by reusing checkRepositoryAccess) would make the method consistent with the rest of the provider.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/control-plane/src/source-control/providers/bitbucket-provider.ts
Line: 214-229
Comment:
**`getBranchHeadSha` skips the workspace enforcement check**
Every other method that fetches repo data (`checkRepositoryAccess`, `listBranches`) calls `ensureRepositoryInWorkspace` before trusting the Bitbucket response. `getBranchHeadSha` fetches directly without that guard:
```typescript
async getBranchHeadSha(config: GetBranchHeadShaConfig): Promise<string | null> {
const data = await this.fetchJson<...>(
`.../${config.owner}/${config.name}/refs/branches/...`,
{ headers: await this.buildAppHeaders() } // no workspace check
);
return data.target?.hash ?? null;
}
```
While the repos passed to `getBranchHeadSha` today are sourced from `resolveInstalledRepo` (which already validates the workspace), the omission means:
1. The contract documented on other methods is silently broken here.
2. If `getBranchHeadSha` is called through the interface in a future code path, or if a cross-workspace repo somehow ends up in the DB, the branch SHA of an out-of-workspace repo would be resolved and returned without any error.
Adding the same workspace check that `checkRepositoryAccess` performs (by fetching and calling `ensureRepositoryInWorkspace` on the repo response first, or by reusing `checkRepositoryAccess`) would make the method consistent with the rest of the provider.
How can I resolve this? If you propose a fix, please make it concise.| } | ||
|
|
||
| function cacheBitbucketJwtState(keys: string[], state: CachedBitbucketJwtState): void { | ||
| for (const key of keys) { | ||
| cachedBitbucketJwtStates.set(key, state); |
There was a problem hiding this comment.
Unbounded memory growth in cachedBitbucketJwtStates
The cacheBitbucketJwtState function only ever adds entries to the Map — it never removes stale or rotated keys. After every successful token rotation inside getRefreshableScmJwtToken, the new refresh token is registered as a new map key but the previous key is never deleted.
In a long-lived server process (self-hosted Next.js, warm Vercel function instances, Cloudflare Workers module workers), every token rotation leaves an orphaned entry behind. For a deployment with many active Bitbucket users refreshing tokens hourly, the Map grows without bound over time.
The fix is to delete the old initialCacheKeys from the map inside the .then() callback after the new keys have been registered, so only the current refresh-token key remains active per user state.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/web/src/lib/auth.ts
Line: 207-211
Comment:
**Unbounded memory growth in `cachedBitbucketJwtStates`**
The `cacheBitbucketJwtState` function only ever adds entries to the Map — it never removes stale or rotated keys. After every successful token rotation inside `getRefreshableScmJwtToken`, the new refresh token is registered as a new map key but the previous key is never deleted.
In a long-lived server process (self-hosted Next.js, warm Vercel function instances, Cloudflare Workers module workers), every token rotation leaves an orphaned entry behind. For a deployment with many active Bitbucket users refreshing tokens hourly, the Map grows without bound over time.
The fix is to delete the old `initialCacheKeys` from the map inside the `.then()` callback after the new keys have been registered, so only the current refresh-token key remains active per user state.
How can I resolve this? If you propose a fix, please make it concise.|
closing due to inactivity |
Summary
Add Bitbucket as a supported SCM provider across the web app, control plane, and deployment config, with follow-up fixes to harden auth and workspace-scoped behavior.
Included
Verification
node node_modules/vitest/vitest.mjs run src/lib/auth.test.ts --config vitest.config.tsnpm test -w @open-inspect/control-plane -- src/source-control/providers/bitbucket-provider.test.tsnpm test -w @open-inspect/control-plane -- src/source-control/config.test.tsnpm test -w @open-inspect/control-plane -- src/session/participant-service.test.ts