Files
legal-ai/paperclip-bug-report.md
Chaim b409f1c7eb Add case data, benchmark embeddings, and bug report
Add cases symlink, Google Vision extraction and benchmark
embedding data, and Paperclip bug report.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:20:40 +00:00

2.5 KiB

Bug: Skill import from Gitea — wrong raw URL format causes empty SKILL.md

File at: https://github.com/paperclipai/paperclip/issues/new

Title

Skill import from Gitea: wrong raw URL format causes empty SKILL.md

Body

Bug Summary

When importing skills from a Gitea instance (self-hosted), Paperclip fetches the git tree successfully via the /api/v3/ endpoint (which Gitea supports), but then uses the wrong raw file URL format to download SKILL.md content, resulting in a 404 and an almost-empty stub being saved.

Environment

  • Paperclip server: @paperclipai/server@2026.403.0
  • Gitea instance: self-hosted Gitea

Steps to Reproduce

  1. Host a skill repo on a Gitea instance with a SKILL.md (32KB+), scripts/, and references/ directories
  2. Import the skill via URL: https://my-gitea.example.com/org/skill-name.git
  3. Observe that only a stub SKILL.md (~283 bytes) is saved, and subdirectories are missing

Root Cause

In server/dist/services/github-fetch.js, the resolveRawGitHubUrl() function builds:

https://{hostname}/raw/{owner}/{repo}/{ref}/{file}

This format works for GitHub Enterprise, but not for Gitea. Gitea expects:

https://{hostname}/{owner}/{repo}/raw/branch/{ref}/{file}

Proof

# Paperclip's URL format -> 404
$ curl -s -o /dev/null -w "%{http_code}" "https://my-gitea.example.com/raw/org/skill-repo/main/SKILL.md"
404

# Correct Gitea format -> 200
$ curl -s -o /dev/null -w "%{http_code}" "https://my-gitea.example.com/org/skill-repo/raw/branch/main/SKILL.md"
200

Secondary Issue

When SKILL.md is at the repository root, path.posix.dirname("SKILL.md") returns ".", causing the inventory filter entry.startsWith("./") to miss all sibling directories (scripts/, references/). This means even if the raw URL worked, subdirectories would still be excluded from the file inventory.

Suggested Fix

  1. Detect Gitea vs GitHub Enterprise (e.g., check for /api/v1/ endpoint which is Gitea-specific, vs /api/v3/)
  2. Use the correct raw URL format per platform:
    • GitHub/GHE: https://{hostname}/raw/{owner}/{repo}/{ref}/{file}
    • Gitea: https://{hostname}/{owner}/{repo}/raw/branch/{ref}/{file}
  3. Fix root-level SKILL.md inventory: when skillDir === ".", include all files instead of filtering by entry.startsWith("./")

Workaround

Manually clone the repo into ~/.paperclip/instances/default/skills/{company_id}/{slug}/ and update the company_skills table directly with correct markdown content and file_inventory.