The idea
A command-line tool that reads your project's dependency manifests — package.json, requirements.txt, Cargo.toml, go.mod — and resolves the SPDX license for every direct and transitive dependency. It compares each license against a configurable policy file and exits non-zero with a violation report. Designed to drop into CI so license drift is caught before code ships, not during a legal audit six months later.
Why build this
Companies shipping commercial software need to know whether their dependency tree contains copyleft licenses like GPL-3.0 that require them to open-source their own code. Most teams discover this late — during an acquisition due-diligence review, or when a lawyer asks. Existing tools like FOSSA and Snyk solve this but are SaaS products with per-seat pricing that price out small teams. The raw data is freely available: npm, PyPI, and crates.io all expose license metadata via public APIs. An open-source CLI that runs in CI without sending your dependency list to a third party fills a real gap.
Stack sketch
- Language: Python —
tomllib,requirements-parser, andjsonare all in stdlib, making multi-ecosystem parsing straightforward - License resolution: npm registry JSON API (
registry.npmjs.org/<pkg>/<version>), PyPI JSON API (pypi.org/pypi/<pkg>/<version>/json), crates.io REST API for Rust - SPDX parsing:
spdx-toolsPython library for parsing and comparing SPDX expressions, including compound expressions likeMIT OR Apache-2.0 - Policy file: a single YAML file listing either allowed licenses (allowlist mode) or denied licenses (denylist mode)
- Output: human-readable table, JSON, and JUnit XML for CI integration (GitHub Actions, GitLab CI, Jenkins)
- Caching: local SQLite cache keyed by package + version, invalidated after 7 days, with a
--cache-dirflag for a shared CI cache volume
Scope for v1
- Scan
package.json(npm) andrequirements.txt(PyPI) — two ecosystems covers the majority of mixed projects - Resolve direct dependencies only; log a warning that transitive resolution is skipped, with a
--transitiveflag as a follow-up stub - YAML policy with an allowlist of SPDX identifiers (
MIT,Apache-2.0,BSD-2-Clause, etc.) - Three distinct exit codes: 0 for pass, 1 for policy violations, 2 for resolution errors — so CI can distinguish "you have a GPL dep" from "the registry was unreachable"
- A
--unknown-licensesflag acceptingwarn,fail, orignoreto handle packages with no declared license - No Cargo or Go support in v1
Where it could go
Add Cargo and Go module support to cover Rust and Go shops — both ecosystems have well-documented public APIs and consistent license metadata. A --suggest flag could propose MIT-licensed alternatives for violating packages where a known drop-in exists, turning the tool from a linter into an active dependency advisor.
A lightweight self-hosted dashboard that ingests the JSON report over time would let legal and engineering share a single compliance view without both needing CLI access. At that point the project crosses from a developer utility into a fundable niche: open-core compliance tooling for teams that can't afford enterprise SaaS but have real legal obligations.
Watch out for
Many packages declare no license field at all, or use a freeform string like "UNLICENSED" or "SEE LICENSE IN LICENSE.md" that doesn't map to an SPDX identifier — handle this from day one with the --unknown-licenses flag, or teams will see noisy failures on their first run and disable the check entirely.