Skip to content

Commit 4536040

Browse files
authored
Add new Inputs section in the documentation (#1965)
Signed-off-by: tdruez <tdruez@aboutcode.org>
1 parent 0b80226 commit 4536040

File tree

3 files changed

+230
-2
lines changed

3 files changed

+230
-2
lines changed

docs/faq.rst

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,30 @@ existing data, allowing for more comprehensive analysis and insights.
9595
It's essential to set up :ref:`MatchCode.io <scancodeio_settings_matchcodeio>` before
9696
executing this pipeline.
9797

98+
What input types are supported?
99+
-------------------------------
100+
101+
ScanCode.io supports **multiple input types** for your projects:
102+
103+
- **File Upload**: Upload archives, source files, packages, or SBOMs directly.
104+
See :ref:`inputs_file_upload`.
105+
106+
- **Download URL**: Provide an HTTP/HTTPS URL to fetch remote files.
107+
See :ref:`inputs_download_url`.
108+
109+
- **Package URL (PURL)**: Reference packages from popular registries (npm, PyPI,
110+
Maven, Cargo, NuGet, RubyGems, and more) using the PURL specification.
111+
See :ref:`inputs_package_url`.
112+
113+
- **Docker Reference**: Fetch Docker images directly from container registries
114+
using the ``docker://`` syntax.
115+
See :ref:`inputs_docker_reference`.
116+
117+
- **Git Repository**: Clone a Git repository using its HTTPS URL.
118+
See :ref:`inputs_git_repository`.
119+
120+
For complete details on all input methods, refer to the :ref:`inputs` documentation.
121+
98122
What is the difference between scan_codebase and scan_single_package pipelines?
99123
-------------------------------------------------------------------------------
100124

docs/index.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,11 +47,12 @@ In this documentation, you’ll find:
4747
custom-pipelines
4848
scanpipe-pipes
4949
project-configuration
50-
policies
51-
data-models
50+
inputs
5251
output-files
5352
command-line-interface
5453
rest-api
54+
policies
55+
data-models
5556
automation
5657
webhooks
5758
application-settings

docs/inputs.rst

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
.. _inputs:
2+
3+
Inputs
4+
======
5+
6+
ScanCode.io supports multiple input types for projects, providing flexibility in how
7+
you provide data for analysis. This section covers all supported input methods.
8+
9+
.. _inputs_file_upload:
10+
11+
File Upload
12+
-----------
13+
14+
You can **upload files directly** to a project through the Web UI or REST API.
15+
Supported file types include archives (e.g., ``.tar``, ``.zip``, ``.tar.gz``),
16+
individual source files, pre-built packages, and **SBOMs** (SPDX or CycloneDX in
17+
JSON format).
18+
19+
When uploading through the Web UI, navigate to your project and use the upload
20+
interface in the "Inputs" panel.
21+
22+
For REST API uploads, refer to the :ref:`rest_api` documentation for endpoint details.
23+
24+
.. _inputs_download_url:
25+
26+
Download URL
27+
------------
28+
29+
Instead of uploading files directly, you can provide a **URL pointing to a remote file**.
30+
ScanCode.io will fetch the file and add it to your project inputs.
31+
32+
**HTTP and HTTPS URLs** are supported::
33+
34+
https://example.com/path/to/archive.tar.gz
35+
36+
The fetcher handles HTTP redirects and extracts the filename from either the
37+
``Content-Disposition`` header or the URL path.
38+
39+
.. tip::
40+
For files behind authentication, see :ref:`inputs_authentication`.
41+
42+
.. _inputs_package_url:
43+
44+
Package URL (PURL)
45+
------------------
46+
47+
ScanCode.io integrates with most package repositories using the
48+
`Package URL (PURL) specification <https://github.com/package-url/purl-spec>`_.
49+
50+
A **PURL** is a URL string used to identify and locate a software package in a
51+
mostly universal and uniform way across package managers and ecosystems.
52+
53+
The **general PURL syntax** is::
54+
55+
pkg:<type>/<namespace>/<name>@<version>?<qualifiers>#<subpath>
56+
57+
Cargo (Rust)
58+
^^^^^^^^^^^^
59+
60+
Fetches packages from `crates.io <https://crates.io/>`_::
61+
62+
pkg:cargo/rand@0.7.2
63+
64+
Resolves to: ``https://crates.io/api/v1/crates/rand/0.7.2/download``
65+
66+
RubyGems
67+
^^^^^^^^
68+
69+
Fetches packages from `rubygems.org <https://rubygems.org/>`_::
70+
71+
pkg:gem/bundler@2.3.23
72+
73+
Resolves to: ``https://rubygems.org/downloads/bundler-2.3.23.gem``
74+
75+
npm
76+
^^^
77+
78+
Fetches packages from the `npm registry <https://www.npmjs.com/>`_::
79+
80+
pkg:npm/is-npm@1.0.0
81+
82+
Resolves to: ``https://registry.npmjs.org/is-npm/-/is-npm-1.0.0.tgz``
83+
84+
Hackage (Haskell)
85+
^^^^^^^^^^^^^^^^^
86+
87+
Fetches packages from `Hackage <https://hackage.haskell.org/>`_::
88+
89+
pkg:hackage/cli-extras@0.2.0.0
90+
91+
Resolves to: ``https://hackage.haskell.org/package/cli-extras-0.2.0.0/cli-extras-0.2.0.0.tar.gz``
92+
93+
NuGet (.NET)
94+
^^^^^^^^^^^^
95+
96+
Fetches packages from `nuget.org <https://www.nuget.org/>`_::
97+
98+
pkg:nuget/System.Text.Json@6.0.6
99+
100+
Resolves to: ``https://www.nuget.org/api/v2/package/System.Text.Json/6.0.6``
101+
102+
GitHub
103+
^^^^^^
104+
105+
Fetches release archives from `GitHub <https://github.com/>`_ repositories::
106+
107+
pkg:github/aboutcode-org/scancode-toolkit@3.1.1?version_prefix=v
108+
109+
Resolves to: ``https://github.com/aboutcode-org/scancode-toolkit/archive/v3.1.1.tar.gz``
110+
111+
The ``version_prefix`` qualifier is used when the repository tags include a prefix
112+
(commonly ``v``) before the version number.
113+
114+
Bitbucket
115+
^^^^^^^^^
116+
117+
Fetches archives from `Bitbucket <https://bitbucket.org/>`_ repositories::
118+
119+
pkg:bitbucket/robeden/trove@3.0.3
120+
121+
Resolves to: ``https://bitbucket.org/robeden/trove/get/3.0.3.tar.gz``
122+
123+
GitLab
124+
^^^^^^
125+
126+
Fetches archives from `GitLab <https://gitlab.com/>`_ repositories::
127+
128+
pkg:gitlab/tg1999/firebase@1a122122
129+
130+
Resolves to: ``https://gitlab.com/tg1999/firebase/-/archive/1a122122/firebase-1a122122.tar.gz``
131+
132+
Maven (Java)
133+
^^^^^^^^^^^^
134+
135+
Fetches artifacts from Maven repositories. The default repository is Maven Central::
136+
137+
pkg:maven/org.apache.commons/commons-io@1.3.2
138+
139+
Resolves to: ``https://repo.maven.apache.org/maven2/org/apache/commons/commons-io/1.3.2/commons-io-1.3.2.jar``
140+
141+
You can specify an alternative repository using the ``repository_url`` qualifier::
142+
143+
pkg:maven/org.apache.commons/commons-io@1.3.2?repository_url=https://repo1.maven.org/maven2
144+
145+
You can also fetch POM files or source JARs using the ``type`` and ``classifier``
146+
qualifiers::
147+
148+
pkg:maven/org.apache.commons/commons-io@1.3.2?type=pom
149+
pkg:maven/org.apache.commons/commons-math3@3.6.1?classifier=sources
150+
151+
.. _inputs_docker_reference:
152+
153+
Docker Reference
154+
----------------
155+
156+
ScanCode.io can **fetch Docker images directly** from container registries using the
157+
``docker://`` reference syntax.
158+
159+
Examples::
160+
161+
docker://nginx:latest
162+
docker://alpine:3.22.1
163+
docker://ghcr.io/perfai-inc/perfai-engine:main
164+
docker://osadl/alpine-docker-base-image:v3.22-latest
165+
166+
The Docker image fetcher uses `Skopeo <https://github.com/containers/skopeo>`_ under
167+
the hood. When fetching multi-platform images, ScanCode.io automatically selects the
168+
first available platform.
169+
170+
For private registries requiring authentication, see the following settings:
171+
172+
- :ref:`SCANCODEIO_SKOPEO_CREDENTIALS <scancodeio_settings_skopeo_credentials>`
173+
- :ref:`SCANCODEIO_SKOPEO_AUTHFILE_LOCATION <scancodeio_settings_skopeo_authfile_location>`
174+
175+
.. _inputs_git_repository:
176+
177+
Git Repository
178+
--------------
179+
180+
You can provide a **Git repository URL** as project input. The repository will be cloned
181+
(with only the latest commit history) at the start of pipeline execution.
182+
183+
Example::
184+
185+
https://github.com/aboutcode-org/scancode.io.git
186+
187+
.. note::
188+
SSH URLs (``git@github.com:...``) are not supported. Use HTTPS URLs instead.
189+
190+
.. _inputs_authentication:
191+
192+
Authentication
193+
--------------
194+
195+
For files hosted on private servers or behind authentication, several settings are
196+
available to configure credentials. See :ref:`scancodeio_settings_fetch_authentication`
197+
for details on:
198+
199+
- :ref:`Basic authentication <scancodeio_settings_fetch_basic_auth>`
200+
- :ref:`Digest authentication <scancodeio_settings_fetch_digest_auth>`
201+
- :ref:`HTTP request headers <scancodeio_settings_fetch_headers>` (e.g., for GitHub tokens)
202+
- :ref:`.netrc file <scancodeio_settings_netrc_location>`
203+
- :ref:`Docker private registries <scancodeio_settings_skopeo_credentials>`

0 commit comments

Comments
 (0)