zephyr/scripts/west_commands/zspdx/spdxids.py
Steve Winslow fd31b9b4ac west: spdx: Generate SPDX 2.2 tag-value documents
This adds support to generate SPDX 2.2 tag-value documents via the
new west spdx command. The CMake file-based APIs are leveraged to
create relationships from source files to the corresponding
generated build files. SPDX-License-Identifier comments in source
files are scanned and filled into the SPDX documents.

Before `west build` is run, a specific file must be created in the
build directory so that the CMake API reply will run. This can be
done by running:

    west spdx --init -d BUILD_DIR

After `west build` is run, SPDX generation is then activated by
calling `west spdx`; currently this requires passing the build
directory as a parameter again:

    west spdx -d BUILD_DIR

This will generate three SPDX documents in `BUILD_DIR/spdx/`:

1) `app.spdx`: This contains the bill-of-materials for the
application source files used for the build.

2) `zephyr.spdx`: This contains the bill-of-materials for the
specific Zephyr source code files that are used for the build.

3) `build.spdx`: This contains the bill-of-materials for the built
output files.

Each file in the bill-of-materials is scanned, so that its hashes
(SHA256 and SHA1) can be recorded, along with any detected licenses
if an `SPDX-License-Identifier` appears in the file.

SPDX Relationships are created to indicate dependencies between
CMake build targets; build targets that are linked together; and
source files that are compiled to generate the built library files.

`west spdx` can be called with optional parameters for further
configuration:

* `-n PREFIX`: specifies a prefix for the Document Namespaces that
will be included in the generated SPDX documents. See SPDX spec 2.2
section 2.5 at
https://spdx.github.io/spdx-spec/2-document-creation-information/.
If -n is omitted, a default namespace will be generated according
to the default format described in section 2.5 using a random UUID.

* `-s SPDX_DIR`: specifies an alternate directory where the SPDX
documents should be written. If not specified, they will be saved
in `BUILD_DIR/spdx/`.

* `--analyze-includes`: in addition to recording the compiled
source code files (e.g. `.c`, `.S`) in the bills-of-materials, if
this flag is specified, `west spdx` will attempt to determine the
specific header files that are included for each `.c` file. This
will take longer, as it performs a dry run using the C compiler
for each `.c` file (using the same arguments that were passed to it
for the actual build).

* `--include-sdk`: if `--analyze-includes` is used, then adding
`--include-sdk` will create a fourth SPDX document, `sdk.spdx`,
which will list any header files included from the SDK.

Signed-off-by: Steve Winslow <steve@swinslow.net>
2021-05-05 11:14:06 -04:00

62 lines
1.9 KiB
Python

# Copyright (c) 2020, 2021 The Linux Foundation
#
# SPDX-License-Identifier: Apache-2.0
import re
def getSPDXIDSafeCharacter(c):
"""
Converts a character to an SPDX-ID-safe character.
Arguments:
- c: character to test
Returns: c if it is SPDX-ID-safe (letter, number, '-' or '.');
'-' otherwise
"""
if c.isalpha() or c.isdigit() or c == "-" or c == ".":
return c
return "-"
def convertToSPDXIDSafe(s):
"""
Converts a filename or other string to only SPDX-ID-safe characters.
Note that a separate check (such as in getUniqueID, below) will need
to be used to confirm that this is still a unique identifier, after
conversion.
Arguments:
- s: string to be converted.
Returns: string with all non-safe characters replaced with dashes.
"""
return "".join([getSPDXIDSafeCharacter(c) for c in s])
def getUniqueFileID(filenameOnly, timesSeen):
"""
Find an SPDX ID that is unique among others seen so far.
Arguments:
- filenameOnly: filename only (directories omitted) seeking ID.
- timesSeen: dict of all filename-only to number of times seen.
Returns: unique SPDX ID; updates timesSeen to include it.
"""
converted = convertToSPDXIDSafe(filenameOnly)
spdxID = f"SPDXRef-File-{converted}"
# determine whether spdxID is unique so far, or not
filenameTimesSeen = timesSeen.get(converted, 0) + 1
if filenameTimesSeen > 1:
# we'll append the # of times seen to the end
spdxID += f"-{filenameTimesSeen}"
else:
# first time seeing this filename
# edge case: if the filename itself ends in "-{number}", then we
# need to add a "-1" to it, so that we don't end up overlapping
# with an appended number from a similarly-named file.
p = re.compile(r"-\d+$")
if p.search(converted):
spdxID += "-1"
timesSeen[converted] = filenameTimesSeen
return spdxID