Supply chain safety is on the fore of the business’s collective consciousness. We’ve not too long ago seen a big rise in software program provide chain assaults, a Log4j vulnerability of catastrophic severity and breadth, and even an Executive Order on Cybersecurity.
It is in opposition to this background that Google is looking for contributors to a brand new open supply challenge known as GUAC (pronounced just like the dip). GUAC, or Graph for Understanding Artifact Composition, is within the early levels but is poised to alter how the business understands software program provide chains. GUAC addresses a necessity created by the burgeoning efforts throughout the ecosystem to generate software program construct, safety, and dependency metadata. True to Google’s mission to prepare and make the world’s info universally accessible and helpful, GUAC is supposed to democratize the provision of this safety info by making it freely accessible and helpful for each group, not simply these with enterprise-scale safety and IT funding.
Thanks to group collaboration in teams corresponding to OpenSSF, SLSA, SPDX, CycloneDX, and others, organizations more and more have prepared entry to:
These information are helpful on their very own, but it surely’s troublesome to mix and synthesize the knowledge for a extra complete view. The paperwork are scattered throughout totally different databases and producers, are hooked up to totally different ecosystem entities, and can’t be simply aggregated to reply higher-level questions on a corporation’s software program belongings.
To assist tackle this challenge we’ve teamed up with Kusari, Purdue University, and Citi to create GUAC, a free instrument to carry collectively many various sources of software program safety metadata. We’re excited to share the challenge’s proof of idea, which helps you to question a small dataset of software program metadata together with SLSA provenance, SBOMs, and OpenSSF Scorecards.
Graph for Understanding Artifact Composition (GUAC) aggregates software program safety metadata right into a excessive constancy graph database—normalizing entity identities and mapping customary relationships between them. Querying this graph can drive higher-level organizational outcomes corresponding to audit, coverage, threat administration, and even developer help.
Conceptually, GUAC occupies the “aggregation and synthesis” layer of the software program provide chain transparency logical mannequin:
GUAC has 4 main areas of performance:
- Collection
GUAC will be configured to connect with quite a lot of sources of software program safety metadata. Some sources could also be open and public (e.g., OSV); some could also be first-party (e.g., a corporation’s inside repositories); some could also be proprietary third-party (e.g., from information distributors). - Ingestion
From its upstream information sources GUAC imports information on artifacts, tasks, sources, vulnerabilities, repositories, and even builders. - Collation
Having ingested uncooked metadata from disparate upstream sources, GUAC assembles it right into a coherent graph by normalizing entity identifiers, traversing the dependency tree, and reifying implicit entity relationships, e.g., challenge → developer; vulnerability → software program model; artifact → supply repo, and so forth. - Query
Against an assembled graph one might question for metadata hooked up to, or associated to, entities throughout the graph. Querying for a given artifact might return its SBOM, provenance, construct chain, challenge scorecard, vulnerabilities, and up to date lifecycle occasions — and people for its transitive dependencies.A CISO or compliance officer in a corporation needs to have the ability to purpose in regards to the threat of their group. An open supply group just like the Open Source Security Foundation needs to determine important libraries to take care of and safe. Developers want richer and extra reliable intelligence in regards to the dependencies of their tasks.
The excellent news is, more and more one finds the upstream provide chain already enriched with attestations and metadata to energy higher-level reasoning and insights. The unhealthy information is that it’s troublesome or not possible right now for software program shoppers, operators, and directors to assemble this information right into a unified view throughout their software program belongings.
To perceive one thing complicated just like the blast radius of a vulnerability, one must hint the connection between a part and every little thing else within the portfolio—a process that would span hundreds of metadata paperwork throughout a whole bunch of sources. In the open supply ecosystem, the variety of paperwork may attain into the hundreds of thousands.
GUAC aggregates and synthesizes software program safety metadata at scale and makes it significant and actionable. With GUAC in hand, we can reply questions at three essential levels of software program provide chain safety:
- Proactive, e.g.,
- What are essentially the most used important elements in my software program provide chain ecosystem?
- Where are the weak factors in my general safety posture?
- How do I forestall provide chain compromises earlier than they occur?
- Where am I uncovered to dangerous dependencies?
- Operational, e.g.,
- Is there proof that the applying I’m about to deploy meets group coverage?
- Do all binaries in manufacturing hint again to a securely managed repository?
- Reactive, e.g.,
- Which elements of my group’s stock is affected by new vulnerability X?
- A suspicious challenge lifecycle occasion has occurred. Where is threat launched to my group?
- An open supply challenge is being deprecated. How am I affected?
- Proactive, e.g.,
GUAC is an Open Source challenge on Github, and we’re excited to get extra of us concerned and contributing (learn the contributor information to get began)! The challenge remains to be in its early levels, with a proof of idea that may ingest SLSA, SBOM, and Scorecard paperwork and help easy queries and exploration of software program metadata. The subsequent efforts will give attention to scaling the present capabilities and including new doc varieties for ingestion. We welcome assist and contributions of code or documentation.
Since the challenge will probably be consuming paperwork from many various sources and codecs, we’ve got put collectively a gaggle of “Technical Advisory Members” to assist advise the challenge. These members embody illustration from corporations and teams corresponding to SPDX, CycloneDX Anchore, Aquasec, IBM, Intel, and many extra. If you’re all for collaborating as a contributor or advisor representing finish customers’ wants—or the sources of metadata GUAC consumes—you’ll be able to register your curiosity within the related GitHub challenge.
The GUAC staff will probably be showcasing the challenge at Kubecon NA 2022 subsequent week. Come by our session in the event you’ll be there and have a chat with us—we’d be comfortable to speak in individual or just about!