Link Search Menu Expand Document

ASWF CI Working Group

Meeting: 14 October 2020

https://zoom.us/j/757849640?pwd=QzE1K2hrL2FHSFhKK3h5Z3BWTFJsZz09

Attendees

  • Daniel Heckenberg (Animal Logic, TAC Chair)
  • Andrew Grimberg (Linux Foundation Release Engineering)
  • Patrick Hodoul (Autodesk / OCIO)
  • Jean-Francois Panisset (VES Technology Committee)
  • Marshall Elfstrand (Apple Developer Relations)
  • Jeff Bradley (DreamWorks)
  • Brian Cipriano (OpenCue)
  • Larry Gritz (OSL)
  • Aloys Baillet (Animal Logic / ASWF Docker Containers)

Agenda & Notes

  • First meeting in 2 months due to Open Source Days, skipping that meeting

ASWF CI Goals for Year 2

  • GPU Build & Test (success!)

    • How to formalize / canonize configuration setup to make it easier to use for other projects

    • JF: document in Template Project

    • Other projects: adapt what was done for OCIO

    • Make sure that AWS funding coming in for the GPU builder / CodeBuilder. Andrew: need to follow up, not sure for now.

  • Mac, Windows & Linux (New focus)

  • Packaging / Distribution

  • Testing with commercial components

Follow ups

  • GPU Build & Test

    • Documentation / template project?

    • Optix as a dependency for OSL

      • Larry: part of the driver install, missing header files

      • NVIDIA provided a direct download path for header files, so can be handed at build time, the link is versioned, can pull needed version on the file. Headers are small, rest of the install (CUDA) should already be in the containers.

      • Should be more or less ready to go, need to figure out how to build the download link, OSL will be the example to follow.

      • Action: AWSF should have membership in the NVIDIA Developer Program

    • GitHub secret sharing with PRs

      • GitHub secret sharing for PR builds: GitLab is offering that functionality, no movement yet on the GitHub side
  • VFX Reference Platform

    • ASWF Survey for window of compatibility

      • Larry: VFX Platform has kept narrow focus on purpose

      • Which years should we be targeting? How far back should we keep our packages updated? How far back should we test our current code?

      • TAC could be a good place to give guidance to projects

      • Daniel: one of the ideas floated is to try to answer some of the questions from our users via a survey.

      • What does a supported version mean? What can users expect in terms of “support”?

      • Daniel: will bring together a group to put together the survey, especially for the CI / dependency relevant part of the survey.

      • Andrew: would it be beneficial for the TAC to come up with an Integration workgroup that comes up with validation tests that project integrate properly. Is there currently a test suite for the VFX Platform? Daniel: not that’s defined for this purpose. There are high level applications that pull in various ASWF / Reference Platform components. Andrew: these issues come up in other LF projects, which are composed of a number of smaller subprojects. ASWF is somewhat unique in that it is composed of single, larger, stand-alone projects. Other LF projects typically have an integration WG that works on integrating the sub projects.

      • LFN (Linux Foundation Networking) project has an integration project. Could have something similar for the VFX Reference Platform.

      • Integration projects typically only test on release artifacts, or development artifacts at specific stages. There is a team that has the specific task of this development: the current ASWF Docker Containers already are mostly there.

      • Daniel: running the individual unit tests of every project in a single environment could go a long way.

    • Position on versions of other common dependencies, including CMake

      • Larry: build side components, dependencies not on the platform list. Are there versions that we want to link to.
    • Availability of clang-tidy etc

    • Mac, Windows

    • Larry: OpenEXR dependencies (for instance) become de-facto standard

    • Daniel: are we trying to keep the Reference Platform as small as possible, or are we trying to define the platform in a maximum way so that guarantees compatibility across years. For compilers we want “as old as possible” compatibility, but still want to use more modern components. Artifacts such as Docker Containers need to have a clearly defined purpose. We haven’t been very clear at identifying use cases.

    • Larry: platform is very specific as “there shall be this version in this year”: not looking for that necessarily, but min/max versions would help. Another edge case: sometimes packages have major upgrades within a year. For instance OpenEXR, 2020 spec says 2.4, 2021 draft says 3.0, meanwhile this year’s major release of OpenEXR is 2.5, should we test against that version since it’s not part of any specific year.

    • Should all ASWF projects release in lockstep, following the Reference Platform? Or at least make it clear that if you want a new major release included in the platform, you have to hit a specific deadline related to new platform year versions.

    • Daniel: important that vendors have clarity about the dependencies they build into their platform, we can’t lose sight of the importance of defining that narrowly.

    • Daniel: are we running any of the component tests in the context of the Docker containers as we build them, or only running the builds? Aloys: some projects are running their test suites within our containers, and some external projects such as Walt Disney Animation partio have asked for support. Also GPU tests are running within that Docker image.

    • Patrick: about dependencies between Open Source Projects, there are 2 levels. One is at the VFX Reference Platform to make sure that all the projects work together, but at the project level, there is more control. OCIO for instance has more control since it can specify exactly what it is building against. Each project should have a “build against latest” to make sure that when it comes time to integrate everything into a VFX Reference Platform year, we are already mostly there. Daniel: “top of tree” builds. Patrick: doesn’t need to be a rule, but could be a strong suggestion that every build has such as “top of tree” build. Larry: I do this for my projects, not for every project dependency, but especially against top of tree of projects that are in flux. Don’t want to be caught off guard when a project does a release and haven’t been testing builds against “latest”. Patrick: OCIO does have a top of tree build. Larry: don’t do it against OpenVDB due to build costs, same for LLVM.

  • ASWF-docker updates

    • CI vs runtime environment. Distribution?

      • Discussion about distribution complexities at last TAC meeting

      • How does redistribution of binaries trigger IP / export control issues? Does that include Docker Containers?

      • Daniel: as we step into package redistribution, it makes the issue more visible. John provided some examples / links about how other LF-aligned foundations deal with it. We should try to figure out how this applies to us.

      • Andrew: different experiences about distribution. Most of the projects he’s worked on only distribute their own build artifacts. Containers are complex since you are building a bunch of components you didn’t build. Some foundations set up a completely different LLC organization for the purposes of artifact distribution for legal reasons. John would have more visibility into that. Distributing someone else’s artifacts always opens door to external challenges.

      • Aloys: we’ve definitely avoided doing some things, all our Docker images should only include open source components. Should we provide license files for everything we combine? But what about the OS dependencies? Andrew: don’t need to provide all the licenses included in the base container image. Aloys: we do add a number of packages to the base distribution package.

      • Andrew: should be fine release as is as long as you are just pulling additional packages from upstream source. The distribution is already doing what they need to do to protect IP, and make sure that packages included in the distribution can be freely redistributed and don’t include those in their repos (hence additional repositories such as EPEL).

      • Andrew: if we were to generate a container with “difficult” dependencies such as ffmpeg, we would be stepping into murky waters. A flag to automatically instead that dependency rather than include it would be better.

      • Aloys: the list of packages is in a single place, so easy to review.

    • VFX2021 / CUDA11

      • Aloys: still an issue, unfixed problems. NVIDIA has released CentOS 7 images but not every one, missing the nvidia/cuda-opengl container. Have an open ticket with NVIDIA, we don’t necessarily care so much about CVEs since there’s no sensitive code inside those Docker images.

      • Focussing on more clang / LLVM variants in the Docker images, working with Larry and Nick from OpenVDB. Working on clang 6-7-8-9-10 variants on selected VFX platform years. Larry likes to be able to build the code with various compilers, currently has to do under Ubuntu.

      • Aloys: was about to release all the compilers, but clang 7.1.0 just released, and fixes a bug affecting OpenVDB, so will re-release the containers with that clang version.

      • Hoping this will fix OpenVDB clang 7 test suite crash.

    • Docker Hub repository changes.

      • Hoping to have those containers this week, but having weird timeout errors on GitHub Actions pulling packages. Could be related to new limits imposed by GitHub Actions? It looks like everything is now slower to pull large Docker images.

      • Pulling vfx-all taking 3.5 min

      • Larry: are they throttling our bandwidth? Aloys: Docker Hub is likely trying to throttle everyone’s bandwidth / storage.

      • Daniel: any clarity on how policy changes at Docker Hub are affecting us?

      • Andrew: haven’t heard back about application for open source, we may want to go ahead and purchase an account. We will need to make sure that all the ASWF repos are using account credentials to avoid impact. But projects outside the ASWF github repo would still be hit by throttling limits.

      • Aloys: we already have credentials in ASWF-docker project. Andrew: would need to use credentials for pulls instead of doing anonymous pulls.

      • Andrew: will start with the “smallest team”, should be around $300/year.

      • Aloys: aswf account is used on Docker Hub

      • Looks like latest builds just went through

  • Mac CI

    • Is it possible to have more control over the Mac Agent build environment on GitHub Actions (since we don’t have Docker containers available). Building in a “historical” environment vs in a “latest environment” with backward compatibility.

    • Larry: use Homebrew install on Mac, but that’s always “latest” and has pros and cons

    • MacStadium / Developer Transition Kits

      • Various early access programs, targeted to individual developers / companies, anything targeted to organizations like ASWF?

      • Marshall: the MacStadium Open Source program (opensource@macstadium.com) and for qualified OSS projects, you can contact them and they have DTKs available they can add to your account at no cost for your project. That’s the only DTK program available for open source projects and which is freely available.

      • Larry: had a call with Apple, Apple had already build OIIO and in process of building OSL, sent PR for OIIO, Larry hadn’t heard back from MacStadium since initial contract, but got prompted and should be in the process of being resolved.

      • Daniel: any examples of GitHub Actions workflows with DTK? Andrew: no other projects at LF currently using Apple DTK.

      • Marshall: the MacStadium DTK access may actually allocate dedicated hardware (access via Orka).

      • Larry: not clear how crucial this setup is, by the time the hardware is widely available, GitHub should have added it as an option for public runners. Main point of using MacStadium so that our projects are “ready to go” when Apple Silicon runners become available via GitHub Actions.

      • Daniel: so doing ad-hoc builds and fixing specific issues should be sufficient at this stage, as opposed to builds on every updates?

      • Larry: can depend on the project, and how much trouble it would be to set up. It’s not that radically different an environment. PR submitted by Apple for OIIO was pretty minor, package has already been ported to ARM on Linux. On OSL it will be more extensive because of the use of LLVM targeting x86, never exercised targeting ARM as the JIT target, so could have more potential for breakage without continuous testing.

      • Daniel: would it be enough of an incremental value add to justify setting up Orka. Larry: we would like to be able to move there, if anything to solve the versioning issue on Mac. Daniel: there seems to be enough value to go there. We need someone who has time to do this.

      • Larry: trying to get access to a DTK project on behalf of OIIO / OSL, but it is speculative.

      • JF to prototype integration between an ASWF project and Orka build (there is a GitHub repo demonstrating this integration)

      • Marshall: check out the MacStadium open source project page

  • Package management

    • Dependencies

    • Build deployment / runtime environment

Action Items

  • Larry/Daniel: NVidia Developer Program membership for ASWF to officially access Optix headers

  • JF/Andy: MacStadium account setup (ASWF and Open Source program) and GitHub Actions integration for Orca and DTK testing

  • Andy/Aloys: Docker Hub ASWF account (paid or via opensource application) to avoid restrictions

Next Steps