Link Search Menu Expand Document

ASWF CI Working Group

Meeting: 29 May 2019

Attendees

  • Daniel Heckenberg (Animal Logic, TAC Chair)
  • Jean-Francois Panisset (VES Technology Committee)
  • Michael Dolan (SPI - OpenColorIO)
  • Larry Gritz (SPI)
  • Trevor Thomson (Blue Sky Studios)
  • Andrew Grimberg (Linux Foundation Release Engineering)
  • Aloys Baillet (Animal Logic)
  • Jeff Bradley (Dreamworks)
  • Dan Bailey (ILM)

Agenda & Notes

ASWF CI Goals for Year 1 [0:00-0:05]

Timeframes

  • CI Platform decision: May
    • Update provided by Daniel to Governing Board, heads up about potential budget impact
    • We employed in pre-launch phase a CI consultant, would be possible to do this again (Windows / Mac builds) -> Andrew: this work was done by LF Release Engineering. We could engage LF RelEng for more intensive activities.
  • Project CII badges: (Security and static analysis) June
  • Dependency management: July

CI Platform: Confirm Azure? May [0:05-0:25]

  • LF Rel Eng Update (Andrew)
    • Responded to a couple of email threads
      • Azure / Docker: acquired academysoftwarefoundation registry on Docker Hub, tried to acquire aswf-staging but already used? Aloys got it, but can cede it to LF releng (not typo squatting!)
      • Should we create a ASWF Docker repo in GitHub? Andrew suggests putting templates in ci-management repository, can keep in a central location, and keeps side by side with Jenkins / Packer work.
      • Waiting for the azure-pipelines.yaml merge in OpenVDB repo before can link repo to Azure Pipelines. Dan: ran into similar issues with CircleCI. Will do work on his side first.
      • Higher end builders: we need to provide our own builders, which incurs costs, we would have to manage instances. Dan: can we just increase the existing resources? Andrew: hasn’t found a way to do this yet.
      • Daniel: should we try to approach Azure Pipelines tech support? Andrew: trying to figure out who to talk to at Microsoft. Someone else at LF has been in contact, waiting for introduction. Daniel: tried to go through Azure support, but that didn’t help.
      • Conan / Conda plugins for Nexus3: no binary distribution available, you have to compile / install yourself, makes it more difficult to work with configuration management. Can you present the same files via multiple plugins? Not in Nexus 3. Also disk space on Nexus 3 system can be grown dynamically. Daniel: what would be the process for managing disk space / costs? Nexus 3 offers proxy, host and meta repositories. Normally LF sets up a cleanup on proxy repository, no cleanup on release hosted repos, but cleanup on dev hosted repo (maintain snapshot releases up to 30 days by default). Not immediately clear how that would work with Conan / Conda.
      • Nexus 3 uses a blob store on the backend, so can be challenging to figure out where disk space is being used. Can do compacting to reclaim disk space.
      • All LF systems managed as “pets” have alerting configured, they know when a system is running low on space.
      • Daniel: what advantages to using our own Nexus3 repo as opposed to other systems? Andrew: main advantage is policies we set, a third party solution (e.g. Artifactory Cloud) is more at the mercy of the policies set by the provider. On LF Nexus3 systems a dedicated user account is created per project, which avoid artifacts from one project overwriting ones from another project.
      • JFrog Artifactory backed by Google Cloud (free for OSS)
      • LF has no projects currently using Artifactory, but they have a project coming online which has that as a requirement, so Artifactory will be supported.
      • GitHub has beta repository support, Azure Pipelines as well, supports a certain set of repo formats.
      • LF shifting to only using Docker Hub as container registry.
      • Daniel: looks like we’re still in exploratory stage with respect to artifact repositories. Especially once we look at workflows that involve branches and PRs.
      • Aloys: is the ci-management repo linked to Azure Pipelines? Andrew: not yet, for now just linked to Jenkins, but could be.
      • Dan: in setting Azure for OpenVDB they used native Docker Hub build system, seemed quite straightforward. When you push a commit it will build images automatically. Andrew: not a problem, Docker Hub will only re-trigger the required builds.
  • OpenColorIO on Azure Pipelines (Michael)
    • Had some cycles to spend on Azure setup for OpenColorIO, not in main branch yet (in ci_test branch)
    • Builds functional on all 3 platforms, SonarCloud working as well
    • Seems to be working well, all builds in a single platform
    • Linux uses Docker images, since line of configuration, Mac uses Brew, Windows will use Docker, but for now native
    • Took advantage of Azure template functionality to set up YAML templates you can call from multiple jobs with parameters, so primary Azure YAML file is quite simple
    • Andrew: seems similar to global-jjb for Jenkins, can you reference these templates from other repositories? Michael: seems to be, for now they live in another repository. Did encounter some weirdness with variable expansion to be able to do conditional routing of builds. But sometimes variable don’t seem to expand when expected. Can also build expressions into steps to do conditional execution.
    • Static analysis is working and providing lots of information, running from test brach but analysing master branch.
    • Daniel: should we try to converge to central location for Docker configurations? Michael: yes, would be useful. For instance, built a Docker image for Linux builds, try to follow VFX Reference Platform as much as possible, but would be nice to have standard VFX Reference Platform images per year and layer on top of those. Within a centralized Docker hub, could have locations for project-specific configurations. Currently just reading images, not requirement to push back Docker images.
    • Daniel: seems good alignment in work between Dan, Aloys and Michael. We seem to be in a middle ground to capture external dependencies in an external system (Conda, Conan) vs Docker layers. Where we store Docker images / configurations is not a gating factor.
    • Need to be able to override separation to avoid dependencies between project repos and ci-management repo with different permissions. Dan: should we have a based master repo that can be forked by individual projects? It could get complicated to change master configs once we have multiple projects. A base image that has everything the VFX Reference Platform would be nice, but in reality there may need to be adaptations for different projects.
    • Aloys: GPU-enabled docker images also require specific configuration, so no longer possible to inherit from a single base image. Can use Ansible recipes to manage Docker base images.
    • Jeff: a single source without too much variation will help us to get all our components working together, since that’s where they will eventually run in a facility. May be a good idea to limit flexibility in this case.
    • Daniel: simplicity is a goal, we already have the opportunity to explore things independently using the current tools / methods. Need to define a nice way to have a shared resource for Dockerfiles and Docker images while still being able to experiment on a per-project basis. We should set up a test repo to try the shared approach so we can experiment.
    • Aloys: Michael mentioned wanting to use Docker on Windows. Why not use the base image from Azure? Michael: OpenColorIO was OK with the base image, hosted Azure agents have multiple Python versions, so can be tricky to grab the correct version. Docker offers a more streamlined way to set up the environment, more flexible to try new things, also to contain special case installs, not as simple to setup using tasks. Autodesk developers work on Windows natively, could use their environment tools. Aloys: Microsoft promised they wouldn’t release Docker images with Visual Studio pre-installed (would be too large). So a Docker image with the compiler might be tricky. Was able to build NumPy without Docker. Where could you host a Windows Docker image with Visual Studio pre-installed? Andrew: could be legal issues with that. Michael: used the Use Python Azure task (no access to Python dynamic libraries). But Docker can isolate from errant environment variables.
    • Aloys: Could we use the LF Foundation images from Azure? Andrew: we could have some static resources in existing cloud, but that would incur cost, and hasn’t been done before. There was no formalized budget on the CI resource use on VEXXhost, we don’t have any official budget for that. VEXXhost has just dropped prices, also changed hourly to per-minute billing. No easy way in Azure Pipelines to spin up builders dynamically. Daniel: also a blocker for GPU builds / testing.
    • Andrew: what’s the default licensing? Apache V2, we now have a aswf-docker repo. Aloys: we need to revisit the issue of Houdini / Autodesk SDK installations in Docker images, probably should do it on the flight during the build. Dan: we can also make the repositories private as an option. Azure Pipelines has 60 min limit if all is public, so would it be an issue if a Docker image was private?
  • Artifact, Package, Container storage and process

Project CII badges: June [0:25-0:30]

  • Someone to lead this?
    • Is anyone willing to step up to drive the CII work while Daniel is away? Andrew can drive this.
    • Static / Dynamic analysis requirements
    • For current needs we need SonarCloud, working against master branch in OpenColorIO
    • Need dynamic / fuzzing for higher level badges
    • GitHub just added a feature for private reports for security issues. Could be a good way to set policy for security reports.
    • OpenColorIO added a paragraph in their CONTRIBUTING file on how to report security issues.

Project CI requirements [0:30-0:50]

  • OpenVDB
  • OpenColorIO
  • OpenEXR
  • OpenCue
    • Brian: No progress yet on our Jenkins migration as we’ve been waiting on our repos to get transferred first. At this point and after reviewing OpenVDB’s Azure PR I’m considering whether we should just skip the Jenkins migration and go straight to Azure. It looks pretty straightforward and all of the concepts we’re using in our Jenkinsfile map very well into the Azure yaml format, from what I can tell. To be discussed at our next TSC meeting.

Action Items

Next Steps

  • Follow up meeting: 29 May 2019