Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

rkt: implement generic no volume policy #3102

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sgotti
Copy link
Contributor

@sgotti sgotti commented Aug 22, 2016

This patch implements a generic no volume policy instead of a custom, per
stage1, behavior based on a docker2aci specific app annotation.

This also avoids stage1 having to read the app image manifest but just uses the
pod manifest as its unique source for app setup.

Since the behavior on unspecified volumes is not defined in the appc/spec but
left to the ACE, the idea is to use a pod.RuntimeApp annotation to define the
no volume policy per runtime app (instead of extending the appc PodManifest
with rkt specific fields).

The annotation is called coreos.com/rkt/annotations/novolumepolicy and
can have two possible values:

  • empty: create an empty, per app, volume.
  • empty-copy: create an empty, per app, volume and copy image contents.

empty is the default if the annotation doesn't exists.

rkt, in the stage0, will check if the annotation is already provided by the pod
manifest and, in such case, only validate it.
If not provided and the image has been converted using docker2aci then an
annotation with value empty-copy will be defined.

This has the additional effect to let the user define (and override for
docker2aci converted images) the no volume policy for every app (with this
patch only by providing a --pod-manifest, in future, if needed, a per app
command line flag could be added)

Note that only pod.RuntimeApp annotations are checked for
coreos.com/rkt/annotations/novolumepolicy and not an app manifest
annotation (this, to correctly work, requires #3100).

Notes
Before this patch an user provided pod manifest was only validated without
changing it, now, instead, it's also enhanced.

Questions

  • Currently the behavior of a missing volume for a mount point isn't defined in the spec and left to the ACE. Should the no volume policy behavior be explicitly defined in the appc/spec? I'm not sure about this and where to define it (per app in the pod runtimeApp? per mountpoint in the image manifest?)
  • Currently rkt fly stage1 is different from other flavors as if a volume isn't specified it just doesn't mount nothing:
    • Should a new policy none be implemented to define the fly behavior?
    • Should rkt fly be adapted to also implement the empty and empty-copy behaviors (if it this makes sense).
  • Should this no volume policy be explicitly defined in the stage1 implementors guide?

This fixes one part of #3098.

This patch implements a generic no volume policy instead of a custom, per
stage1, behavior based on a docker2aci specific app annotation.

This also avoids stage1 having to read the app image manifest but just uses the
pod manifest as its unique source for app setup.

Since the behavior on unspecified volumes is not defined in the appc/spec but
left to the ACE, the idea is to use a pod.RuntimeApp annotation to define the
no volume policy per runtime app (instead of extending the appc PodManifest
with rkt specific fields).

The annotation is called `coreos.com/rkt/annotations/novolumepolicy` and
can have two possible values:

* `empty`: create an empty, per app, volume.
* `empty-copy`: create an empty, per app, volume and copy image contents.

`empty` is the default if the annotation doesn't exists.

rkt, in the stage0, will check if the annotation is already provided by the pod
manifest and, in such case, only validate it.
If not provided and the image has been converted using docker2aci then an
annotation with value `empty-copy` will be defined.

This has the additional effect to letting the user define (and override for
docker2aci converted images) the no volume policy for every app (with this
patch only by providing a --pod-manifest, in future, if needed, a per app
command line flag could be added)

Note that only pod.RuntimeApp annotations are checked for
`coreos.com/rkt/annotations/novolumepolicy` and not a app manifest
annotation (this, to correctly work, requires rkt#3100).

**Notes**
Before this patch an user provided pod manifest was only validated without
changing it, now, instead, it's also enhanced.
@lucab
Copy link
Member

lucab commented Aug 22, 2016

Do we actually have usecases for all those cases? If not, I think this could be a good moment for some cleaning and uniforming the behaviors.

The safest option would be to validate the pod manifest and just abort if volumes are not matching. However it seems that Docker is behaving in a different way (described here), which we are trying to mimic. As such, I think it makes sense to clarify in appc that mountpoints not covered by any volumes have the empty-copy behavior.

@lucab
Copy link
Member

lucab commented Aug 22, 2016

Also, semaphore is failing because KVM doesn't have the concept of stage1 return value: #2777

@lucab lucab added this to the v1.14.0 milestone Aug 22, 2016
@lucab
Copy link
Member

lucab commented Aug 23, 2016

There may be actually more to it. According to appc pod spec:

A Pod Manifest must be fully resolved (reified) before execution.
Specifically, a Pod Manifest must have all mountPoints satisfied by volumes,
and must reference all applications deterministically (by image ID).

@sgotti
Copy link
Contributor Author

sgotti commented Aug 23, 2016

@lucab That's an interesting point.

My view, given also the next sentence At runtime, the reified Pod Manifest is exposed to applications through the Metadata Service. makes me think that this reifying job (defining empty kind volumes for unspecified ones and the related runtimeApp.Mounts) should be done in the stage0 instead of the stage1s like currently done.

This change raises a question:

  • How to handle --pod-manifest: To keep backward compatibility they should also be reified or acurrently working pod manifest will expose a changed behavior (probably no empty volume mounted over mountPoint with no assigned volume). So this will change the "already reified" expectation on a provided podmanifest making it instead a sort of pod manifest template.

Then we should define, as you said above, if just one no volume policy is enough or different policies should be available (error, none, empty, empty-copy).

Related to this point the question is: how to, if needed, make the user choose the no volume policy (per app). Do like in this patch providing a runtimeApp annotation (for user specified podManifests) and a cli option or other solutions?

@lucab
Copy link
Member

lucab commented Aug 23, 2016

This reifying job (defining kind=empty volumes for unspecified ones and the related runtimeApp.Mounts) should be done in the stage0 instead of the stage1

I think this should be the way to go to reify incomplete/template manifests.

How to handle --pod-manifest

If k8s is not relying on this implicit behavior, we should stick to the spec and properly validate that all mountpoints are covered by a volume. Otherwise, this should be made explicit somewhere and whatever is producing the runtime manifest for us should be adapted first. /cc @euank for insights

how to, if needed, make the user choose the no volume policy

This looks like a corner case enough, that I would avoid exposing another knob just for it. Just to have the possibility of getting a docker-like behavior we can discuss about a kind=guest volume, which works like an empty volume but is pre-populated with all data from guest target path. /cc @jonboulle to know if this makes sense

@lucab lucab modified the milestones: v1.15.0, v1.14.0 Aug 30, 2016
@s-urbaniak
Copy link
Contributor

@sgotti bumping this to the next release.

@s-urbaniak s-urbaniak modified the milestones: v1.16.0, v1.15.0 Sep 15, 2016
@lucab
Copy link
Member

lucab commented Sep 29, 2016

ping @jonboulle

It looks like last time we encountered it (#2315 (comment)) we special-cased it, and it's now coming back. Proposal here is to have another special casing, and I feel this will come back at a later point. Given that this kind of guest-filled volumes are there in the wild and we have to provide compatibility for it, can we capture this semantics in appc?

@lucab lucab modified the milestones: v1.17.0, v1.16.0 Sep 29, 2016
@s-urbaniak s-urbaniak modified the milestones: v1.18.0, v1.17.0 Oct 12, 2016
@lucab lucab removed this from the v1.18.0 milestone Oct 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants