Page MenuHomePhabricator

MPIC: Create experiment form as derivative of instrument form
Closed, ResolvedPublic3 Estimated Story Points

Description

T370880

Description

Re-skin the instrument form to create a new experiment form that includes a new variants fieldset and excludes unneeded fields. The experiment form should be clearly branded as an experiment (i.e. remove references to "instrument").

Field to include:

  • A multi-value single fieldset called Variants that includes the following fields:
    • Name - text input
    • Type - select dropdown (string, number, boolean) << this can be removed from the frontend form if it saves time and just hardcode boolean but it's still sent in the api response
    • Values - multi-value chip input

The variants fieldset should be saved to the database as a json value with the following structure:

{
  "name": "color",
  "type": "string",
  "values": [
    "red",
    "blue",
   "green
  ]
}

For an example, see how this is done for sample rates in the instrument store.

Fields to exclude from the frontend form:

  • Type
  • Schema type/stream name
  • Contextual Attributes

As noted in T373467#10125438, we should make these frontend excluded fields optional in the database since for now instruments and experiments live in the same table.

Acceptance Criteria

Required

  • Unit/Integration tests?
  • Documentation?
  • Passed QA?

Event Timeline

Hey, @cjming and @Sfaci 👋🏻 Apologies in advance if these are silly questions, but I need to validate the following:

  1. Shouldn't each variant have a single value assigned to it, instead of an array? Following the color example, I might have a "Submit button color" A/B test with a “blue button” variant which value corresponds to the color “blue”, a variant with the name “green button” and the value “green” assigned to it, and so on and so forth.
  1. Is it correct to assume that variants in the same experiment can only use values of the same type? Following the example above again, the values of all the variants in the "Submit button" A/B test need to share the type "string", right?
  1. Relatedly, this sounds too sophisticated, but: would there be a way to automatically detect the type of the values entered by users instead of asking them to provide it via the UI? This sounds like a highly technical field, and end users might struggle to understand its purpose, even if we provide supporting text 🤔

Thanks for your help!

Hi, Here are my assumptions regarding your questions:

Shouldn't each variant have a single value assigned to it, instead of an array? Following the color example, I might have a "Submit button color" A/B test with a “blue button” variant which value corresponds to the color “blue”, a variant with the name “green button” and the value “green” assigned to it, and so on and so forth.

My understanding here is that each variant has a name and different values (that's why we use an array) to express, for example using the previous example, how many different colors we can use to show the button for different user buckets. I think that the button and all its different possible colors would be only one variant.

Is it correct to assume that variants in the same experiment can only use values of the same type? Following the example above again, the values of all the variants in the "Submit button" A/B test need to share the type "string", right?

I think this is right. It would be odd to allow different data types for something that is going to represent a value for the same concept. In the case we need something really heterogeneous we always have the string type that allows us to represent anything

Relatedly, this sounds too sophisticated, but: would there be a way to automatically detect the type of the values entered by users instead of asking them to provide it via the UI? This sounds like a highly technical field, and end users might struggle to understand its purpose, even if we provide supporting text

I agree with you. It sounds a bit weird to ask for the data type. I guess we could guess the datatype from the real value but we wouldn't be able to distinguish between a number that needs to be treated as a string (not sure whether that makes sense). or wow to deal with boolean data type? Just typing "true" and "false"?. Anyway, I'm not sure about this but I think MPIC is a web application pretty technical and I guess we can consider its end users like technical users. After all they are folks that know how to deal with data, right? In my short experience working with folks that create and work with instruments and Metrics Platform, I would say they could deal with these technical details.

I hope @cjming can give more clarity than me about all this

@cjming @Sarai-WMF According to the ticket description, some fields are going to be excluded here. Some of those fields are schema_title and stream_name. Shouldn't we exclude schema_title as well?
And I guess that, all these fields we consider as excluded for experiments are the ones that we should consider as optional in the database (so far these fields were defined as mandatory)

VirginiaPoundstone lowered the priority of this task from High to Medium.Sep 9 2024, 7:11 PM

I'm sorry! I have just realized that I was wrong when I said that we should remove schema_title field. There is no such field. We take schema_title from the autocompletion field we use for stream_name so, it's enough if we remove that field.
Regarding considering schema_title as optional in the database, it's still valid because we won't have to save that value when registering an experiment, but that is already covered by the specific ticket.

hi @Sarai-WMF hi @Sfaci -- I agree with Santi's inline responses to Sarai's questions.

As we were reminded recently and given the tightening timeframe, we want to make sure to capture at a basic minimum what the Growth Team needs for their upcoming experiment which is a simple on/off indicator i.e. boolean type field that toggles the feature's visibility.

In that spirit, we can cut the scope of the fieldset to be singular (not multi-value yet) which will simplify the component for the alpha release.

As for the name, type, values of the new variants field, it's helpful to think of the feature variant itself as the element that is being targeted and its values are the different options being tested. While it's still in active development, the current incarnation of how this information will be saved as a configuration variable might also help visualize how the variants's properties map to the names of user buckets in an experiment:

[
  'experiment-1' => 'feature_a:red',
  'experiment-n' => 'feature_b:8',
  'experiment-x' => 'feature_n:true'
]

In the above example which is the experiment enrollment data for a given user, the key is the experiment machine-readable name and the value is the name of the user's assigned bucket - a concatenated string comprised of the feature name, a colon separator, and the feature value which is the variant that this user will see.

From the MPIC experiments API, the data for an experiment will output all the possible options that a feature variant can appear as to a user:

{
  "id": 2,
  "name": "experiment 1",
  "slug": "experiment-1",
  "description": "experiment description",
  "creator": "Jill Hill",
  "owner": "Growth Team",
  "purpose": "WE.1.2",
  "created_at": "2024-07-10T18:57:59.000Z",
  "updated_at": "2024-07-12T19:46:28.000Z",
  "start_date": "2024-04-30T00:00:00.000Z",
  "end_date": "2024-08-13T19:46:28.000Z",
  "task": "https://phabricator.wikimedia.org/T369544",
  "compliance_requirements": "gdpr",
  "sample_unit": "session",
  "sample_rate": {
    "default": 0.5,
    "0.1": [
      "frwiki"
    ],
    "0.01": [
      "enwiki"
    ]
  },
  "environments": "staging",
  "security_legal_review": "pending",
  "status": 1,
  "stream_name": "product_metrics.web_base",
  "schema_title": "analytics/product_metrics/web/base",
  "schema_type": "web",
  "email_address": "[email protected]",
  "type": "a/b test",
  "variants": {
    "name": "feature_a",
    "type": "string",
    "values": [
      "red",
      "blue",
      "green
    ]
  }
}

Capturing type in the variants property is an early attempt on our part to make this fieldset more robust and structured. But for our looming deadline, we could just make type a default of boolean and remove it altogether from the form (but still publish it in the api endpoint) if that saves time (not sure if it would). I feel like it's pretty trivial to save what's basically an enum field of 3 possible values (string, integer, boolean).

I took the liberty of updating the ACs and ticket description to tighten scope.

fields we consider as excluded for experiments are the ones that we should consider as optional in the database (so far these fields were defined as mandatory)

Yes! excellent point -- updated ACs accordingly

Per discussion with @Sfaci I think we agreed that it makes sense to have 2 separate pinia stores - one for instruments and one for experiments.

Not sure how much more we want to decouple because they do share a majority of fields/properties which is why I think we settled on keeping experiments in the instruments table and preserving the Type field (baseline or a/b test) as a column in the table (as the identifier distinguishing instruments from experiments), even tho it's to be removed from the frontend forms.

Thank you @cjming for exposing here all the considerations about cutting the scope regarding the experiments form design. I think we agreed I would do it but I forgot.

Regarding our agreement on keeping two separate stores: instruments and experiments, after exploring this way this morning I'm not sure about it. According to our current instrument form all its components are fully coupled with the specific store (through the v-model property), and that would mean that we should duplicate every component (Details, Duration, . . .) for the new experiments form (to couple with a new experiments store). I think that if we use a "shared" store, we could compose the new experiments form using the existing components and just make some improvements to the current store we have. And there are some code in the current store that works for both instruments and experiments.

After exploring both ways (specific stores or a "shared" one) I would say that the second one is our best way to get an experiments form for now, even regarding the technical debt.

@Sfaci - sounds good re: stores -- whatever you think is the most expedient makes sense to me

@Sarai-WMF I have realized the new design for forms (baseline instruments and AB tests) has a new look and feel for the Location component. For now I have considered we could do that in a follow up ticket to try to achieve the current goals using the current component we have to fill this data. Is that ok?

Agreed, @Sfaci. We haven't agreed to those changes yet. It was just convenient for me to reflect the potential iteration on Figma for visibility. Thanks for the comment!

Change #1075207 had a related patch set uploaded (by Santiago Faci; author: Santiago Faci):

[operations/deployment-charts@master] MPIC: Deploying on staging a new relase v0.2

https://gerrit.wikimedia.org/r/1075207

Change #1075207 merged by jenkins-bot:

[operations/deployment-charts@master] MPIC: Deploying on staging a new relase v0.2

https://gerrit.wikimedia.org/r/1075207

Change #1075524 had a related patch set uploaded (by Santiago Faci; author: Santiago Faci):

[operations/deployment-charts@master] MPIC: Deploying to production a new relase v0.2

https://gerrit.wikimedia.org/r/1075524

Change #1075524 merged by jenkins-bot:

[operations/deployment-charts@master] MPIC: Deploying to production a new relase v0.2

https://gerrit.wikimedia.org/r/1075524

Sfaci moved this task from Done to Needs Review on the Data Products (Data Products Sprint 19) board.

@Sarai-WMF This work is already deployed on staging/production, so it's already available for you to review. Thanks!

VirginiaPoundstone raised the priority of this task from Medium to High.Oct 15 2024, 8:41 PM

Design review done. I didn't detect any relevant issues that aren't either already documented in Phabricator or design review docs (e.g. T374957), that are exclusive to the experiment form (e.g., T377353), or in scope for Alpha (e.g., T377336).

@Sarai-WMF Thank you very much!
Let's move this task to done because it was already deployed