You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
The docs for multiple choice use SWAG as an example, which is the task of selecting the next sentence given a context. Somewhat strangely, rather than being given in the format (sentence1, [sentence2a, sentence2b, sentence2c, sentence2d]), the dataset is given in the format (sentence1, sentence2_start, [sentence2_endA, sentence2_endB, sentence2_endC, sentence2_endD]).
The code given in the docs basically turns the dataset into the first format, where sentence 1 is kept intact and the start of sentence 2 is concatenated to each ending:
The preprocessing function you want to create needs to:
1. Make four copies of the `sent1` field and combine each of them with `sent2` to recreate how a sentence starts.
2. Combine `sent2` with each of the four possible sentence endings.
What is being described is formatting the dataset as (sentence1 sentence2_start, [sentence2_start sentence2_endA, sentence2_start sentence2_endB, sentence2_start sentence2_endC, sentence2_start sentence2_endD]), where there is overlap between the first and the second sentence (namely sentence2_start).
Expected behavior
Either the code is wrong or the description is wrong.
If the description is wrong, it should be:
The preprocessing function you want to create needs to:
Make four copies of the sent1 field.
Combine sent2 with each of the four possible sentence endings.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Not relevant.
Who can help?
@stevhliu @ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
The docs for multiple choice use SWAG as an example, which is the task of selecting the next sentence given a context. Somewhat strangely, rather than being given in the format
(sentence1, [sentence2a, sentence2b, sentence2c, sentence2d])
, the dataset is given in the format(sentence1, sentence2_start, [sentence2_endA, sentence2_endB, sentence2_endC, sentence2_endD])
.The code given in the docs basically turns the dataset into the first format, where sentence 1 is kept intact and the start of sentence 2 is concatenated to each ending:
transformers/docs/source/en/tasks/multiple_choice.md
Lines 96 to 100 in a06a0d1
Yet, the docs say:
transformers/docs/source/en/tasks/multiple_choice.md
Lines 85 to 88 in a06a0d1
What is being described is formatting the dataset as
(sentence1 sentence2_start, [sentence2_start sentence2_endA, sentence2_start sentence2_endB, sentence2_start sentence2_endC, sentence2_start sentence2_endD])
, where there is overlap between the first and the second sentence (namelysentence2_start
).Expected behavior
Either the code is wrong or the description is wrong.
If the description is wrong, it should be:
If the code is wrong, it should be:
The text was updated successfully, but these errors were encountered: