Skip to content

Poetry generation using Transformer-based architectures

Notifications You must be signed in to change notification settings

mukhal/Transpoemer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Transpoemer

Poetry generation using Transformer-based architectures

This repo holds my experiments on finetuning pre-trained Transformer-based architectures for Poetry generation. All of the experiments are done on Arabic Poetry.

Generation Procedure

I follow a simple approach for poetry generation. Conditioned on a verse, the model should generate the next verse. Then, this generated verse is used as an input to the model and so on. There are more complicated approaches that would take into account an extended left context, but I leave such approaches for later.

timestep Model Input Model output
1 فيرجع الصدى كأنه النشيج
2 كأنه النشيج وهو المراد
3 وهو المراد ...
4 .. ...

1. Pretraining

For starters, I pre-trained BERT on the Arabic Wikipedia. I used the source code here to train a monolingual Masked Language Model. I used the default training configuration.

2. Finetuning for Generation

  • BERT's original uses do not include language generation. Actually, its Masked Language Modelling objective makes it very difficult to sample from it. My idea is to condition a decoder model on the Contextual Embeddings generated by BERT for generation. I used two types of decoders:

      1. GRU decoder: A GRU-based decoder. The hidden states of the decoder are initialized with the embeddings output by BERT.
      1. GRU network with attention: In addition to initializing the hidden states of the decoder as before, Bahdanau attention on the contextual embeddings of BERT is used.

    Finetuning is done through a Maximum Likelihood (MLE) objective to maximize the probability of generating the next verse. The gradients are back-propagated from the decoder to BERT.

Notebook

GPT-2

TODO

About

Poetry generation using Transformer-based architectures

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published