@janellecshane Maybe I should refine the GPT2 models for the english texts.
I thought I want to start from scratch, so Harry does not fly space ships. On the other hand, the power of GPT2 is probably related to the much larger text corpus.
For the german model it does not make sense to refine GPT2, because it seems to contain very little german text.
I am using the original GPT2 training code from the rkfg-fork following the instructions from different posts in github issues.