Results of condition matching test. We show that our model is able to generate the perceived correct sound effect, for a given conditioning text, 56% of the time. Sound effects are judged as almost matching 20% of the time, and incorrect 24% of the time.