This prototype transcribes polyphonic raw piano audio accurately to Midi. Furthermore, it offers several applications to edit the produced Midi. It generates new melodic sequences, harmonic chords and melodic continuation based on a primer Midi. The program integrates MuseScore 3 in order to display the result on scores. Further processing can then manually be done via MuseScore 3. The features are presented on a graphical user interface.



Can Artificial Intelligence be creative? Machine learning is gaining more and more momentum. It influences us in every aspect of our modern lives. So, let's have fun with it and use it as a tools to create. (big grin) 

Creativity and technology have long been linked to one another. Beside an AI creating art autonomously, most of the times, the artist utilizes the practical benefits of today’s technology to create new art. In music, this phenomenon can be observed in large recording studios, DJ mix tables or even the electric guitar. But the average hobby musician does not have access to expensive software or extravagant studios. Therefore, this project provides a creative outlet for music-lovers, who would like to experiment on improvising and creating their own compositions. Additionally, this software teaches the basic principles of harmony as well as the correct notation of sheet music. It helps self-taught musicians, as well as students in educational institutions to improve their musical understanding. And maybe, this prototype will encourage some person in this world to create their art to leave an impact in this world.


The idea of this project is to build and implement a Python program that transcribes audio to sheet music. So, these are three main steps of this program: 

  1. Getting the Audio Input 
  2. Transcribing and Generating melodic sequences
  3. Generating Sheet Music and saving it as PDF file

The integration between these applications should happen seamlessly. A graphical user interface rounds this prototype up, providing user-friendly usage.

Program Details

We are using the Magenta package provided by Google that contains a variety of audio processing libraries like librosa, mir_eval and more. Furthermore, Magenta holds powerful machine learning algorithms. These RNNs have been trained on large datasets and therefore offer great accuracy in predictions. Magenta gives you the possibility to train your own models on their RNNs as well. But in this project, we are not training our own models, but rather using pre-trained checkpoints and models provided by Magenta.These are the main applications of this project: 



  • Python 3.7 on PyCharm 2020.1.1
  • Windows 10 64-Bit


Use pip -m install tensorflow == 1.15.3 to install the specific version of TensorFlow that is compatible with Magenta. Then add Magenta by pip install Magenta. The current version should be 1.3.1.

Get SoX from the Website: http://sox.sourceforge.net/ (32-Bit works on 64-Bit) and add it to your Path on your Computer. Tutorial for Windows: https://stackoverflow.com/questions/17667491/how-to-use-sox-in-windows

Downgrade the Numba Library to 0.48.0.

Download all the pre-trained models and save them on your computer:

  1. Transcription: https://github.com/tensorflow/magenta/tree/master/magenta/models/onsets_frames_transcription (scroll down to Transcription Script, press on "checkpoint")
  2. Melody Generation: https://github.com/magenta/magenta/tree/master/magenta/models/melody_rnn (attention_rnn model)
  3. Improvising Further: https://github.com/magenta/magenta/tree/master/magenta/models/pianoroll_rnn_nade (Get the pianoroll_rnn_nade model)
  4. Bach Improvising: https://github.com/magenta/magenta/tree/master/magenta/models/polyphony_rnn (polyphony_rnn model)
  5. Harmonizing Chords: https://github.com/magenta/magenta/tree/master/magenta/models/improv_rnn (chord_pitches_improv model)
  6. Creating Sheet Music: Download MuseScore3 from https://musescore.org/de/download/musescore.msi


  • Clone my repository from https://gitlab.ldv.ei.tum.de/komcrea/musik/-/tree/master
  • Fill out all the right paths to your checkpoints and models in the "config.json.example". I would recommend to set up one folder that contains all the models.
  • Rename "config.json.example" to "config.json" for the code to utilize it.
  • Hit RUN on "GUI.py". A simple (but beautiful) Graphical User Interface will pop up.
  • Have fun. 

Please make sure you select an existing .wav file before transcribing.

When Python says, that there was an error, you have to restart this GUI, as this GUI will continue, even though an error occurred.

For more Information visit: 



How To Write a Python Program

  • Scan the internet for existing code aka. do !!extensive!! research on GitHub

After weeks of research into the darkest corners of GitHub, I found this powerful GitHub repository by none other than Google: https://github.com/magenta/magenta. Running it through several tests, it was apparent, that this code is by far the most accurate in note detection, compared to the other available scripts on GitHub.

  • Clone a useful GitHub onto your local IDE
    • Get the requirements

I am using PyCharm as it has a great GitHub integration. Cloning it onto my PC and then installing the dependencies: Magenta, Tensorflow and SOX. I had to downgrade Numba to the specific version 0.48.0. 

  • Tweak and edit the code to your liking
    • Get rid of unused, irrelevant functions and variables
    • Add scripts that enhance your application 

This program uses 5 different models, that Magenta provides: onset_frames_transcription,  melody_rnn, improv_rnn, polyphony_rnn and pianoroll_rnn_nade. Because we solely use pretrained models and checkpoints for this program, only the ...generate.py are needed. Therefore, open your own project and clone only the 5 "...generate.py" files to your new program. Furthermore, this code uses so called "flags" that are difficult to work with for our program, so hardcoding the variables and setting up a config.json that deals with set values is recommended.  Other functions in the code are also not used, like the "main" function. So you can delete it. This code deals with audio, so a microphone recording should also be made possible. We are adding a .py file that uses pyaudio and is solely recording and saving audio as wave. The code can be found in the internet. 

  • Build a !!beautiful!! GUI that makes this program understandable for "normal" people

Since these 5 models are 5 different .py files which do not have any integration with each other, we need to "connect" them with a GUI. PyQt5 is a great tool to build Graphical User Interfaces. Make sure that you use the signals and slots system to pass variables like the midifile to the next function. This ensures, that for example, after the Piano transcription, the saved midi file can immediately used to generate new sequences with. The transitions can be made seamless with this GUI. MuseScore 3 is a free software that can be integrated in this program thanks to the GUI. Therefore, this GUI connects all the separate parts and creates a coherent program. 


  • Add formalities to make the code better understandable for others
    • Add README.md, requirements.txt, LICENSE and more
    • Come up with a great documentation 

A README.md is extremely important, so that others can test and follow your work. It also does not hurt to make it a bit more entertaining. Imagine, how you would like to be introduced to a program. From your research, you might have stumbled upon repositories that have a bad README.md and you could not get their code running due to their lacking documentation. Learn from their mistakes. Try to make it as understandable as possible. It is also very important to specify which versions you used as some libraries might not be compatible with others depending on their versions. 

  • Upload your project on GitHub or GitLab to share with the world

When you are happy and content with your work, upload it onto your GitLab. Share it with others. 


Things I Wished I Knew before I started All of This:

  • StackOverflow is your best friend
  • Trial and Error is your best friend
  • The debugging beetle in PyCharm is your best friend
  • If your windows don't have blinds yet, get them, they are your best friend
  • Use the dark theme in PyCharm, your eyes will thank me
  • Give your new functions and variables reasonable names
  • Code that only runs through command line exists, I did not know that...
  • In PyCharm, you can give in command line parameters through the configuration settings of your script
  • Always update your GitLab

  • No labels