Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation such as a MIDI piano roll, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamic and source (i.e., instrument). Existing algorithms for AMT commonly identify pitches and their saliences in each frame and then form notes in a post-processing stage, which applies a combination of thresholding, pruning and smoothing operations. Very few existing methods consider the note temporal evolution over multiple frames during the pitch identification stage. In this work we propose a note-based spectrogram factorization method that uses the entire temporal evolution of piano notes as a template dictionary. The method uses an artificial neural network to detect note onsets from the audio spectral flux. Next, it estimates the notes present in each audio segment between two successive onsets with a greedy search algorithm. Finally, the spectrogram of each segment is factorized using a discrete combination of note templates comprised of full note spectrograms of individual piano notes sampled at different dynamic levels. We also propose a new psychoacoustically informed measure for spectrogram similarity.