[UPDATE 3] Inspiration strikes a chord!


This is an update to my previous post, “Inspiration strikes a chord!”, detailing my pet project to decipher the songs in some decorative piano rolls I found in a restaurant. Click the following links for post 1, post 2!, and post 3!

Time to up the ante. At this point, all my development and testing has centered about a single photograph – with ideal lighting, minimal glare, and a bird’s eye viewing angle. These niceties have translated into gross assumptions in my algorithm, which will likely stick out like a sore thumb when I use different data.

So let’s do exactly that!

Here are results from three completely different photographs. I’m quickly discovering what does and does not generalize in the current algorithm. Two examples: variations in lighting cause the threshold function to generate false positives and negatives; and my pitch-scale error minimization doesn’t always converge.


ID: 000
ID: 003
ID: 006

…and here’s some metadata…

000 struggled with lighting and alignment. Notes in the lower registers were lost to glare, and the upper register saw false positives from lyrics written in the top margin. Strong scale signal, though.
Again, lost a lot of notes to glare. I suspect there’s some error in the perspective transformation as well. The scale error signal appeared quite weak.
Perhaps the best of the three, 006 produced a strong scale signal and featured well-lit, high-contrast notes. The resulting MP3 sounds delightful!

Leave a Comment

Your email address will not be published.