Automatic Fingering Prediction 0.1ish

Synthesia is a living project. You can help by sharing your ideas.
Search the forum before posting your idea. :D

No explicit, hateful, or hurtful language. Nothing illegal.
Nicholas
Posts: 12393

Post by Nicholas » 07-23-09 8:10 pm

TonE wrote:Then I do not understand why B2 equals to 15, as seen in the screenshot above.
Wait, what?

Which screenshot? I was talking about the text file vicentefer attached to his last message.

TonE
Synthesia Donor
Posts: 1180

Post by TonE » 07-23-09 8:16 pm

I know, but maybe I mixed up your "group 1" with the field which is marked as "1" in the AFP 0.1ish screenshot. Anyway, I hope my point was still clear and sorry for confusing you a little :P .

Nicholas
Posts: 12393

Post by Nicholas » 07-23-09 8:31 pm

oh, hehe... no, I was referring to regular expression groups. The parenthesis in that code block mean that if you run it through a RegEx engine, it should come back with one "group", and if you ask it for the contents of that group, you should get the number part just after the "n=".

I was totally confused for a minute there. ;)

Nicholas
Posts: 12393

Post by Nicholas » 07-23-09 11:57 pm

I got a chance to mess around with your app. *Very* cool. Without knowing too much about fingering, the generated fingers looked pretty awesome to me.

(Oh, and I recommend that you just update your top post with the latest version so you don't have to search the whole thread for it.)

---

In lieu of writing a MIDI converter (because vicentefer already found a nicely functional one), here is some (.NET 2.0) source for parsing those files:

Code: Select all

    struct Note : IComparer<Note>
    {
        public long time;
        public int noteId;

        public int Compare(Note x, Note y) { return x.time.CompareTo(y.time); }
    }

    class Program
    {
        static void Main(string[] args)
        {
            if (args.Length != 1 || !File.Exists(args[0]))
            {
                Console.WriteLine("Pass in the name of a file that exists.");
                return;
            }

            List<Note> notes = new List<Note>();
            using (StreamReader reader = new StreamReader(args[0]))
            {
                Regex expression = new Regex("([0-9]+) On ch=[1-9][0-9]? n=([1-9][0-9]*) v=[1-9][0-9]*");
                while (!reader.EndOfStream)
                {
                    Match m = expression.Match(reader.ReadLine());
                    if (!m.Success) continue;

                    long time;
                    int noteId;
                    if (!long.TryParse(m.Groups[1].Value, out time)) continue;
                    if (!int.TryParse(m.Groups[2].Value, out noteId)) continue;

                    Note note = new Note();
                    note.time = time;
                    note.noteId = noteId;

                    notes.Add(note);
                }
            }

            notes.Sort(new Note());

            StringBuilder result = new StringBuilder();
            foreach (Note n in notes) result.AppendFormat("{0} ", n.noteId);

            Console.WriteLine(result.ToString().Trim());
        }
    }
Using that, dragging vicentefer's 02hanon.txt file to it, the program produces this:
Spoiler:
36 40 45 43 41 43 41 40 38 41 47 45 43 45 43 41 40 43 48 47 45 47 45 43 41 45 50 48 47 48 47 45 43 47 52 50 48 50 48 47 45 48 53 52 50 52 50 48 47 50 55 53 52 53 52 50 48 52 57 55 53 55 53 52 50 53 59 57 55 57 55 53 52 55 60 59 57 59 57 55 53 57 62 60 59 60 59 57 55 59 64 62 60 62 60 59 57 60 65 64 62 64 62 60 59 62 67 65 64 65 64 62 67 62 59 60 62 60 62 64 65 60 57 59 60 59 60 62 64 59 55 57 59 57 59 60 62 57 53 55 57 55 57 59 60 55 52 53 55 53 55 57 59 53 50 52 53 52 53 55 57 52 48 50 52 50 52 53 55 50 47 48 50 48 50 52 53 48 45 47 48 47 48 50 52 47 43 45 47 45 47 48 50 45 41 43 45 43 45 47 48 43 40 41 43 41 43 45 47 41 38 40 41 40 41 43 45 40 36 38 40 38 40 41 36
Hehe, pasted into AFP 0.1.01:
Spoiler:
1 2 4 3 2 4 3 2 1 2 4 3 2 3 2 1 1 3 5 4 3 4 3 2 1 3 5 4 3 4 3 2 1 3 5 3 2 4 3 2 1 2 4 3 2 3 2 1 1 3 5 4 3 4 3 2 1 3 5 3 2 4 3 2 1 2 4 3 2 3 2 1 1 3 5 4 3 4 3 2 1 3 5 4 3 4 3 2 1 3 5 3 2 4 3 2 1 2 4 3 2 3 2 1 1 2 4 3 2 3 2 1 5 3 1 1 2 1 2 3 4 2 1 2 3 2 3 4 5 3 1 2 3 2 3 4 5 3 1 2 3 2 3 4 5 3 1 1 2 1 2 3 4 2 1 2 3 2 3 4 5 3 1 2 3 2 3 4 5 3 1 1 2 1 2 3 4 2 1 2 3 2 3 4 5 3 1 2 3 2 3 4 5 3 1 2 3 2 3 4 5 3 1 1 2 1 2 3 4 2 1 2 3 2 3 4 5 3 1 2 3 1 2 3 1
Right now the parser will blend all tracks together.

An alternative is to drop the notes.Sort(...) line and detect whenever the time drops to less than it was last time, and call that a new track. (Or... just stop parsing or something, depending on how you want to handle tracks.)

tommai78101
Posts: 762

Post by tommai78101 » 07-24-09 1:28 am

Are there going to be a guide on how to use this?

Reading the first and second posts made by Frost is too much for my eyes. Sorry if I couldn't give out more detailed explanations.
Hardware Information: Windows Vista Home Premium SP1, 358MB Mobile Intel Graphics Media Accelerator X3100, Synthesia 0.7.1 preview r697, 2 GB DDRAM, 1.6 GHz Intel Pentium Dual-Core Processor T2330, Acer Aspire 5720-4126
New Hardware Information: Windows 10 Pro, 2GB Nvidia GeForce 860M, 8GB RAM, 1.7GHz Core-i5 4210U, Alienware 13 R1.

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 2:47 am

I just made my little program run the MIDI-to-Text conversion for you, so, at this point, it should be as easy as:
  1. Make a very simple (one-track, one-hand, no-simultaneous note) MIDI.
  2. Drop it on the MidiTextParser.exe program attached to this post. (It will copy something to your clipboard.)
  3. Paste it into the big first box in Frost's program.
  4. Click "Calculate".
The "predicted fingering" box will have your results. 1 is your right thumb, 2 your index finger, etc.
Attachments
MidiTextParser.zip
Both .exe's inside need to be extracted to the same directory. Drag .mid files to MidiTextParser.exe (not mf2t.exe!)
(26.72 KiB) Downloaded 371 times

TonE
Synthesia Donor
Posts: 1180

Post by TonE » 07-24-09 5:02 am

Thanks Nicholas, it works, only copying to clipboard creates:
Unhandled Exception: System.Runtime.InteropServices.ExternalException: Requested Clipboard operation did not succeed.
at System.Windows.Forms.Clipboard.ThrowIfFailed(Int32 hr)
... SetDataObject ...
... SetText ...
... SetText(String text)
at MidiTextParser.Program.Main(String[] args)

TonE
Synthesia Donor
Posts: 1180

Post by TonE » 07-24-09 5:17 am

... it works apart from the fact AFP 0.1.01 does not convert midi pitch numbers correctly to note names.

It seems AFP starts mapping
0 --> R (est)
1 --> A1
2 --> A1#
...
60 --> G5#
...
127 --> D11#

instead it should be, keeping the 0 reserved to a rest which is unconventional but practical in AFP, as noone uses normally that deep/low pitches in .mid files:

1 --> C#x
2 --> Dx
...
12 --> C(x+1)
...

replacing x with a number of your choice, see other thread for discussion.

Frost
Posts: 51

Post by Frost » 07-24-09 7:12 am

Couldn't wrist position be "learned" over a few notes?
that's a good idea. right now it uses dynamic programming, so finds the absolute minimum, but the addition of heuristics on top of it can provide improvements. I got an idea,ish after reading your post, still hazy though. in a few days it would be clearer.
What is C1 for a midi pitch number in the range of 0..127 or is this unimportant?
it's unimportant. the notes are for just for finding the distances between the keys; notes do not matter at all. (c-d-e is the same as f-g-a due to the placement of black keys)

For now, the GUI feels like a research interface.
exactly, the actual version won't have hundreds of parameters (I'm still hiding about 100 parameters, was lazy to implement in the GUI, those empty tabs :) ), they would probably be hidden in a advanced user part. only parameters may be "hand size: small - medium - large", "left-right hand" etc. But right now, I/you constantly modify the parameters, so tried to squeeze as much as I could :)
You should look at lilypond both for input and output.
wow, yes it's simple, and I was thinking of doing a non-synthesia version for sheet music in the future and searched for a midi sheet generator with fingerings to use along AFP. lilypond looks very nice and simple, I'll definitely look to it.

Until someone mades this library, maybe we can use a "midi to txt" like you can find in http://www.midiox.com/
..
I just made my little program run the MIDI-to-Text conversion for you
thanks! that really helps. I can add the midi drag box to AFP as soon as I got the time. should be a lot more easier to test.
it works apart from the fact AFP 0.1.01 does not convert midi pitch numbers correctly to note names.

It seems AFP starts mapping
0 --> R (est)
1 --> A1
2 --> A1#
...
60 --> G5#
...
127 --> D11#

instead it should be, keeping the 0 reserved to a rest which is unconventional but practical in AFP, as noone uses normally that deep/low pitches in .mid files:

1 --> C#x
2 --> Dx
...
12 --> C(x+1)
...
ah, yes. that was mentioned for a little bit.
Now, the note numbers are not correct. Right now, it starts from "A", not "C". I was fixing it, but your post made me confused.

The reason why I started from 1=A is, well, the leftmost key in piano is A. Actually the notes themselves are correct, it's the octave thats wrong. Right now:
A1 A1# B1 C1 C1# D1..
1___2__3__4__5___6..

However, on piano it should be:
A0 A0# B0 C1 C1# D1..
1___2__3__4__5___6..

So I was going to start the octave from "C" instead of "A" and take the leftmost 3 keys as "leftovers" from the previous octave. Now, you say midi pitches start with 1=C# ?
Which standard would you prefer? Should I take 1 as the first key of the piano, or as the midi pitch number? Which would be more comfortable for you? I mean, when it's finished, most people would use it with midi files, but right now, creating a midi file for a short fragment would be harder than writing it manually. I can, of course, modify the midi and hand input separately, but which standard would people be more comfortable with?

Frost
Posts: 51

Post by Frost » 07-24-09 7:22 am

The reason why I started from 1=A is, well, the leftmost key in piano is A.
now, thinking about it, I only thought of the 88-key full size piano, but what is the layout of smaller, 49-key, 76-key etc. keyboards? Do they also start with an A? C? Manufacturer dependent?

Frost
Posts: 51

Post by Frost » 07-24-09 7:54 am

Are there going to be a guide on how to use this?

Reading the first and second posts made by Frost is too much for my eyes. Sorry if I couldn't give out more detailed explanations.
well, right now it's for testing only, so it's a bit messy.
To use:
-Write numbers into the first box (sample: 1 2 3 4 5 ....)
or
-Write notes into the second box (sample: C4 C4# D4 G3...)
-Click calculate. The program will give the fingering as "1 4 3.." etc.
-The fingerings may be worse than what you expect, so you play with the parameters to make the prediction better.

"What parameter does what" is slightly complicated, there are "+" signs along them, which will give help if you hover on them with the mouse pointer. If that's too messy and you don't want to play with the parameters, you can just help by reviewing your favourite piece. As an example, you can say:

Song: (C1 D1 G1..)
Prediction: (1 2 4..)
Actual fingering: (1 2 5..)
Why actual is better: (stretch between 2 and 4 is uncomfortable, using 2-5 is better..)
Parameters (if changed): "R12.Thumb pass= 2.5 ..."

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 10:17 am

TonE wrote:Thanks Nicholas, it works, only copying to clipboard creates: [an exception]
That's weird. Do you have some weird permissions thing going on? You must have .NET 2.0 or it wouldn't run, so that shouldn't be the problem.

You extracted both to a folder?... Again, you must have otherwise it would have crashed before that looking for mf2t.exe...

Are you trying to convert a super-huge MIDI? The clipboard is supposed to be able to take anything, but if there are a few thousand notes, maybe that is doing something weird?

I'm not sure what it could be. That should be the safest line in there.
Frost wrote:... but what is the layout of smaller, 49-key, 76-key etc. keyboards? Do they also start with an A? C? Manufacturer dependent?
It's size dependent. All 76-key keyboards will start on the same note, etc., but not necessarily the same note as any other size keyboard. I dug up pictures of each on Google a while back when I needed the values for Synthesia. I don't have them on me right now.

TonE
Synthesia Donor
Posts: 1180

Post by TonE » 07-24-09 10:46 am

Nicholas wrote:Are you trying to convert a super-huge MIDI? The clipboard is supposed to be able to take anything, but if there are a few thousand notes, maybe that is doing something weird?
No, only Hanon 10 exercise, right hand part only.
Now, you say midi pitches start with 1=C# ?
Which standard would you prefer? Should I take 1 as the first key of the piano, or as the midi pitch number? Which would be more comfortable for you?
Frost, 1=C# because 0=C. I would prefer the MIDI standard, why, because it is a standard. Actually you should not care much about me as I will not test anything much if I do not know exactly what is going on inside, I would prefer waiting, finding, reading and understanding the papers you mentioned, only one paper I could not find yet. ( Jacobs, Refinements to the Ergonomic Model for Keyboard Fingering, 2001 )

Having a short glance on last paper you mentioned, Kasimi et al., A Simple Algorithm for Automatic Generation of Polyphonic Piano Fingerings, 2007, I did not like the model shown in Figure 5., Cost function cvertical for finger pair 2,5.

To me it seems the point x=1 with its value y=10 should be lower than both points at x=10 and x=11, or the other way around, the both points at x=10 and x=11 should be much higher/bigger than 10. Why, I tried it on the keys myself, to me it feeled not equally difficult. But of course I do not know what you did exactly in your AFP application. But it is good to see someone comes up with new fresh energy and input here into the forum. :)

As far as I know all 5 octave keyboards, as the most synthesizers are, start with a C, to be more precise starting with 36 and ending also with a C = 96.

Frost
Posts: 51

Post by Frost » 07-24-09 11:51 am

Having a short glance on last paper you mentioned, Kasimi et al., A Simple Algorithm for Automatic Generation of Polyphonic Piano Fingerings, 2007, I did not like the model shown in Figure 5., Cost function cvertical for finger pair 2,5.

To me it seems the point x=1 with its value y=10 should be lower than both points at x=10 and x=11, or the other way around, the both points at x=10 and x=11 should be much higher/bigger than 10. Why, I tried it on the keys myself, to me it feeled not equally difficult. But of course I do not know what you did exactly in your AFP application. But it is good to see someone comes up with new fresh energy and input here into the forum.
I attached the Jacobs paper.
Well, Kasimi paper seems very simple, don't know how it will fare though (it will only affect the results in case of polyphonic fingering). About the Cvertical score, on a 1 note interval fingers 2-5 will not *ever* be played (at least in my method), so any high enough score would do. But you are right, 1 interval can be possibly played even though uncomfortable, whereas 10-11 key intervals are very hard, I also changed it to 6-7.
By the way, that was the part where I haven't gotten to. Would you be interested in finding hand stretch scores between finger pairs? As the 2-5 example, but for all pairs ( 1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5 and 4-5) ?

Actually you should not care much about me as I will not test anything much if I do not know exactly what is going on inside, I would prefer waiting, finding, reading and understanding the papers you mentioned
actually, method is really very straight forward, I'll try to write a simple guide about the internal workflow and the rules, people can suggest their rules to make it better. I'll also release the source code once I clean up it a little bit.
Attachments
Refinements to the Ergonomic Model for Keyboard Fingering .pdf
(91.63 KiB) Downloaded 257 times

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 12:08 pm

TonE wrote:No, only Hanon 10 exercise, right hand part only.
(Without derailing this too much further, because what you guys are discussing is way more important, the best I can find is that some other application has the clipboard "open" so the copy can't go through. Don't you have AutoHotKey monitoring the clipboard? Although... if Synthesia can still write to it, this little app should be able to, too.)

tommai78101
Posts: 762

Post by tommai78101 » 07-24-09 1:34 pm

Nicholas wrote:(Without derailing this too much further, because what you guys are discussing is way more important, the best I can find is that some other application has the clipboard "open" so the copy can't go through. Don't you have AutoHotKey monitoring the clipboard? Although... if Synthesia can still write to it, this little app should be able to, too.)
On a scale of 1 to 100, how important is it? If it's over 80, please make a subforum for this active development in the Development Updates. If possible, you can promote the developers to be a part of the Synthesia work team, possibly Concept Division.
Hardware Information: Windows Vista Home Premium SP1, 358MB Mobile Intel Graphics Media Accelerator X3100, Synthesia 0.7.1 preview r697, 2 GB DDRAM, 1.6 GHz Intel Pentium Dual-Core Processor T2330, Acer Aspire 5720-4126
New Hardware Information: Windows 10 Pro, 2GB Nvidia GeForce 860M, 8GB RAM, 1.7GHz Core-i5 4210U, Alienware 13 R1.

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 4:12 pm

tommai78101 wrote:On a scale of 1 to 100...
You derailed it further! :lol:

Gah, so did I just now!

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 9:57 pm

Alright, I have something to add this time for real. ;)

Regarding finding the right values for the (hundreds of) parameters: isn't that what genetic algorithms are really good at doing?

If we had some larger body of samples with known fingerings, (I wonder if the Hanon set is a diverse enough training set just by itself), you could start doing the generational approach with parameter searching. You would generate results, compare them to the known "optimal" (by-hand) fingerings, and spawn some slightly different "children" parameter experiments based on those parameters. Then you just cull the children that go too far off in the wrong direction.

We might find some convergences that make this a lot easier.

TonE
Synthesia Donor
Posts: 1180

Post by TonE » 07-24-09 10:15 pm

actually, method is really very straight forward, I'll try to write a simple guide about the internal workflow and the rules, people can suggest their rules to make it better. I'll also release the source code once I clean up it a little bit.
Frost, thanks for sharing the "refinements paper" and I feel much better now after reading your text above. I would love to contribute wherever I can, as many others in this forum might like/want to in the same way. However to join the forces in this forum, it would be very helping or flow increasing if already gained insights would be shared in a short and efficient way here in the forum, that followers of the same topic do not have to do the same work again. They could spend their energies for "error checking of the steps made so far" and for further steps from that point. We would decrease redundant work.

So what does it mean in practice?

I would like to see:

1. A growing set of reference .mid files with "perfect" or official fingering for each note in that piece from various sources, e.g. Czerny, Hanon and more. The fingering information should be written into a separate .txt file, writing one finger number per line or continuous writing with spaces in between as it is the case in your AFP. This "testing set of phrases/midi files" is important to evaluate the current algorithms performance based on input parameter changes into the algorithm/system.

2. A list of made assumptions and requirements for anything we use in the algorithm. Best would be having some sort of pseudo-code descriptions using natural language, and adding comments to that pseudo-code. Then any follower could see its use/location in the algorithm and why it is there, along with any requirements and limitations for that part only. For example so far I do not know yet what exactly you are using from the mentioned papers, where and how, to what degree, and always explaining why we have to use it? Maybe we can just ignore it, or maybe we have to detail it much more which was not done in the paper because of any reasons at that time like lack of time, lack of interest, lack of craziness, lack of original external idea inputs... :-) Or giving another example, until reading your question "Would you be interested in finding hand stretch scores between finger pairs? As the 2-5 example, but for all pairs ( 1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5 and 4-5) ?" I did not know yet that no such information is used anywhere in the algorithm already or those papers did not answer this question yet in a sufficient way. The goal of this mentioned list in this point 2. would be exactly to prevent such situations.

3. Having a list of unsolved problems which are important or related to the used algorithm above. The same example from above could be listed also here. Unsolved problems should be described using input and output requirements. Here is one example: http://www.cs.indiana.edu/~epnichol/pag ... rawer.html

In the above problem of "finding right, not left, hand stretch scores between finger pairs" it could look like:

input:
a) starting finger number from a set of five numbers 1..5 (considering only right hand, meaning strong thumb = 1 is on the left, not always but most times),

b) maximum positive half step distance space counting from the thumb position, e.g. I would set this to 14 for the pair 1-5, meaning from C to D in the next higher octave and automatic exclusion of impossible pressings resulting from that given boundary state information. Example: Thumb=1 is on C, finger 5 is on D in the next higher octave. This is the maximum space we can control in the positive direction on the piano keyboard = +14 half steps from thumb position. Now try this out, put your fingers like this on the piano keyboard. Where are your other three fingers (2,3,4) now, (we could consider their current locations as their natural initial points for stretch distance measurements) and try to reach with your finger 2 the next D just after your thumb position? Can you reach it? :-)

output:
List of start-end finger pairs giving for each following information like in the case of the pair 1-5
a) 1-5: values of start and end.
b) 0-14: maximum controlled space in the positive direction.
c) 0 (finger 1=start),14 (finger 5=end), 12 (finger 4), 10 (finger 3), 8 (finger 2): initial relaxed locations for all fingers, which we can define here as "the optimum finger placements in current range". So we are considering only the range between lowest used pitch and highest used pitch, the direction of hand movement in that particular time is ignored here, which could also influence the optimal placement, depending on the movement direction which can be right, left or staying almost at the same place.
d) minimum of a to be defined range: stretch value for that position

So we would need following output list for the finger pair 1-5:
0-14: stretchvalue(0,14)
0-13: stretchvalue(0,13)
0-12: stretchvalue(0,12)
0-11: stretchvalue(0,11)
...
0-1: stretchvalue(0,1)

Nicholas
Posts: 12393

Post by Nicholas » 07-24-09 10:23 pm

All three of those are excellent TonE. Your #1 point is exactly what I was talking about in my post. An increasingly large training set is going to be critical the further along this effort goes.

It also helps with "regression testing" like you alluded to. You can catch things like a new parameter actually hurting the quality of the generated fingerings.

The sooner we have a single button or command-line that will test against some huge set of data and return a single number (or maybe a handful) representing how well we did, well... the better.

Post Reply