Get DeepFormants working again by iskunk · Pull Request #9 · MLSpeech/DeepFormants

iskunk · 2020-01-17T20:19:18Z

Minor syntax tweaks to make the code Python 3 compatible
Fixes for various NumPy warnings/errors, either due to use of float where int is required, or domain errors on log functions
Replaced the use of the obsolete Python-2-only scikits.talkbox library with a compatible LPC implementation from the Conch project
Documentation update to indicate that an old version of rnn is required
Invoke Lua scripts via luajit directly, instead of going through the th frontend (to reduce the dependency footprint)

* Minor syntax tweaks to make the code Python 3 compatible * Fixes for various NumPy warnings/errors, either due to use of "float" where "int" is required, or domain errors on log functions * Replaced the use of the obsolete Python-2-only scikits.talkbox library with a compatible LPC implementation from the Conch project * Documentation update to indicate that an old version of "rnn" is required * Invoke Lua scripts via "luajit" directly, instead of going through the "th" frontend (to reduce the dependency footprint)

iskunk · 2020-01-17T21:03:30Z

(Pushed again to fix a minor goof in the README.)

This PR should address issues #3 (in part), #5, and #7.

I would appreciate an especially careful review of my changes to the extract_features.py file, as I'm not completely certain that I didn't fudge up the math. I did, however, find that the librosa implementation of LPC, which was the most obvious successor of the old Talkbox library, gave quite different results. (Talkbox includes a test_lpc.py file which was helpful in determining this.) Thankfully, this thread led me to a compatible (if Python-only) implementation from the Conch project that appears to do the trick.

More work is needed, of course. First, the tracking model needs to be rebuilt using a current version of rnn, to completely resolve #3. Second, the memory usage is out of control and needs to be addressed.

I tested my changes with two speech files; one was 19 seconds long, the other 67 seconds. I ran DeepFormants on a multiprocessor system (Intel Xeon, no GPU) with 48 GB RAM. In both cases, the feature-extraction stage took a while to run, presumably due to the pure-Python replacement LPC implementation. No big deal. But the second stage, when Torch is invoked... the small file led to a peak memory usage of 33 GB. It didn't take particularly long, which makes me suspect all that memory was allocated but hardly used. With the large file, the usage got up to 50 GB, and once it was clear that swapping was slowing the program down to a crawl, I terminated the run.

iskunk mentioned this pull request Jan 17, 2020

Is this only for Python 2? #7

Open

iskunk force-pushed the revamp branch from bd9d6e9 to e759570 Compare January 17, 2020 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get DeepFormants working again#9

Get DeepFormants working again#9
iskunk wants to merge 1 commit intoMLSpeech:masterfrom
iskunk:revamp

iskunk commented Jan 17, 2020

Uh oh!

iskunk commented Jan 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iskunk commented Jan 17, 2020

Uh oh!

iskunk commented Jan 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant