Jump to content
Al T

If you was to create a rhyming dictionary, how would you structure the database?

Recommended Posts

Hi,

 

I'm trying to think of the best structure for a offline rhyming dictionary database.

 

I'm willing to make it a open source offline dictionary, but need some ideas on how to best create the dictionary without bogging it down or making it bloated.

 

Thank you in advance!

Edited by Al T

Share this post


Link to post

I'm not an expert with that, but I guess you have to structure the database with a phonetic interpretation of words.

When you look at this

https://www.rhymedb.com/

 

Moreover, how to handle other languages, I assume also there a phonetic interpretation might cover a few languages at once.

Edited by Rollo62
  • Like 3

Share this post


Link to post

Maybe the soundex algorithm might be a way to start?  I have used it before to check for misspellings of names as two words which sound the same will generate the same value.  Unfortunately there are false positives, and you might want to not use the first character, just the numeric value, but it might get fairly close.  (see https://en.wikipedia.org/wiki/Soundex).  I wouldn't truncate at 3 digits, let it go all the way to maxint if necessary.

 

As for database layout, you could just store the word with the soundex value (integer portion).  When you need sounds like words, just return all records with the same soundex integer from your database.  

Edited by Steven Kamradt

Share this post


Link to post

I don't think soundex to be a good algorithm, I certainly look for phonetic 

Share this post


Link to post

Maybe this database is helpful too, with real pronounciation examples and theoretical background.

https://englishexplorations.check.uni-hamburg.de/basic-concepts-of-english-phonetics-and-pronunciation/

https://en.wikipedia.org/wiki/International_Phonetic_Alphabet

https://www.internationalphoneticassociation.org/IPAcharts/IPA_chart_orig/pdfs/IPA_Kiel_2020_full.pdf

 

If you use such IPA alphabet in the DB, then maybe its possible to find and run a specific soundex algorithm on that.

I would doubt that the original soundex for the usual alphabet might help here, but its worth a try.

Edited by Rollo62

Share this post


Link to post

Forget soundex. Soundex is actually LOSSY on vowels. It would be counter-productive (at least for English).  A modified metaphone algorithm would probably serve better.

 

First off, what language? In Finnish you could probably just turn the strings end-front and do a simple SQL Like. Hungarian too. Perhaps.

English? Not so simple. You need a serious ORTHOGRAPY to FONETICS (Sampa or another representation) /database/. English never had a reform so orto-to-fonetic is illogical. Impossible to "calculate". Retrieve and receive? ie, ei... And loads of other stuff.

Such database might be possible to "scrape" from sources. Oxford dictionary of modern English, perhaps?

You would probably need also the orthographic "lemma" forms, all converted to phonetic representation.

 

Your biggest road-block would definitely NOT be the RDBMS structure.

 

I would start by buying my NLP-doctor-friend at the royal university of tech a couple of beers.

 

Actually, when thinking about this, maybe this is one of the avenues where it is actually prudent to research the use of ML.

Edited by Bounceback

Share this post


Link to post

Since its about rhymes, probably the phonetic abstraction on the trailing parts of the words might be interesting, and maybe can more or less looked up in a DB.

Playing around with this tool, Like:

Quote

a cable
laying on this table

into

Quote

ə ˈkeɪbl
ˈleɪɪŋ ɒn ðɪs ˈteɪbl

 

But you're right, its hard to predict its general outcome without deeper knowledge of semantics, orthography, etc.

On the other hand, is the phonetic transcription not all about the phonetic representation of spoken words ?

So it would be my first candidate for the job.

 

Edited by Rollo62

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×