This is a list of mostly familiar words, phrases, and names, most of which have a weight or score from 1–100 assigned.
More accurately, the "list" is actually a database containing words and phrases along with a bit of metadata and I periodically export lists to text files using various parameters.
Though others may find some utility in it, it was developed with constructors of word puzzles (specifically crossword constructors) in mind.
As such, entries have been selected for inclusion and scored based on their usefulness and desirability for word puzzle construction and so the makeup of this list is somewhat different than other, general purpose wordlists.
For instance, in addition to a reasonably complete set of standard, unobscure English "dictionary words," the list includes: proper nouns, phrasal verbs, idioms and conversational phrases, texting shorthand, prefixes and suffixes, partial phrases such as "I let," Roman numerals, foreign words and phrases relevant to English speakers, and anything else allowed by standard crossword construction rules.
This has been a work in progress for roughly a decade now, and is comprised of entries typed in by me, scraped from various lists (important movies, Billboard hit songs and artists, world capitals, etc.), and harvested from thousands of puzzles from dozens of sources both indie and mainstream.
Special thanks to Joe Krozel, wordlist curator and constructor par excellence, for helping me clean up and rescore a significant number of junk entries.
I have always prioritized comprehensiveness, so anything reasonably legitimate that may somehow, some day prove useful or interesting for any word-related endeavours has been kept. Also, the scope and quantity of the things that have been entered over the years pretty much guarantee that undesirable things have snuck in. It certainly contains a wealth of useless dreck including - but not limited to - esoterica, misspellings, and horrifically offensive terms that should never see the light of day. In fact, I have found that although having a large list at my disposal has helped immensely with algorithmic theme generation and has by and large enabled me to fill crossword grids with more interesting content, it has not sped up the construction process, since the amount of trash to wade through grows in proportion to the size of the list.
I make no guarantees about the legitimacy, correctness, or usefulness of the contents of this list at any given time. I take no responsibility for misspelled words or names (or straight-up nonsense entries) that end up in a puzzle, offensive or triggering words that you may come across, scoring choices that may cause you frustration, or any other misfortune that may befall a user of this list.
That said, if you come across anything that you feel should be fixed or removed entirely, do let me know. If you'd like to get even more involved in the ongoing curation of the list, also let me know that and we'll figure something out.
You are free to use this list in any way you'd like. This includes commercial uses, though I'd appreciate it if you didn't just turn around and try to sell it as is (but I mean, I'll still offer it for free to anyone so that wouldn't really be a smart business venture anyway). If you do use the list to assist with a commercial venture or even if it just helps get you out of a tricky corner when constructing a puzzle, let me know! This is in no way any sort of licensing obligation, I just like to hear about awesome things that people are doing and would love to know that I've helped make them happen in some small way.
My contact information can be found at the bottom of this page
Of the total entries, have been marked as usable and included in the lists shared below.
The set of entries marked as "unusable" includes which have been flagged as "theme entries" and are likely unusable outside of a themed puzzle similar to the one they were scraped from (e.g. THYMEOUT from a herb puns puzzle, TOTBELLY from a puzzle involving changing P to T, and TWHEELER most likely from a number rebus puzzle). I haven't included these in the lists shared here, but email me if you'd like these as well.
The rest of the unusable entries are marked as such due to obscurity or offensiveness (or, perhaps in rare cases, both). Note that there are still a number of obscure and/or offensive/vulgar/crass entries in the lists shared below; the "unusable" flag is reserved for only the most extreme examples (e.g. while some profanity and sexual slang is included in the lists, I have made an effort to keep racial slurs out).
Although I now enter new additions using full spacing, upper/lower case, and punctuation, only about % of the list ( entries) is in this format. But I convert everything to ALLCAPSNOSPACESNOPUNCTUATION when I export it anyway since this is how it would be used in crossword constructing software. Let me know if you'd prefer the list to include the full text versions of the entries and I can get you that one instead. In either case there will be some entries that include digits (e.g. 9TO5, APOLLO13, etc.), though there shouldn't be any that include other special characters in all but the "full text" lists below.
The scoring is a bit all over the place, since I've changed my preferences and methodology over the years.
Also, only about % of entries (, to be exact) are scored (and that's not even entirely accurate because some of those scores were applied in bulk to entire classes of entries at once rather than by weighing each entry individually).
But note that 100% of the entires on the scored list have scores attached because I apply a default score to unscored entries when exporting.
A few notes on scoring:
- The cut-off point for general usefulness is 50. The 35-49 range is mostly things that I would only consider using in a pinch in a conventional crossword, and anything below that is likely unusable for almost any purpose. Yes I know this distribution of scores makes no sense.
- Unscored entries have been given a default score of 50.
- Proper nouns tend to be scored pretty low, and lately I've been scoring these in the 50-60 range, depending on familiarity.
- The length of entry is taken into account when scoring it. For instance, I favour common, simple vocabulary for short entries (generally scored around 85) whereas I favour noun and verb phrases, idioms, and especially unusual letter combinations for longer entries (these are scored anywhere from about 56 to 100, whereas dry, long words are scored at 50 or 51).
I offer unscored versions of the list as well, just in case you find my scores more of a hindrance than a help.
The grid text version contains all nontheme entries written in all capitals with no spaces or punctuation, and is suitable for use with construction software.
Each line contains one entry, and in the scored version this is followed by a semicolon and then a numerical score from 1-100, like this:
The full text version contains the same set of entries (and scores, in the scored version) but some are written with spaces, punctuation, and mixed case.
Additionally, since there are entries which contain semicolons a double colon is used to separate the entry from the score in the scored version, like this:
Here's an example::57
...and, one more::62
DOS/Windows line endings are used (i.e. \r\n) in all files, and to my knowledge no user has ever encountered a line ending-related problem when using these lists with construction software.
Each grid text list is available with either a .txt or .dict extension for convenience (since not all construction software recognizes the same file extensions).
However, these files differ in name/extension only; they are otherwise byte for byte identical.
Lastly, I update the master list often (at least once or twice a week), so check back regularly. Updates may include new additions, deletions, scoring changes, and additional "full text" data (i.e. changing an entry like GIVEMEONEEXAMPLE to "Give me one example" in the full text version).
Be aware, however, that if you have saved my list in the past and made your own changes through the wordlist interface in your construction software, importing the updated version of my list my overwrite your changes.
You can get in touch with me at:
[Breakfast ___ Tiffany's]
[Word after "polka"]