The ten hundred most common words
PowerShell v3+, 105 92 bytes
param($a,$b)$x=@();-split($b-replace"[^a-zA-Z']",' ')|%{if($_-notin$a){$x+=$_}};($x,1)[!$x]
Takes simple words like $a
, and words like $b
. Makes helper $x
. Take each word in $b
and get rid of any bad not letters, then check each one |{...}
. If that word is not in $a
, then we add it to $x
. At the end, we choose $x
or 1
by not $x
. That is sent out, either words
or 1
.
Some words to try
PS C:\Tools\Scripts\golfing> ('This returns "Hello, World!"','tHiS rEtUrNs TrUe...','Thing Explainer is a book written by a man.
The man writes books with simple words.','This set of stuff "¤!^¤>7\ must return true'|%{"$_";(.\ten-hundred-most-common-words.ps1 (gc .\ten-hundred-most-common-words.txt) $_)})-join"`n###`n"
This returns "Hello, World!"
###
1
###
tHiS rEtUrNs TrUe...
###
1
###
Thing Explainer is a book written by a man.
The man writes books with simple words.
###
1
###
This set of stuff "¤!^¤>7\ must return true
###
1
PS C:\Tools\Scripts\golfing> ("This code doesn't returns Hello, World!",'tHiS rEtUrN"s false...'|%{"$_`n***`n"+(.\ten-hundred-most-common-words.ps1 (gc .\ten-hundred-most-common-words.txt) $_)})-join"`n###`n"
This code doesn't returns Hello, World!
***
code
###
tHiS rEtUrN"s false...
***
s false
Python, 93 bytes
import re
lambda w,s:[w for w in re.sub("[^'\w]|\d|_",' ',w).split()if w.lower()not in s]or 1
All test cases are at ideone
Preprocessing of the list is to split on |
and put it in a set
(which I imagine is fine if pre-sorting is allowed). Input words as w
and the set as s
.
If that's not allowed this becomes 98 bytes with not in s
becoming not in set(s)
.
We could preprocess it to have all permutations of upper and lower case characters too and save 8 bytes, but I think that might be going too far (that would be a huge set).