Remove the Salutations

Retina, 68% 72.8% (old) 74.8% 77.5% (new test battery)

i`^h(a[iy]|eya?|i(h?i|ya|)|ello)[ ,]+

T`l`L`^.

Try it online! Edit: Gained 4.8% (old) 2.7% (new) coverage with help from @MartinEnder's tips.


GNU sed, 78% 100%

/^\w*[wd]\b/!s/^[dghs][eruaio]\w*\W\+//i
s/./\U&/

(49 bytes)

The test battery is quite limited: we can count which words appear first on each line:

$ sed -e 's/[ ,].*//' inputs.txt | sort | uniq -ic
 40 aight
 33 alright
 33 dear
 33 g'd
 41 good
 36 greetings
 35 guys
 31 hai
 33 hay
 27 hello
 33 hey
 37 heya
 43 hi
 34 hihi
 29 hii
 35 hiya
 45 hola
 79 how
 37 howdy
 33 kowabunga
 39 salutations
 32 speak
 34 sweet
 40 talk
 36 wassup
 34 what's
 38 yo

The salutations to be removed begin with d, g, h or s (or uppercase versions thereof); the non-salutations beginning with those letters are

 33 g'd
 41 good
 79 how
 32 speak
 34 sweet

Ignoring lines where they appear alone, that's 220 false-positives. So let's just remove initial words beginning with any of those four letters.

When we see an initial word beginning with any of those (/ ^[dghs]\w*), case-insensitively (/i), and followed by at least one non-word character (\W\+), then replace with an empty string. Then, replace the first character with its uppercase equivalent (s/./\U&/).

That gives us

s/^[dghs]\w*\W\+//i
s/./\U&/

We can now refine this a bit:

  • The largest set of false-positives is how, so we make the substitution conditional by prefixing with a negative test:

     /^[Hh]ow\b/!
    
  • We can also filter on the second letter, to eliminate g'd, speak and sweet:

    s/^[dghs][eruaio]\w*\W\+//i
    
  • That leaves only good as a false positive. We can adjust the prefix test to eliminate words ending in either w or d:

    /^\w*[wd]\b/!
    

Demonstration

$ diff -u <(./123478.sed inputs.txt) replaced.txt | grep ^- | wc -l
0

PHP, 60.6%

50 Bytes

<?=ucfirst(preg_replace("#^[dh]\w+.#i","",$argn));

Try it online!

PHP, 59.4%

49 Bytes

<?=ucfirst(preg_replace("#^h\w+,? #i","",$argn));

Try it online!

PHP, 58.4%

50 Bytes

<?=ucfirst(preg_replace("#^[gh]\w+.#i","",$argn));

Try it online!