General method to trim non-printable characters in Clojure
I believe, what you are referring to are so-called non-printable characters. Based on this answer in Java, you could pass the #"\p{C}"
regular expression as pattern to replace
:
(defn remove-non-printable-characters [x]
(clojure.string/replace x #"\p{C}" ""))
However, this will remove line breaks, e.g. \n
. So in order to keep those characters, we need a more complex regular expression:
(defn remove-non-printable-characters [x]
(clojure.string/replace x #"[\p{C}&&^(\S)]" ""))
This function will remove non-printable characters. Let's test it:
(= "sample" "sample")
;; => false
(= (remove-non-printable-characters "sample")
(remove-non-printable-characters "sample"))
;; => true
(remove-non-printable-characters "sam\nple")
;; => "sam\nple"
The \p{C}
pattern is discussed here.