// These are format-agnostic transformations applied to the plain-text string // returned by Tika regardless of original file type (PDF, DOC, DOCX, etc.). // Empty ...