Skip to content

Add function to turn html to text #708

@TobiasNx

Description

@TobiasNx

In context of OERSI (see https://gitlab.com/oersi/oersi-etl/-/issues/360) we are in need of a fix function that gets rid of any html tags in a text and removes HTML-encoded special characters.

Idea could be:

    html_to_text {
        @Override
        public void apply(final Metafix metafix, final Record record, final List<String> params, final Map<String, String> options) {
            record.transform(params.get(0), s -> Jsoup.parse(s).wholeText());
        }
    },

Based on idea here: https://stackoverflow.com/questions/3607965/how-to-convert-html-text-to-plain-text

Not sure if we wait for #706

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions