Page 1 of 1

Extract hashtag from a string

Posted: 04 Feb 2020 10:06
by tsolignani
Good morning everyone.

Say a have a string like that «good morning #everyone, have a #nice day».

I would like to extract all hashtags, meaning all words prepended with «#».

How do I do that? I can search for the #character but then I should get the whole word, until the white space or comma or full stop or what.

Thank you.

Re: Extract hashtag from a string

Posted: 04 Feb 2020 12:32
by yogi108
tsolignani wrote:
04 Feb 2020 10:06
Good morning everyone.

Say a have a string like that «good morning #everyone, have a #nice day».

I would like to extract all hashtags, meaning all words prepended with «#».

How do I do that? I can search for the #character but then I should get the whole word, until the white space or comma or full stop or what.

Thank you.
Hi,
You can try the following:
findAll('good morning #everyone, have a #nice day', '(?smi)(#\\w*)')
--> [#everyone, #nice]
I'm not a Regex expert, you can try also in the script Regex tester, there is also little help tutorial

Regards
Fritz

Re: Extract hashtag from a string

Posted: 04 Feb 2020 17:25
by tsolignani
It does work! Thank you very much 👍

Re: Extract hashtag from a string

Posted: 05 Feb 2020 18:54
by Desmanto
I tes a bit and thought someone might already done it. So I just googled and found in stackoverflow : https://stackoverflow.com/questions/385 ... -semicolon

I use the answer from garyh.

Code: Select all

text = "Here's a #hashtag and here is #not_a_tag; which should be different. Also testing: Mid#hash. #123 #!@£ and <p>#hash</p>";

find = findAll(text, '(?<=[\\s>])#(\\d*[A-Za-z_]+\\d*)\\b(?!;)', true);
hashtag = newList();
for(i in find)
  addElement(hashtag, i[1]);

Re: Extract hashtag from a string

Posted: 05 Feb 2020 21:04
by tsolignani
Very interesting, thank you!