Page 1 of 1
Extract hashtag from a string
Posted: 04 Feb 2020 10:06
by tsolignani
Good morning everyone.
Say a have a string like that «good morning #everyone, have a #nice day».
I would like to extract all hashtags, meaning all words prepended with «#».
How do I do that? I can search for the #character but then I should get the whole word, until the white space or comma or full stop or what.
Thank you.
Re: Extract hashtag from a string
Posted: 04 Feb 2020 12:32
by yogi108
tsolignani wrote: ↑04 Feb 2020 10:06
Good morning everyone.
Say a have a string like that «good morning #everyone, have a #nice day».
I would like to extract all hashtags, meaning all words prepended with «#».
How do I do that? I can search for the #character but then I should get the whole word, until the white space or comma or full stop or what.
Thank you.
Hi,
You can try the following:
findAll('good morning #everyone, have a #nice day', '(?smi)(#\\w*)')
--> [#everyone, #nice]
I'm not a Regex expert, you can try also in the script Regex tester, there is also little help tutorial
Regards
Fritz
Re: Extract hashtag from a string
Posted: 04 Feb 2020 17:25
by tsolignani
It does work! Thank you very much
Re: Extract hashtag from a string
Posted: 05 Feb 2020 18:54
by Desmanto
I tes a bit and thought someone might already done it. So I just googled and found in stackoverflow :
https://stackoverflow.com/questions/385 ... -semicolon
I use the answer from garyh.
Code: Select all
text = "Here's a #hashtag and here is #not_a_tag; which should be different. Also testing: Mid#hash. #123 #!@£ and <p>#hash</p>";
find = findAll(text, '(?<=[\\s>])#(\\d*[A-Za-z_]+\\d*)\\b(?!;)', true);
hashtag = newList();
for(i in find)
addElement(hashtag, i[1]);
Re: Extract hashtag from a string
Posted: 05 Feb 2020 21:04
by tsolignani
Very interesting, thank you!