Learning Regular Expression in Java

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Body

I am just doing some codewars for fun...In this Remove consecutive duplicate words, we would take

"alpha beta beta gamma gamma gamma delta alpha beta beta gamma gamma gamma delta"

as a input string, and output.

"alpha beta gamma delta alpha beta gamma delta"

One of the solution I saw was. very elegant but I still struggle with the regexpublic class Kata {
public static String removeConsecutiveDuplicates(String s){
return s.replaceAll("(\\b\\S )( \\1\\b) ", "$1");
}
}\\b means the word boundary. \\s means. one or more empty spaces.
( \\1) means the first group ie (\\b\\S )
. ( \\1\\b) means matching first group and a duplicate word, correct? means or one or. more cases. \\b in first group means word position before the word; in the. second group, means word position after it. What is the empty space in ( \\1\\b) Can someone explain it?

Comments

ejsanders1984

I think the empty space is literally catching the space between words. The actual regex only has single slashes, it's double in the code because it needs escaped.

From regex101.com:

"(\b\S )( \1\b) "

gm 1st Capturing Group (\b\S ) \b assert position at a word boundary: (^{\w|\w$|\W\w|\w\W)} \S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ]) matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)

2nd Capturing Group ( \1\b) matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy) A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data matches the character with index 3210 (2016 or 408) literally (case sensitive) \1 matches the same text as most recently matched by the 1st capturing group \b assert position at a word boundary: (^{\w|\w$|\W\w|\w\W)} Global pattern flags g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Author

Account Strength

40%

Account Age

1 year

Verified Email

Yes

Verified Flair

Total Karma

Link Karma

Comment Karma

Profile updated: 6 days ago

Ok_Improvement2153

Subreddit

r/learnjava

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 7 months ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/learnjava/c...