Java 9 - Regex Improvements

[Updated: Dec 4, 2017, Created: Dec 4, 2017]

Java 9 added new methods in java.util.regex.Matcher class to improve regex match post processing. Let's see what are those methods.

Regex MatchResult as Stream

public Stream<MatchResult> results()

This method allows us to use Java 8 Stream and lambda capabilities to do post match processing

Following regex matches html tags:

        Pattern pattern = Pattern.compile("<(.|\n)*?>");
        String input = "<html><body><div>content</div></body></html>";
        Matcher matcher = pattern.matcher(input);
        //using the new method results()
        Stream<MatchResult> stream = matcher.results();
   stream.forEach(matchResult -> System.out.println(matchResult.group()));
<html>
<body>
<div>
</div>
</body>
</html>

Following example matches phone number and apply some stream operations:

   Pattern pattern = Pattern.compile("\\d{3}-\\d{3}-\\d{4}");
   String input = "111-111-1111 222-222-2222 333-333-3333 444-444-444";
   Matcher matcher = pattern.matcher(input);
   String s = matcher.results()
                     .map(MatchResult::group)
                     .dropWhile(str -> !str.startsWith("222"))
                     .takeWhile(str -> !str.startsWith("444"))
                     .collect(Collectors.joining(", "));
   System.out.println(s);
222-222-2222, 333-333-3333

Following regex matches integers in the input string and find the sum:

   Pattern pattern = Pattern.compile("\\d+");
   String input = "a 1 b 2 c 3 d 4 e 5 f 6";
   Matcher matcher = pattern.matcher(input);
   int sum = matcher.results()
                    .map(MatchResult::group)
                    .mapToInt(Integer::parseInt)
                    .sum();
   System.out.println(sum);
21

Lambda Based Replacement

replaceAll()

public String replaceAll(Function<MatchResult,String> replacer)
   Pattern pattern = Pattern.compile("\\s+");
   Matcher matcher = pattern.matcher("this is a test string");
   String s = matcher.replaceAll(matchResult -> "-");
   System.out.println(s);
this-is-a-test-string
   Pattern pattern = Pattern.compile("(\\d{3})-(\\d{3})-(\\d{4})");
   String input = "phone1: 111-111-1111\nphone2: 222-222-2222";
   System.out.println("-- input --\n" + input);
   Matcher matcher = pattern.matcher(input);
   String s = matcher.replaceAll(matchResult -> "($1) $2 $3");
   System.out.println("-- after replacement --\n" + s);
-- input --
phone1: 111-111-1111
phone2: 222-222-2222
-- after replacement --
phone1: (111) 111 1111
phone2: (222) 222 2222

replaceFirst()

public String replaceFirst(Function<MatchResult,String> replacer)
   Pattern pattern = Pattern.compile("\\s+");
   Matcher matcher = pattern.matcher("this is a test string");
   String s = matcher.replaceFirst(matchResult -> "-");
   System.out.println(s);
this-is a test string

Other overloaded variants

Following new methods now accepts StringBuilder as well. That means the code which is already thread safe can use StringBuilder for efficiency instead of StringBuffer. Note that StringBuffer is synchronized whereas StringBuilder is not.

appendReplacement()

New method:

 public Matcher appendReplacement(StringBuilder sb,
                                 String replacement)

The existing variant:

public Matcher appendReplacement(StringBuffer sb,
                                 String replacement)

appendTail()

New method:

public StringBuilder appendTail(StringBuilder sb)

The existing variant:

public StringBuffer appendTail(StringBuffer sb)

Example Project

Dependencies and Technologies Used :

  • JDK 9
  • Maven 3.3.9

Java 9 Regex Examples Select All Download
  • java-9-regex-changes
    • src
      • main
        • java
          • com
            • logicbig
              • example

See Also