What are regular expressions in Java

Regular expressions

Martin Kompf

Regular Expressions (engl .: Regular expressions, Abbr .: regex) are a powerful tool for processing character strings. Regular expressions have been built into the Java programming language since version 1.4. This page contains a Regex Quick Reference Guide and a Regex Tester for interactive testing of regular expressions.

Regex tester

The Regex Tester is a Java program for the interactive development and testing of regular expressions. You start the tool via Java Webstart by clicking on the Launch Button. A Java runtime environment must be available on your computer.

Newer versions of Java Webstart make it difficult or even forbid the execution of unsigned applications, even if they run completely in the sandbox. In this case you can download the program manually and run it from the command line.

The signature can be checked using and my public PGP key.

Regex quick reference

character

x The character x \ Backslash \ xhh Character with hexadecimal value hh \ uhhhh Unicode character with hex code hhhhh \ t Tabulator \ n Newline (LF) \ r Carriage return (CR)

Character classes

[abc] a or b or c [^ abc] Everything except a, b or c [AZ] All characters from A to Z [A-Za-z] A to Z or a to z [a-z && [^ mn] ] a to z except m and n

Predefined character classes

. Any character \ d All digits: [0-9] \ D Everything except digits \ s Whitespace: [\ t \ n \ x08 \ x0c \ r] \ S Negation of \ s \ w Word characters: [a-zA-Z_0- 9] \ W Negation of \ w

POSIX character classes

\ p {Lower} ASCII lower case: [az] \ p {Upper} ASCII upper case: [AZ] \ p {ASCII} ASCII characters: [\ x00- \ x7f] \ p {Alpha} letter: [A-Za-z ] \ p {Digit} Number: \ d \ p {Alnum} Letter or number

Unicode character classes

\ p {Lu} Unicode upper case letter \ p {Ll} Unicode lower case letter \ p {Sc} Unicode currency symbol \ p {Nl} Unicode number or letter

Limits

^ Beginning of line $ End of line \ b Word boundary \ A Beginning of input \ z End of input

Repetitions

? once or not at all * several times or not at all + at least once {n} exactly n times {n,} at least n times {n, m} n to m times

Logical operators

XY X followed by Y X | Y X or Y (X) X as a group

Flags

(? i) Case-insensitive (ASCII) (? iu) Case-insensitive (Unicode) (? m) Multi-line mode (? s) Single-line / dotall mode

Application patterns of regular expressions in Java

Matches and LookingAt

String str = "X"; String regex = "[A-Z]"; // test for capital letters (ASCII) boolean matches = str.matches (regex); // Alternative: pre-compiled pattern Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (str); matches = matcher.matches (); // LookingAt matcher = pattern.matcher ("Hello"); matches = matcher.matches (); // false startsWith = matcher.lookingAt (); // true

Group

String str = "17-25"; String regex = "(\ d +) - (\ d +)"; // Double the backslash in the Java source code! Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (str); if (matcher.matches ()) {System.out.printf ("From% s to% s \ n", matcher.group (1), matcher.group (2)); }

Split

String str = "1,2, 3, 4, 5.6"; String regex = "\ W +"; // Split at word boundaries String [] elements = str.split (regex); // Alternative: pre-compiled pattern Pattern pattern = Pattern.compile (regex); elements = pattern.split (str);

Find

String str = "All RDBMS use SQL"; String regex = "\ p {Lu} {2,}"; // Find all words with at least two // uppercase letters (Unicode) Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (str); while (matcher.find ()) {System.out.println (str.substring (matcher.start (), matcher.end ())); }

Replace

String str = "value with spaces"; String regex = "\ s +"; String replacement = "+"; // Replace whitespace with + string no_spaces = str.replaceAll (regex, replacement); // Alternative: pre-compiled pattern Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (str); no_spaces = matcher.replaceAll (replacement);

Find and replace

String str = "Hello $ {user.name}!"; String regex = "\ $ \ {([a-z.] +) \}"; // Replace $ {name} with System.getProperty (name) Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (str); StringBuffer sb = new StringBuffer (); while (matcher.find ()) {matcher.appendReplacement (sb, System.getProperty (matcher.group (1))); } matcher.appendTail (sb); System.out.println (sb.toString ());

Left