Java string split by words

Java String split()

The String split() method returns an array of split strings after the method splits the given string around matches of a given regular expression containing the delimiters.

The regular expression must be a valid pattern and remember to escape special characters if necessary.

String str = "A-B-C-D"; String[] strArray = str.split("-"); // [A, B, C, D]

The split() method is overloaded and accepts the following parameters.

  • regex – the delimiting regular expression.
  • limit – controls the number of times the pattern is applied and therefore affects the length of the resulting array.
    • If the limit is positive then the pattern will be applied at most limit – 1 times. The result array’s length will be no greater than limit, and the array’s last entry will contain all input beyond the last matched delimiter.
    • If the limit is zero then result array can be of any size. The trailing empty strings will be discarded.
    • If the limit is negative then result array can be of any size.
    public String[] split(String regex); public String[] split(String regex, int limit);

    1.2. Throws PatternSyntaxException

    Watch out that split() throws PatternSyntaxException if the regular expression’s syntax is invalid. In the given example, “[” is an invalid regular expression.

    String[] strArray = "hello world".split("[");
    Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 0

    The method does not accept ‘null’ argument. It will throw NullPointerException in case the method argument is null.

    Exception in thread "main" java.lang.NullPointerException at java.lang.String.split(String.java:2324) at com.StringExample.main(StringExample.java:11)

    2. Java Programs to Split a String

    2.1. Split with Specified Delimiter

    The following Java program splits a string based on a given delimiter hyphen «-» .

    String str = "how to do-in-java-provides-java-tutorials"; String[] strArray = str.split("-"); //[how to do, in, java, provides, java, tutorials]

    The following Java program splits a string by space using the delimiter «\\s» . To split by all white space characters (spaces, tabs etc), use the delimiter “ \\s+ “.

    String str = "how to do injava"; String[] strArray = str.split("\\s"); //[how, to, to, injava]

    Java program to split a string by delimiter comma.

    String str = "A,B,C,D"; String[] strArray = str.split(","); //[A,B,C,D]

    2.4. Split by Multiple Delimiters

    Java program to split a string with multiple delimiters. Use regex OR operator ‘|’ symbol between multiple delimiters.

    In the given example, I am splitting the string with two delimiters, a hyphen and a dot.

    String str = "how-to-do.in.java"; String[] strArray = str.split("-|\\."); //[how, to, do, in, java]

    3. Split a String into Maximum N tokens

    This version of the method also splits the string, but the maximum number of tokens can not exceed limit argument. After the method has found the number of tokens, the remaining unsplitted string is returned as the last token, even if it may contain the delimiters.

    Below given is a Java program to split a string by space in such a way the maximum number of tokens can not exceed 5 .

    String str = "how to do in java provides java tutorials"; String[] strArray = str.split("\\s", 5); System.out.println(strArray.length); //5 System.out.println(Arrays.toString(strArray)); //[how, to, do, in, java provides java tutorials]

    This Java String tutorial taught us to use the syntax and usage of Spring.split() API, with easy-to-follow examples. We learned to split strings using different delimiters such as commas, hyphens, and even multiple delimiters in a String.

    Источник

    Java split String by words example

    Java split String by words example shows how to split string into words in Java. The example also shows how to break string sentences into words using the split method.

    How to split String by words?

    The simplest way to split the string by words is by the space character as shown in the below example.

    As you can see from the output, it worked for the test sentence string. The sentence is broken down into words by splitting it using space.

    Let’s try some other not-so-simple sentences.

    As you can see from the output, our code did not work as expected. The reason being is simple split by space is not enough to separate words from a string. Sentences may be separated by punctuation marks like dot, comma, question marks, etc.

    In order to make the code handle all these punctuation and symbols, we will change our regular expression pattern from only space to all the punctuation marks and symbols as given below.

    This time we got the output as we wanted. The regex pattern [ !\»\\#$%&'()*+,-./:;[email protected]\\[\\]^_`<|>~]+ includes almost all the punctuation and symbols that can be used in a sentence including space. We applied + at the end to match one or more instances of these to make sure that we do not get any empty words.

    Instead of this pattern, you can also use \\P pattern to extract words from the sentence, where \\P denotes POSIX expression and L denotes character class for word characters. You need to change the line with the split method as given below.

    Please note that \\P expression works for both ASCII and non-ASCII characters (i.e. accented characters like “café” or “kākā”).

    Please let me know your views in the comments section below.

    Источник

    Java String split() Method with examples

    Java String split method is used for splitting a String into substrings based on the given delimiter or regular expression.

    For example:

    Input String: [email protected] Regular Expression: @ Output Substrings:

    Java String Split Method

    Java Split Method Examples

    We have two variants of split() method in String class.

    1. String[] split(String regex) : It returns an array of strings after splitting an input String based on the delimiting regular expression.

    2. String[] split(String regex, int limit) : This method is used when you want to limit the number of substrings. The only difference between this variant and above variant is that it limits the number of strings returned after split up. For example: split(«anydelimiter», 3) would return the array of only 3 strings even if there can be more than three substrings.

    What if limit is entered as a negative number?
    If the limit is negative then the returned string array would contain all the substrings including the trailing empty strings, however if the limit is zero then the returned string array would contains all the substrings excluding the trailing empty Strings.

    It throws PatternSyntaxException if the syntax of specified regular expression is not valid.

    String split() method Example

    If the limit is not defined:

    Java String split method

    If positive limit is specified in the split() method: Here the limit is specified as 2 so the split method returns only two substrings.

    String split strings with a limit

    If limit is specified as a negative number: As you can see that when the limit is negative, it included the trailing empty strings in the output. See the output screenshot below.

    String split when limit is set to negative

    Limit is set to zero: This will exclude the trailing empty strings.

    output

    Difference between zero and negative limit in split() method:

    • If the limit in split() is set to zero, it outputs all the substrings but exclude trailing empty strings if present.
    • If the limit in split() is set to a negative number, it outputs all the substrings including the trailing empty strings if present.

    Java String split() method with multiple delimiters

    Let’s see how we can pass multiple delimiters while using split() method. In this example we are splitting input string based on multiple special characters.

    Number of substrings: 7 Str[0]: Str[1]:ab Str[2]:gh Str[3]:bc Str[4]:pq Str[5]:kk Str[6]:bb

    Lets practice few more examples:

    Java Split String Examples

    Example 1: Split string using word as delimiter

    Here, a string (a word) is used as a delimiter in split() method.

    Example 2: Split string by space

    String[] strArray = str.split("\\s+");
    Input: "Text with spaces"; Output: ["Text", "with", "spaces"]

    Example 3: Split string by pipe

    String[] strArray = str.split("\\|");
    Input: "Text1|Text2|Text3"; Output: ["Text1", "Text2", "Text3"]

    Example 4: Split string by dot ( . )

    String[] strArray = str.split("\\.");

    You can split string by dot ( . ) using \\. regex in split method.

    Input: "Just.a.Simple.String"; Output: ["Just", "a", "Simple", "String"]

    Example 5: Split string into array of characters

    String[] strArray = str.split("(?!^)");

    The ?! part in this regex is negative assertion, which it works like a not operator in the context of regular expression. The ^ is to match the beginning of the string. Together it matches any character that is not the beginning of the string, which means it splits the string on every character.

    Input: "String"; Output: ["S", "t", "r", "i", "n", "g"]

    Example 6: Split string by capital letters

    String[] strArray = str.split("(?=\\p)");

    \p is a shorthand for \p . This regex matches uppercase letter. The extra backslash is to escape the sequence. This regex split string by capital letters.

    Input: "BeginnersBook.com"; Output: ["Beginners", "Book.com"]

    Example 7: Split string by newline

    String[] str = str.split(System.lineSeparator());

    This is one of the best way to split string by newline as this is a system independent approach. The lineSeparator() method returns the character sequence for the underlying system.

    Example 8: Split string by comma

    String[] strArray = str.split(",");

    To split the string by comma, you can pass , special character in the split() method as shown above.

    About the Author

    I have 15 years of experience in the IT industry, working with renowned multinational corporations. Additionally, I have dedicated over a decade to teaching, allowing me to refine my skills in delivering information in a simple and easily understandable manner.

    Comments

    Is it right to say that 28 is present at array1[0], 12 at array1[1] and 2013 at array1[2]?
    I am really confused right now.Please help.

    It would be helpful to include some examples that require use of the escape characters and which characters need them. It is one thing I was looking for. Once I realized that “|” needed “\\|”, my split worked like a champ. Thanks for showing these using small code bits. It really does make a difference.

    Источник

    Читайте также:  Php intelephense license key
Оцените статью