Java split all whitespace

How to split String in Java by WhiteSpace or tabs? Example Tutorial

You can split a String by whitespaces or tabs in Java by using the split() method of java.lang.String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces. Though this is not as straightforward as it seems, especially if you are not coding in Java regularly. Input String may contain leading and trailing spaces, it may contain multiple white spaces between words and words may also be separated by tabs. Your solution needs to take care of all these conditions if you just want words and no empty String.

In this article, I am going to show you a couple of examples to demonstrate how you can split String in Java by space. By splitting I mean getting individual words as a String array or ArrayList of String, whatever you need.

In this Java String tutorial, I’ll show you three ways to split a String where words are separated by whitespaces or tabs using JDK and without using any third-party library like Apache commons or Google Guava. The first example is your ideal situation where each word in the String is separated by just one whitespace.

In the second example, you will learn how to deal with multiple whitespaces or tabs by using a greedy regular expression in Java e.g. «\\s+» which will find more than whitespaces, and in the third example, you will learn how to deal with leading and trailing whitespaces on input String. You can even combine all these things in one solution depending upon your requirements.

And, If you are new to the Java world then I also recommend you go through The Complete Java MasterClass on Udemy to learn Java in a better and more structured way. This is one of the best and up-to-date courses to learn Java online.

Читайте также:  Обновленные сервера на css

1st Example — Splitting String where words are separated by regular whitespace

This is the simplest case. Usually, words are separated by just one white space between them. In order to split it and get the array of words, just call the split() method on input String, passing a space as regular expression i.e. » « , this will match a single white space and split the string accordingly.

String lineOfCurrencies = "USD JPY AUD SGD HKD CAD CHF GBP EURO INR"; String[] currencies = lineOfCurrencies.split(" "); System.out.println("input string words separated by whitespace: " + lineOfCurrencies); System.out.println("output string: " + Arrays.toString(currencies)); Output: input string words separated by whitespace: USD JPY AUD SGD HKD CAD CHF GBP EURO INR output string: [USD, JPY, AUD, SGD, HKD, CAD, CHF, GBP, EURO, INR]

This is also the easiest way to convert the String to String array, but If you want to convert the String array into ArrayList of String you can see the Java How to Program by Dietel, one of the most comprehensive books for beginner and intermediate Java programmers.

2nd Example — Splitting String where words are separated by multiple whitespaces or tabs

In order to handle this scenario, you need to use a greedy regular expression, which will match any number of white spaces. You can use «\\s+» regex for this purpose. If you look closely, we are using regular expression metacharacters and character classes. \s will match any space including tabs but \ require escaping hence it becomes \\s , but it’s not greedy yet.

So we added +, which will match 1 or more occurrences, so it becomes greedy. To learn more about «\\s+» regular expression to remove white spaces, see this tutorial.

Anyway here is the code in action:

String lineOfPhonesWithMultipleWhiteSpace = "iPhone Galaxy Lumia"; String[] phones = lineOfPhonesWithMultipleWhiteSpace.split("\\s+"); System.out.println("input string separated by tabs: " + lineOfPhonesWithMultipleWhiteSpace); System.out.println("output string: " + Arrays.toString(phones)); Output: input string separated by tabs: iPhone Galaxy Lumia output string: [iPhone, Galaxy, Lumia]

You can see how we have converted a String to an array where three values are separated by multiple whitespaces. If you want to learn more about how regular expression works in Java, I suggest you read the regular expression chapter from Java: How to Program by Deitel and Deitel.

Читайте также:  Java identifier expected system out println

How to split String in Java by WhiteSpace or tabs

3rd Example — Splitting String with leading and trailing whitespace

Splitting a String by white space becomes tricky when your input string contains leading or trailing whitespaces because they will match the \\s+ regular expression and an empty String will be inserted into the output array. To avoid that you should trim the String before splitting it i.e. call the trim() before calling split() .

Though you should remember that since String is immutable in Java, you either need to hold the output of trim or chain the trim() and split() together as shown in the following example:

String linewithLeadingAndTrallingWhiteSpace = " Java C++ "; String[] languages = linewithLeadingAndTrallingWhiteSpace.split("\\s"); languages = linewithLeadingAndTrallingWhiteSpace.trim().split("\\s+"); System.out.println("input string: " + linewithLeadingAndTrallingWhiteSpace); System.out.println("output string wihtout trim: " + Arrays.toString(languages)); System.out.println("output string after trim() and split: " + Arrays.toString(languages)); Output: input string with leading and trailing space: Java C++ output string without trim: [, Java, C++] output string after trim() and split: [Java, C++]

If you want an ArrayList of String instead of a String array then follow the steps given in this tutorial.

Java Program to split string by spaces or tabs

Here is our sample Java program, which combines all these examples and scenarios to give you the complete idea of how to split a String by spaces in Java.

public class StringSplitExample < public static void main(String args[]) < // You can split a String by space using the split() // function of java.lang.String class. // It accepts a regular expression and you just need to // pass a regular expression which matches with space // though space could be whitespace, tab etc // also words can have multiple spaces in between // so be careful. // Suppose we have a String with currencies separated by space String lineOfCurrencies = "USD JPY AUD SGD HKD CAD CHF GBP EURO INR"; // Now, we will split this string and convert it into an array of String // we use regex " ", which will match just one whitespace String[] currencies = lineOfCurrencies.split(" "); System.out.println("input string words separated by whitespace: " + lineOfCurrencies); System.out.println("output string: " + Arrays.toString(currencies)); // above regular expression will not work as expected if you have multiple // space between two words in string, because it could pick extra // whitespace as another word. To solve this problem, we will use // a proper greedy regular expression to match any number of whitespace // they are actually separated with two tabs here String lineOfPhonesWithMultipleWhiteSpace = "iPhone Galaxy Lumia"; String[] phones = lineOfPhonesWithMultipleWhiteSpace.split("\\s+"); System.out.println("input string separted by tabs: " + lineOfPhonesWithMultipleWhiteSpace); System.out.println("output string: " + Arrays.toString(phones)); // above regular expression will not able to handle leading // and trailing whitespace, as it will count empty String // as another word, as shown below String linewithLeadingAndTrallingWhiteSpace = " Java C++ "; String[] languages = linewithLeadingAndTrallingWhiteSpace.split("\\s+"); System.out.println("input string with leading and traling space: " + linewithLeadingAndTrallingWhiteSpace); System.out.println("output string: " + Arrays.toString(languages)); // You can solve above problem by trimming the string before // splitting it i.e. call trim() before split() as shown below languages = linewithLeadingAndTrallingWhiteSpace.trim().split("\\s+"); System.out.println("input string: " + linewithLeadingAndTrallingWhiteSpace); System.out.println("output string afte trim() and split: " + Arrays.toString(languages)); > >

This program has demonstrated all three ways which we have discussed earlier to split a String by single or multiple whitespaces or tabs in Java.

Читайте также:  Добавление водяного знака php

Important points about the split() method:

  1. Splits this string around matches of the given regular expression.
  2. Returns the array of strings computed by splitting this string around matches of the given regular expression
  3. The split() method throws PatternSyntaxException — if the regular expression’s syntax is invalid
  4. This method was added to Java 1.4, so it’s not available to the earlier version but you can use it in Java 5, 6, 7, or 8 because Java is backward compatible.

Источник

Оцените статью