How to split a string in Java? [Answered]

Sample problem:

I have a string, "004-034556", that I want to split into two strings:

string1="004";
string2="034556";

That means the first string will contain the characters before '-', and the second string will contain the characters after '-'. I also want to check if the string has '-' in it. If not, I will throw an exception. How can I do this?

Split a string in Java- Answer #1:

Just use the appropriate method: String#split().

String string = "004-034556";
String[] parts = string.split("-");
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556

Note that this takes a regular expression, so remember to escape special characters if necessary.

there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), and the opening square bracket [, the opening curly brace {, These special characters are often called “metacharacters”.

So, if you want to split on e.g. period/dot . which means “any character” in regex, use either backslash \ to escape the individual special character like so split("\\."), or use character class [] to represent literal character(s) like so split("[.]"), or use Pattern#quote() to escape the entire string like so split(Pattern.quote(".")).

String[] parts = string.split(Pattern.quote(".")); // Split on period.

To test beforehand if the string contains certain character(s), just use String#contains().

if (string.contains("-")) {
    // Split it.
} else {
    throw new IllegalArgumentException("String " + string + " does not contain -");
}

Note, this does not take a regular expression. For that, use String#matches() instead.

If you’d like to retain the split character in the resulting parts, then make use of positive lookaround. In case you want to have the split character to end up in left hand side, use positive lookbehind by prefixing ?<= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?<=-)");
String part1 = parts[0]; // 004-
String part2 = parts[1]; // 034556

In case you want to have the split character to end up in right hand side, use positive lookahead by prefixing ?= group on the pattern.

String string = "004-034556";
String[] parts = string.split("(?=-)");
String part1 = parts[0]; // 004
String part2 = parts[1]; // -034556

If you’d like to limit the number of resulting parts, then you can supply the desired number as 2nd argument of split() method.

String string = "004-034556-42";
String[] parts = string.split("-", 2);
String part1 = parts[0]; // 004
String part2 = parts[1]; // 034556-42

Answer #2:

An alternative to processing the string directly would be to use a regular expression with capturing groups. This has the advantage that it makes it straightforward to imply more sophisticated constraints on the input. For example, the following splits the string into two parts, and ensures that both consist only of digits:

import java.util.regex.Pattern;
import java.util.regex.Matcher;

class SplitExample
{
    private static Pattern twopart = Pattern.compile("(\\d+)-(\\d+)");

    public static void checkString(String s)
    {
        Matcher m = twopart.matcher(s);
        if (m.matches()) {
            System.out.println(s + " matches; first part is " + m.group(1) +
                               ", second part is " + m.group(2) + ".");
        } else {
            System.out.println(s + " does not match.");
        }
    }

    public static void main(String[] args) {
        checkString("123-4567");
        checkString("foo-bar");
        checkString("123-");
        checkString("-4567");
        checkString("123-4567-890");
    }
}

As the pattern is fixed in this instance, it can be compiled in advance and stored as a static member (initialised at class load time in the example). The regular expression is:

(\d+)-(\d+)

The parentheses denote the capturing groups; the string that matched that part of the regexp can be accessed by the Match.group() method, as shown. The \d matches and single decimal digit, and the + means “match one or more of the previous expression). The – has no special meaning, so just matches that character in the input. Note that you need to double-escape the backslashes when writing this as a Java string. Some other examples:

([A-Z]+)-([A-Z]+)          // Each part consists of only capital letters 
([^-]+)-([^-]+)            // Each part consists of characters other than -
([A-Z]{2})-(\d+)           // The first part is exactly two capital letters,
                           // the second consists of digits

Answer #3:

Use:

String[] result = yourString.split("-");
if (result.length != 2) 
     throw new IllegalArgumentException("String not in correct format");

This will split your string into two parts. The first element in the array will be the part containing the stuff before the -, and the second element in the array will contain the part of your string after the -.

If the array length is not 2, then the string was not in the format: string-string.

Check out the split() method in the String class.

Answer #4:

This:

String[] out = string.split("-");

should do the thing you want. The string class has many method to operate with a string.

Answer #5:

// This leaves the regexes issue out of question
// But we must remember that each character in the Delimiter String is treated
// like a single delimiter        

public static String[] SplitUsingTokenizer(String subject, String delimiters) {
   StringTokenizer strTkn = new StringTokenizer(subject, delimiters);
   ArrayList<String> arrLis = new ArrayList<String>(subject.length());

   while(strTkn.hasMoreTokens())
      arrLis.add(strTkn.nextToken());

   return arrLis.toArray(new String[0]);
}

Answer #6:

With Java 8:

    List<String> stringList = Pattern.compile("-")
            .splitAsStream("004-034556")
            .collect(Collectors.toList());

    stringList.forEach(s -> System.out.println(s));

Answer #7:

The requirements left room for interpretation. I recommend writing a method,

public final static String[] mySplit(final String s)

which encapsulate this function. Of course you can use String.split(..) as mentioned in the other answers for the implementation.

You should write some unit-tests for input strings and the desired results and behaviour.

Good test candidates should include:

 - "0022-3333"
 - "-"
 - "5555-"
 - "-333"
 - "3344-"
 - "--"
 - ""
 - "553535"
 - "333-333-33"
 - "222--222"
 - "222--"
 - "--4555"

With defining the according test results, you can specify the behaviour.

For example, if "-333" should return in [,333] or if it is an error. Can "333-333-33" be separated in [333,333-33] or [333-333,33] or is it an error? And so on.

Answer #8:

You can try like this also

 String concatenated_String="hi^Hello";

 String split_string_array[]=concatenated_String.split("\\^");

Answer #9:

Assuming, that

  • you don’t really need regular expressions for your split
  • you happen to already use apache commons lang in your app

The easiest way is to use StringUtils#split(java.lang.String, char). That’s more convenient than the one provided by Java out of the box if you don’t need regular expressions. Like its manual says, it works like this:

A null input String returns null.

 StringUtils.split(null, *)         = null
 StringUtils.split("", *)           = []
 StringUtils.split("a.b.c", '.')    = ["a", "b", "c"]
 StringUtils.split("a..b.c", '.')   = ["a", "b", "c"]
 StringUtils.split("a:b:c", '.')    = ["a:b:c"]
 StringUtils.split("a b c", ' ')    = ["a", "b", "c"]

I would recommend using commong-lang, since usually it contains a lot of stuff that’s usable. However, if you don’t need it for anything else than doing a split, then implementing yourself or escaping the regex is a better option.

Answer #10:

Use org.apache.commons.lang.StringUtils’ split method which can split strings based on the character or string you want to split.

Method signature:

public static String[] split(String str, char separatorChar);

In your case, you want to split a string when there is a “-“.

You can simply do as follows:

String str = "004-034556";

String split[] = StringUtils.split(str,"-");

Output:

004
034556

Assume that if - does not exists in your string, it returns the given string, and you will not get any exception.

Answer #11:

To summarize: there are at least five ways to split a string in Java:

  1. String.split():String[] parts ="10,20".split(",");
  2. Pattern.compile(regexp).splitAsStream(input):List<String> strings = Pattern.compile("\\|") .splitAsStream("010|020202") .collect(Collectors.toList());
  3. StringTokenizer (legacy class):StringTokenizer strings = new StringTokenizer("Welcome to EXPLAINJAVA.COM!", "."); while(strings.hasMoreTokens()){ String substring = strings.nextToken(); System.out.println(substring); }
  4. Google Guava Splitter:Iterable<String> result = Splitter.on(",").split("1,2,3,4");
  5. Apache Commons StringUtils:String[] strings = StringUtils.split("1,2,3,4", ",");

So you can choose the best option for you depending on what you need, e.g. return type (array, list, or iterable).

Hope you learned something from this post.

Follow Programming Articles for more!

About ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ

Linux and Python enthusiast, in love with open source since 2014, Writer at programming-articles.com, India.

View all posts by ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ →