Java найти в строке строку

How to find a whole word in a String in Java?

I should report whenever I have a match and where. Multiple occurrences should also be accounted for. However, for this one, I should get a match only on ‘123woods’, not on ‘woods’. This eliminates using String.contains() method. Also, I should be able to have a list/set of keywords and check at the same time for their occurrence. In this example, if I have ‘123woods’ and ‘come’, I should get two occurrences. Method execution should be somewhat fast on large texts. My idea is to use StringTokenizer but I am unsure if it will perform well. Any suggestions?

Are you sure the logic isn’t flawed? What if you have keywords — words123 and 123words. Then in the text words123words which are the matches?

14 Answers 14

The example below is based on your comments. It uses a List of keywords, which will be searched in a given String using word boundaries. It uses StringUtils from Apache Commons Lang to build the regular expression and print the matched groups.

String text = "I will come and meet you at the woods 123woods and all the woods"; List tokens = new ArrayList(); tokens.add("123woods"); tokens.add("woods"); String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b"; Pattern pattern = Pattern.compile(patternString); Matcher matcher = pattern.matcher(text); while (matcher.find())

If you are looking for more performance, you could have a look at StringSearch: high-performance pattern matching algorithms in Java.

What if I have an ArrayList and I want to use a Pattern to build it? Seems like I have to use the trusty old StringBuilder?

@baba — You could do that, or you could iterate through the List<>. I’m not sure which would be more efficient, you may want to try both approaches if performance is a concern.

Use regex + word boundaries as others answered.

"I will come and meet you at the 123woods".matches(".*\\b123woods\\b.*"); 
"I will come and meet you at the 123woods".matches(".*\\bwoods\\b.*"); 
String string = "I will come and meet you at the 123woods"; String keyword = "123woods"; Boolean found = Arrays.asList(string.split(" ")).contains(keyword); if(found)

How about something like Arrays.asList(String.split(» «)).contains(«xx») ?

Got a way to match Exact word from String in Android:

String full = "Hello World. How are you ?"; String one = "Hell"; String two = "Hello"; String three = "are"; String four = "ar"; boolean is1 = isContainExactWord(full, one); boolean is2 = isContainExactWord(full, two); boolean is3 = isContainExactWord(full, three); boolean is4 = isContainExactWord(full, four); Log.i("Contains Result", is1+"-"+is2+"-"+is3+"-"+is4); Result: false-true-true-false 
private boolean isContainExactWord(String fullString, String partWord)

Try to match using regular expressions. Match for «\b123wood\b», \b is a word break.

public class FindTextInLine < String match = "123woods"; String text = "I will come and meet you at the 123woods"; public void findText () < if (text.contains(match)) < System.out.println("Keyword matched the string" ); >> > 

While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.

Читайте также:  Background image css несколько изображений

The solution seems to be long accepted, but the solution could be improved, so if someone has a similar problem:

This is a classical application for multi-pattern-search-algorithms.

Java Pattern Search (with Matcher.find ) is not qualified for doing that. Searching for exactly one keyword is optimized in java, searching for an or-expression uses the regex non deterministic automaton which is backtracking on mismatches. In worse case each character of the text will be processed l times (where l is the sum of the pattern lengths).

Single pattern search is better, but not qualified, too. One will have to start the whole search for every keyword pattern. In worse case each character of the text will be processed p times where p is the number of patterns.

Multi pattern search will process each character of the text exactly once. Algorithms suitable for such a search would be Aho-Corasick, Wu-Manber, or Set Backwards Oracle Matching. These could be found in libraries like Stringsearchalgorithms or byteseek.

// example with StringSearchAlgorithms AhoCorasick stringSearch = new AhoCorasick(asList("123woods", "woods")); CharProvider text = new StringCharProvider("I will come and meet you at the woods 123woods and all the woods", 0); StringFinder finder = stringSearch.createFinder(text); List all = finder.findAll(); 

A much simpler way to do this is to use split():

String match = "123woods"; String text = "I will come and meet you at the 123woods"; String[] sentence = text.split(); for(String word: sentence) < if(word.equals(match)) return true; >return false; 

This is a simpler, less elegant way to do the same thing without using tokens, etc.

While simpler to understand and write, it is not the answer of the question I was asking. I have two or three, or maybe indefinite number of «match» keywords, I need to get those that were found in the «text». Of course, you might loop my «match» keywords for each of the «words» on the split text, but I find it far less elegant than the already accepted solution.

You can use regular expressions. Use Matcher and Pattern methods to get the desired output

You can also use regex matching with the \b flag (whole word boundary).

To Match «123woods» instead of «woods» , use atomic grouping in regular expresssion. One thing to be noted is that, in a string to match «123woods» alone , it will match the first «123woods» and exits instead of searching the same string further.

it searches 123woods as primary search, once it got matched it exits the search.

Looking back at the original question, we need to find some given keywords in a given sentence, count the number of occurrences and know something about where. I don’t quite understand what «where» means (is it an index in the sentence?), so I’ll pass that one. I’m still learning java, one step at a time, so I’ll see to that one in due time 🙂

It must be noticed that common sentences (as the one in the original question) can have repeated keywords, therefore the search cannot just ask if a given keyword «exists or not» and count it as 1 if it does exist. There can be more then one of the same. For example:

// Base sentence (added punctuation, to make it more interesting): String sentence = "Say that 123 of us will come by and meet you, " + "say, at the woods of 123woods."; // Split it (punctuation taken in consideration, as well): java.util.List strings = java.util.Arrays.asList(sentence.split(" |,|\\.")); // My keywords: java.util.ArrayList keywords = new java.util.ArrayList<>(); keywords.add("123woods"); keywords.add("come"); keywords.add("you"); keywords.add("say"); 

By looking at it, the expected result would be 5 for «Say» + «come» + «you» + «say» + «123woods», counting «say» twice if we go lowercase. If we don’t, then the count should be 4, «Say» being excluded and «say» included. Fine. My suggestion is:

// Set. ready. int counter = 0; // Go! for(String s : strings) < // Asking if the sentence exists in the keywords, not the other // around, to find repeated keywords in the sentence. Boolean found = keywords.contains(s.toLowerCase()); if(found) < counter ++; System.out.println("Found: " + s); >> // Statistics: if (counter > 0)

Found: Say
Found: come
Found: you
Found: say
Found: 123woods
In sentence: Say that 123 of us will come by and meet you, say, at the woods of 123woods.
Count: 5

Источник

Читайте также:  Python threading запуск функции

Substring search in Java

I want to search if this string contains «world». I used following functions but they have some problems. I used String.indexof() but if I will try to search for «w» it will say it exists. In short I think I am looking for exact comparison. Is there any good function in Java? Also is there any function in Java that can calculate log base 2?

6 Answers 6

I’m assuming the problems you’re having with indexOf() related to you using the character version (otherwise why would you be searching for w when looking for world?). If so, indexOf() can take a string argument to search for:

String s = "hello world i am from heaven"; if (s.indexOf("world") != -1) < // it contains world >

as for log base 2, that’s easy:

public static double log2(double d) < return Math.log(d) / Math.log(2.0d); >

I explode every string to do some kind of comparison for my application. Some time it also get single CHAR so it create these problems. If i want to pass argument by myself then i know i can pass a string. But that is at run time. Can you please another solution ?

Or you can also say that I am looking for organization in a string and at run time it get the arugment «or» again it will say YES EXIST

For an exact String comparison, you can simply do:

boolean match = stringA.equals(stringB); 

If you want to check that a string contains a substring, you can do:

boolean contains = string.contains(substring); 

For more String methods, see the javadocs

String s = "hello world i am from heaven"; if (s.contains("world")) < // This string contains "world" >

This is a simple and easy-to-use function and for a one-liner:

String s = "hello world i am from heaven"; if (s.contains("world")) System.out.prinln("It exists. "); 

Hi My version is as below:

package com.example.basic; public class FindSubString < public String findTheSubString(String searchString, String inputString)< StringBuilder builder = new StringBuilder(inputString); System.out.println("The capacity of the String " + builder.capacity()); System.out.println("pos of" + builder.indexOf(searchString)); return builder.substring(builder.indexOf(searchString),builder.indexOf(searchString)+searchString.length()); >public static void main(String[] args) < String myString = "Please find if I am in this String and will I be found"; String searchString = "I am in this String"; FindSubString subString = new FindSubString(); boolean isPresent = myString.contains(searchString); if(isPresent)< System.out.println("The substring is present " + isPresent + myString.length()); System.out.println(subString.findTheSubString(searchString,myString)); >else < System.out.println("The substring is ! present " + isPresent); >> > 

Please let me know if it was useful.

Читайте также:  Python how to install venv

I got the solution finally.

public void onTextChanged(CharSequence s, int start, int before, int count) < ArrayList> arrayTemplist = new ArrayList>(); String searchString = mEditText.getText().toString(); if(searchString.equals("")) else < for (int i = 0; i < arraylist.size(); i++) < String currentString = arraylist.get(i).get(MainActivity.COUNTRY); if (searchString.toLowerCase().contains(currentString.toLowerCase())) < //pass the character-sequence instead of currentstring arrayTemplist.add(arraylist.get(i)); >> > adapter = new ListViewAdapter(MainActivity.this, arrayTemplist); listview.setAdapter(adapter); > >); > 

Replace the above code with this one:

public void onTextChanged(CharSequence s, int start, int before, int count) < ArrayList> arrayTemplist = new ArrayList>(); String searchString = mEditText.getText().toString(); if(searchString.equals("")) else < for (int i = 0; i < arraylist.size(); i++) < String currentString = arraylist.get(i).get(MainActivity.COUNTRY); if (currentString.toLowerCase().contains(searchString.toLowerCase())) < //pass the character-sequence instead of currentstring arrayTemplist.add(arraylist.get(i)); >> > adapter = new ListViewAdapter(MainActivity.this, arrayTemplist); listview.setAdapter(adapter); > >); 

Источник

Поиск, получение, удаление подстроки в String

Методы indexOf, lastIndexOf позволяют искать строки в строках. Есть 4 вида таких методов:

Метод indexOf ищет в нашей строке указанную строку. Он может искать ее с начала строки или начиная с какого-то номера (второй метод). Если строка найдена – метод возвращает номер ее первого символа, если не найдена — возвращает -1

int indexOf(String str)
String s text-green">Good news everyone!"; int index = s.indexOf("ne");
int indexOf(String str, int from)
String s text-green">ws everyone!"; int index = s.indexOf("ne", 7);

Метод lastIndexOf ищет указанную строку в нашей строке с конца! Он может искать ее с самого конца строки или начиная с какого-то номера (второй метод). Если строка найдена – метод возвращает номер ее первого символа, если не найдена — возвращает -1.

int lastIndexOf(String str)
String s text-green">Good news everyone!"; int index = s.lastIndexOf("ne");
int lastIndexOf(String str, int from)
String s text-green">Good news everyone!"; int index = s.lastIndexOf("ne", 15);

9) Как заменить часть строки на другую?

Для этого есть три метода.

Метод replace заменяет все вхождения определенного символа на другой.

Метод replaceAll заменяет все вхождения одной подстроки на другую.

Метод replaceFirst заменяет первое вхождение переданной подстроки на заданную подстроку.

String replace(char oldChar, char newChar)
String s ; String s2 = s.replace ('o', 'a');
String replaceAll(String regex, String replacement)
String s ; String s2 = s.replaceAll("ne", "_");
String replaceFirst(String regex, String replacement)
String s ; String s2 = s.replaceFirst("ne", "_");

Но тут нужно быть аккуратным. В двух последних методах (replaceAll&replaceFirst) в качестве строки, которую мы ищем, передается не просто строка, а регулярное выражение. Но об этом я расскажу позднее.

Источник

Оцените статью