Diff java and с

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Library for performing the comparison operations between texts

License

dnaumenko/java-diff-utils

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Diff Utils library is an OpenSource library for performing the comparison operations between texts: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

Main reason to build this library was the lack of easy-to-use libraries with all the usual stuff you need while working with diff files. Originally it was inspired by JRCS library and it’s nice design of diff module.

  • computing the difference between two texts.
  • capable to hand more than plain ASCII. Arrays or List of any type that implements hashCode() and equals() correctly can be subject to differencing using this library
  • patch and unpatch the text with the given patch
  • parsing the unified diff format
  • producing human-readable differences

This library implements Myer’s diff algorithm. But it can easily replaced by any other which is better for handing your texts. I have plan to add implementation of some in future.

  • JDK 1.5 compatibility
  • Ant build script
  • Generate output in unified diff format (thanks for Bill James)

Just add the code below to your maven dependencies:

dependency> groupId>com.googlecode.java-diff-utilsgroupId> artifactId>diffutilsartifactId> version>1.3.0version> dependency>
dependency org="com.googlecode.java-diff-utils" name="diffutils" rev="1.3.0"/>
  • support for inline diffs in output
  • helpers for showing side-by-side, line-by-line diffs or text with inter-line and intra-line change highlights
  • customization of diff algorithm for better experience while computing diffs between strings (ignoring blank lines or spaces, etc)
  • generating output in other formats (not only unified). E.g. CVS.
Читайте также:  PHP LDAP LOGIN

This work is licensed under The Apache Software License, Version 1.1. Reason: The code contains work of HP, which contributed it under Apache-1.1. [Example code). It was easier to change the license to Apache-1.1 than to contact HP Legal for a code created in 2003 at HP Bristol.

About

Library for performing the comparison operations between texts

Источник

Нахождение разницы между двумя строками в Java

В этом кратком руководстве показано, как найти разницу между двумя строками с помощью Java.

В этом уроке мы будем использовать две существующие библиотеки Java и сравним их подходы к решению этой проблемы.

2. Проблема

Рассмотрим следующее требование: мы хотим найти разницу между строками « ABCDELMN» и «ABCFGLMN».

В зависимости от того, в каком формате нам нужен вывод, и игнорируя возможность написать собственный код для этого, мы нашли два основных доступных варианта.

Первая — это написанная Google библиотека под названием diff-match-patch . Как они утверждают, библиотека предлагает надежные алгоритмы синхронизации простого текста .

Другой вариант — класс StringUtils из Apache Commons Lang.

Давайте рассмотрим различия между этими двумя.

3. diff-match-patch

Для целей этой статьи мы будем использовать форк оригинальной библиотеки Google , так как артефакты для оригинальной не выпускаются на Maven Central. Кроме того, некоторые имена классов отличаются от исходной кодовой базы и больше соответствуют стандартам Java.

Во-первых, нам нужно включить его зависимость в наш файл pom.xml :

 dependency>   groupId>org.bitbucket.cowwocgroupId>   artifactId>diff-match-patchartifactId>   version>1.2version>   dependency> 

Затем рассмотрим этот код:

 String text1 = "ABCDELMN";   String text2 = "ABCFGLMN";   DiffMatchPatch dmp = new DiffMatchPatch();   LinkedListDiff> diff = dmp.diffMain(text1, text2, false); 

Если мы запустим приведенный выше код, который создает разницу между текстом1 и текстом2 , печать переменной diff приведет к следующему результату:

 [Diff(EQUAL,"ABC"), Diff(DELETE,"DE"), Diff(INSERT,"FG"), Diff(EQUAL,"LMN")] 

На самом деле на выходе будет список объектов Diff , каждый из которых формируется типом операции ( INSERT , DELETE или EQUAL ), и частью текста, связанной с операцией .

При запуске diff между text2 и text1 мы получим такой результат:

 [Diff(EQUAL,"ABC"), Diff(DELETE,"FG"), Diff(INSERT,"DE"), Diff(EQUAL,"LMN")] 

4. Строковые утилиты

Класс от Apache Commons имеет более упрощенный подход .

Во- первых, мы добавим зависимость Apache Commons Lang в наш файл pom.xml :

 dependency>   groupId>org.apache.commonsgroupId>   artifactId>commons-lang3artifactId>   version>3.12.0version>   dependency> 

Затем, чтобы найти разницу между двумя текстами с помощью Apache Commons, мы вызываем StringUtils#Difference :

 StringUtils.difference(text1, text2) 

Результатом будет простая строка :

В то время как запуск diff между text2 и text1 вернет:

Этот простой подход можно улучшить с помощью StringUtils.indexOfDifference() , который вернет индекс, с которого две строки начинают различаться (в нашем случае это четвертый символ строки). Этот индекс можно использовать для получения подстроки исходной строки , чтобы показать, что общего между двумя входными данными , в дополнение к тому, что отличается.

5. Производительность

Для наших тестов мы генерируем список из 10 000 строк с фиксированной частью из 10 символов , за которыми следуют 20 случайных буквенных символов .

Затем мы перебираем список и выполняем сравнение между n -м элементом и n+1 -м элементом списка:

 @Benchmark   public int diffMatchPatch()    for (int i = 0; i  inputs.size() - 1; i++)    diffMatchPatch.diffMain(inputs.get(i), inputs.get(i + 1), false);   >   return inputs.size();   > 
 @Benchmark   public int stringUtils()    for (int i = 0; i  inputs.size() - 1; i++)    StringUtils.difference(inputs.get(i), inputs.get(i + 1));   >   return inputs.size();   > 

Наконец, давайте запустим тесты и сравним две библиотеки:

 Benchmark Mode Cnt Score Error Units   StringDiffBenchmarkUnitTest.diffMatchPatch avgt 50 130.559 ± 1.501 ms/op  StringDiffBenchmarkUnitTest.stringUtils avgt 50 0.211 ± 0.003 ms/op 

6. Заключение

С точки зрения чистой скорости выполнения StringUtils явно производительнее , хотя и возвращает только ту подстроку, с которой начинают различаться две строки.

В то же время Diff-Match-Patch обеспечивает более тщательный результат сравнения , за счет производительности.

Реализация этих примеров и фрагментов доступна на GitHub .

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Diff Utils library is an OpenSource library for performing the comparison / diff operations between texts or some kind of data: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

License

java-diff-utils/java-diff-utils

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Diff Utils library is an OpenSource library for performing the comparison operations between texts: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

Main reason to build this library was the lack of easy-to-use libraries with all the usual stuff you need while working with diff files. Originally it was inspired by JRCS library and it’s nice design of diff module.

This is originally a fork of java-diff-utils from Google Code Archive.

Javadocs of the actual release version: JavaDocs java-diff-utils

Look here to find more helpful informations and examples.

These two outputs are generated using this java-diff-utils. The source code can also be found at the Examples page:

Producing a one liner including all difference information.

//create a configured DiffRowGenerator DiffRowGenerator generator = DiffRowGenerator.create() .showInlineDiffs(true) .mergeOriginalRevised(true) .inlineDiffByWord(true) .oldTag(f -> "~") //introduce markdown style for strikethrough .newTag(f -> "**") //introduce markdown style for bold .build(); //compute the differences for two test texts. ListDiffRow> rows = generator.generateDiffRows( Arrays.asList("This is a test senctence."), Arrays.asList("This is a test for diffutils.")); System.out.println(rows.get(0).getOldLine());

This is a test senctence for diffutils.

Producing a side by side view of computed differences.

DiffRowGenerator generator = DiffRowGenerator.create() .showInlineDiffs(true) .inlineDiffByWord(true) .oldTag(f -> "~") .newTag(f -> "**") .build(); ListDiffRow> rows = generator.generateDiffRows( Arrays.asList("This is a test senctence.", "This is the second line.", "And here is the finish."), Arrays.asList("This is a test for diffutils.", "This is the second line.")); System.out.println("|original|new|"); System.out.println("|--------|---|"); for (DiffRow row : rows) < System.out.println("|" + row.getOldLine() + "|" + row.getNewLine() + "|"); >
original new
This is a test senctence . This is a test for diffutils.
This is the second line. This is the second line.
And here is the finish.
  • computing the difference between two texts.
  • capable to hand more than plain ascii. Arrays or List of any type that implements hashCode() and equals() correctly can be subject to differencing using this library
  • patch and unpatch the text with the given patch
  • parsing the unified diff format
  • producing human-readable differences
  • inline difference construction
  • Algorithms:
    • Meyers Standard Algorithm
    • Meyers with linear space improvement
    • HistogramDiff using JGit Library

    But it can easily replaced by any other which is better for handing your texts. I have plan to add implementation of some in future.

    Recently a checkstyle process was integrated into the build process. java-diff-utils follows the sun java format convention. There are no TABs allowed. Use spaces.

    public static T> PatchT> diff(ListT> original, ListT> revised, BiPredicateT, T> equalizer) throws DiffException < if (equalizer != null) < return DiffUtils.diff(original, revised, new MyersDiff<>(equalizer)); > return DiffUtils.diff(original, revised, new MyersDiff<>()); >

    This is a valid piece of source code:

    • blocks without braces are not allowed
    • after control statements (if, while, for) a whitespace is expected
    • the opening brace should be in the same line as the control statement

    Just add the code below to your maven dependencies:

    dependency> groupId>io.github.java-diff-utilsgroupId> artifactId>java-diff-utilsartifactId> version>4.12version> dependency>
    // https://mvnrepository.com/artifact/io.github.java-diff-utils/java-diff-utils implementation "io.github.java-diff-utils:java-diff-utils:4.12"

    About

    Diff Utils library is an OpenSource library for performing the comparison / diff operations between texts or some kind of data: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

    Источник

Оцените статью