Close

Java 8 Streams - Stream.distinct Examples

Java 8 Streams Java Java API 


Interface:

java.util.stream.Stream

java.lang.AutoCloseableAutoCloseablejava.util.stream.BaseStreamBaseStreamjava.util.stream.StreamStreamLogicBig

Method:

Stream<T> distinct()

Returns a stream consisting of the distinct elements according to java.lang.Object#equals method of this stream.

For ordered streams, the selection of distinct elements is stable. Here stability means for duplicated elements (having same compare key but different instances), the element appearing first in the encounter order is preserved. For unordered streams, no stability guarantees are made.

This is a stateful intermediate operation.


Examples


package com.logicbig.example.stream;

import java.util.stream.Stream;

public class DistinctExample {

public static void main(String... args) {
Stream<String> s = Stream.of("one", "two", "three", "four", "two", "one");
Stream<String> s2 = s.distinct();
s2.forEach(System.out::println);
}
}

Output

one
two
three
four




Stream.distinct(), IntStream.distinct(), LongStream.distinct() and DoubleStream.distinct() methods returns a new stream of distinct elements. The distinct elements are selected based on Object.equals() method. This example shows a comparison between ordered and unordered parallel stream performance when using distinct method.

package com.logicbig.example;

import java.util.stream.IntStream;


public class DistinctExample {

public static void main (String[] args) {
PerformanceTestUtil.runTest("unordered stream", () -> {
IntStream stream = IntStream.range(0, 1000000);
stream.unordered().parallel().distinct().count();
});

PerformanceTestUtil.runTest("ordered stream", () -> {
IntStream stream = IntStream.range(0, 1000000);
stream.parallel().distinct().count();
});
}
}

Output

unordered stream time taken: 140.0 milliseconds
ordered stream time taken: 529.0 milliseconds
Original Post




This example shows how stability is maintained with ordered parallel stream while using distinct operation. For unordered stream there's no stability guarantee. A stable distinct operation will drop the duplicates which comes later in the encounter order, whereas, an unstable distinct operation will randomly eliminate the duplciates.

package com.logicbig.example;

import java.util.Arrays;
import java.util.stream.Stream;

public class DistinctStabilityExample {
public static void main (String[] args) {

Object[] myObjects = createStream().parallel().distinct().toArray();
System.out.printf("ordered distinct result 1: %s%n",
Arrays.toString(myObjects));

MyObject.c = 0;
myObjects = createStream().parallel().distinct().toArray();
System.out.printf("ordered distinct result 2: %s%n",
Arrays.toString(myObjects));

MyObject.c = 0;
myObjects = createStream().unordered().parallel().distinct().toArray();
System.out.printf("unordered distinct result 1: %s%n",
Arrays.toString(myObjects));

MyObject.c = 0;
myObjects = createStream().unordered().parallel().distinct().toArray();
System.out.printf("unordered distinct result 2: %s%n",
Arrays.toString(myObjects));

}

private static Stream<MyObject> createStream () {
return Stream.of(new MyObject("a"), new MyObject("b"),
new MyObject("c"), new MyObject("b"),
new MyObject("c"), new MyObject("c"),
new MyObject("a"));

}

private static class MyObject {
private static int c = 0;
private int id = ++c;
private String str;

public MyObject (String str) {
this.str = str;
}

@Override
public boolean equals (Object o) {
if (this == o)
return true;
if (o == null || getClass() != o.getClass())
return false;

MyObject myObject = (MyObject) o;

return str != null ? str.equals(myObject.str) : myObject.str == null;

}

@Override
public int hashCode () {
return str != null ? str.hashCode() : 0;
}

@Override
public String toString () {
return "MyObject{id=" + id + ", str='" + str + "\'}";
}
}
}

Output

ordered distinct result 1: [MyObject{id=1, str='a'}, MyObject{id=2, str='b'}, MyObject{id=3, str='c'}]
ordered distinct result 2: [MyObject{id=1, str='a'}, MyObject{id=2, str='b'}, MyObject{id=3, str='c'}]
unordered distinct result 1: [MyObject{id=7, str='a'}, MyObject{id=4, str='b'}, MyObject{id=5, str='c'}]
unordered distinct result 2: [MyObject{id=7, str='a'}, MyObject{id=2, str='b'}, MyObject{id=5, str='c'}]
Original Post




See Also