Streams are lazy because intermediate operations are not evaluated until terminal operation is invoked.
Each intermediate operation creates a new stream, stores the provided operation/function and return the new stream.
The pipeline accumulates these newly created streams.
The time when terminal operation is called, traversal of streams begins and the associated function is performed one by one.
Parallel streams don't evaluate streams 'one by one' (at terminal point). The operations are rather performed simultaneously, depending on the available cores.
Lazy evaluation in sequential stream
public static void main (String[] args) {
IntStream stream = IntStream.range(1, 5);
stream = stream.peek(i -> log("starting", i))
.filter(i -> { log("filtering", i);
return i % 2 == 0;})
.peek(i -> log("post filtering", i));
log("Invoking terminal method count.");
log("The count is", stream.count());
}
public static void log (Object... objects) {
String s = LocalTime.now().toString();
for (Object object : objects) {
s += " - " + object.toString();
}
System.out.println(s);
// putting a little delay so that we can see a clear difference
// with parallel stream.
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Output:
19:04:55.062 - Invoking terminal method count.
19:04:55.072 - starting - 1
19:04:55.074 - filtering - 1
19:04:55.076 - starting - 2
19:04:55.077 - filtering - 2
19:04:55.078 - post filtering - 2
19:04:55.079 - starting - 3
19:04:55.080 - filtering - 3
19:04:55.081 - starting - 4
19:04:55.082 - filtering - 4
19:04:55.083 - post filtering - 4
19:04:55.084 - The count is - 2
Above output shows that all iteration and function evaluation begins only after invoking the terminal method Stream#count().
In above example, we used the method Stream#peek(), note that this method is recommended for logging purposes only. We shouldn't perform stateful operations or apply side effects in this function. Here's the API note:
This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline
Lazy evaluation in parallel stream
public static void main (String[] args) {
IntStream stream = IntStream.range(1, 5).parallel();
stream = stream.peek(i -> log("starting", i))
.filter(i -> {log("filtering", i);
return i % 2 == 0;})
.peek(i -> log("post filtering", i));
log("Invoking terminal method count.");
log("The count is", stream.count());
}
Output:
19:06:19.604 - Invoking terminal method count.
19:06:19.616 - starting - 3
19:06:19.616 - starting - 1
19:06:19.616 - starting - 4
19:06:19.616 - starting - 2
19:06:19.617 - filtering - 4
19:06:19.617 - filtering - 2
19:06:19.617 - filtering - 1
19:06:19.617 - filtering - 3
19:06:19.618 - post filtering - 2
19:06:19.618 - post filtering - 4
19:06:19.620 - The count is - 2
What are the advantages of Laziness?
Lazy operations achieve efficiency. It is a way not to work on stale data. Lazy operations might be useful in the situations where input data is consumed gradually rather than having whole complete set of elements beforehand. For example consider the situations where an infinite stream has been created using Stream#generate(Supplier<T>) and the provided Supplier function is gradually receiving data from a remote server. In those kind of the situations server call will only be made at a terminal operation when it's needed.
Also consider a stream on which we have already applied a number of the intermediate operations but haven't applied the terminal operation yet: we can pass around such stream within the application without actually performing any operation on the underlying data, the terminal operation may be called at very different part of the application or at very late in time.
Example project
Dependencies and Technologies Used:
|