確定類型#
定制 collector 要實現 Collector 接口,首先要確定類型
- 待收集元素的類型
- 累加器 /accumulate 的類型
- 最終結果的類型
假設要實現這樣一個收集器:
public class GroupingBy<T,K> implements Collector<T,Map<K,List<T>>,Map<K,List<T>>>
類型分別為:
- T
- Map<K,List>
- Map<K,List>
實現收集器的組件#
收集器有 4 個重要的組件,他們都是函數
- supplier
- accumulator
- combiner
- finisher
supplier#
supplier 用於創建容器.
@Override
public Supplier<Map<K, List<T>>> supplier() {
return ()-> new HashMap<>();
}
accumulator 是疊加器,相當於 reduce 裡面的第二個參數,用於將下一個內容加入到前面的結果.
@Override
public BiConsumer<Map<K, List<T>>, T> accumulator() {
return (accumulator,ele)->{
K key = this.classifier.apply(ele);
List<T> tList = accumulator.get(key);
if (tList == null){
tList = new ArrayList<>();
}
tList.add(ele);
accumulator.put(key,tList);
};
}
在添加下一個元素之前判斷 map 中有無 list
關鍵的一點是 key 的獲取。由傳進來的一個 classifier 完成,通過 classifier 獲得 key.
combiner#
相當於 reduce 的參數 3,用於將產生的各個容器合併起來
@Override
public BinaryOperator<Map<K, List<T>>> combiner() {
return (l,r)->{
l.putAll(r);
return l;
};
}
直接把後一個裝到前一個並返回就行
finisher#
描述返回最終的結果.
@Override
public Function<Map<K, List<T>>, Map<K, List<T>>> finisher() {
return accumulator->accumulator;
}
額外 characteristics#
描述數據的返回形式
@Override
public Set<Characteristics> characteristics() {
return Collections.unmodifiableSet(EnumSet.of(Characteristics.IDENTITY_FINISH));
}
相關解釋:
/**
* Characteristics indicating properties of a {@code Collector}, which can
* be used to optimize reduction implementations.
*/
enum Characteristics {
/**
* Indicates that this collector is <em>concurrent</em>, meaning that
* the result container can support the accumulator function being
* called concurrently with the same result container from multiple
* threads.
*
* <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
* then it should only be evaluated concurrently if applied to an
* unordered data source.
*/
CONCURRENT,
/**
* Indicates that the collection operation does not commit to preserving
* the encounter order of input elements. (This might be true if the
* result container has no intrinsic order, such as a {@link Set}.)
*/
UNORDERED,
/**
* Indicates that the finisher function is the identity function and
* can be elided. If set, it must be the case that an unchecked cast
* from A to R will succeed.
*/
IDENTITY_FINISH
}