我写了一段代码在IMDB上查找一些电影名称,但是如果我搜索“哈利波特”,我会找到不止一部电影。我想使用多线程,但在这方面我没有太多的知识。
我正在使用策略设计模式在更多的网站中进行搜索,例如在其中一个方法中,我有以下代码
for (Element element : elements) {
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if (movieName.matches(patternMatcher)) {
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
resultList.add(result);
}
}对于每个元素(即电影名称),它将在IMDB上创建一个新连接,以便在super.imdbConnection(movieName)行上查找收视率和其他内容。
问题是,我希望同时拥有所有的连接,因为在找到5-6部电影时,这个过程将比预期的要长得多。
我不是要代码,我想要一些ideeas。我想过创建一个实现Runnable的内部类,并使用它,但我没有找到任何意义。
如何重写该循环以使用多线程?
我使用Jsoup进行解析,元素和元素都来自那个库。
发布于 2020-07-22 04:13:23
最简单的方法是parallelStream()
List<Result> resultList = elements.parallelStream()
.map(e -> {
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if(movieName.matches(patternMatcher)){
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
return result;
}else{
return null;
}
}).filter(Objects::nonNull)
.collect(Collectors.toList());如果你不喜欢parallelStream()并且想要使用线程,你可以这样做:
List<Element> elements = new ArrayList<>();
//create a function which returns an implementation of `Callable`
//input: Element
//output: Callable<Result>
Function<Element, Callable<Result>> scrapFunction = (element) -> new Callable<Result>() {
@Override
public Result call() throws Exception{
String searchedUrl = element.select("a").attr("href");
String movieName = element.select("h2").text();
if(movieName.matches(patternMatcher)){
Result result = new Result();
result.setName(movieName);
result.setLink(searchedUrl);
result.setTitleProp(super.imdbConnection(movieName));
System.out.println(movieName + " " + searchedUrl);
return result;
}else{
return null;
}
}
};
//create a fixed pool of threads
ExecutorService executor = Executors.newFixedThreadPool(elements.size());
//submit a Callable<Result> for every Element
//by using scrapFunction.apply(...)
List<Future<Result>> futures = elements.stream()
.map(e -> executor.submit(scrapFunction.apply(e)))
.collect(Collectors.toList());
//collect all results from Callable<Result>
List<Result> resultList = futures.stream()
.map(e -> {
try{
return e.get();
}catch(Exception ignored){
return null;
}
}).filter(Objects::nonNull)
.collect(Collectors.toList());https://stackoverflow.com/questions/63022139
复制相似问题