
- 1、熟悉SpringBoot集成ElasticSearch
- 1.1、官方指导文档
- 1.2、创建集成项目配置
- 1.3、测试索引-增删查
- 1.4、测试文档-增删改查
- 2、ElasticSearch实战-仿京东首页查询高亮
- 2.1、创建项目
- 2.2、基础爬虫拉取数据(jsoup)
- 2.3、编写service业务逻辑层接口及实现类
- 2.4、编写Controller前端访问层
- 2.5、测试接口
- 2.6、前后端分离(简单使用Vue)
- 2.7、高亮显示关键字
elasticsearch官方指导文档:https://www.elastic.co/guide/index.html
推荐使用REST风格 *** 作es,可以直接根据REST Client客户端官方指导文档即可:
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/index.html
1、引入springboot集成es客户端依赖
org.springframework.boot spring-boot-starter-data-elasticsearch
2、统一版本
org.springframework.boot spring-boot-starter-parent2.2.5.RELEASE 1.8 7.6.1
3、导入后续会用到的关键依赖
org.projectlombok
lombok
true
com.alibaba
fastjson
1.2.70
4、创建并编写配置类
@Configuration
public class ElasticSearchRestClientConfig {
// 向spring容器中注入Rest高级客户端
//方法名最好和返回类型保持一直,后续自动匹配装载时方便
@Bean
public RestHighLevelClient restHighLevelClient(){
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("127.0.0.1",9200,"http"))
);
return client;
}
}
5、创建并编写测试实体类
@Data //生成setter和getter方法
@NoArgsConstructor //生成无参构造函数
@AllArgsConstructor //生成带参构造函数
public class User implements Serializable {
private String name;
private Integer age;
}
1.3、测试索引-增删查
- 首先启动elasticsearch和es-head服务和插件
- 然后要启动项目的主启动类SpringbootElasticsearchApiApplication,因为要把RestHighLevelClient注入到spring容器中,在测试前一定一定要做这一步,后续的测试才不会报错,血的教训!!!
- 测试建议写在test包下的SpringbootElasticsearchApplicationTests类中
6.1、创建索引
@SpringBootTest
class SpringbootElasticsearchApplicationTests {
@Autowired
RestHighLevelClient restHighLevelClient;
@Test
public void testPUTCreateIndex() throws IOException {
//创建索引请求对象,同时可初始化索引名
CreateIndexRequest request = new CreateIndexRequest("yxj_index");
//创建索引响应对应,默认类型
CreateIndexResponse reponse = restHighLevelClient.indices().create(request,RequestOptions.DEFAULT);
System.out.println(reponse.isAcknowledged());//根据响应状态,索引是够创建成功
System.out.println(reponse);//查询响应对象信息
restHighLevelClient.close();//用完一定要关闭客户端
}
}
控制台结果:
true
org.elasticsearch.client.indices.CreateIndexResponse@5565235d
6.2、获取索引,并判断其是否存在
@Test
public void testGETIndexAndIsExists() throws IOException {
//创建获取索引请求对象
GetIndexRequest request = new GetIndexRequest("yxj_index");
//创建获取索引响应对象
GetIndexResponse response = restHighLevelClient.indices().get(request, RequestOptions.DEFAULT);
//判断索引是否存在
boolean exits = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);
System.out.println(response.getIndices());//输出索引信息(暂时还没数据)
System.out.println(exits);//判断是否存在
restHighLevelClient.close();//用完一定要关闭客户端
}
控制台结果:
[Ljava.lang.String;@36790bec
true
6.3、删除索引
@Test
public void testDeleteIndex() throws IOException {
//创建删除索引的请求对象
DeleteIndexRequest request = new DeleteIndexRequest("yxj_index");
//创建删除索引的响应对象
AcknowledgedResponse response = restHighLevelClient.indices().delete(request,RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());//判断删除是否成功
restHighLevelClient.close();
}
控制台结果:
true
1.4、测试文档-增删改查
1、添加文档
@Test
void testAdddocument() throws IOException{
//创建对象
User user = new User("一宿君",21);
//创建请求,链接索引库
IndexRequest request = new IndexRequest("yxj_index");
//规则 PUT /yxj_index/_doc/1
request.id("1");
request.timeout("1s");//设置超时时间为1s
request.timeout(Timevalue.timevalueMinutes(1));//这两种方式应该都可以
//将数据放入request请求中(json格式)
request.source(JSON.toJSONString(user), XContentType.JSON);
//客户端发送请求,获取响应的结果信息
IndexResponse response = restHighLevelClient.index(request,RequestOptions.DEFAULT);
System.out.println(response.status());//获取 *** 作文档的状态
System.out.println(response);//获取文档 *** 作相应信息
restHighLevelClient.close();
}
控制台结果:
CREATED
IndexResponse[index=yxj_index,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
2、获取文档信息
@Test
void testGetDocumntAndIsExits() throws IOException {
//创建获取文档请求,指定索引名和文档id
GetRequest request = new GetRequest("yxj_index","1");
//过滤掉_source文档上下文,我们只需要判断文档是否存在,不需要获取内容,可以提高效率
//request.fetchSourceContext(new FetchSourceContext(false));
//不获取任何字段
//request.storedFields("_none_");
//获取值钱,先判断该文档是否存在(提高效率)
boolean exists = restHighLevelClient.exists(request, RequestOptions.DEFAULT);
if(exists){
System.out.println("文档存在。。。");
//发送请求获取响应对象(此处发送请求,如果使用上述的request过滤掉上下文,是获取不到内容的,可以把上述过滤注释掉)
GetResponse response = restHighLevelClient.get(request,RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());//获取文档全部内容,转换为字符串
System.out.println(response);//获取全部相应信息(和Kibana的命令 *** 作是一致的)
}else {
System.out.println("文档不存在!!!");
}
restHighLevelClient.close();//关闭客户端
}
控制台结果:
文档存在。。。
{"age":21,"name":"一宿君"}
{"_index":"yxj_index","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{"age":21,"name":"一宿君"}}
3、文档更新
@Test
void testUpdatedocument() throws IOException {
//创建更新请求
UpdateRequest request = new UpdateRequest("yxj_index","1");
//创建更新数据
User user = new User("一宿君Java",19);
//将数据放入请求中,转换为JSON格式
request.doc(JSON.toJSONString(user),XContentType.JSON);
//发送请求
UpdateResponse response = restHighLevelClient.update(request, RequestOptions.DEFAULT);
System.out.println(response.status());//查询更新状态是否成功
restHighLevelClient.close();//关闭客户端
}
控制台结果:
OK
4、文档的删除
@Test
void testDeletedocument() throws IOException {
//创建删除请求
DeleteRequest request = new DeleteRequest("yxj_index", "1");
//发送请求
DeleteResponse response = restHighLevelClient.delete(request, RequestOptions.DEFAULT);
System.out.println(response.status());//查询更新状态是否成功
restHighLevelClient.close();//关闭客户端
}
控制台结果:
OK
5、批量插入文档数据
@Test
void testBulkInsertdocument() throws IOException {
//创建批量出入请求对象
BulkRequest request = new BulkRequest();
request.timeout("1s");
//创建集合文档数据
List userList = new ArrayList<>();
userList.add(new User("一宿君1", 1));
userList.add(new User("一宿君2", 2));
userList.add(new User("一宿君3", 3));
userList.add(new User("一宿君4", 4));
userList.add(new User("一宿君5", 5));
userList.add(new User("一宿君6", 6));
//批量请求处理
for(int i=0;i
6、文档带条件查询
@Test
void testHasConditionSearch() throws IOException {
//创建查询条件请求对象
SearchRequest request = new SearchRequest();
//构建查询条件对象
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("name","一宿君");
//TermQueryBuilder queryBuilder = QueryBuilders.termQuery("name","一宿君");
//将查询条件对象放入 请求构建查询条件对象中
searchSourceBuilder.query(matchQueryBuilder);
//设置高亮
searchSourceBuilder.highlighter(new HighlightBuilder());
//设置分页(当前第0页,每页显示3条数据)
searchSourceBuilder.from(0);
searchSourceBuilder.size(3);
//将构建查询条件对象放入到请求查询条件对象中
request.source(searchSourceBuilder);
//此处是指定索引,如果不指定会遍历所有的索引
request.indices("bulk_index");
//客户单发送请求
SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);
System.out.println(response.status());//查看查询的状态
System.out.println(response);//打印全部响应信息
//获取查询结果集,并遍历
SearchHits hits = response.getHits();//此处获取到的是整个hits标签,包含全部信息
System.out.println(JSON.toJSONString(hits));//将结果集转换为JSON格式
System.out.println("============================================================");
//此处的hits内部才是包含数据
for(SearchHit documentFields:hits.getHits()){
System.out.println(documentFields.getSourceAsString());//这个是获取字符串格式
//System.out.println(documentFields.getSourceAsMap());//这个是获取map集合对格式
}
restHighLevelClient.close();//关闭客户端
}
控制台结果:
OK
{"took":19,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":6,"relation":"eq"},"max_score":0.22232392,"hits":[{"_index":"bulk_index","_type":"_doc","_id":"1","_score":0.22232392,"_source":{"age":1,"name":"一宿君1"}},{"_index":"bulk_index","_type":"_doc","_id":"2","_score":0.22232392,"_source":{"age":2,"name":"一宿君2"}},{"_index":"bulk_index","_type":"_doc","_id":"3","_score":0.22232392,"_source":{"age":3,"name":"一宿君3"}}]}}
{"fragment":true,"hits":[{"fields":{},"fragment":false,"highlightFields":{},"id":"1","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君1","age":1},"sourceAsString":"{"age":1,"name":"一宿君1"}","sourceRef":{"fragment":true},"type":"_doc","version":-1},{"fields":{},"fragment":false,"highlightFields":{},"id":"2","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君2","age":2},"sourceAsString":"{"age":2,"name":"一宿君2"}","sourceRef":{"fragment":true},"type":"_doc","version":-1},{"fields":{},"fragment":false,"highlightFields":{},"id":"3","matchedQueries":[],"primaryTerm":0,"rawSortValues":[],"score":0.22232392,"seqNo":-2,"sortValues":[],"sourceAsMap":{"name":"一宿君3","age":3},"sourceAsString":"{"age":3,"name":"一宿君3"}","sourceRef":{"fragment":true},"type":"_doc","version":-1}],"maxScore":0.22232392,"totalHits":{"relation":"EQUAL_TO","value":6}}
============================================================
{"age":1,"name":"一宿君1"}
{"age":2,"name":"一宿君2"}
{"age":3,"name":"一宿君3"}
2、ElasticSearch实战-仿京东首页查询高亮
2.1、创建项目
静态界面资源包:
链接:https://pan.baidu.com/s/1L8_NtjVLMmOooK2m-L0Tlw
提取码:9gjc
配置application.properties配置文件:
#修改端口号
server.port=9090
#关闭thymeleaf缓存
spring.thymeleaf.cache=false
导入相关依赖(特别注意版本号):
org.springframework.boot
spring-boot-starter-parent
2.2.5.RELEASE
com.wbs
springboot-elasticsearch-jd
0.0.1-SNAPSHOT
springboot-elasticsearch-jd
1.8
7.6.1
org.springframework.boot
spring-boot-starter-data-elasticsearch
org.springframework.boot
spring-boot-starter-thymeleaf
org.springframework.boot
spring-boot-starter-web
com.alibaba
fastjson
1.2.70
org.springframework.boot
spring-boot-devtools
runtime
true
org.springframework.boot
spring-boot-configuration-processor
true
org.projectlombok
lombok
true
org.springframework.boot
spring-boot-starter-test
test
编写IndexController层:
@Controller
public class IndexController {
@RequestMapping({"/","/index"})
public String toIndex(){
return "index";
}
}
启动项目,直接访问地址localhost:9090,首先保证我们的项目能正常启动和访问到首页:
2.2、基础爬虫拉取数据(jsoup)
数据获取的方式有很多种:
- 数据库
- 消息队列
- 缓存
- 爬虫
- 等等……
1、首先导入jsoup依赖包
org.jsoup
jsoup
1.10.2
2、进入京东首页搜索商品关键字
查看地址栏地址:
https://search.jd.com/Search?keyword=Java&enc=utf-8
3、审查网页元素
4、编写工具类爬取数据(获取请求返回的页面信息,筛选出可用的)
public class HtmlParseUtilTest {
public static void main(String[] args) throws IOException {
//1、请求url
String url = "https://search.jd.com/Search?keyword=Java&enc=utf-8";
//2、解析网页(jsoup解析返回的就是浏览器document对象,可以 *** 作网页中所有的html元素)
document document = Jsoup.parse(new URL(url), 30000);
//3、通过上述审查网页元素中的商品列表id,获取元素
Element element = document.getElementById("J_goodsList");
//4、获取element元素中ul下的每一个所有li元素
Elements elements = element.getElementsByTag("li");
//5、获取li元素下的商品属性:img、price、name、……
for (Element el : elements) {
System.out.println("img-src:" + el.getElementsByTag("img").eq(0).attr("src"));//获取li元素下的第一章照片
System.out.println("name:" + el.getElementsByClass("p-name").eq(0).text());//获取商品名字
System.out.println("price:" + el.getElementsByClass("p-price").eq(0).text());//获取商品价格
System.out.println("shopname:" + el.getElementsByClass("hd-shopname").eq(0).text());//获取商品出版社
System.out.println("================================================================================================");
}
}
}
上述的情况是以为大型网站图片比较多,一般使用的都是图片延迟加载(也就是懒加载的方式)渲染图片,这样可以高相应速度。
更改图片获取属性为 :data-lazy-img
5、编写实体类,存放商品属性信息
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Product implements Serializable {
private String name;
private String img;
private String price;
private String shopname;
//……属性可以根据需要添加,这里只罗列几个关键属性即可
}
6、编写修改解析网页工具类,获取树
public class HtmlParseUtil {
public static void main(String[] args) throws IOException {
new HtmlParseUtil().parseJD("Java").forEach(System.out::println);
}
public List parseJD(String keyword) throws IOException {
//1、请求url
String url = "https://search.jd.com/Search?keyword=" + keyword +"&enc=utf-8";
//2、解析网页(jsoup解析返回的就是浏览器document对象,可以 *** 作网页中所有的html元素)
document document = Jsoup.parse(new URL(url), 30000);
//3、通过上述审查网页元素中的商品列表id,获取元素
Element element = document.getElementById("J_goodsList");
//4、获取element元素中ul下的每一个所有li元素
Elements elements = element.getElementsByTag("li");
//5、创建存储数据集合
ArrayList productArrayList = new ArrayList<>();
//6、获取li元素下的商品属性:img、price、name、shopname,并添加到集合中
for (Element el : elements) {
String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");//获取li元素下的第一章照片
String name = el.getElementsByClass("p-name").eq(0).text();//获取商品名字
String price = el.getElementsByClass("p-price").eq(0).text();//获取商品价格
String shopname = el.getElementsByClass("hd-shopname").eq(0).text();//获取商品出版社
//创建商品实体类
Product product = new Product(img,name,price,shopname);
//添加到集合中
productArrayList.add(product);
}
//返回集合
return productArrayList;
}
}
注意:
执行查看结果:
2.3、编写service业务逻辑层接口及实现类
//接口
@Service
public interface ProductService {
//爬取数据存入es中
public Boolean parseProductSafeEs(String keyword) throws IOException;
//分页查询
public List
2.4、编写Controller前端访问层
注意:此处所有的方法都不要关闭RestHighLevelClient客户端,否则其他方法会无法继续访问,同时报IO异常。
@Controller
public class ProductController {
@Autowired
RestHighLevelClient restHighLevelClient;
@Autowired
ProductService productService;
@RequestMapping("/createIndex")
@ResponseBody
public String creatIndex() throws IOException {
CreateIndexRequest request = new CreateIndexRequest("jd_pro_index");
CreateIndexResponse response = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
if(response.isAcknowledged()){
return "创建成功!";
}else{
return "创建失败!";
}
}
@RequestMapping("/deleteIndex")
@ResponseBody
public String deleteIndex() throws IOException {
DeleteIndexRequest request = new DeleteIndexRequest("jd_pro_index");
AcknowledgedResponse response = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
if(response.isAcknowledged()){
return "删除成功!";
}else{
return "删除失败!";
}
}
@RequestMapping("/toSafeEs/{keyword}")
@ResponseBody
public String parseProductSafeEs(@PathVariable("keyword") String keyword) throws IOException {
if(productService.parseProductSafeEs(keyword)){
return "爬取数据成功存入es中!";
}
return "爬取数据失败";
}
@RequestMapping("/searchEsDoc/{keyword}/{pageNum}/{pageSize}")
@ResponseBody
public List> searchProduct(
@PathVariable("keyword") String keyword,
@PathVariable("pageNum") int pageNum,
@PathVariable("pageSize") int pageSize) throws IOException {
List> mapList = productService.searchProduct(keyword, pageNum, pageSize);
if (mapList != null){
return mapList;
}
return null;
}
}
2.5、测试接口
创建索引
爬取数据存入es中
查询数据:
2.6、前后端分离(简单使用Vue)
- 下载vue依赖:用于渲染前端页面
- 下载axios依赖:用于ajax请求后端接口
vue和axios都可以去官网下载,跟狂神学了一个小技巧,在本地新建一个英文目录文件夹,直接cmd进入该目录下(前提是安装了nodejs):
#如果之前没有初始化过,可以先执行初始化
npm init
#下载vue
npm install vue
#下载axios
npm install axios
修改index.xml首页:
一宿君Java-ES仿京东实战
微信扫一扫
支付宝扫一扫
评论列表(0条)