RSS River Plugin
RSS River Plugin offers a simple way to index RSS feeds into Elasticsearch.
It reads your feeds with a regular period and index content.
As all rivers, it's quite simple to create an RSS River :
- Install the plugin and start Elasticsearch
- Create your index (with mapping if needed)
- Define the river
- Search for RSS content
$ bin/plugin -install fr.pilato.elasticsearch.river/rssriver/0.2.0
$ bin/elasticsearch
$ curl -XPUT 'http://localhost:9200/lefigaro/' -d '{}'
$ curl -XPUT 'http://localhost:9200/lefigaro/page/_mapping' -d '{
"page" : {
"properties" : {
"title" : {"type" : "string", "analyzer" : "french"},
"description" : {"type" : "string", "analyzer" : "french"},
"author" : {"type" : "string"},
"link" : {"type" : "string"}
}
}
}'
$ curl -XPUT 'localhost:9200/_river/lefigaro/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune"
}
]
}
}'
$ curl -XGET 'http://localhost:9200/lefigaro/_search?q=taxe'
You can define multiple RSS feeds on the same river (same index) :
$ curl -XPUT 'http://localhost:9200/newspapers/' -d '{}'
$ curl -XPUT 'localhost:9200/_river/newspapers/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune"
}, {
"name": "lemonde",
"url": "http://www.lemonde.fr/rss/une.xml"
}
]
}
}'
By default, update_rate
(default to 15 minutes) will be replaced by the RSS ttl value if any.
If you need to force updates, you can use the ignore_ttl
field.
$ curl -XPUT 'http://localhost:9200/newspapers/' -d '{}'
$ curl -XPUT 'localhost:9200/_river/newspapers/_meta' -d '{
"type": "rss",
"rss": {
"feeds" : [ {
"name": "lefigaro",
"url": "http://rss.lefigaro.fr/lefigaro/laune",
"update_rate": 900000,
"ignore_ttl": true
}
]
}
}'