PostgreSQL + Elasticsearch synchronization in JAVA spring (JPA)
I agree with Vijay regarding the general approach.
Essentially you can use Logstash to log your PostgreSQL data by using the pgjdbc driver (simple .jar) in Logstash to connect to PostgreSQL directly and use logstash.conf
to write an SQL query to fetch the data from PostgreSQL.
I recently wrote a comprehensive guide about Spring Boot + PostgreSQL and the ELK stack and how to set things up (incl. a GitHub repo link to a sample project) in Towards Data Science, which you can find here: https://towardsdatascience.com/a-hitchhikers-guide-to-spring-boot-elasticsearch-logstash-kibana-postgresql-and-docker-5602feaa9fd3
I hope it helps everyone using this stack (even if you do not use Spring Boot but just Spring)
And here is a direct link to GitHub if you do not want to read the article: https://github.com/tech4242/spring-elastic-genie
I assume that you are using PosgreSQL as a primary storage to store your data and Elasticsearch as a secondary storage to index the data from PosgreSQL for searching.
I would suggest Logstash is the best option. You will have create a configuration file that connects PosgreSQL with elasticsearch. Use JDBC input plugin and elasticsearch output plugin and run Logstash with that configuration file. Every document will be synced from there onward.
You also have some other options:
Batch Sync: Write an application/script (perl/python) to pull the data from PosgreSQL to index into Elasticsearch. You can schedule this job to be executed at a specific time you wish.
Real-time Sync: Right after you insert/update a record into PostgreSQL successfully, you send a request in the same API to index/update this record to Elasticsearch.