In-memory data grid with Apache Ignite

In-memory data grid with Apache Ignite

Apache Ignite is a relatively new solution, but quickly increasing its popularity. It is hard to assign to a single area of database engine division because it has characteristics typical for some of them. The primary purpose of this solution is an in-memory data grid and key-value storage. It also has some common RDBMS features like support for SQL queries and ACID transactions. But that’s not to say it is full SQL and transactional database. It does not support foreign key constraints and transactions are available only at the key-value level. Despite that, Apache Ignite seems to be a very interesting solution.

Apache Ignite may be easily started as a node embedded in the Spring Boot application. The simplest way to achieve that is by using the Spring Data Ignite library. Apache Ignite implements Spring Data CrudRepository interface that supports basic CRUD operations and also provides access to the Apache Ignite SQL Grid using the unified Spring Data interfaces. Although it has support for distributed, ACID, and SQL-compliant disk store persistence we design a solution which store in-memory cache objects in MySQL database. The architecture of the presented solution is visible on the figure below and you can see it is very simple. The application put data to the in-memory cache on Apache Ignite. Apache Ignite automatically synchronizes these changes with the database in an asynchronous, background task. The way of reading data by the application also should not surprise you. If an entity is not cached it is read from the database and put to the cache for future use.

ignite

I’m going to guide you through the process of the sample application development. The result of this development is available on GitHub. I have found a few examples on the web, but there were only the basics. I’ll show you how to configure Apache Ignite to write objects from the cache in the database and create some more complex cross-cache join queries. Let’s begin by running the database.

1. Setup MySQL database

The best way to start a MySQL database locally is of course by Docker container. For Docker on Windows, MySQL database is now available on 192.168.99.100:33306.

$ docker run -d --name mysql -e MYSQL_DATABASE=ignite -e MYSQL_USER=ignite -e MYSQL_PASSWORD=ignite123 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -p 33306:3306 mysql

The next step is to create tables used by application entities to store the data: PERSON, CONTACT. Those to tables are in 1…N relation where table CONTACT holds the foreign key referenced to PERSON id.

CREATE TABLE `person` (
`id` int(11) NOT NULL,
`first_name` varchar(45) DEFAULT NULL,
`last_name` varchar(45) DEFAULT NULL,
`gender` varchar(10) DEFAULT NULL,
`country` varchar(10) DEFAULT NULL,
`city` varchar(20) DEFAULT NULL,
`address` varchar(45) DEFAULT NULL,
`birth_date` date DEFAULT NULL,
PRIMARY KEY (`id`)
);

CREATE TABLE `contact` (
`id` int(11) NOT NULL,
`location` varchar(45) DEFAULT NULL,
`contact_type` varchar(10) DEFAULT NULL,
`person_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
);

ALTER TABLE `ignite`.`contact` ADD INDEX `person_fk_idx` (`person_id` ASC);
ALTER TABLE `ignite`.`contact`
ADD CONSTRAINT `person_fk` FOREIGN KEY (`person_id`) REFERENCES `ignite`.`person` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;

2. Maven configuration

The easiest way to start working with Apache Ignite’s Spring Data repository is by adding the following Maven dependency to an application’s pom.xml file. All the other Ignite dependencies would be automatically included. We also need MySQL JDBC driver, Spring JDBC dependencies to configure the connection to the database. They are required because we are embedding Apache Ignite to the application and it has to establish a connection with MySQL in order to be able to synchronize cache with database tables.

<dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
   <groupId>org.springframework.boot</groupId>
   <artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
   <groupId>mysql</groupId>
   <artifactId>mysql-connector-java</artifactId>
   <scope>runtime</scope>
</dependency>
<dependency>
   <groupId>org.apache.ignite</groupId>
   <artifactId>ignite-spring-data</artifactId>
   <version>${ignite.version}</version>
</dependency>

3. Configure Ignite node

Using IgniteConfiguration class we are able to configure all available Ignite’s node settings. The most important thing here is a cache configuration (1). We should add primary key and entity classes as an indexed types (2). Then we have to enable export cache updates to database (3) and read data not found in a cache from database (4). The interaction between Ignite’s node and MySQL may be configured using CacheJdbcPojoStoreFactory class (5). We should pass there DataSource @Bean (6), dialect (7) and mapping between object fields and table columns (8).

@Bean
public Ignite igniteInstance() {
   IgniteConfiguration cfg = new IgniteConfiguration();
   cfg.setIgniteInstanceName("ignite-1");
   cfg.setPeerClassLoadingEnabled(true);

   CacheConfiguration<Long, Contact> ccfg2 = new CacheConfiguration<>("ContactCache"); // (1)
   ccfg2.setIndexedTypes(Long.class, Contact.class); // (2)
   ccfg2.setWriteBehindEnabled(true);
   ccfg2.setWriteThrough(true); // (3)
   ccfg2.setReadThrough(true); // (4)
   CacheJdbcPojoStoreFactory<Long, Contact> f2 = new CacheJdbcPojoStoreFactory<>(); // (5)
   f2.setDataSource(datasource); // (6)
   f2.setDialect(new MySQLDialect()); // (7)
   JdbcType jdbcContactType = new JdbcType(); // (8)
   jdbcContactType.setCacheName("ContactCache");
   jdbcContactType.setKeyType(Long.class);
   jdbcContactType.setValueType(Contact.class);
   jdbcContactType.setDatabaseTable("contact");
   jdbcContactType.setDatabaseSchema("ignite");
   jdbcContactType.setKeyFields(new JdbcTypeField(Types.INTEGER, "id", Long.class, "id"));
   jdbcContactType.setValueFields(new JdbcTypeField(Types.VARCHAR, "contact_type", ContactType.class, "type"), new JdbcTypeField(Types.VARCHAR, "location", String.class, "location"), new JdbcTypeField(Types.INTEGER, "person_id", Long.class, "personId"));
   f2.setTypes(jdbcContactType);
   ccfg2.setCacheStoreFactory(f2);

   CacheConfiguration<Long, Person> ccfg = new CacheConfiguration<>("PersonCache");
   ccfg.setIndexedTypes(Long.class, Person.class);
   ccfg.setWriteBehindEnabled(true);
   ccfg.setReadThrough(true);
   ccfg.setWriteThrough(true);
   CacheJdbcPojoStoreFactory<Long, Person> f = new CacheJdbcPojoStoreFactory<>();
   f.setDataSource(datasource);
   f.setDialect(new MySQLDialect());
   JdbcType jdbcType = new JdbcType();
   jdbcType.setCacheName("PersonCache");
   jdbcType.setKeyType(Long.class);
   jdbcType.setValueType(Person.class);
   jdbcType.setDatabaseTable("person");
   jdbcType.setDatabaseSchema("ignite");
   jdbcType.setKeyFields(new JdbcTypeField(Types.INTEGER, "id", Long.class, "id"));
   jdbcType.setValueFields(new JdbcTypeField(Types.VARCHAR, "first_name", String.class, "firstName"), new JdbcTypeField(Types.VARCHAR, "last_name", String.class, "lastName"), new JdbcTypeField(Types.VARCHAR, "gender", Gender.class, "gender"), new JdbcTypeField(Types.VARCHAR, "country", String.class, "country"), new JdbcTypeField(Types.VARCHAR, "city", String.class, "city"), new JdbcTypeField(Types.VARCHAR, "address", String.class, "address"), new JdbcTypeField(Types.DATE, "birth_date", Date.class, "birthDate"));
   f.setTypes(jdbcType);
   ccfg.setCacheStoreFactory(f);

   cfg.setCacheConfiguration(ccfg, ccfg2);
   return Ignition.start(cfg);
}

Here’s Spring data source configuration for MySQL running as a Docker container.

spring:
  datasource:
    name: mysqlds
    url: jdbc:mysql://192.168.99.100:33306/ignite?useSSL=false
    username: ignite
    password: ignite123

On that occasion, it should be mentioned that Apache Ignite has still some deficiencies. For example, it maps Enum to integer taking its ordinal value although it has configured VARCHAR as JDCB type. When reading such a row from database it is not mapped properly to Enum in object – you would have null in this response field.

new JdbcTypeField(Types.VARCHAR, "contact_type", ContactType.class, "type")

4. Model objects

Like I mentioned before we have two tables in the database schema. There are also two model classes and two cache configurations one per each model class. Here’s model class implementation. One of the few interesting things here is ID generation with AtomicLong class. It is one of the basic Ignite’s components acting as a sequence generator. We can also see a specific annotation @QuerySqlField, which marks the field as available for usage as a query parameter in SQL.

@QueryGroupIndex.List(
@QueryGroupIndex(name="idx1")
)
public class Person implements Serializable {

   private static final long serialVersionUID = -1271194616130404625L;
   private static final AtomicLong ID_GEN = new AtomicLong();

   @QuerySqlField(index = true)
   private Long id;
   @QuerySqlField(index = true)
   @QuerySqlField.Group(name = "idx1", order = 0)
   private String firstName;
   @QuerySqlField(index = true)
   @QuerySqlField.Group(name = "idx1", order = 1)
   private String lastName;
   private Gender gender;
   private Date birthDate;
   private String country;
   private String city;
   private String address;
   private List<Contact> contacts = new ArrayList<>();

   public void init() {
      this.id = ID_GEN.incrementAndGet();
   }

   public Long getId() {
      return id;
   }

      public void setId(Long id) {
   this.id = id;
   }

   public String getFirstName() {
      return firstName;
   }

   public void setFirstName(String firstName) {
      this.firstName = firstName;
   }

   public String getLastName() {
      return lastName;
   }

   public void setLastName(String lastName) {
      this.lastName = lastName;
   }

   public Gender getGender() {
      return gender;
   }

   public void setGender(Gender gender) {
      this.gender = gender;
   }

   public Date getBirthDate() {
      return birthDate;
   }

   public void setBirthDate(Date birthDate) {
      this.birthDate = birthDate;
   }

   public String getCountry() {
      return country;
   }

   public void setCountry(String country) {
      this.country = country;
   }

   public String getCity() {
      return city;
   }

   public void setCity(String city) {
      this.city = city;
   }

   public String getAddress() {
      return address;
   }

   public void setAddress(String address) {
      this.address = address;
   }

   public List<Contact> getContacts() {
      return contacts;
   }

   public void setContacts(List<Contact> contacts) {
      this.contacts = contacts;
   }

}

5. Ignite repositories

I assume that you are familiar with Spring Data JPA concept of creating repositories. A repository handling should be enabled on the main or @Configuration class.


@SpringBootApplication
@EnableIgniteRepositories
public class IgniteRestApplication {

   @Autowired
   DataSource datasource;

   public static void main(String[] args) {
      SpringApplication.run(IgniteRestApplication.class, args);
   }

   // ...
}

Then we have to extend our @Repository interface with base CrudRepository interface. It supports only inherited methods with the id parameter. In the PersonRepository fragment visible below I defined some find methods using the Spring Data naming convention and Ignite’s queries. In those samples, you can see that we can return a full object or selected fields as a query result – according to the needs.

@RepositoryConfig(cacheName = "PersonCache")
public interface PersonRepository extends IgniteRepository<Person, Long> {

   List<Person> findByFirstNameAndLastName(String firstName, String lastName);

   @Query("SELECT c.* FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List<Contact> selectContacts(String firstName, String lastName);

   @Query("SELECT p.id, p.firstName, p.lastName, c.id, c.type, c.location FROM Person p JOIN \"ContactCache\".Contact c ON p.id=c.personId WHERE p.firstName=? and p.lastName=?")
   List<List<?>> selectContacts2(String firstName, String lastName);
}

6. API and testing

Finally, we can inject the repository beans into the REST controller classes. API would expose methods for adding a new object to the cache, updating or removing existing objects and some for searching using the primary key or the other more complex indices.

@RestController
@RequestMapping("/person")
public class PersonController {

   private static final Logger LOGGER = LoggerFactory.getLogger(PersonController.class);

   @Autowired
   PersonRepository repository;

   @PostMapping
   public Person add(@RequestBody Person person) {
      person.init();
      return repository.save(person.getId(), person);
   }

   @PutMapping
   public Person update(@RequestBody Person person) {
      return repository.save(person.getId(), person);
   }

   @DeleteMapping("/{id}")
   public void delete(Long id) {
      repository.delete(id);
   }

   @GetMapping("/{id}")
   public Person findById(@PathVariable("id") Long id) {
      return repository.findOne(id);
   }

   @GetMapping("/{firstName}/{lastName}")
   public List<Person> findByName(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      return repository.findByFirstNameAndLastName(firstName, lastName);
   }

   @GetMapping("/contacts/{firstName}/{lastName}")
   public List<Person> findByNameWithContacts(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      List<Person> persons = repository.findByFirstNameAndLastName(firstName, lastName);
      List<Contact> contacts = repository.selectContacts(firstName, lastName);
      persons.stream().forEach(it -> it.setContacts(contacts.stream().filter(c -> c.getPersonId().equals(it.getId())).collect(Collectors.toList())));
      LOGGER.info("PersonController.findByIdWithContacts: {}", contacts);
      return persons;
   }

   @GetMapping("/contacts2/{firstName}/{lastName}")
   public List<Person> findByNameWithContacts2(@PathVariable("firstName") String firstName, @PathVariable("lastName") String lastName) {
      List<List<?>> result = repository.selectContacts2(firstName, lastName);
      List<Person> persons = new ArrayList<>();
      for (List<?> l : result) {
         persons.add(mapPerson(l));
      }
      LOGGER.info("PersonController.findByIdWithContacts: {}", result);
      return persons;
   }

   private Person mapPerson(List<?> l) {
      Person p = new Person();
      Contact c = new Contact();
      p.setId((Long) l.get(0));
      p.setFirstName((String) l.get(1));
      p.setLastName((String) l.get(2));
      c.setId((Long) l.get(3));
      c.setType((ContactType) l.get(4));
      c.setLocation((String) l.get(4));
      p.addContact(c);
      return p;
   }

}

It is certainly important to test the performance of the implemented solution, especially when it is related to an in-memory data grid and databases. For that purpose, I created some JUnit tests which put a large number of objects into the cache and then invoke some find methods using random input data to test queries performance. Here’s method which generates many Person and Contact objects and puts them into cache using API endpoints.

@Test
public void testAddPerson() throws InterruptedException {
   ExecutorService es = Executors.newCachedThreadPool();
   for (int j = 0; j < 10; j++) { es.execute(() -> {
      TestRestTemplate restTemplateLocal = new TestRestTemplate();
      Random r = new Random();
      for (int i = 0; i < 1000000; i++) {
         Person p = restTemplateLocal.postForObject("http://localhost:8090/person", createTestPerson(), Person.class);
         int x = r.nextInt(6);
         for (int k = 0; k < x; k++) {
            restTemplateLocal.postForObject("http://localhost:8090/contact", createTestContact(p.getId()), Contact.class);
         }
      }
      });
   }
   es.shutdown();
   es.awaitTermination(60, TimeUnit.MINUTES);
}

Spring Boot provides methods for capturing basic metrics of API response times. To enable that feature we have to include Spring Actuator to the dependencies. Metrics endpoint is available under http://localhost:8090/metrics address. In addition to each API method processing time, it also prints such statistics like a number of running threads or free memory.

7. Running application

Let’s run our sample application with embedded Apache Ignite’s node. Following some performance suggestions available in the Ignite’s docs I defined the JVM configuration visible below.

$ java -jar -Xms512m -Xmx1024m -XX:MaxDirectMemorySize=256m -XX:+DisableExplicitGC -XX:+UseG1GC target/ignite-rest-service-1.0-SNAPSHOT.jar

Now, we can run JUnit test class IgniteRestControllerTest. It puts some data into the cache and then calls find methods. The metrics for the tests with 1M Person objects and 2.5M Contact objects in the cache are visible below. All find methods have taken about 1ms on average.

{
"mem": 624886,
"mem.free": 389701,
"processors": 4,
"instance.uptime": 2446038,
"uptime": 2466661,
"systemload.average": -1,
"heap.committed": 524288,
"heap.init": 524288,
"heap.used": 133756,
"heap": 1048576,
"threads.peak": 107,
"threads.daemon": 25,
"threads.totalStarted": 565,
"threads": 80,
...
"gauge.response.person.contacts.firstName.lastName": 1,
"gauge.response.contact": 1,
"gauge.response.person.firstName.lastName": 1,
"gauge.response.contact.location.location": 1,
"gauge.response.person.id": 1,
"gauge.response.person": 0,
"counter.status.200.person.id": 1000,
"counter.status.200.person.contacts.firstName.lastName": 1000,
"counter.status.200.person.firstName.lastName": 1000,
"counter.status.200.contact": 2500806,
"counter.status.200.person": 1000000,
"counter.status.200.contact.location.location": 1000
}

0 COMMENTS

comments user
Chengri Cui

So Awesome Article. I loved it. Is it possible to share code repository, so that I can easily test it and how it works.
And is it possible to use spring data integration which uses ignite repository so that I can use for spring boot project?
Thanks you.

Leave a Reply