Create Apps with Claude Code on Ollama
This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex and Claude Code available. This is a really useful feature. Using the example of integration with Claude Coda and several different models running both locally and in the cloud, you will see how it works.
You can find other articles about AI and Java on my blog. For example, if you are interested in how to use Ollama to serve models for Spring AI applications, you can read the following article.
Source Code
Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several branches, each with an application generated from the same prompt using different models. Currently, the branch with the fewest comments in the code review has been merged into master. This is the version of the code generated using the glm-5 model. However, this may change in the future, and the master branch may be modified. Therefore, it is best to simply refer to the individual branches or pull requests shown below.

Below is the current list of branches. The dev branch contains the initial version of the repository with the CLAUDE.md file, which specifies the basic requirements for the generated code.
$ git branch
dev
glm-5
gpt-oss
* master
minimax
qwen3-coderShellSessionHere are instructions for AI from the CLAUDE.md file. They include a description of the technologies I plan to use in my application and a few practices I intend to apply. For example, I don’t want to use Lombok, a popular Java library that automates the generation of code parts such as getters, setters, and constructors. It seems that in the age of AI, this approach doesn’t make sense, but for some reason, AI models really like this library 🙂 Also, each time I make a code change, I want the LLM model to increment the version number and update the README.md file, etc.
# Project Instructions
- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.MarkdownRun Claude on Ollama
First, install Ollama on your computer. You can download the installer for your OS here. If you have used Ollama before, please update to the latest version.
$ ollama --version
ollama version is 0.16.1ShellSessionNext, install Claude Code.
curl -fsSL https://claude.ai/install.sh | bashShellSessionBefore you start, it is worth increasing the maximum context window value allowed by Ollama. By default, it is set to 4k, and on the Ollama website itself, you will find a recommendation of 64k for Claude Code. I set the maximum value to 256k for testing different models.

For example, the gpt-oss model supports a 128k context window size.

Let’s pull and run the gpt-oss model with Ollama:
ollama run gpt-ossShellSessionAfter downloading and launching, you can verify the model parameters with the ollama ps command. If you have 100% GPU and a context window size of ~131k, that’s exactly what I meant.

Ensure you are in the root repository directory, then run Claude Code with the command ollama launch claude. Next, choose the gpt-oss model visible in the list under “More”.

That’s it! Finally, we can start playing with AI.

Generate a Java App with Claude Code
My application will be very simple. I just need something to quickly test the solution. Of course, all guidelines defined in the CLAUDE.md file should be followed. So, here is my prompt. Nothing more, nothing less 🙂
Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.PlaintextAfter a few minutes, I have the entire code generated. Below is a summary from the AI of what has been done. If you want to check it out for yourself, take a look at this branch in my repository.

For the sake of formality, let’s take a look at the generated code. There is nothing spectacular here, because it is just a regular Spring Boot application that exposes a few REST endpoints for CRUD operations. However, it doen’t look bad. Here’s the Spring Boot @Service implementation responsible for using PersonRepository to interact with database.
@Service
public class PersonService {
private final PersonRepository repository;
public PersonService(PersonRepository repository) {
this.repository = repository;
}
public List<Person> findAll() {
return repository.findAll();
}
public Optional<Person> findById(Long id) {
return repository.findById(id);
}
@Transactional
public Person create(Person person) {
return repository.save(person);
}
@Transactional
public Optional<Person> update(Long id, Person person) {
return repository.findById(id).map(existing -> {
existing.setFirstName(person.getFirstName());
existing.setLastName(person.getLastName());
existing.setEmail(person.getEmail());
existing.setAge(person.getAge());
return repository.save(existing);
});
}
@Transactional
public void delete(Long id) {
repository.deleteById(id);
}
}JavaHere’s the generated @RestController witn REST endpoints implementation:
@RestController
@RequestMapping("/api/people")
public class PersonController {
private final PersonService service;
public PersonController(PersonService service) {
this.service = service;
}
@GetMapping
public List<Person> getAll() {
return service.findAll();
}
@GetMapping("/{id}")
public ResponseEntity<Person> getById(@PathVariable Long id) {
Optional<Person> person = service.findById(id);
return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
}
@PostMapping
public ResponseEntity<Person> create(@RequestBody Person person) {
Person saved = service.create(person);
return ResponseEntity.status(201).body(saved);
}
@PutMapping("/{id}")
public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
Optional<Person> updated = service.update(id, person);
return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
}
@DeleteMapping("/{id}")
public ResponseEntity<Void> delete(@PathVariable Long id) {
service.delete(id);
return ResponseEntity.noContent().build();
}
}JavaBelow is a summary in a pull request with the generated code.

Using Ollama Cloud Models
Recently, Ollama has made it possible to run models not only locally, but also in the cloud. By default, all models tagged with cloud are run this way. Cloud models are automatically offloaded to Ollama’s cloud service while offering the same capabilities as local models. This is the most useful for larger models that wouldn’t fit on a personal computer. You can for example try to experiment with the qwen3-coder model locally. Unfortunately, it didn’t look very good on my laptop.

Then, I can run a same or event a larger model in cloud and automatically connect Claude Code with that model using the following command:
ollama launch claude --model qwen3-coder:480b-cloudJavaNow you can repeat exactly the same exercise as before or take a look at my branch containing the code generated using this model.

You can also try some other cloud models like minimax-m2.5 or glm-5.

Conclusion
If you’re developing locally and don’t want to burn money on APIs, use Claude Code with Ollama, and e.g., the gpt-oss or glm-5 models. It’s a pretty powerful and free option. If you have a powerful personal computer, the locally launched model should be able to generate the code efficiently. Otherwise, you can use the option of launching the model in the cloud offered by Ollama free of charge up to a certain usage limit (it is difficult to say exactly what that limit is). The gpt-oss model worked really well on my laptop (MacBook Pro M3), and it took about 7-8 minutes to generate the application. You can also look for a model that suits you better.

5 COMMENTS