Create Apps with Claude Code on Ollama

Create Apps with Claude Code on Ollama

This article explains how to run Claude Code on Ollama and use local or cloud models served by Ollama to create Java apps. Read this article if you are experimenting with AI code generation and using paid APIs for this purpose. Relatively recently, Ollama has made a built-in integration with developer tools such as Codex and Claude Code available. This is a really useful feature. Using the example of integration with Claude Coda and several different models running both locally and in the cloud, you will see how it works.

You can find other articles about AI and Java on my blog. For example, if you are interested in how to use Ollama to serve models for Spring AI applications, you can read the following article.

Source Code

Feel free to use my source code if you’d like to try it out yourself. To do that, you must clone my sample GitHub repository. Then you should only follow my instructions. This repository contains several branches, each with an application generated from the same prompt using different models. Currently, the branch with the fewest comments in the code review has been merged into master. This is the version of the code generated using the glm-5 model. However, this may change in the future, and the master branch may be modified. Therefore, it is best to simply refer to the individual branches or pull requests shown below.

Below is the current list of branches. The dev branch contains the initial version of the repository with the CLAUDE.md file, which specifies the basic requirements for the generated code.

$ git branch
    dev
    glm-5
    gpt-oss
  * master
    minimax
    qwen3-coder
ShellSession

Here are instructions for AI from the CLAUDE.md file. They include a description of the technologies I plan to use in my application and a few practices I intend to apply. For example, I don’t want to use Lombok, a popular Java library that automates the generation of code parts such as getters, setters, and constructors. It seems that in the age of AI, this approach doesn’t make sense, but for some reason, AI models really like this library 🙂 Also, each time I make a code change, I want the LLM model to increment the version number and update the README.md file, etc.

# Project Instructions

- Always use the latest versions of dependencies.
- Always write Java code as the Spring Boot application.
- Always use Maven for dependency management.
- Always create test cases for the generated code both positive and negative.
- Always generate the CircleCI pipeline in the .circleci directory to verify the code.
- Minimize the amount of code generated.
- The Maven artifact name must be the same as the parent directory name.
- Use semantic versioning for the Maven project. Each time you generate a new version, bump the PATCH section of the version number.
- Use `pl.piomin.services` as the group ID for the Maven project and base Java package.
- Do not use the Lombok library.
- Generate the Docker Compose file to run all components used by the application.
- Update README.md each time you generate a new version.
Markdown

Run Claude on Ollama

First, install Ollama on your computer. You can download the installer for your OS here. If you have used Ollama before, please update to the latest version.

$ ollama --version
  ollama version is 0.16.1
ShellSession

Next, install Claude Code.

curl -fsSL https://claude.ai/install.sh | bash
ShellSession

Before you start, it is worth increasing the maximum context window value allowed by Ollama. By default, it is set to 4k, and on the Ollama website itself, you will find a recommendation of 64k for Claude Code. I set the maximum value to 256k for testing different models.

For example, the gpt-oss model supports a 128k context window size.

Let’s pull and run the gpt-oss model with Ollama:

ollama run gpt-oss
ShellSession

After downloading and launching, you can verify the model parameters with the ollama ps command. If you have 100% GPU and a context window size of ~131k, that’s exactly what I meant.

Ensure you are in the root repository directory, then run Claude Code with the command ollama launch claude. Next, choose the gpt-oss model visible in the list under “More”.

ollama-claude-code-gpt-oss

That’s it! Finally, we can start playing with AI.

ollama-claude-code-run

Generate a Java App with Claude Code

My application will be very simple. I just need something to quickly test the solution. Of course, all guidelines defined in the CLAUDE.md file should be followed. So, here is my prompt. Nothing more, nothing less 🙂

Generate an application that exposes REST API and connects to a PostgreSQL database.
The application should have a Person entity with id, and typical fields related to each person.
All REST endpoints should be protected with JWT and OAuth2.
The codebase should use Skaffold to deploy on Kubernetes.
Plaintext

After a few minutes, I have the entire code generated. Below is a summary from the AI of what has been done. If you want to check it out for yourself, take a look at this branch in my repository.

ollama-claude-code-generated

For the sake of formality, let’s take a look at the generated code. There is nothing spectacular here, because it is just a regular Spring Boot application that exposes a few REST endpoints for CRUD operations. However, it doen’t look bad. Here’s the Spring Boot @Service implementation responsible for using PersonRepository to interact with database.

@Service
public class PersonService {
    private final PersonRepository repository;

    public PersonService(PersonRepository repository) {
        this.repository = repository;
    }

    public List<Person> findAll() {
        return repository.findAll();
    }

    public Optional<Person> findById(Long id) {
        return repository.findById(id);
    }

    @Transactional
    public Person create(Person person) {
        return repository.save(person);
    }

    @Transactional
    public Optional<Person> update(Long id, Person person) {
        return repository.findById(id).map(existing -> {
            existing.setFirstName(person.getFirstName());
            existing.setLastName(person.getLastName());
            existing.setEmail(person.getEmail());
            existing.setAge(person.getAge());
            return repository.save(existing);
        });
    }

    @Transactional
    public void delete(Long id) {
        repository.deleteById(id);
    }
}
Java

Here’s the generated @RestController witn REST endpoints implementation:

@RestController
@RequestMapping("/api/people")
public class PersonController {
    private final PersonService service;

    public PersonController(PersonService service) {
        this.service = service;
    }

    @GetMapping
    public List<Person> getAll() {
        return service.findAll();
    }

    @GetMapping("/{id}")
    public ResponseEntity<Person> getById(@PathVariable Long id) {
        Optional<Person> person = service.findById(id);
        return person.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @PostMapping
    public ResponseEntity<Person> create(@RequestBody Person person) {
        Person saved = service.create(person);
        return ResponseEntity.status(201).body(saved);
    }

    @PutMapping("/{id}")
    public ResponseEntity<Person> update(@PathVariable Long id, @RequestBody Person person) {
        Optional<Person> updated = service.update(id, person);
        return updated.map(ResponseEntity::ok).orElseGet(() -> ResponseEntity.notFound().build());
    }

    @DeleteMapping("/{id}")
    public ResponseEntity<Void> delete(@PathVariable Long id) {
        service.delete(id);
        return ResponseEntity.noContent().build();
    }
}
Java

Below is a summary in a pull request with the generated code.

ollama-claude-code-pr

Using Ollama Cloud Models

Recently, Ollama has made it possible to run models not only locally, but also in the cloud. By default, all models tagged with cloud are run this way. Cloud models are automatically offloaded to Ollama’s cloud service while offering the same capabilities as local models. This is the most useful for larger models that wouldn’t fit on a personal computer. You can for example try to experiment with the qwen3-coder model locally. Unfortunately, it didn’t look very good on my laptop.

Then, I can run a same or event a larger model in cloud and automatically connect Claude Code with that model using the following command:

ollama launch claude --model qwen3-coder:480b-cloud
Java

Now you can repeat exactly the same exercise as before or take a look at my branch containing the code generated using this model.

You can also try some other cloud models like minimax-m2.5 or glm-5.

Conclusion

If you’re developing locally and don’t want to burn money on APIs, use Claude Code with Ollama, and e.g., the gpt-oss or glm-5 models. It’s a pretty powerful and free option. If you have a powerful personal computer, the locally launched model should be able to generate the code efficiently. Otherwise, you can use the option of launching the model in the cloud offered by Ollama free of charge up to a certain usage limit (it is difficult to say exactly what that limit is). The gpt-oss model worked really well on my laptop (MacBook Pro M3), and it took about 7-8 minutes to generate the application. You can also look for a model that suits you better.

5 COMMENTS

comments user
sekhar

This what its happening with the Ollama Claude Code. ❯ what’s the value of 3*3?

⏺ 9

❯ what was my previous question

⏺ I don’t have a record of a prior question in this session. why is it not remembering what I asked?

    comments user
    piotr.minkowski

    But you should use it to generate the code.

comments user
Scott Howell

Thank you

comments user
Magesh K

May I know your laptop configuration where you experimented this?

    comments user
    piotr.minkowski

    I use MacBook Pro M3.

Leave a Reply