[Platform] Introduce `CachedPlatform` #416

Guikingone · 2025-09-03T12:02:35Z

Q	A
Bug fix?	no
New feature?	yes
Docs?	yes
Issues	Somehow related to #337
License	MIT

Hi 👋🏻

This PR aim to introduce a caching layer for Ollama platform (like OpenAI, Anthropic and more already does).

examples/ollama/chat-llama-with-cache.php

src/ai-bundle/config/options.php

examples/ollama/chat-llama-with-cache.php

junaidbinfarooq · 2025-09-15T12:37:31Z

src/platform/src/Bridge/Ollama/OllamaResultConverter.php

+            $metadata->add('cached', true);
+            $metadata->add('prompt_cache_key', $options['prompt_cache_key']);
+            $metadata->add('cached_prompt_count', $data['prompt_eval_count']);
+            $metadata->add('cached_completion_count', $data['eval_count']);


Wouldn't it make sense to group this data into a DTO, like there is TokenUsage, and then add that DTO to metadata or perhaps even reuse the said DTO?

Not convinced about the benefits of using an object here, we're only storing an integer, I don't see the benefits to be honest 🤔

@OskarStark @chr-hertel Any thoughts?

i agree, it would be great to have an object like CacheUsage similar to TokenUsage

Yes CacheUsage would be a good fit

junaidbinfarooq · 2025-09-15T12:40:04Z

src/platform/tests/Bridge/Ollama/OllamaClientTest.php

+        $firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [
+            'messages' => [
+                [
+                    'role' => 'user',
+                    'content' => 'Say hello world',
+                ],
+            ],
+            'model' => 'llama3.2',
+        ], [
+            'prompt_cache_key' => 'foo',
+        ]);
+
+        $result = $firstCall->getResult();
+
+        $this->assertSame('Hello world', $result->getContent());
+        $this->assertSame(10, $result->getMetadata()->get('cached_prompt_count'));
+        $this->assertSame(10, $result->getMetadata()->get('cached_completion_count'));
+
+        $secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [
+            'messages' => [
+                [
+                    'role' => 'user',
+                    'content' => 'Say hello world',
+                ],
+            ],
+            'model' => 'llama3.2',
+        ], [
+            'prompt_cache_key' => 'foo',
+        ]);
+
+        $secondResult = $secondCall->getResult();
+
+        $this->assertSame('Hello world', $secondResult->getContent());


Suggested change

$firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$result = $firstCall->getResult();

$this->assertSame('Hello world', $result->getContent());

$this->assertSame(10, $result->getMetadata()->get('cached_prompt_count'));

$this->assertSame(10, $result->getMetadata()->get('cached_completion_count'));

$secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$secondResult = $secondCall->getResult();

$this->assertSame('Hello world', $secondResult->getContent());

$firstCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$secondCall = $platform->invoke(new Ollama(Ollama::LLAMA_3_2), [

'messages' => [

[

'role' => 'user',

'content' => 'Say hello world',

],

],

'model' => 'llama3.2',

], [

'prompt_cache_key' => 'foo',

]);

$firstResult = $firstCall->getResult();

$secondResult = $secondCall->getResult();

$this->assertSame('Hello world', $firstResult->getContent());

$this->assertSame(10, $firstResult->getMetadata()->get('cached_prompt_count'));

$this->assertSame(10, $firstResult->getMetadata()->get('cached_completion_count'));

$this->assertSame('Hello world', $secondResult->getContent());

chr-hertel · 2025-09-25T16:49:40Z

Let's zoom a bit out here, for two reasons:

doesn't ollama caching already?
if we want to have it user land, why only ollama?

Guikingone · 2025-09-25T17:36:49Z

doesn't ollama caching already?

Ollama does a "context caching" and/or a K/V caching, it stores the X latest messages for the model window (or pending tokens to speed TTFT), it's not a cache that returns the generated response if the request already exist.

if we want to have it user land, why only Ollama?

Well, because that's the one that I use the most and the easiest to implement first but we can integrate it for every platform if that's the question, we just need to use the API contract, both Anthropic and OpenAI already does it natively 🤔

If the question is: Could we implement it at the platform layer for every platform without relying on API calls, well, that's not a big deal to be honest and we could easily integrate it 🙂

chr-hertel · 2025-09-25T17:40:31Z

What do you think about having it as decorator CachedPlatform or similar?

Guikingone · 2025-09-26T12:02:28Z

I like the idea of CachedPlatform, looks and sound like HttpCache, I'll rewrite it 👍🏻

src/ai-bundle/src/AiBundle.php

chr-hertel · 2025-09-29T21:22:35Z

src/platform/src/Bridge/Ollama/OllamaResultConverter.php

changes here would also belong into CachedPlatform so every bridge can benefit from this decorator

OskarStark · 2025-11-17T06:10:15Z

docs/bundles/ai-bundle.rst

                vectorizer: 'ai.vectorizer.mistral_embeddings'
                store: 'ai.store.memory.research'

+Cached platform


Suggested change

Cached platform

Cached Platform

OskarStark · 2025-11-17T06:10:30Z

docs/bundles/ai-bundle.rst

+---------------
+
+Thanks to Symfony's Cache component, platforms can be decorated and use any cache adapter,
+this platform  allows to reduce network calls / resource comsumption:


Suggested change

this platform allows to reduce network calls / resource comsumption:

this platform allows to reduce network calls / resource consumption:

OskarStark · 2025-11-17T06:11:50Z

docs/components/platform.rst

+
+    echo $firstResult->getContent().\PHP_EOL;
+
+    $secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));


maybe something like:

Suggested change

$secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));

// This call will not be executed against the API

$secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));

OskarStark · 2025-11-17T06:12:13Z

examples/.env


 # For using Ollama
-OLLAMA_HOST_URL=http://localhost:11434
+OLLAMA_HOST_URL=http://127.0.0.1:11434


Can we move this to an extra PR please?

OskarStark · 2025-11-17T06:12:47Z

src/ai-bundle/config/options.php

+                            ->children()
+                                ->stringNode('platform')->isRequired()->end()
+                                ->stringNode('service')->isRequired()->end()
+                                ->stringNode('cache_key')->end()


Can/should we provide a default key here?

OskarStark · 2025-11-17T06:13:52Z

src/platform/composer.json

        "symfony/process": "^7.3|^8.0",
        "symfony/var-dumper": "^7.3|^8.0"
    },
+    "suggest": {


Please remove, in Symfony we decided against suggest section in composer.json files

Guikingone requested review from Nyholm, OskarStark and chr-hertel as code owners September 3, 2025 12:02

carsonbot added Feature New feature Platform Issues & PRs about the AI Platform component Status: Needs Review labels Sep 3, 2025

Guikingone force-pushed the ollama/prompt_caching branch from 51fa81c to 1fa590a Compare September 3, 2025 12:04

OskarStark changed the title ~~[Platform] Add Ollama prompt cache~~ [Platform][Ollama] Add prompt cache Sep 3, 2025

Guikingone force-pushed the ollama/prompt_caching branch 2 times, most recently from bf5a1fe to 5ef4417 Compare September 3, 2025 12:18

OskarStark reviewed Sep 3, 2025

View reviewed changes

examples/ollama/chat-llama-with-cache.php Outdated Show resolved Hide resolved

Guikingone force-pushed the ollama/prompt_caching branch from 5ef4417 to cc5f431 Compare September 5, 2025 11:57

VincentLanglet reviewed Sep 5, 2025

View reviewed changes

src/ai-bundle/config/options.php Show resolved Hide resolved

Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from 8e15d3d to 46bca63 Compare September 5, 2025 15:31

junaidbinfarooq reviewed Sep 9, 2025

View reviewed changes

examples/ollama/chat-llama-with-cache.php Outdated Show resolved Hide resolved

Guikingone force-pushed the ollama/prompt_caching branch 4 times, most recently from 914eddf to 8112ab2 Compare September 15, 2025 08:57

Guikingone requested review from OskarStark, VincentLanglet and junaidbinfarooq September 15, 2025 09:06

junaidbinfarooq reviewed Sep 15, 2025

View reviewed changes

Guikingone force-pushed the ollama/prompt_caching branch 3 times, most recently from de85ef7 to e038555 Compare September 23, 2025 11:56

Guikingone force-pushed the ollama/prompt_caching branch from e038555 to 194f7a4 Compare September 29, 2025 15:51

OskarStark reviewed Sep 29, 2025

View reviewed changes

src/ai-bundle/src/AiBundle.php Outdated Show resolved Hide resolved

OskarStark reviewed Sep 29, 2025

View reviewed changes

src/ai-bundle/src/AiBundle.php Outdated Show resolved Hide resolved

chr-hertel reviewed Sep 29, 2025

View reviewed changes

Guikingone force-pushed the ollama/prompt_caching branch 4 times, most recently from ce368c9 to 3ce43bd Compare October 28, 2025 17:00

Guikingone changed the title ~~[Platform][Ollama] Add prompt cache~~ [Platform] Introduce CachedPlatform Nov 7, 2025

Guikingone force-pushed the ollama/prompt_caching branch from 3ce43bd to b94644b Compare November 7, 2025 13:14

Guikingone force-pushed the ollama/prompt_caching branch from b94644b to 1e9a194 Compare November 16, 2025 18:33

Guikingone added 2 commits November 16, 2025 19:35

feat(platform): Ollama prompt cache

d7a9bfc

ref

a4f0eba

Guikingone force-pushed the ollama/prompt_caching branch from 1e9a194 to a4f0eba Compare November 16, 2025 18:35

OskarStark reviewed Nov 17, 2025

View reviewed changes

	this platform allows to reduce network calls / resource comsumption:
	this platform allows to reduce network calls / resource consumption:


		echo $firstResult->getContent().\PHP_EOL;

		$secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));

	$secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));
	// This call will not be executed against the API
	$secondResult = $cachedPlatform->invoke('gpt-4o-mini', new MessageBag(Message::ofUser('What is the capital of France?')));

Uh oh!

[Platform] Introduce CachedPlatform #416

Are you sure you want to change the base?

[Platform] Introduce CachedPlatform #416

Uh oh!

Conversation

Guikingone commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 25, 2025

Uh oh!

chr-hertel commented Sep 25, 2025

Uh oh!

Guikingone commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Platform] Introduce `CachedPlatform` #416

[Platform] Introduce `CachedPlatform` #416

Guikingone commented Sep 3, 2025 •

edited

Loading