feat: adds support for image builder api v3 by daniel-white · Pull Request #504 · tensorlakeai/tensorlake

daniel-white · 2026-01-13T20:03:29Z

adds v3 endpoint suppport
new high level builder client
replaces existing image build

src/tensorlake/builder/client_v3.py

eabatalov · 2026-01-20T22:16:54Z

src/tensorlake/applications/interface/function.py


    def __str__(self) -> str:
        # Shows a simple human readable representation of the Function. Used in error messages.
-        function_name: str = (


Don't change this please. This is handling cases of inconsistent misconfigured by user function objects.
When app is deployed it's fully validated and doesn't have any inconsistencies.

eabatalov · 2026-01-20T22:18:43Z

src/tensorlake/applications/interface/function.py

        return self._awaitables_factory

+    @property
+    def function_name(self) -> str:


This new getter is most probably not needed because we're already using this information provided by SDK.
I'll figure out how you use this and write an approach that aligns with current architecture.

It was getting it via the private path

eabatalov · 2026-01-20T22:20:22Z

src/tensorlake/cli/deploy.py

+)
+from tensorlake.applications.image_builder.client_v3 import (
+    ImageBuilderClientV3,
+    ImageBuilderClientV3Options,


Why ImageBuilderClient and ImageBuilder are in separate packages? Aren't all of this is an ImageBuilderClient stuff?

Separation of concerns.

eabatalov · 2026-01-20T22:21:33Z

src/tensorlake/cli/deploy.py

+        f"⚙️  Preparing deployment for application(s) from {application_file_path}"
    )

+    opts = ImageBuilderClientV3Options.from_env()


Please add a type hint to opts.

eabatalov · 2026-01-20T22:26:27Z

src/tensorlake/cli/deploy.py

+
+            if not build_req.images:
+                click.secho(
+                    f"❌ No images found for application '{fn_config.function_name}'. "


This is already validated during application validation. No need to do this check again.

eabatalov · 2026-01-20T22:28:53Z

src/tensorlake/cli/deploy.py

+                raise click.Abort
+
+            await builder.build(build_req)
+        except (


If we're rewriting ImageBuilder then it should provide a clear contract on what exception types it rases in which cases. We also shouldn't use click inside SDK where the new ImageBuilder was moved now as SDK doesn't depend on Click, only CLI.

eabatalov · 2026-01-20T22:29:31Z

src/tensorlake/cli/deploy.py

+        except Exception:
+            # Error message and summary are already printed by builder.build()
+            # Print final error message
+            click.secho("\n❌ Image build(s) failed", err=True, fg="red")


We swallow the Exception here (not printing it) how we and user are going to debug such errors?

eabatalov · 2026-01-20T22:32:56Z

src/tensorlake/applications/image_builder/__init__.py

+from typing import AsyncGenerator
+from uuid import uuid4 as uuid
+
+import click


Click is a CLI only dependency, please don't add it into SDK.

lets discuss tomorrow.

eabatalov · 2026-01-20T22:33:37Z

src/tensorlake/applications/image_builder/__init__.py

+import tempfile
+from datetime import datetime
+from typing import AsyncGenerator
+from uuid import uuid4 as uuid


We typically use nanoid (vendored module) but if you want we can use something else.

eabatalov · 2026-01-20T22:35:09Z

src/tensorlake/applications/image_builder/__init__.py

+
+
+class BuildRequest:
+    """Represents a request to build an application with multiple container images.


Do we need such verbose docstrings? Asking because these are internal non-public components.

yea i thought these were public

eabatalov · 2026-01-20T22:36:23Z

src/tensorlake/applications/image_builder/client_v3.py

@@ -0,0 +1,1122 @@
+"""


Please reduce verbosity of docstrings and make them concise and clear.

eabatalov · 2026-01-20T22:39:17Z

src/tensorlake/applications/image_builder/__init__.py

+        """
+
+        try:
+            v3_req = (


Please add type hints to all local variables to keep the code base easy to read and enforce discipline and alignment with existing coding practices in the repo.

anyway to force that with a tool?

eabatalov · 2026-01-20T22:41:23Z

src/tensorlake/applications/image_builder/__init__.py

+            }
+        except ImageBuilderClientV3Error as e:
+            click.secho(str(e), err=True, fg="red")
+            raise


All these exceptions need to bubble to CLI and then logged their during click.echo.
ImageBuilder needs to document all exceptions it can raise with enough details to make exception handling by CLI meaningful. If CLI is not going to do different things depending on the exception type then just document "raises Exception on error".

yea. im thinking we need some kind of verbose mode, otherwise showing full stack traces all the time is annoying

eabatalov · 2026-01-20T22:42:34Z

src/tensorlake/applications/image_builder/__init__.py

+            ]
+            _ = await asyncio.gather(*process_log_events_tasks, return_exceptions=True)
+
+        except (asyncio.CancelledError, KeyboardInterrupt, click.Abort):


Should we remove except for these exceptions from cli code?

eabatalov · 2026-01-20T22:43:39Z

src/tensorlake/applications/image_builder/__init__.py

+                    reporters.values(), log_streams, strict=True
+                )
+            ]
+            _ = await asyncio.gather(*process_log_events_tasks, return_exceptions=True)


return_exceptions=True means that all exceptions raised are put into the returned list which is ignored here.
Is this correct?

eabatalov · 2026-01-20T22:45:15Z

src/tensorlake/applications/image_builder/__init__.py

+                self._handle_build_info_error(
+                    reporter,
+                    summary,
+                    f"Error getting final build info for {reporter.image_build_id}: {e}",


Do we need to write all these separate except blocks to add these "Error getting final build info" lines while a full backtrace and exception is available to us?

just different error messages. happy to learn

eabatalov · 2026-01-20T22:49:10Z

src/tensorlake/applications/image_builder/__init__.py

+        Args:
+            summary: Dictionary with counts for total, succeeded, failed, canceled, and unknown.
+        """
+        total = summary["total"]


Please use a @DataClass or anything similar instead of untyped dicts.
Also please don't print anything except one success message for all image builds if everything succeeded.
We need to reduce cognitive load on users, not increase it by printing more build stats than they need to see.
I think all these print methods can be removed unless you have a clear reason why to keep them.
Also we need to output a single final success/failure message for all applications. Thus this printing needs to be done in CLI.

eabatalov · 2026-01-20T22:53:52Z

src/tensorlake/applications/image_builder/__init__.py

+        """
+        if not image_info.functions:
+            raise ValueError("image_info.functions cannot be empty")
+        self.image_info = image_info


If the whole purpose of this object is to wrap ImageInformation and provide a method that transforms it, why don't just replace it with private function def _v3_request(image_info) ? This would remove > 100 lines of docstrings and not useful code. Also ImageBuildRequest and BuildRequest are hard to understand entities. It sounds like the same things. If we keep ImageBuildRequest class then it needs to be clear what it is just from its name.

eabatalov · 2026-01-20T22:57:09Z

src/tensorlake/applications/image_builder/__init__.py

+    may consist of multiple container images. Each image can contain different
+    sets of functions and their dependencies.
+
+    Tensorlake applications can use multiple images to separate functions with


We don't need to explain Tensorlake Applications UX here as this class if for internal use and we have good public docs that are responsible for explaining all of this.

eabatalov · 2026-01-20T23:01:47Z

src/tensorlake/applications/image_builder/__init__.py

+            The v3 image build request.
+        """
+        image = self.image_info.image
+        function_names = [func.function_name for func in self.image_info.functions]


Please use func._function_config.function_name. It's okay because this code is part of SDK.
This is consistent with how function name is obtained currently all across SDK for properly initialized function objects. Also .function_name public property shouldn't be added to Function object because it makes it usable by users because they work with Function objects directly. We minimize API surface exposed to users to minimal to ease maintenance.

felt kind of yucky reaching into its internals...

eabatalov · 2026-01-20T23:03:42Z

src/tensorlake/cli/deploy.py

    print_validation_messages,
    validate_loaded_applications,
 )
-from tensorlake.builder.client_v2 import BuildContext, ImageBuilderV2Client


Please delete all clients not used anymore in this PR.

eabatalov · 2026-01-20T23:05:13Z

src/tensorlake/applications/image_builder/__init__.py

+            self._print_build_summary(summary)
+            # Use os._exit() to bypass asyncio cleanup and avoid "unhandled exception" errors
+            # This exits immediately without triggering asyncio.run() cleanup issues
+            os._exit(0)


This looks like a hack, please fix this properly or explain why it's not possible at the moment.

eabatalov · 2026-01-20T23:07:25Z

src/tensorlake/applications/image_builder/__init__.py

+                    f"Build service error getting final build info for {reporter.image_build_id}: {e}",
+                )
+            except Exception as e:  # pylint: disable=broad-except
+                self._handle_build_info_error(


Why do we have many except blocks that do almost the same?
Also why do we call _handle_build_info_error instead of bubbling the exception up to the caller?

eabatalov · 2026-01-20T23:09:11Z

src/tensorlake/applications/image_builder/client_v3.py

+
+from tensorlake.cli._common import ASYNC_HTTP_EVENT_HOOKS
+
+# Enable httpx debug logging if requested via environment variable


httpx is used across SDK and CLI.
What side effects does this have on them?
Is it right to put this code to client_v3.py?
Is this approach documented in httpx docs?

eabatalov · 2026-01-20T23:10:50Z

src/tensorlake/applications/image_builder/client_v3.py

+        return self._project_id
+
+    @classmethod
+    def from_env(cls) -> "ImageBuilderClientV3Options":


This looks to me like we're copy-pasting CLI code here.
Why do we do that? Who's going to use from_env() ?

eabatalov · 2026-01-20T23:11:42Z

src/tensorlake/applications/image_builder/client_v3.py

+        Validation is not performed in __init__. Call validate() to validate the configuration.
+    """
+
+    _base_url: str = field(init=False)


If this is a simple @DataClass then why do we make these fields private and add lots of getter properties for it? Can we remove the getter properties?

eabatalov · 2026-01-20T23:13:00Z

src/tensorlake/applications/image_builder/client_v3.py

+    _api_key: str | None = field(default=None, init=False)
+    _pat: str | None = field(default=None, init=False)
+    _organization_id: str | None = field(default=None, init=False)
+    _project_id: str | None = field(default=None, init=False)


What's the use for field here?

eabatalov · 2026-01-20T23:18:46Z

src/tensorlake/applications/image_builder/client_v3.py

+
+
+@dataclass
+class ImageBuilderClientV3Options:


Can we remove this and just create ImageBuilderClientV3() directly in CLI?
We're essentially creating an abstraction here over concepts that ImageBuilderClient shouldn't know about like pat,api_key,organization_id, project_id. This is all managed by CLI. Long story short there's nothing ImageBuilder specific in ImageBuilderClientV3Options.

Also please extract common logic with APIClient, see https://github.com/tensorlakeai/tensorlake/blob/main/src/tensorlake/applications/remote/api_client.py#L157C9-L157C16 and https://github.com/tensorlakeai/tensorlake/blob/main/src/tensorlake/applications/remote/api_client.py#L490 and reuse it in both this client and APIClient. This will result in a more complete refactoring that you started already.

If you really want to keep ImageBuilderClientV3Options then it needs to be a common non-image builder specific options object accepted by both image builder and existing API clients. The options objects needs to be defined in a common package and filled by CLI which handles configuration of all these things.
Also if you use it then existing CLI code needs to use it too where applicable.

eabatalov · 2026-01-20T23:27:00Z

src/tensorlake/applications/image_builder/client_v3.py

+# ============================================================================
+
+# Type aliases for clarity (these are just str at runtime)
+ImageBuildId = str


Why do we add these type aliases instead of using str directly?

eabatalov

Overall feedback: please remove unnecessary docstrings and abstractions + address the comments I left so far.

eabatalov · 2026-01-21T10:20:39Z

Another thing. Please move all image builder related stuff inside SDK under remote directory. This is where all Tensorlake Cloud (aka remote mode) code is strictly located. Everything else besides local and remote dirs is generic.

I didn't review the following but please ensure that it's correct:

Request timeouts for each httpx client request we're doing are correct for the expected duration of the request.
Every httpx.client request has a correct retry policy.
See APIClient for examples.

daniel-white force-pushed the feat/imagebuilderv3 branch from cd6b0db to 5aa6cb6 Compare January 13, 2026 20:06

calavera reviewed Jan 13, 2026

View reviewed changes

src/tensorlake/builder/client_v3.py Outdated Show resolved Hide resolved

daniel-white force-pushed the feat/imagebuilderv3 branch from 81b41a2 to 112e296 Compare January 15, 2026 18:17

daniel-white added 8 commits January 20, 2026 11:17

feat: adds support for image builder api v3

bf5614d

rename to make it easier to understand what each type is

b0ac385

log and sync

ffc5722

clean up, new emojis

042ecf7

refactor

745f6fd

adds cancel app build api

e40907a

shorten display name

e596bec

periodic message

6896c00

daniel-white requested a review from eabatalov January 20, 2026 17:10

daniel-white force-pushed the feat/imagebuilderv3 branch from b9f4ecb to 6896c00 Compare January 20, 2026 17:11

daniel-white changed the title ~~WIP feat: adds support for image builder api v3~~ feat: adds support for image builder api v3 Jan 20, 2026

daniel-white marked this pull request as ready for review January 20, 2026 20:30

eabatalov reviewed Jan 20, 2026

View reviewed changes

eabatalov requested changes Jan 20, 2026

View reviewed changes

daniel-white added 2 commits January 20, 2026 20:04

data class and nano id

5496fa3

reduce docs, remove type aliases

21920f7



		class BuildRequest:
		"""Represents a request to build an application with multiple container images.


		from tensorlake.cli._common import ASYNC_HTTP_EVENT_HOOKS

		# Enable httpx debug logging if requested via environment variable

Conversation

daniel-white commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eabatalov Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eabatalov Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eabatalov Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eabatalov Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

daniel-white commented Jan 13, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading

eabatalov Jan 20, 2026 •

edited

Loading