Эх сурвалжийг харах

Feat: Divided docker layer to make it easier to cache (#4313)

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
tofarr 1 жил өмнө
parent
commit
5fb3dece93

+ 27 - 55
docs/modules/usage/architecture/runtime.md

@@ -70,74 +70,46 @@ Check out the [relevant code](https://github.com/All-Hands-AI/OpenHands/blob/mai
 
 ### Image Tagging System
 
-OpenHands uses a dual-tagging system for its runtime images to balance reproducibility with flexibility:
+OpenHands uses a dual-tagging system for its runtime images to balance reproducibility with flexibility.
+Tags may be in one of 2 formats:
 
-1. Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
-   Example: `runtime:abc123def456`
+- **Generic**: `oh_v{openhands_version}_{16_digit_lock_hash}` (e.g.: `oh_v0.9.9_1234567890abcdef`)
+- **Specific**: `oh_v{openhands_version}_{16_digit_lock_hash}_{16_digit_source_hash}`
+  (e.g.: `oh_v0.9.9_1234567890abcdef_1234567890abcdef`)
 
-   - This tag is based on the MD5 hash of the Docker build folder, which includes the source code (of runtime client and related dependencies) and Dockerfile
-   - Identical hash tags guarantee that the images were built with exactly the same source code and Dockerfile
-   - This ensures reproducibility; the same hash always means the same image contents
+#### Lock Hash
 
-2. Generic tag: `{target_image_repo}:{target_image_tag}`.
-   Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`
+This hash is built from the first 16 digits of the MD5 of:
+- The name of the base image upon which the image was built (e.g.: `nikolaik/python-nodejs:python3.12-nodejs22`)
+- The content of the `pyproject.toml` included in the image.
+- The content of the `poetry.lock` included in the image.
 
-   - This tag follows the format: `runtime:oh_v{OH_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
-   - It represents the latest build for a particular base image and OpenHands version combination
-   - This tag is updated whenever a new image is built from the same base image, even if the source code changes
+This effectively gives a hash for the dependencies of Openhands independent of the source code.
 
-The hash-based tag ensures reproducibility, while the generic tag provides a stable reference to the latest version of a particular configuration. This dual-tagging approach allows OpenHands to efficiently manage both development and production environments.
+#### Source Hash
 
-### Build Process
+This is the first 16 digits of the MD5 of the directory hash for the source directory. This gives a hash
+for only the openhands source
 
-1. Image Naming Convention:
-   - Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
-     Example: `runtime:abc123def456`
-   - Generic tag: `{target_image_repo}:{target_image_tag}`.
-     Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`
+#### Build Process
 
-2. Build Process:
-   - a. Convert the base image name to an OH runtime image name
-      Example: `ubuntu:22.04` -> `runtime:oh_v0.9.3_ubuntu_tag_22.04`
-   - b. Generate a build context (Dockerfile and OpenHands source code) and calculate its hash
-   - c. Check for an existing image with the calculated hash
-   - d. If not found, check for a recent compatible image to use as a base
-   - e. If no compatible image exists, build from scratch using the original base image
-   - f. Tag the new image with both hash-based and generic tags
+When generating an image...
 
-3. Image Reuse and Rebuilding Logic:
-   The system follows these steps to determine whether to build a new image or use an existing one from a user-provided (base) image (e.g., `ubuntu:22.04`):
-   - a. If an image exists with the same hash (e.g., `runtime:abc123def456`), it will be reused as is
-   - b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
-   - c. If neither the hash-tagged nor the generic-tagged image is found, the system will build the image completely from scratch
+- OpenHands first checks whether an image with the same **Specific** tag exists. If there is such an image,
+  no build is performed - the existing image is used.
+- OpenHands next checks whether an image with the **Generic** tag exists. If there is such an image,
+  OpenHands builds a new image based upon it, bypassing all installation steps (like `poetry install` and
+  `apt-get`) except a final operation to copy the current source code. The new image is tagged with a
+  **Specific** tag only.
+- If neither a **Specific** nor **Generic** tag exists, a brand new image is built based upon the base
+  image (Which is a slower operation). This new image is tagged with both the **Generic** and **Specific**
+  tags.
 
-4. Caching and Efficiency:
-   - The system attempts to reuse existing images when possible to save build time
-   - If an exact match (by hash) is found, it's used without rebuilding
-   - If a compatible image is found, it's used as a base for rebuilding, saving time on dependency installation
-
-Here's a flowchart illustrating the build process:
-
-```mermaid
-flowchart TD
-    A[Start] --> B{Convert base image name}
-    B --> |ubuntu:22.04 -> runtime:oh_v0.9.3_ubuntu_tag_22.04| C[Generate build context and hash]
-    C --> D{Check for existing image with hash}
-    D -->|Found runtime:abc123def456| E[Use existing image]
-    D -->|Not found| F{Check for runtime:oh_v0.9.3_ubuntu_tag_22.04}
-    F -->|Found| G[Rebuild based on recent image]
-    F -->|Not found| H[Build from scratch]
-    G --> I[Tag with hash and generic tags]
-    H --> I
-    E --> J[End]
-    I --> J
-```
-
-This approach ensures that:
+This dual-tagging approach allows OpenHands to efficiently manage both development and production environments.
 
 1. Identical source code and Dockerfile always produce the same image (via hash-based tags)
 2. The system can quickly rebuild images when minor changes occur (by leveraging recent compatible images)
-3. The generic tag (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination
+3. The generic tag (e.g., `runtime:oh_v0.9.3_1234567890abcdef`) always points to the latest build for a particular base image and OpenHands version combination
 
 ## Runtime Plugin System
 

+ 0 - 3
openhands/runtime/builder/docker.py

@@ -59,9 +59,6 @@ class DockerRuntimeBuilder(RuntimeBuilder):
         target_image_repo, target_image_hash_tag = target_image_hash_name.split(':')
         target_image_tag = tags[1].split(':')[1] if len(tags) > 1 else None
 
-        # Check if the image exists and pull if necessary
-        self.image_exists(target_image_hash_name)
-
         buildx_cmd = [
             'docker',
             'buildx',

+ 6 - 4
openhands/runtime/modal/runtime.py

@@ -2,6 +2,7 @@ import os
 import tempfile
 import threading
 import uuid
+from pathlib import Path
 from typing import Callable, Generator
 
 import modal
@@ -14,7 +15,7 @@ from openhands.events import EventStream
 from openhands.runtime.client.runtime import EventStreamRuntime, LogBuffer
 from openhands.runtime.plugins import PluginRequirement
 from openhands.runtime.utils.runtime_build import (
-    prep_docker_build_folder,
+    prep_build_folder,
 )
 
 
@@ -148,9 +149,10 @@ class ModalRuntime(EventStreamRuntime):
             base_runtime_image = modal.Image.from_registry(runtime_container_image_id)
         elif base_container_image_id:
             build_folder = tempfile.mkdtemp()
-            prep_docker_build_folder(
-                build_folder,
-                base_container_image_id,
+            prep_build_folder(
+                build_folder=Path(build_folder),
+                base_image=base_container_image_id,
+                build_from_scratch=True,
                 extra_deps=runtime_extra_deps,
             )
 

+ 183 - 253
openhands/runtime/utils/runtime_build.py

@@ -1,16 +1,17 @@
 import argparse
 import hashlib
-import importlib.metadata
 import os
 import shutil
+import string
 import tempfile
+from pathlib import Path
+from typing import List
 
 import docker
 from dirhash import dirhash
 from jinja2 import Environment, FileSystemLoader
 
 import openhands
-from openhands import __package_name__
 from openhands import __version__ as oh_version
 from openhands.core.logger import openhands_logger as logger
 from openhands.runtime.builder import DockerRuntimeBuilder, RuntimeBuilder
@@ -20,65 +21,16 @@ def get_runtime_image_repo():
     return os.getenv('OH_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/all-hands-ai/runtime')
 
 
-def _put_source_code_to_dir(temp_dir: str):
-    """Builds the project source tarball directly in temp_dir and unpacks it.
-    The OpenHands source code ends up in the temp_dir/code directory.
-    Parameters:
-    - temp_dir (str): The directory to put the source code in
-    """
-    if not os.path.isdir(temp_dir):
-        raise RuntimeError(f'Temp directory {temp_dir} does not exist')
-
-    dest_dir = os.path.join(temp_dir, 'code')
-    openhands_dir = None
-
-    try:
-        # Try to get the source directory from the installed package
-        distribution = importlib.metadata.distribution(__package_name__)
-        source_dir = os.path.dirname(distribution.locate_file(__package_name__))
-        openhands_dir = os.path.join(source_dir, 'openhands')
-    except importlib.metadata.PackageNotFoundError:
-        pass
-
-    if openhands_dir is not None and os.path.isdir(openhands_dir):
-        logger.info(f'Package {__package_name__} found')
-        shutil.copytree(openhands_dir, os.path.join(dest_dir, 'openhands'))
-        # note: "pyproject.toml" and "poetry.lock" are included in the openhands
-        # package, so we need to move them out to the top-level directory
-        for filename in ['pyproject.toml', 'poetry.lock']:
-            shutil.move(os.path.join(dest_dir, 'openhands', filename), dest_dir)
-    else:
-        # If package is not found, build from source code
-        project_root = os.path.dirname(
-            os.path.dirname(os.path.abspath(openhands.__file__))
-        )
-        logger.info(f'Building source distribution using project root: {project_root}')
-
-        # Copy the 'openhands' directory
-        openhands_dir = os.path.join(project_root, 'openhands')
-        if not os.path.isdir(openhands_dir):
-            raise RuntimeError(f"'openhands' directory not found in {project_root}")
-        shutil.copytree(openhands_dir, os.path.join(dest_dir, 'openhands'))
-
-        # Copy pyproject.toml and poetry.lock files
-        for file in ['pyproject.toml', 'poetry.lock']:
-            src_file = os.path.join(project_root, file)
-            dest_file = os.path.join(dest_dir, file)
-            shutil.copy2(src_file, dest_file)
-
-    logger.info(f'Unpacked source code directory: {dest_dir}')
-
-
 def _generate_dockerfile(
     base_image: str,
-    skip_init: bool = False,
+    build_from_scratch: bool = True,
     extra_deps: str | None = None,
 ) -> str:
     """Generate the Dockerfile content for the runtime image based on the base image.
 
     Parameters:
     - base_image (str): The base image provided for the runtime image
-    - skip_init (boolean):
+    - build_from_scratch (boolean): False implies most steps can be skipped (Base image is another openhands instance)
     - extra_deps (str):
 
     Returns:
@@ -93,69 +45,12 @@ def _generate_dockerfile(
 
     dockerfile_content = template.render(
         base_image=base_image,
-        skip_init=skip_init,
+        build_from_scratch=build_from_scratch,
         extra_deps=extra_deps if extra_deps is not None else '',
     )
     return dockerfile_content
 
 
-def prep_docker_build_folder(
-    dir_path: str,
-    base_image: str,
-    skip_init: bool = False,
-    extra_deps: str | None = None,
-) -> str:
-    """Prepares a docker build folder by copying the source code and generating the Dockerfile
-
-    Parameters:
-    - dir_path (str): The build folder to place the source code and Dockerfile
-    - base_image (str): The base Docker image to use for the Dockerfile
-    - skip_init (str):
-    - extra_deps (str):
-
-    Returns:
-    - str: The MD5 hash of the build folder directory (dir_path)
-    """
-    # Copy the source code to directory. It will end up in dir_path/code
-    _put_source_code_to_dir(dir_path)
-
-    # Create a Dockerfile and write it to dir_path
-    dockerfile_content = _generate_dockerfile(
-        base_image,
-        skip_init=skip_init,
-        extra_deps=extra_deps,
-    )
-    if os.getenv('SKIP_CONTAINER_LOGS', 'false') != 'true':
-        logger.debug(
-            (
-                f'===== Dockerfile content start =====\n'
-                f'{dockerfile_content}\n'
-                f'===== Dockerfile content end ====='
-            )
-        )
-    with open(os.path.join(dir_path, 'Dockerfile'), 'w') as file:
-        file.write(dockerfile_content)
-
-    # Get the MD5 hash of the dir_path directory
-    dir_hash = dirhash(
-        dir_path,
-        'md5',
-        ignore=[
-            '.*/',  # hidden directories
-            '__pycache__/',
-            '*.pyc',
-        ],
-    )
-    hash = f'v{oh_version}_{dir_hash}'
-    logger.info(
-        f'Input base image: {base_image}\n'
-        f'Skip init: {skip_init}\n'
-        f'Extra deps: {extra_deps}\n'
-        f'Hash for docker build directory [{dir_path}] (contents: {os.listdir(dir_path)}): {hash}\n'
-    )
-    return hash
-
-
 def get_runtime_image_repo_and_tag(base_image: str) -> tuple[str, str]:
     """Retrieves the Docker repo and tag associated with the Docker image.
 
@@ -204,7 +99,7 @@ def build_runtime_image(
     base_image: str,
     runtime_builder: RuntimeBuilder,
     extra_deps: str | None = None,
-    docker_build_folder: str | None = None,
+    build_folder: str | None = None,
     dry_run: bool = False,
     force_rebuild: bool = False,
 ) -> str:
@@ -215,7 +110,7 @@ def build_runtime_image(
     - base_image (str): The name of the base Docker image to use
     - runtime_builder (RuntimeBuilder): The runtime builder to use
     - extra_deps (str):
-    - docker_build_folder (str): The directory to use for the build. If not provided a temporary directory will be used
+    - build_folder (str): The directory to use for the build. If not provided a temporary directory will be used
     - dry_run (bool): if True, it will only ready the build folder. It will not actually build the Docker image
     - force_rebuild (bool): if True, it will create the Dockerfile which uses the base_image
 
@@ -224,160 +119,195 @@ def build_runtime_image(
 
     See https://docs.all-hands.dev/modules/usage/architecture/runtime for more details.
     """
-    # Calculate the hash for the docker build folder (source code and Dockerfile)
-    with tempfile.TemporaryDirectory() as temp_dir:
-        from_scratch_hash = prep_docker_build_folder(
-            temp_dir,
-            base_image=base_image,
-            skip_init=False,
-            extra_deps=extra_deps,
-        )
-
-    runtime_image_repo, runtime_image_tag = get_runtime_image_repo_and_tag(base_image)
-
-    # The image name in the format <image repo>:<hash>
-    hash_runtime_image_name = f'{runtime_image_repo}:{from_scratch_hash}'
-
-    # non-hash generic image name, it could contain *similar* dependencies
-    # but *might* not exactly match the state of the source code.
-    # It resembles the "latest" tag in the docker image naming convention for
-    # a particular {repo}:{tag} pair (e.g., ubuntu:latest -> runtime:ubuntu_tag_latest)
-    # we will build from IT to save time if the `from_scratch_hash` is not found
-    generic_runtime_image_name = f'{runtime_image_repo}:{runtime_image_tag}'
-
-    # Scenario 1: If we already have an image with the exact same hash, then it means the image is already built
-    # with the exact same source code and Dockerfile, so we will reuse it. Building it is not required.
-    if not force_rebuild and runtime_builder.image_exists(
-        hash_runtime_image_name, False
-    ):
-        logger.info(
-            f'Image [{hash_runtime_image_name}] already exists so we will reuse it.'
-        )
-        return hash_runtime_image_name
-
-    # Scenario 2: If a Docker image with the exact hash is not found, we will FIRST try to re-build it
-    # by leveraging the `generic_runtime_image_name` to save some time
-    # from re-building the dependencies (e.g., poetry install, apt install)
-    if not force_rebuild and runtime_builder.image_exists(generic_runtime_image_name):
-        logger.info(
-            f'Could not find docker image [{hash_runtime_image_name}]\n'
-            f'Will try to re-build it from latest [{generic_runtime_image_name}] image to potentially save '
-            f'time for dependencies installation.\n'
-        )
-
-        cur_docker_build_folder = docker_build_folder or tempfile.mkdtemp()
-        _skip_init_hash = prep_docker_build_folder(
-            cur_docker_build_folder,
-            # we want to use the existing generic image as base
-            # so that we can leverage existing dependencies already installed in the image
-            base_image=generic_runtime_image_name,
-            skip_init=True,  # skip init since we are re-using the existing image
-            extra_deps=extra_deps,
-        )
-
-        assert (
-            _skip_init_hash != from_scratch_hash
-        ), f'The skip_init hash [{_skip_init_hash}] should not match the existing hash [{from_scratch_hash}]'
-
-        if not dry_run:
-            _build_sandbox_image(
-                docker_folder=cur_docker_build_folder,
+    if build_folder is None:
+        with tempfile.TemporaryDirectory() as temp_dir:
+            result = build_runtime_image_in_folder(
+                base_image=base_image,
                 runtime_builder=runtime_builder,
-                target_image_repo=runtime_image_repo,
-                # NOTE: WE ALWAYS use the "from_scratch_hash" tag for the target image
-                # otherwise, even if the source code is exactly the same, the image *might* be re-built
-                # because the same source code will generate different hash when skip_init=True/False
-                # since the Dockerfile is slightly different
-                target_image_hash_tag=from_scratch_hash,
-                target_image_tag=runtime_image_tag,
+                build_folder=Path(temp_dir),
+                extra_deps=extra_deps,
+                dry_run=dry_run,
+                force_rebuild=force_rebuild,
             )
-        else:
-            logger.info(
-                f'Dry run: Skipping image build for [{generic_runtime_image_name}]'
-            )
-
-        if docker_build_folder is None:
-            shutil.rmtree(cur_docker_build_folder)
+            return result
 
-    # Scenario 3: If the Docker image with the required hash is not found AND we cannot re-use the latest
-    # relevant image, we will build it completely from scratch
-    else:
-        if force_rebuild:
-            logger.info(
-                f'Force re-build: Will try to re-build image [{generic_runtime_image_name}] from scratch.\n'
-            )
+    result = build_runtime_image_in_folder(
+        base_image=base_image,
+        runtime_builder=runtime_builder,
+        build_folder=Path(build_folder),
+        extra_deps=extra_deps,
+        dry_run=dry_run,
+        force_rebuild=force_rebuild,
+    )
+    return result
 
-        cur_docker_build_folder = docker_build_folder or tempfile.mkdtemp()
-        _new_from_scratch_hash = prep_docker_build_folder(
-            cur_docker_build_folder,
-            base_image,
-            skip_init=False,
-            extra_deps=extra_deps,
-        )
-        assert (
-            _new_from_scratch_hash == from_scratch_hash
-        ), f'The new from scratch hash [{_new_from_scratch_hash}] does not match the existing hash [{from_scratch_hash}]'
 
+def build_runtime_image_in_folder(
+    base_image: str,
+    runtime_builder: RuntimeBuilder,
+    build_folder: Path,
+    extra_deps: str | None,
+    dry_run: bool,
+    force_rebuild: bool,
+) -> str:
+    runtime_image_repo, _ = get_runtime_image_repo_and_tag(base_image)
+    lock_tag = f'oh_v{oh_version}_{get_hash_for_lock_files(base_image)}'
+    hash_tag = f'{lock_tag}_{get_hash_for_source_files()}'
+    hash_image_name = f'{runtime_image_repo}:{hash_tag}'
+
+    if force_rebuild:
+        logger.info(f'Force rebuild: [{runtime_image_repo}:{hash_tag}] from scratch.')
+        prep_build_folder(build_folder, base_image, True, extra_deps)
         if not dry_run:
             _build_sandbox_image(
-                docker_folder=cur_docker_build_folder,
-                runtime_builder=runtime_builder,
-                target_image_repo=runtime_image_repo,
-                # NOTE: WE ALWAYS use the "from_scratch_hash" tag for the target image
-                target_image_hash_tag=from_scratch_hash,
-                target_image_tag=runtime_image_tag,
-            )
-        else:
-            logger.info(
-                f'Dry run: Skipping image build for [{generic_runtime_image_name}]'
+                build_folder,
+                runtime_builder,
+                runtime_image_repo,
+                hash_tag,
+                lock_tag,
             )
+        return hash_image_name
+
+    lock_image_name = f'{runtime_image_repo}:{lock_tag}'
+    build_from_scratch = True
+
+    # If the exact image already exists, we do not need to build it
+    if runtime_builder.image_exists(hash_image_name, False):
+        logger.info(f'Reusing Image [{hash_image_name}]')
+        return hash_image_name
+
+    # We look for an existing image that shares the same lock_tag. If such an image exists, we
+    # can use it as the base image for the build and just copy source files. This makes the build
+    # much faster.
+    if runtime_builder.image_exists(lock_image_name):
+        logger.info(f'Build [{hash_image_name}] from [{lock_image_name}]')
+        build_from_scratch = False
+        base_image = lock_image_name
+    else:
+        logger.info(f'Build [{hash_image_name}] from scratch')
+
+    prep_build_folder(build_folder, base_image, build_from_scratch, extra_deps)
+    if not dry_run:
+        _build_sandbox_image(
+            build_folder,
+            runtime_builder,
+            runtime_image_repo,
+            hash_tag,
+            lock_tag,
+        )
 
-        if docker_build_folder is None:
-            shutil.rmtree(cur_docker_build_folder)
-
-    return f'{runtime_image_repo}:{from_scratch_hash}'
+    return hash_image_name
 
 
-def _build_sandbox_image(
-    docker_folder: str,
-    runtime_builder: RuntimeBuilder,
-    target_image_repo: str,
-    target_image_hash_tag: str,
-    target_image_tag: str,
-) -> str:
-    """Build and tag the sandbox image.
-    The image will be tagged as both:
-        - target_image_hash_tag
-        - target_image_tag
+def prep_build_folder(
+    build_folder: Path,
+    base_image: str,
+    build_from_scratch: bool,
+    extra_deps: str | None,
+):
+    # Copy the source code to directory. It will end up in build_folder/code
+    # If package is not found, build from source code
+    openhands_source_dir = Path(openhands.__file__).parent
+    project_root = openhands_source_dir.parent
+    logger.info(f'Building source distribution using project root: {project_root}')
+
+    # Copy the 'openhands' directory (Source code)
+    shutil.copytree(
+        openhands_source_dir,
+        Path(build_folder, 'code', 'openhands'),
+        ignore=shutil.ignore_patterns(
+            '.*/',
+            '__pycache__/',
+            '*.pyc',
+            '*.md',
+        ),
+    )
 
-    Parameters:
-    - docker_folder (str): the path to the docker build folder
-    - runtime_builder (RuntimeBuilder): the runtime builder instance
-    - target_image_repo (str): the repository name for the target image
-    - target_image_hash_tag (str): the *hash* tag for the target image that is calculated based
-        on the contents of the docker build folder (source code and Dockerfile)
-        e.g. 1234567890abcdef
-    -target_image_tag (str): the tag for the target image that's generic and based on the base image name
-        e.g. oh_v0.9.3_image_ubuntu_tag_22.04
-    """
-    target_image_hash_name = f'{target_image_repo}:{target_image_hash_tag}'
-    target_image_generic_name = f'{target_image_repo}:{target_image_tag}'
+    # Copy pyproject.toml and poetry.lock files
+    for file in ['pyproject.toml', 'poetry.lock']:
+        src = Path(openhands_source_dir, file)
+        if not src.exists():
+            src = Path(project_root, file)
+        shutil.copy2(src, Path(build_folder, 'code', file))
 
-    tags_to_add = [target_image_hash_name]
+    # Create a Dockerfile and write it to build_folder
+    dockerfile_content = _generate_dockerfile(
+        base_image,
+        build_from_scratch=build_from_scratch,
+        extra_deps=extra_deps,
+    )
+    with open(Path(build_folder, 'Dockerfile'), 'w') as file:  # type: ignore
+        file.write(dockerfile_content)  # type: ignore
+
+
+_ALPHABET = string.digits + string.ascii_lowercase
+
+
+def truncate_hash(hash: str) -> str:
+    """Convert the base16 hash to base36 and truncate at 16 characters."""
+    value = int(hash, 16)
+    result: List[str] = []
+    while value > 0 and len(result) < 16:
+        value, remainder = divmod(value, len(_ALPHABET))
+        result.append(_ALPHABET[remainder])
+    return ''.join(result)
+
+
+def get_hash_for_lock_files(base_image: str):
+    openhands_source_dir = Path(openhands.__file__).parent
+    md5 = hashlib.md5()
+    md5.update(base_image.encode())
+    for file in ['pyproject.toml', 'poetry.lock']:
+        src = Path(openhands_source_dir, file)
+        if not src.exists():
+            src = Path(openhands_source_dir.parent, file)
+        with open(src, 'rb') as f:
+            for chunk in iter(lambda: f.read(4096), b''):
+                md5.update(chunk)
+    # We get away with truncation because we want something that is unique
+    # rather than something that is cryptographically secure
+    result = truncate_hash(md5.hexdigest())
+    return result
+
+
+def get_hash_for_source_files():
+    openhands_source_dir = Path(openhands.__file__).parent
+    dir_hash = dirhash(
+        openhands_source_dir,
+        'md5',
+        ignore=[
+            '.*/',  # hidden directories
+            '__pycache__/',
+            '*.pyc',
+        ],
+    )
+    # We get away with truncation because we want something that is unique
+    # rather than something that is cryptographically secure
+    result = truncate_hash(dir_hash)
+    return result
 
-    # Only add the generic tag if the image does not exist
-    # so it does not get overwritten & only points to the earliest version
-    # to avoid "too many layers" after many re-builds
-    if not runtime_builder.image_exists(target_image_generic_name):
-        tags_to_add.append(target_image_generic_name)
 
-    try:
-        image_name = runtime_builder.build(path=docker_folder, tags=tags_to_add)
-        if not image_name:
-            raise RuntimeError(f'Build failed for image {target_image_hash_name}')
-    except Exception as e:
-        logger.error(f'Sandbox image build failed: {str(e)}')
-        raise
+def _build_sandbox_image(
+    build_folder: Path,
+    runtime_builder: RuntimeBuilder,
+    runtime_image_repo: str,
+    hash_tag: str,
+    lock_tag: str,
+):
+    """Build and tag the sandbox image. The image will be tagged with all tags that do not yet exist"""
+
+    names = [
+        name
+        for name in [
+            f'{runtime_image_repo}:{hash_tag}',
+            f'{runtime_image_repo}:{lock_tag}',
+        ]
+        if not runtime_builder.image_exists(name, False)
+    ]
+
+    image_name = runtime_builder.build(path=str(build_folder), tags=names)
+    if not image_name:
+        raise RuntimeError(f'Build failed for image {names}')
 
     return image_name
 
@@ -416,7 +346,7 @@ if __name__ == '__main__':
             runtime_image_hash_name = build_runtime_image(
                 args.base_image,
                 runtime_builder=DockerRuntimeBuilder(docker.from_env()),
-                docker_build_folder=temp_dir,
+                build_folder=temp_dir,
                 dry_run=True,
                 force_rebuild=args.force_rebuild,
             )

+ 25 - 22
openhands/runtime/utils/runtime_templates/Dockerfile.j2

@@ -4,10 +4,13 @@ FROM {{ base_image }}
 ENV POETRY_VIRTUALENVS_PATH=/openhands/poetry
 ENV MAMBA_ROOT_PREFIX=/openhands/micromamba
 
-{% if not skip_init %}
+{% if build_from_scratch %}
 # ================================================================
 # START: Build Runtime Image from Scratch
 # ================================================================
+# This is used in cases where the base image is something more generic like nikolaik/python-nodejs
+# rather than the current OpenHands release
+
 {% if 'ubuntu' in base_image and (base_image.endswith(':latest') or base_image.endswith(':24.04')) %}
 {% set LIBGL_MESA = 'libgl1' %}
 {% else %}
@@ -38,24 +41,14 @@ RUN mkdir -p /openhands/micromamba/bin && \
 RUN /openhands/micromamba/bin/micromamba create -n openhands -y && \
     /openhands/micromamba/bin/micromamba install -n openhands -c conda-forge poetry python=3.12 -y
 
-# ================================================================
-# END: Build Runtime Image from Scratch
-# ================================================================
-{% endif %}
-
-# ================================================================
-# START: Copy Project and Install/Update Dependencies
-# ================================================================
-RUN if [ -d /openhands/code ]; then rm -rf /openhands/code; fi
-COPY ./code /openhands/code
+# Create a clean openhands directory including only the pyproject.toml, poetry.lock and openhands/__init__.py
+RUN \
+    if [ -d /openhands/code ]; then rm -rf /openhands/code; fi && \
+    mkdir -p /openhands/code/openhands && \
+    touch /openhands/code/openhands/__init__.py
+COPY ./code/pyproject.toml ./code/poetry.lock /openhands/code
 
-# Below RUN command sets up the Python environment using Poetry,
-# installs project dependencies, and configures the container
-# for OpenHands development.
-# It creates and activates a virtual environment, installs necessary
-# tools like Playwright, sets up environment variables, and configures
-# the bash environment to ensure the correct Python interpreter and
-# virtual environment are used by default.
+# Install all dependencies
 WORKDIR /openhands/code
 RUN \
     /openhands/micromamba/bin/micromamba config set changeps1 False && \
@@ -70,16 +63,26 @@ RUN \
     /openhands/micromamba/bin/micromamba run -n openhands poetry run playwright install --with-deps chromium && \
     # Set environment variables
     echo "OH_INTERPRETER_PATH=$(/openhands/micromamba/bin/micromamba run -n openhands poetry run python -c "import sys; print(sys.executable)")" >> /etc/environment && \
-    # Install extra dependencies if specified
-    {{ extra_deps }} {% if extra_deps %} && {% endif %} \
     # Clear caches
     /openhands/micromamba/bin/micromamba run -n openhands poetry cache clear --all . && \
     # Set permissions
-    {% if not skip_init %}chmod -R g+rws /openhands/poetry && {% endif %} \
+    chmod -R g+rws /openhands/poetry && \
     mkdir -p /openhands/workspace && chmod -R g+rws,o+rw /openhands/workspace && \
     # Clean up
     apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
     /openhands/micromamba/bin/micromamba clean --all
+
+# ================================================================
+# END: Build Runtime Image from Scratch
 # ================================================================
-# END: Copy Project and Install/Update Dependencies
+{% endif %}
+
 # ================================================================
+# Copy Project source files
+# ================================================================
+RUN if [ -d /openhands/code/openhands ]; then rm -rf /openhands/code/openhands; fi
+COPY ./code/pyproject.toml ./code/poetry.lock /openhands/code
+COPY ./code/openhands /openhands/code/openhands
+
+# Install extra dependencies if specified
+{% if extra_deps %}RUN {{ extra_deps }} {% endif %}

+ 130 - 188
tests/unit/test_runtime_build.py

@@ -1,25 +1,29 @@
+import hashlib
 import os
 import tempfile
 import uuid
 from importlib.metadata import version
-from unittest import mock
-from unittest.mock import ANY, MagicMock, call, patch
+from pathlib import Path
+from unittest.mock import ANY, MagicMock, mock_open, patch
 
 import docker
 import pytest
 import toml
 from pytest import TempPathFactory
 
+import openhands
 from openhands import __version__ as oh_version
 from openhands.core.logger import openhands_logger as logger
 from openhands.runtime.builder.docker import DockerRuntimeBuilder
 from openhands.runtime.utils.runtime_build import (
     _generate_dockerfile,
-    _put_source_code_to_dir,
     build_runtime_image,
+    get_hash_for_lock_files,
+    get_hash_for_source_files,
     get_runtime_image_repo,
     get_runtime_image_repo_and_tag,
-    prep_docker_build_folder,
+    prep_build_folder,
+    truncate_hash,
 )
 
 OH_VERSION = f'oh_v{oh_version}'
@@ -73,39 +77,19 @@ def _check_source_code_in_dir(temp_dir):
     assert _pyproject_version == version('openhands-ai')
 
 
-def test_put_source_code_to_dir(temp_dir):
+def test_prep_build_folder(temp_dir):
     shutil_mock = MagicMock()
-    with patch(f'{_put_source_code_to_dir.__module__}.shutil', shutil_mock):
-        _put_source_code_to_dir(temp_dir)
-    shutil_mock.copytree.assert_called_once_with(
-        os.path.join(os.getcwd(), 'openhands'),
-        os.path.join(temp_dir, 'code', 'openhands'),
-    )
-    shutil_mock.copy2.assert_has_calls(
-        [
-            mock.call(
-                os.path.join(os.getcwd(), 'pyproject.toml'),
-                os.path.join(temp_dir, 'code', 'pyproject.toml'),
-            ),
-            mock.call(
-                os.path.join(os.getcwd(), 'poetry.lock'),
-                os.path.join(temp_dir, 'code', 'poetry.lock'),
-            ),
-        ]
-    )
-
-
-def test_docker_build_folder(temp_dir):
-    mock = MagicMock()
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        prep_docker_build_folder(
+    with patch(f'{prep_build_folder.__module__}.shutil', shutil_mock):
+        prep_build_folder(
             temp_dir,
             base_image=DEFAULT_BASE_IMAGE,
-            skip_init=False,
+            build_from_scratch=True,
+            extra_deps=None,
         )
 
     # make sure that the code was copied
-    mock.assert_called_once_with(temp_dir)
+    shutil_mock.copytree.assert_called_once()
+    assert shutil_mock.copy2.call_count == 2
 
     # Now check dockerfile is in the folder
     dockerfile_path = os.path.join(temp_dir, 'Dockerfile')
@@ -113,71 +97,40 @@ def test_docker_build_folder(temp_dir):
     assert os.path.isfile(dockerfile_path)
 
 
-def test_hash_folder_same(temp_dir):
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy the rest of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        dir_hash_1 = prep_docker_build_folder(
-            temp_dir,
-            base_image=DEFAULT_BASE_IMAGE,
-            skip_init=False,
-        )
-
-        with tempfile.TemporaryDirectory() as temp_dir_2:
-            dir_hash_2 = prep_docker_build_folder(
-                temp_dir_2,
-                base_image=DEFAULT_BASE_IMAGE,
-                skip_init=False,
-            )
-        assert dir_hash_1 == dir_hash_2
-
-
-def test_hash_folder_diff_init(temp_dir):
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy the all of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        dir_hash_1 = prep_docker_build_folder(
-            temp_dir,
-            base_image=DEFAULT_BASE_IMAGE,
-            skip_init=False,
-        )
-
-        with tempfile.TemporaryDirectory() as temp_dir_2:
-            dir_hash_2 = prep_docker_build_folder(
-                temp_dir_2,
-                base_image=DEFAULT_BASE_IMAGE,
-                skip_init=True,
-            )
-    assert dir_hash_1 != dir_hash_2
-
-
-def test_hash_folder_diff_image(temp_dir):
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy all of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        dir_hash_1 = prep_docker_build_folder(
-            temp_dir,
-            base_image=DEFAULT_BASE_IMAGE,
-            skip_init=False,
+def test_get_hash_for_lock_files():
+    with patch('builtins.open', mock_open(read_data='mock-data'.encode())):
+        hash = get_hash_for_lock_files('some_base_image')
+        # Since we mocked open to always return "mock_data", the hash is the result
+        # of hashing the name of the base image followed by "mock-data" twice
+        md5 = hashlib.md5()
+        md5.update('some_base_image'.encode())
+        for _ in range(2):
+            md5.update('mock-data'.encode())
+        assert hash == truncate_hash(md5.hexdigest())
+
+
+def test_get_hash_for_source_files():
+    dirhash_mock = MagicMock()
+    dirhash_mock.return_value = '1f69bd20d68d9e3874d5bf7f7459709b'
+    with patch(f'{get_hash_for_source_files.__module__}.dirhash', dirhash_mock):
+        result = get_hash_for_source_files()
+        assert result == truncate_hash(dirhash_mock.return_value)
+        dirhash_mock.assert_called_once_with(
+            Path(openhands.__file__).parent,
+            'md5',
+            ignore=[
+                '.*/',  # hidden directories
+                '__pycache__/',
+                '*.pyc',
+            ],
         )
 
-        with tempfile.TemporaryDirectory() as temp_dir_2:
-            dir_hash_2 = prep_docker_build_folder(
-                temp_dir_2,
-                base_image='debian:11',
-                skip_init=False,
-            )
-    assert dir_hash_1 != dir_hash_2
-
 
-def test_generate_dockerfile_scratch():
+def test_generate_dockerfile_build_from_scratch():
     base_image = 'debian:11'
     dockerfile_content = _generate_dockerfile(
         base_image,
-        skip_init=False,
+        build_from_scratch=True,
     )
     assert base_image in dockerfile_content
     assert 'apt-get update' in dockerfile_content
@@ -186,29 +139,29 @@ def test_generate_dockerfile_scratch():
     assert 'python=3.12' in dockerfile_content
 
     # Check the update command
-    assert 'COPY ./code /openhands/code' in dockerfile_content
+    assert 'COPY ./code/openhands /openhands/code/openhands' in dockerfile_content
     assert (
         '/openhands/micromamba/bin/micromamba run -n openhands poetry install'
         in dockerfile_content
     )
 
 
-def test_generate_dockerfile_skip_init():
+def test_generate_dockerfile_build_from_existing():
     base_image = 'debian:11'
     dockerfile_content = _generate_dockerfile(
         base_image,
-        skip_init=True,
+        build_from_scratch=False,
     )
 
-    # These commands SHOULD NOT include in the dockerfile if skip_init is True
+    # These commands SHOULD NOT include in the dockerfile if build_from_scratch is False
     assert 'RUN apt update && apt install -y wget sudo' not in dockerfile_content
     assert '-c conda-forge' not in dockerfile_content
     assert 'python=3.12' not in dockerfile_content
     assert 'https://micro.mamba.pm/install.sh' not in dockerfile_content
+    assert 'poetry install' not in dockerfile_content
 
     # These update commands SHOULD still in the dockerfile
-    assert 'COPY ./code /openhands/code' in dockerfile_content
-    assert 'poetry install' in dockerfile_content
+    assert 'COPY ./code/openhands /openhands/code/openhands' in dockerfile_content
 
 
 def test_get_runtime_image_repo_and_tag_eventstream():
@@ -234,114 +187,96 @@ def test_get_runtime_image_repo_and_tag_eventstream():
     )
 
 
-def test_build_runtime_image_from_scratch(temp_dir):
+def test_build_runtime_image_from_scratch():
     base_image = 'debian:11'
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy all of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        from_scratch_hash = prep_docker_build_folder(
-            temp_dir,
-            base_image,
-            skip_init=False,
-        )
-
-        mock_runtime_builder = MagicMock()
-        mock_runtime_builder.image_exists.return_value = False
-        mock_runtime_builder.build.return_value = (
-            f'{get_runtime_image_repo()}:{from_scratch_hash}'
-        )
-
+    mock_lock_hash = MagicMock()
+    mock_lock_hash.return_value = 'mock-lock-hash'
+    mock_source_hash = MagicMock()
+    mock_source_hash.return_value = 'mock-source-hash'
+    mock_runtime_builder = MagicMock()
+    mock_runtime_builder.image_exists.return_value = False
+    mock_runtime_builder.build.return_value = (
+        f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash'
+    )
+    mock_prep_build_folder = MagicMock()
+    mod = build_runtime_image.__module__
+    with (
+        patch(f'{mod}.get_hash_for_lock_files', mock_lock_hash),
+        patch(f'{mod}.get_hash_for_source_files', mock_source_hash),
+        patch(
+            f'{build_runtime_image.__module__}.prep_build_folder',
+            mock_prep_build_folder,
+        ),
+    ):
         image_name = build_runtime_image(base_image, mock_runtime_builder)
         mock_runtime_builder.build.assert_called_once_with(
             path=ANY,
             tags=[
-                f'{get_runtime_image_repo()}:{from_scratch_hash}',
-                f'{get_runtime_image_repo()}:{OH_VERSION}_image_debian_tag_11',
+                f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash_mock-source-hash',
+                f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash',
             ],
         )
-        assert image_name == f'{get_runtime_image_repo()}:{from_scratch_hash}'
-
-
-def test_build_runtime_image_exact_hash_exist(temp_dir):
-    base_image = 'debian:11'
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy all of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        from_scratch_hash = prep_docker_build_folder(
-            temp_dir,
-            base_image,
-            skip_init=False,
+        assert (
+            image_name
+            == f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash_mock-source-hash'
         )
+        mock_prep_build_folder.assert_called_once_with(ANY, base_image, True, None)
 
-        mock_runtime_builder = MagicMock()
-        mock_runtime_builder.image_exists.return_value = True
-        mock_runtime_builder.build.return_value = (
-            f'{get_runtime_image_repo()}:{from_scratch_hash}'
-        )
 
+def test_build_runtime_image_exact_hash_exist():
+    base_image = 'debian:11'
+    mock_lock_hash = MagicMock()
+    mock_lock_hash.return_value = 'mock-lock-hash'
+    mock_source_hash = MagicMock()
+    mock_source_hash.return_value = 'mock-source-hash'
+    mock_runtime_builder = MagicMock()
+    mock_runtime_builder.image_exists.return_value = True
+    mock_prep_build_folder = MagicMock()
+    mod = build_runtime_image.__module__
+    with (
+        patch(f'{mod}.get_hash_for_lock_files', mock_lock_hash),
+        patch(f'{mod}.get_hash_for_source_files', mock_source_hash),
+        patch(
+            f'{build_runtime_image.__module__}.prep_build_folder',
+            mock_prep_build_folder,
+        ),
+    ):
         image_name = build_runtime_image(base_image, mock_runtime_builder)
-        assert image_name == f'{get_runtime_image_repo()}:{from_scratch_hash}'
+        assert (
+            image_name
+            == f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash_mock-source-hash'
+        )
         mock_runtime_builder.build.assert_not_called()
+        mock_prep_build_folder.assert_not_called()
 
 
-@patch('openhands.runtime.utils.runtime_build._build_sandbox_image')
-def test_build_runtime_image_exact_hash_not_exist(mock_build_sandbox_image, temp_dir):
+def test_build_runtime_image_exact_hash_not_exist():
     base_image = 'debian:11'
-    repo, latest_image_tag = get_runtime_image_repo_and_tag(base_image)
-    latest_image_name = f'{repo}:{latest_image_tag}'
-
-    mock = (
-        MagicMock()
-    )  # We don't actually need to copy all of the files to perform this check
-    with patch(f'{_put_source_code_to_dir.__module__}._put_source_code_to_dir', mock):
-        from_scratch_hash = prep_docker_build_folder(
-            temp_dir,
-            base_image,
-            skip_init=False,
+    mock_lock_hash = MagicMock()
+    mock_lock_hash.return_value = 'mock-lock-hash'
+    mock_source_hash = MagicMock()
+    mock_source_hash.return_value = 'mock-source-hash'
+    mock_runtime_builder = MagicMock()
+    mock_runtime_builder.image_exists.side_effect = [False, True, False, True]
+    mock_prep_build_folder = MagicMock()
+    mod = build_runtime_image.__module__
+    with (
+        patch(f'{mod}.get_hash_for_lock_files', mock_lock_hash),
+        patch(f'{mod}.get_hash_for_source_files', mock_source_hash),
+        patch(
+            f'{build_runtime_image.__module__}.prep_build_folder',
+            mock_prep_build_folder,
+        ),
+    ):
+        image_name = build_runtime_image(base_image, mock_runtime_builder)
+        assert (
+            image_name
+            == f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash_mock-source-hash'
+        )
+        mock_runtime_builder.build.assert_called_once()
+        mock_prep_build_folder.assert_called_once_with(
+            ANY, f'{get_runtime_image_repo()}:{OH_VERSION}_mock-lock-hash', False, None
         )
-        with tempfile.TemporaryDirectory() as temp_dir_2:
-            non_from_scratch_hash = prep_docker_build_folder(
-                temp_dir_2,
-                base_image,
-                skip_init=True,
-            )
-
-        mock_runtime_builder = MagicMock()
-        # Set up mock_runtime_builder.image_exists to return False then True
-        mock_runtime_builder.image_exists.side_effect = [False, True]
-
-        with patch(
-            'openhands.runtime.utils.runtime_build.prep_docker_build_folder'
-        ) as mock_prep_docker_build_folder:
-            mock_prep_docker_build_folder.side_effect = [
-                from_scratch_hash,
-                non_from_scratch_hash,
-            ]
-
-            image_name = build_runtime_image(base_image, mock_runtime_builder)
-
-            mock_prep_docker_build_folder.assert_has_calls(
-                [
-                    call(ANY, base_image=base_image, skip_init=False, extra_deps=None),
-                    call(
-                        ANY,
-                        base_image=latest_image_name,
-                        skip_init=True,
-                        extra_deps=None,
-                    ),
-                ]
-            )
-
-            mock_build_sandbox_image.assert_called_once_with(
-                docker_folder=ANY,
-                runtime_builder=mock_runtime_builder,
-                target_image_repo=repo,
-                target_image_hash_tag=from_scratch_hash,
-                target_image_tag=latest_image_tag,
-            )
-            assert image_name == f'{repo}:{from_scratch_hash}'
 
 
 # ==============================
@@ -588,3 +523,10 @@ def test_image_exists_not_found():
     mock_client.api.pull.assert_called_once_with(
         'nonexistent', tag='image', stream=True, decode=True
     )
+
+
+def test_truncate_hash():
+    truncated = truncate_hash('b08f254d76b1c6a7ad924708c0032251')
+    assert truncated == 'pma2wc71uq3c9a85'
+    truncated = truncate_hash('102aecc0cea025253c0278f54ebef078')
+    assert truncated == '4titk6gquia3taj5'