DeepSoftwareAnalytics · xynova · Jul 7, 2025
diff --git a/.gitignore b/.gitignore
@@ -162,9 +162,13 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+input/
 evaluation/reports/
 output/
 # *data_collection/collect/*.sh
 *data_collection/collect/temp
 *evaluation/temp
-temp/
+temp/
+
+
+ignore.*
diff --git a/DOCKER_LOGGING_FIX.md b/DOCKER_LOGGING_FIX.md
@@ -0,0 +1,102 @@
+# Docker Logging Configuration Fix
+
+## Issue Description
+
+The Docker containers created by the SWE Factory tool were using a logging configuration that disabled container logs:
+
+```json
+"ContainerIDFile": "",
+"LogConfig": {
+    "Type": "none",
+    "Config": {}
+}
+```
+
+This configuration meant:
+- No container logs were being captured
+- `docker logs <container>` would not work
+- Debugging container issues was difficult
+- The tool couldn't capture container output for analysis
+
+## Root Cause
+
+The issue was in the container creation code in two files:
+1. `app/agents/test_analysis_agent/docker_utils.py` - `build_container()` function
+2. `evaluation/docker_build.py` - `build_container()` and `build_setup_container()` functions
+
+These functions were creating containers without explicit logging configuration, causing Docker to use default settings that might be set to `"none"` in some environments.
+
+## Solution Applied
+
+I've updated all container creation calls to include proper logging configuration:
+
+### Before:
+```python
+container = client.containers.create(
+    image=test_image_name,
+    name=test_container_name,
+    user="root",
+    detach=True,
+    command="tail -f /dev/null",
+    nano_cpus=None,
+    platform="linux/x86_64",
+)
+```
+
+### After:
+```python
+container = client.containers.create(
+    image=test_image_name,
+    name=test_container_name,
+    user="root",
+    detach=True,
+    command="tail -f /dev/null",
+    nano_cpus=None,
+    platform="linux/x86_64",
+    log_config={
+        "Type": "json-file",
+        "Config": {
+            "max-size": "10m",
+            "max-file": "3"
+        }
+    }
+)
+```
+
+## Benefits of the Fix
+
+1. **Container Logs Available**: You can now use `docker logs <container>` to view container output
+2. **Better Debugging**: Container issues can be diagnosed more easily
+3. **Log Rotation**: Logs are automatically rotated when they reach 10MB
+4. **Storage Management**: Only 3 log files are kept per container
+5. **Tool Functionality**: The SWE Factory tool can now capture and analyze container output
+
+## Files Modified
+
+1. `app/agents/test_analysis_agent/docker_utils.py` - Line 330-340
+2. `evaluation/docker_build.py` - Lines 590-600 and 650-660
+
+## Testing the Fix
+
+After applying this fix, you should be able to:
+
+1. Run the SWE Factory tool as usual
+2. Use `docker logs <container_name>` to view container logs
+3. See container output in the tool's log files
+4. Debug container issues more effectively
+
+## Recommended Docker Configuration
+
+For optimal performance, ensure your Docker daemon is configured with:
+
+```json
+{
+  "log-driver": "json-file",
+  "log-opts": {
+    "max-size": "10m",
+    "max-file": "3"
+  }
+}
+```
+
+This ensures consistent logging behavior across all containers, even those created without explicit logging configuration. 
diff --git a/FILE_LOCATION_DEBUG.md b/FILE_LOCATION_DEBUG.md
@@ -0,0 +1,129 @@
+# File Location Debug Guide
+
+## Issue Description
+
+When running the SWE Factory tool, files are being created at the repository root instead of in the specified output directories, despite providing the correct `--output-dir`, `--setup-dir`, and `--results-path` parameters.
+
+## Root Cause Analysis
+
+After analyzing the codebase, I found that the system is correctly designed to use the specified output directories. All file writing operations use proper path construction with `pjoin()` or `os.path.join()` to ensure files are written to the correct output directories.
+
+However, there are several potential causes for files appearing in the wrong location:
+
+### 1. Working Directory Changes
+The code uses `cd` context managers in several places (like in `dump_cost` function), which temporarily change the working directory. If any file operations happen outside of these context managers while the directory is changed, they might write to the wrong location.
+
+### 2. Race Conditions
+The system uses multiprocessing, and there might be race conditions where the working directory is changed in one process while another process is writing files.
+
+### 3. Missing Absolute Paths
+Some file operations might not be using absolute paths, causing them to write relative to the current working directory.
+
+## Changes Made
+
+I've made the following improvements to ensure files are written to the correct locations:
+
+### 1. Added Absolute Path Safety Checks
+- Modified `run_raw_task()` to ensure `task_output_dir` is absolute
+- Modified `do_inference()` to ensure `task_output_dir` is absolute  
+- Modified `dump_cost()` to ensure `task_output_dir` is absolute
+
+### 2. Added Debug Logging
+- Added logging in `AgentsManager` to track where Dockerfile, eval.sh, and status.json are written
+- Added logging in `TestAnalysisAgent` to track where Dockerfile and eval.sh are written
+
+### 3. Created Debug Script
+- Created `debug_file_locations.py` to monitor file creation during execution
+
+## How to Debug the Issue
+
+### Step 1: Run with Debug Logging
+The enhanced logging will now show exactly where files are being written. Look for log messages like:
+```
+Writing Dockerfile to: /path/to/output/dir/Dockerfile
+Writing eval.sh to: /path/to/output/dir/eval.sh
+Writing status.json to: /path/to/output/dir/status.json
+```
+
+### Step 2: Use the Debug Script
+Run the debug script in a separate terminal to monitor file creation:
+
+```bash
+# In one terminal, start the debug script
+python debug_file_locations.py output/swe-factory-runs/kareldb-test 600 10
+
+# In another terminal, run your command
+LITELLM_API_BASE="https://api.dev.halo.engineer/v1/ai" \
+OPENAI_API_KEY="${OPENAI_API_KEY?->Need a key}" \
+PYTHONPATH=. python app/main.py local-issue \
+    --task-id "kareldb-connection-1" \
+    --local-repo "/Users/hector.maldonado@clearroute.io/xynova/kareldb-cp" \
+    --issue-file "input/kareldb_test_issue.txt" \
+    --model google/gemini-2.5-flash \
+    --output-dir "output/swe-factory-runs/kareldb-test" \
+    --setup-dir "output/swe-factory-runs/testbed" \
+    --results-path "output/swe-factory-runs/results" \
+    --conv-round-limit 3 \
+    --num-processes 1 \
+    --model-temperature 0.2
+```
+
+The debug script will:
+- Monitor file creation every 10 seconds for 10 minutes
+- Log all new files created
+- Warn about files created outside the expected output directory
+- Show the current working directory at each check
+
+### Step 3: Check the Logs
+Look for:
+1. **Expected behavior**: Files being written to the specified output directory
+2. **Unexpected behavior**: Files being written to the current working directory or repository root
+3. **Working directory changes**: Any unexpected changes in the current working directory
+
+## Expected File Locations
+
+Based on your command, files should be created in:
+
+- **Task output files**: `output/swe-factory-runs/kareldb-test/kareldb-connection-1/`
+  - `Dockerfile`
+  - `eval.sh`
+  - `status.json`
+  - `cost.json`
+  - `meta.json`
+  - `problem_statement.txt`
+  - `developer_patch.diff`
+  - `info.log`
+  - `test_analysis_agent_0/` (subdirectory with test results)
+
+- **Setup directory**: `output/swe-factory-runs/testbed/`
+  - Repository clones and working directories
+
+- **Results**: `output/swe-factory-runs/results/results.json`
+  - Aggregated results from all tasks
+
+## Troubleshooting
+
+If files are still being created in the wrong location:
+
+1. **Check the debug logs** to see exactly where files are being written
+2. **Verify the output directory exists** and is writable
+3. **Check for any error messages** about directory creation or file writing
+4. **Ensure no other processes** are changing the working directory
+5. **Verify the command line arguments** are being parsed correctly
+
+## Additional Recommendations
+
+1. **Use absolute paths** in your command line arguments
+2. **Ensure the output directories exist** before running the command
+3. **Check file permissions** on the output directories
+4. **Monitor system resources** to ensure there are no disk space issues
+
+## Code Changes Summary
+
+The following files were modified to improve file location handling:
+
+- `app/main.py`: Added absolute path safety checks
+- `app/agents/agents_manager.py`: Added debug logging for file creation
+- `app/agents/test_analysis_agent/test_analysis_agent.py`: Added debug logging for file creation
+- `debug_file_locations.py`: Created debug script for monitoring file creation
+- `FILE_LOCATION_DEBUG.md`: This documentation file 
diff --git a/app/agents/agents_manager.py b/app/agents/agents_manager.py
@@ -49,7 +49,7 @@ def __init__(self,
                 client: docker.DockerClient, 
                 start_time: datetime, 
                 max_iteration_num: int,
-                results_path:str,
+                results_path: str | None,
                 disable_memory_pool:bool,
                 disable_context_retrieval:bool,
                 disable_run_test:bool,
@@ -79,7 +79,19 @@ def __init__(self,
             self.set_agent_status("context_retrieval_agent",True)
         self.agents_dict['test_analysis_agent'].disable_context_retrieval= disable_context_retrieval
         self.agents_dict['test_analysis_agent'].disable_run_test = disable_run_test
-        self.results_file = f'{results_path}/results.json'
+
+        # Handle None results_path by setting a default
+        if results_path is None:
+            results_path = os.path.join(output_dir, "results")
+
+        # Ensure results_path is absolute
+        if not os.path.isabs(results_path):
+            results_path = os.path.abspath(results_path)
+
+        # Create the results directory if it doesn't exist
+        os.makedirs(results_path, exist_ok=True)
+
+        self.results_file = os.path.join(results_path, 'results.json')
         lock_path = self.results_file + '.lock'
         self.lock = FileLock(lock_path, timeout=30)
         with self.lock:
@@ -263,15 +275,21 @@ def run_workflow(self) -> None:
         eval_script_content = self.agents_dict['write_eval_script_agent'].get_latest_eval_script()
         eval_script_skeleton_content = self.agents_dict['write_eval_script_agent'].get_latest_eval_script_skeleton()
         if dockerfile_content and eval_script_content:
-            with open(os.path.join(self.output_dir, "Dockerfile"), "w") as dockerfile_f:
+            dockerfile_path = os.path.join(self.output_dir, "Dockerfile")
+            logger.info(f"Writing Dockerfile to: {dockerfile_path}")
+            with open(dockerfile_path, "w") as dockerfile_f:
                 dockerfile_f.write(dockerfile_content)
 
 
-            with open(os.path.join(self.output_dir, "eval.sh"), "w") as eval_script_f:
+            eval_script_path = os.path.join(self.output_dir, "eval.sh")
+            logger.info(f"Writing eval.sh to: {eval_script_path}")
+            with open(eval_script_path, "w") as eval_script_f:
                 eval_script_f.write(eval_script_content)
 
 
-        with open(os.path.join(self.output_dir, "status.json"), "w") as status_file_f:
+        status_file_path = os.path.join(self.output_dir, "status.json")
+        logger.info(f"Writing status.json to: {status_file_path}")
+        with open(status_file_path, "w") as status_file_f:
                 json.dump({"is_finish": self.workflow_finish_status}, status_file_f)
 
         if self.workflow_finish_status:

diff --git a/app/agents/test_analysis_agent/docker_utils.py b/app/agents/test_analysis_agent/docker_utils.py
@@ -339,6 +339,13 @@ def build_container(client,test_image_name,test_container_name,instance_id,run_t
                 command="tail -f /dev/null",
                 nano_cpus=None,
                 platform="linux/x86_64",
+                log_config={
+                    "Type": "json-file",
+                    "Config": {
+                        "max-size": "10m",
+                        "max-file": "3"
+                    }
+                }
             )