The Eclipse Migration Toolkit for Java (EMT4J) is an open source project under the Eclipse Adoptium Working Group, specifically designed to simplify the process of migrating Java version.
For years, enterprises have faced nontrivial hurdles in migrating to a new Java version. These challenges often stem from API incompatibilities, legacy dependencies, runtime behavioral changes that may break applications, and complex internal technology stacks within enterprises. Manually identifying and resolving these issues is time-consuming and error-prone, leading to prolonged migration timelines and increased costs. EMT4J streamlines this process by offering compatibility scanning and automated code remediation. Internally, EMT4J maintains a set of scanning rules. These rules are then leveraged with the OpenRewrite project to enable automated code remediation.
The rapid advancement of Large Language Models (LLMs) is showcasing AI's pivotal role in programming. Products like Cursor and GitHub Copilot, powered by LLMs, are now widely adopted by developers, regardless of their experience level. At its core, Java version migration is fundamentally a programming task. It is therefore a perfect fit for applying AI technology to drive more automation. This is precisely EMT4J's next plan: to build an AI agent specifically designed to automate the Java version migration.
AI Agent
What?
LLMs are no longer just about generating text. They are rapidly gaining the ability to tackle intricate, multi-step challenges. Thanks to breakthroughs in their reasoning capabilities, understanding of various data types (multimodality), and proficiency in using external tools, these powerful LLMs are now the driving force behind a new class of AI systems: agents. Essentially, agents are systems built to execute tasks independently on your behalf. Think of them as intelligent software entities that can reason, plan, and execute.
An AI agent typically comprises several key components:
- Perception: The ability to receive and interpret information from its environment (e.g., reading code, analysing project configurations).
- Reasoning/Planning: The capacity to process perceived information, understand the current state, and formulate a plan to reach a desired goal. This often involves breaking down complex problems into smaller, manageable sub-tasks.
- Action: The ability to execute planned steps, which could involve writing code, modifying files, running commands, or interacting with other systems.
- Memory: The capability to store and retrieve past experiences, learning from previous interactions to improve future performance.
Why?
Java version migration is complex, often involving a cascade of changes that go beyond simple find-and-replace operations. Here is why AI agents are uniquely positioned to revolutionise this process:
1. Holistic Problem Solving Beyond Rule-Based Systems:
Current EMT4J excels at identifying and resolving known incompatibilities through predefined rules and static/runtime analysis. This approach can sometimes fall short when facing subtle behavioral changes or deeply intertwined dependencies that cannot be easily coded into a rule. AI agents, however, can go beyond fixed rules. It can leverage its reasoning capabilities to understand the context of the code and identify patterns that might lead to a problem in a new JDK version.
2. Dynamic Adaptation and Learning:
Java's ecosystem is constantly evolving. New libraries emerge, existing ones get updated, and best practices shift. A rule-based system requires constant manual updates to keep pace. An AI agent, with its learning component, can adapt to these changes. It can learn from successful and failed migrations, continuously refining its understanding of compatibility issues and optimal repair strategies. This means it can become more effective over time without requiring constant human intervention for every new problem.
3. Complex Code Transformation and Refactoring:
Migrating to a new Java version often requires more than just simple API changes. It can involve significant refactoring and changes in build configurations (e.g., Maven, Gradle). An AI agent can be built to perform these more complex code transformations. Given a goal (e.g., "migrate this Spring Boot application from Java 8 to Java 17"), it can generate or modify code, update configuration files, and even run tests to validate its changes, all autonomously.
4. End-to-End Automation:
Imagine an AI agent capable of not just identifying and fixing code, but also:
- Analysing the entire project structure: Understanding dependencies, build scripts and external integrations.
- Planning the migration steps.
- Executing the migration: Modifying source code, updating dependencies, potentially even running build and test pipelines.
- Debugging and iterating: If tests fail, the agent could analyse the failures, identify potential causes, and attempt further corrections, learning from its mistakes.
This level of end-to-end automation transforms a manual process into a highly efficient, AI-driven workflow.
How?
So, how do we build an AI agent for Java migration? Here are our thoughts.
Migration Workflow
A fully functional and powerful AI agent typically plans its tasks autonomously based on the given objective. However, for Java version migration, the workflow is usually quite fixed: first, modifying the JDK version; then, building the project; and finally, testing it. If errors occur, they need to be fixed before retrying that step.
Because of this predictable sequence, we don't need the agent to plan the entire workflow. Instead, its autonomy can be focused on sub-task decomposition. For instance, if tests fail, the agent could decide whether to fix issues one by one or attempt a batch fix.
Prompt Templates
Building effective LLM prompt templates is crucial for getting consistent and accurate results from large language models. Think of a prompt template as a structured blueprint for your instructions, guiding the LLM to generate desired outputs. We need to define a prompt for every step of our interaction with the LLM. We can start with the simplest instructions and gradually refine them. There are many excellent resources available on how to optimise prompts, such as the Prompt Engineering Guide.
RAG: Knowledge
We can significantly enhance the Agent's capabilities by integrating Retrieval Augmented Generation (RAG) to provide the LLM with richer contextual information. For Java version migration, specifically, this is incredibly valuable. Official Oracle migration documents, alongside any internal enterprise documents related to Java version migration, contain crucial knowledge. Leveraging these resources via RAG can help the Agent complete tasks with greater accuracy and efficiency.
Essential Tools
For the LLM to interact with a Java project, it needs a robust set of specialised tools. These tools are like the Agent's hands, enabling it to perform concrete actions.
Here are some essential tools:
- Project Analysis: help the LLM understand the project's structure, including the build system used and its dependencies.
- Code Operation: While LLMs can directly modify project code or configurations, it's often more convenient to leverage specific, pre-built "Recipes" from project OpenRewrite for common issues.
- Build and Test: Building the project and running tests are mandatory steps in Java version migration. Therefore, we need tools that enable the agent to perform these actions. Additionally, tools for extracting error information when failures occur are crucial, as they help the agent plan subsequent steps.
- Others: e.g., Git operations and migration reports to make it easier for users to review results.
Conclusion
We believe an AI agent is currently the most promising solution for tackling the persistent challenges of Java version migration. This solution is still under development, but our vision is clear: to move beyond rigid, rule-based automation towards a more intelligent, adaptable, and autonomous approach. Ultimately, our goal with EMT4J is to empower developers, freeing them from tedious manual tasks and enabling a smoother, faster transition to the latest Java innovations.
Thanks to EMT4J, Alibaba has already successfully migrated its internal core applications to Java 11, 17, or even 21. However, a significant number of long-tail applications (both internally and among our external customers) still run on Java 8. Our immediate objective is to leverage this AI agent solution to facilitate the seamless migration of these remaining applications.
If you have any suggestions or questions, please don't hesitate to contact us.