Three challenges to human-robot collaboration
Denso operates in 35 countries and regions around the world under our corporate philosophy of “contributing to a better world by creating value together with a vision for the future.” Denso, is Japan’s largest automotive parts manufacturer and second in the world in sales. As the automotive industry is being transformed by digitalization, the company is working to create new value in four areas with an eye on the future: electrification, advanced safety/automated driving, connected driving, and non-vehicle business (factory automation/agriculture).
As part of this effort, Denso has been working since April 2023 on developing an autonomous robot using OpenAI’s ChatGPT, a generative AI that has been commanding much public attention since its release in November 2022.
Takeshi Narisako, Director of the Cloud Services R&D Division and Senior Director of the Research & Development Center, said, “AI has gotten smarter very quickly over the past few years, and its applications for autonomous control of automobiles and robots and a wide variety of machines are expanding. With future advancements in AI and the development of global legislation covering its use, a world where various robots autonomously provide operations and services in towns and cities is quickly becoming a reality,” he said, explaining the potential of robots.
Denso has positioned the value it should provide in such a future society as, “to realize a society where humans and robots work together and all people can live life with a smile on their faces.” However, as things are today, robots face several challenges to pursuing this goal. First, since robotic movements are controlled by pre-programming, robots can’t respond to unexpected situations. There are myriad socially necessary tasks, and this makes it difficult for robots both to respond to sudden or unexpected tasks and to collaborate flexibly with people.
In addition, not all people have the programming knowledge necessary to register a robot’s movements. Such a state means that even if a robot is close by, many people wouldn’t be able to use it well. Furthermore, the more tasks one wants a robot to perform, the more complicated movement registration itself becomes.
We are confident that through the partnership between Microsoft, which is strong in cutting-edge technologies such as generative AI, and Denso, which excels in hardware and software development, we will lead the way in the rapidly evolving field of robot applications of generative AI, and will be able to realize our vision of a world where people and robots work together and society as a whole is full of joy.
Takeshi Narisako, Senior Director Research & Development Center, Denso Corporation
Human-like robots acting autonomously based on human language
Denso’s robot control technology that uses generative AI aims to overcome this challenge. The goal is to develop a control technology to achieve a robot that can understand human instructions and autonomously decide what action to take based on generative AI learned from the vast amount of data available on the internet.
Keitaro Minami, Project Assistant Manager of Automation Innovation Section, Business Innovation Department, Cloud Services R&D Division, Denso Corporation, says, “Conventional robots are inflexible machines that act based on the movements and instructions they are given. In contrast, we are developing control technology in order to realize a human-like robot that acts according to human language and can also easily correct its errors in judgment when a human points them out.”
Minami explains that the most important feature is the dialog with a person when receiving instructions. When instructed to perform a task that has not been pre-registered, a normal robot will simply return a message stating that it is unable to perform that task. However, Denso’s robot control technology suggests that a robot could perform this task based on “what it is capable of doing.” For example, prior learning through generative AI can enable a robot that serves water and coffee to determine that the desired drink is coffee from an order from a human saying, “the black one” or “the drink that wakes you up.”
“You used to have to write a huge number of action branching programs to realize such a flexible response. However, our proprietary control technology, Generative-AI-Robot Technology, enables a robot to determine what ‘it’ and ‘that’ mean, and act autonomously by converging with the rules of the real world,” says Minami.
In the manufacturing world, a trend toward high-mix low-volume manufacturing is gaining speed against a backdrop of market maturity and diversifying needs. The Denso Group has already developed multifunctional robots that can perform multiple tasks using AI. However, when the new robot is completed, it will be expected to be able to perform an even wider range of tasks, and to enable new forms of human-robot collaborations, for example, that of a robot that records various tasks and gives advice to humans for each step of the process. Verbal work instructions will lower the hurdles for using the robot significantly, and the robot can be expected to be used not only in factories, but also in stores and in towns, and so on.
Microsoft chosen for development combining Generative AI and robots
Behind Denso’s development of a new robot was the company’s early promotion of software utilization in anticipation of the coming mobility society. In this context, since joining Denso in 2016, Narisako has been promoting the use of cloud computing by creating an agile development team armed with knowledge from his previous work in the IT industry. The direct impetus for developing this robot came from a suggestion from the field that, since the team had knowledge in both IT as well as industrial robotics, it should take the lead in generative AI driven robot control, or in other words, linking the virtual and real worlds.
“We’ve had an eye on generative AI since it first came out, and we saw it as a new technology that would change the world through various types of verification. Through discussions, we decided that this was an initiative where we could display our uniqueness, and decided to launch the project with the existing robot multi-skill building team at its core,” recalls Narisako.
However, at the time, generative AI was a new technology, and naturally, since Denso didn’t have the expertise to utilize it, the company was expected to face difficulties with stand-alone initiatives. That’s when they approached Microsoft to partner with them for joint development. Denso had already built a relationship of trust with Microsoft through a longstanding relationship in promoting in-house informatization. In addition, Microsoft and OpenAI are extremely close partners, and the fact that Denso could expect extensive technical support from Microsoft’s cloud AI service Azure OpenAI Service, which provides OpenAI’s AI technology, was the deciding factor.
An overview of the overall system for the new robot is as follows. First, Microsoft’s cloud AI services, Azure OpenAI Service and Azure AI Services, serve as the brain that understands human instructions and instructs the physical robot. The former is tasked with interaction with humans and various decisions, while the latter takes care of voice recognition and output. Meanwhile, edge terminals such as Denso Wave’s COBOTTA robotic arm and Autonomous Mobile Robot (AMR) work as “arms and legs” in the real world. They are also fitted with microphones and speakers as real-world ears and mouths.
Azure OpenAI Service’s new features are the key to improving precision of movement
The range of possibilities for generative AI and robot combinations is rapidly expanding, and even though it is less than a year since development began, they include assembling and ringing a bell, having a transport robot and a barista robot work together to serve coffee to customers, and making cocktails through dialogue. Other initiatives have also begun, including having robots work with humans as sales staff in stores, recommending products, and providing other customer services.
However, there were some major technical difficulties getting this far. Development began with Denso adopting a robot control method in which a control program’s output examples were passed to the generative AI, and the AI was made to generate a control program to respond to instructions from a human. However, its biggest drawback was that accuracy could not be guaranteed with current generative AI end products. As a result, this became a barrier to improving the precision of its movements. To break through this barrier, Minami focused on a new feature that was just added to Azure OpenAI Service: “Selecting the appropriate function (movement) for responding to input text.”
“This feature makes it possible to have a robot learn potential movements as functions in advance, so that the AI can select the function that it judges to be the most appropriate in response to human instructions,” says Minami.
For example, if a person instructs the robot to “ring the bell twice,” the generative-AI selects the “ring the bell” function from among “grab the bottle,” “move the part to the specified location,” and other functions (i.e., robot movements) given in advance, and from the word “twice,” it further instructs the edge terminal to shake the arm twice. Instead of a person specifying each movement individually, generative AI makes the overall determination of the corresponding movements based on “my possible actions (functions),” as well as their order, realizing autonomous robot movements. This method also makes it possible to increase the number of robot movements just as a person learns, simply by adding functions.
Radically streamlined development thanks to GitHub Copilot
Actual testing of the new feature has confirmed that it can significantly improve the accuracy of actions compared to ad-hoc code generation. Since then, the method of combining generative AI and robots has been reevaluated with this feature as the backbone. At the same time, using GitHub Copilot to develop new actions (functions) has led, in a short period of time, to the expansion of possible movements.
Minami adds, “GitHub Copilot has been very successful in both shortening the time of and reducing the cost of developing motion programs. The increased efficiency in developing functions to control robots should greatly help future development and expansion of new programs for a variety of human-instructed actions.”
While there were quite a few difficulties with robot control, equipping the development environment went smoothly from the beginning. This was also made possible by the various development support features provided by GitHub Copilot. When it came to equipping the development environment, Denso was already looking at the social implementation of robots, so there were many factors to consider in the system design, including scalability and functionality. In contrast, GitHub Copilot not only automatically generates code, but also automatically inserts comments (explanatory text) to aid in understanding the code, searches for the cause of errors, and provides various other features. The use of these features brought about radical improvements in the efficiency of the development process because it enabled the reception of support from professional developers. The implementation of basic architecture was completed in a mere two days because people could focus on work that should be done by human resources. From there, configuration changes to ensure security and other work to improve platform practicality were carried out.
Looking back on the work put in so far, Narisako assesses the results as “a very good start, through R&D, into creating social value through the use of generative AI.”
“The fact that the robot was able to operate autonomously in such a short period of time is a result of the concerted efforts of our technical staff, who can create hardware and software prototypes in a short period of time, along with our technical staff using their AI and IT skills,” added Narisako.
A focus on finding applications through open innovation
Autonomous robotic movement is determined by a number of factors, including the way various functions and prompts (specific instruction content) are described, and thus requires multiple verifications to ensure it operates correctly. Minami points out that one of the reasons it was possible to increase the number of possible actions in such a short period of time is that Denso developed its own unique verification tool that can modify and implement functions and prompts without any extra time or effort.
However, he points out that R&D has only just gotten underway and many challenges still remain. One of the first items for consideration from a technological standpoint is improving flexibility of implementation. Currently, the features that serve as the brain of the system are located in the cloud, but there may be cases where, depending on the usage scenario, putting the brain in locally would be preferable.
Currently, thanks to the use of containers, the design of the system allows it to be used in any location. However, according to Minami, in order for robots to be widely used in society, there needs to be further improvement in flexibility of implementation.
Minami states that increasing the number of things that robots can do autonomously will require that they be able to grasp various aspects of the external environment, and to provide the input for this, Denso will continue to develop linkages with its various sensors. Narisako also emphasizes that co-creation of utilization scenarios will be essential for social implementation.
“We are in the stage of studying the architecture for social implementation from a bird’s eye view, toward the realization of a world where humans and robots work together," says Narisako. “In order to create social value, you need to discover usage scenarios that will make people happy, and our company is limited in what we can do alone in this regard. From now on, we need to collaborate with more companies and organizations and work to create value through open innovation. We’d like to work with Microsoft not only in technological areas such as autonomous robot control using generative AI, but also in this kind of social implementation. Narisako says, “We are confident that through the partnership between Microsoft, which is strong in cutting-edge technologies such as generative AI, and Denso, which excels in hardware and software development, we will lead the way in the rapidly evolving field of robot applications of generative AI, and will be able to realize our vision of a world where people and robots work together and society as a whole is full of joy.”
Conventional robots are inflexible machines that act based on the movements and instructions they are given. In contrast, we are developing control technology in order to realize a human-like robot that acts according to human language and can also easily correct its errors in judgment when a human points them out.
Keitaro Minami, Project Assistant Manager, Denso Corporation
Follow Microsoft