期刊文献+

A commentary of Multi-skilled AI in MIT Technology Review 2021

原文传递
导出
摘要 Towards the end of 2012,artificial intelligence(AI)scientists first figured out how to impart“vision”to neural networks.Later,they also mastered how to enable neural networks to mimic human reasoning,hearing,speaking,and writing.Although AI has become similar to or even superior to humans in accomplishing specific tasks,it still does not possess the“flexibility”of the human brain,i.e.,the human brain can apply skills learned in one situation to another.Taking cues from the growth process of children,we think about the following question.If senses and language can be combined,and AI can perform at a level closer to humans in terms of collecting and processing information,will it be able to develop an understanding of the world?The answer is yes.“Multi-modal”systems,which can simultaneously acquire human senses and language,thereby generating significantly stronger AI,and making it easier for AI to adapt to new situations and solve new problems.Hence,such algorithms can be used to solve more complex problems,or be implanted into robots for communication and collaboration with humans in our daily lives.In September 2020,researchers from the Allen Institute for AI(AI2)created a model that could generate images from captions,thus demonstrating the ability of the algorithm to associate words with visual information.In November,scientists from the University of North Carolina at Chapel Hill developed a method of incorporating images into existing language models,which significantly enhanced the ability of the model to comprehend text.Early in 2021,OpenAI extended GPT-3 and released two visual language models:one associates the objects in the image with the words in the descriptions,and another one generates a digital image based on the combination of concepts it has learned.The progress made by“multi-modal”systems,in the long run,will help break through the limits of AI.It will not only unlock new AI applications,but also make these applications safer and more reliable.More sophisticated multi-modal systems will also aid the development of more advanced robot assistants.Ultimately,multi-modal systems may prove to be the first AI that we can trust.
作者 Rongrong Ji
出处 《Fundamental Research》 CAS 2021年第6期844-845,共2页 自然科学基础研究(英文版)
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部